Carlo Manuali @ Cluster Studenti
Installing, Configuring and Using OpenPBS 2.3.16 on Linux Debian 4.0 (kernel 2.6.18-6-486) with gcc 3.4 or 4.1 compiler
(Remarks on May-June 2008)
1) ./configure && make:
1.a) Front-End:
./configure --disable-gui --enable-syslog --enable-clients \
--set-default-server=<server_name> --with-scp
1.b) Nodes:
./configure --disable-gui --enable-syslog --enable-clients \
--enable-mom --disable-server --set-default-server=<server_name> --with-scp
2) Error:
make[4]: *** No rule to make target `<built-in>', needed by `attr_atomic.o'
2.a) Modify buildutils/makedepend-sh including 'grep -v ">$"':
eval $CPP $arg_cc $d/$s $errout | \
sed -n -e "s;^\# [0-9][0-9 ]*\"\(.*\)\";$f: \1;p" | \
grep -v "$s\$" | \
grep -v ">$" | \
sed -e 's;\([^ :]*: [^ ]*\).*;\1;' \
>> $TMP
2.b) make clean -> 1)
3) Error:
pbs_log.c:114: error: `_POSIX_PATH_MAX' undeclared here (not in a function)
3.a) Add in /usr/local/include/limits.h (root user):
#define _POSIX_PATH_MAX 255
3.b) make (continue)
4) Error:
math.h:27:3: error: #error "This Intel <math.h> is for use with only the Intel compilers!"
4.a) mv /usr/local/include/math.h /usr/local/include/math.h.old (root user)
4.b) make clean -> 1)
5) Errors:
pbs_log.c:306: undefined reference to `errno'
svr_connect.c:141: undefined reference to `errno' (Front-End installation only)
OR
/usr/bin/ld: errno: TLS definition in /lib/libc.so.6 section .tbss mismatches non-TLS reference in
/lib/libc.so.6: could not read symbols: Bad value
5.a) Add in src/lib/Liblog/pbs_log.c:
#include <errno.h>
5.b) Add in src/server/svr_connect.c (Front-End installation only):
#include <errno.h>
5.c) make (continue)
6) Error (gcc 4.1 compiler only):
../include/pbs_nodes.h:196: error: array type has incomplete element type
6.a) Modify src/include/pbs_nodes.h:
-extern struct attribute_def node_attr_def[]; /* node attributes defs */
+extern struct attribute_def *node_attr_def[]; /* node attributes defs */
6.b) make (continue)
7) make install (root user)
Installation Complete!
8) Something to do on the Execution Nodes (root user):
8.a) Check $PBS_HOME/mom_priv/config:
$logevent 0x1ff
$clienthost <master_node_ip..1>
$clienthost <master_node_ip..N>
8.b) Start /usr/local/sbin/pbs_mom and put it in rc.local
9) Something to do on the Master Node (Queue Server and Scheduler - root user):
9.a) Check $PBS_HOME/sched_priv/sched_config (scheduler configuration):
load_balancing: true ALL (optional)
9.b) Create $PBS_HOME/server_priv/nodes (queue server configuration), for example:
nodo1 np=1 dipmat
nodo4 np=1 dipmat
nodo2 np=1 dipmat
nodo3 np=1 dipmat
nodo5 np=1 dipmat
9.c) Create /etc/pbs.conf:
pbs_home=/usr/spool/PBS
pbs_exec=/usr/local
start_server=1
start_sched=1
start_mom=1
9.d) Create qmgr.conf:
# Create and define queue <qsar>.
create queue qsar
set queue qsar queue_type = Execution
set queue qsar enabled = True
set queue qsar started = True
# Set server attributes.
set server scheduling = True
set server acl_hosts = *.*
set server default_queue = qsar
set server log_events = 511
set server mail_from = <admin_user>
set server query_other_jobs = True
set server default_node = 1
9.e) qmgr < qmgr.conf (setup server)
9.f) Setup and Start /etc/init.d/pbs (available from source distribution)
10) Some useful commands:
10.a) pbsnodes -a (nodes state)
10.b) qstat (queue state), for example:
qstat
qstat -q
qstat -Q
qstat -Q -f
10.c) qsub <program> (jobs submitting)
10.d) qdel <job_id> (jobs deleting)
10.e) qmgr -c "print server" > qmgr.conf (backup server configuration)
10.f) qmgr < qmgr.conf (restore server configuration)
10.g) qmgr -c "delete queue <queue_name>" (queue deleting)
11) An 'Hello World' example:
11.a) test.sh:
## (Specifing the PBS job's name)
#PBS -N test
## (Specifing the standard error file)
#PBS -e test.err
## (Specifing the standard output file)
#PBS -o test.out
## (Specifing nodes number and properties)
#PBS -l nodes=5:dipmat
## Program (script/executable)
#!/bin/sh
echo "This is a test."
echo -n "Today is:" `date`; echo
echo -n "I am in:" `hostname`; echo
echo -n "The current working directory is:" `pwd`; echo
11.b) qsub test.sh
<number>.<master_node_name> (job_id)
11.c) Reading test.out:
This is a test.
Today is: fri may 29 10:14:16 CEST 2008
I am in: nodo5
The current working directory is: /home/carlo
12) Logs files are in $PBS_HOME/*_logs/*
13) Good Submitting!
carlo@unipg.it