Carlo Manuali @ Cluster Studenti
Installing, Configuring and Using OpenPBS 2.3.16 on Linux Debian 4.0 (kernel 2.6.18-6-486) with gcc 3.4 or 4.1 compiler (Remarks on May-June 2008)
1) ./configure && make: 1.a) Front-End: ./configure --disable-gui --enable-syslog --enable-clients \ --set-default-server=<server_name> --with-scp 1.b) Nodes: ./configure --disable-gui --enable-syslog --enable-clients \ --enable-mom --disable-server --set-default-server=<server_name> --with-scp 2) Error: make[4]: *** No rule to make target `<built-in>', needed by `attr_atomic.o' 2.a) Modify buildutils/makedepend-sh including 'grep -v ">$"': eval $CPP $arg_cc $d/$s $errout | \ sed -n -e "s;^\# [0-9][0-9 ]*\"\(.*\)\";$f: \1;p" | \ grep -v "$s\$" | \ grep -v ">$" | \ sed -e 's;\([^ :]*: [^ ]*\).*;\1;' \ >> $TMP 2.b) make clean -> 1) 3) Error: pbs_log.c:114: error: `_POSIX_PATH_MAX' undeclared here (not in a function) 3.a) Add in /usr/local/include/limits.h (root user): #define _POSIX_PATH_MAX 255 3.b) make (continue) 4) Error: math.h:27:3: error: #error "This Intel <math.h> is for use with only the Intel compilers!" 4.a) mv /usr/local/include/math.h /usr/local/include/math.h.old (root user) 4.b) make clean -> 1) 5) Errors: pbs_log.c:306: undefined reference to `errno' svr_connect.c:141: undefined reference to `errno' (Front-End installation only) OR /usr/bin/ld: errno: TLS definition in /lib/libc.so.6 section .tbss mismatches non-TLS reference in /lib/libc.so.6: could not read symbols: Bad value 5.a) Add in src/lib/Liblog/pbs_log.c: #include <errno.h> 5.b) Add in src/server/svr_connect.c (Front-End installation only): #include <errno.h> 5.c) make (continue) 6) Error (gcc 4.1 compiler only): ../include/pbs_nodes.h:196: error: array type has incomplete element type 6.a) Modify src/include/pbs_nodes.h: -extern struct attribute_def node_attr_def[]; /* node attributes defs */ +extern struct attribute_def *node_attr_def[]; /* node attributes defs */ 6.b) make (continue) 7) make install (root user) Installation Complete! 8) Something to do on the Execution Nodes (root user): 8.a) Check $PBS_HOME/mom_priv/config: $logevent 0x1ff $clienthost <master_node_ip..1> $clienthost <master_node_ip..N> 8.b) Start /usr/local/sbin/pbs_mom and put it in rc.local 9) Something to do on the Master Node (Queue Server and Scheduler - root user): 9.a) Check $PBS_HOME/sched_priv/sched_config (scheduler configuration): load_balancing: true ALL (optional) 9.b) Create $PBS_HOME/server_priv/nodes (queue server configuration), for example: nodo1 np=1 dipmat nodo4 np=1 dipmat nodo2 np=1 dipmat nodo3 np=1 dipmat nodo5 np=1 dipmat 9.c) Create /etc/pbs.conf: pbs_home=/usr/spool/PBS pbs_exec=/usr/local start_server=1 start_sched=1 start_mom=1 9.d) Create qmgr.conf: # Create and define queue <qsar>. create queue qsar set queue qsar queue_type = Execution set queue qsar enabled = True set queue qsar started = True # Set server attributes. set server scheduling = True set server acl_hosts = *.* set server default_queue = qsar set server log_events = 511 set server mail_from = <admin_user> set server query_other_jobs = True set server default_node = 1 9.e) qmgr < qmgr.conf (setup server) 9.f) Setup and Start /etc/init.d/pbs (available from source distribution) 10) Some useful commands: 10.a) pbsnodes -a (nodes state) 10.b) qstat (queue state), for example: qstat qstat -q qstat -Q qstat -Q -f 10.c) qsub <program> (jobs submitting) 10.d) qdel <job_id> (jobs deleting) 10.e) qmgr -c "print server" > qmgr.conf (backup server configuration) 10.f) qmgr < qmgr.conf (restore server configuration) 10.g) qmgr -c "delete queue <queue_name>" (queue deleting) 11) An 'Hello World' example: 11.a) test.sh: ## (Specifing the PBS job's name) #PBS -N test ## (Specifing the standard error file) #PBS -e test.err ## (Specifing the standard output file) #PBS -o test.out ## (Specifing nodes number and properties) #PBS -l nodes=5:dipmat ## Program (script/executable) #!/bin/sh echo "This is a test." echo -n "Today is:" `date`; echo echo -n "I am in:" `hostname`; echo echo -n "The current working directory is:" `pwd`; echo 11.b) qsub test.sh <number>.<master_node_name> (job_id) 11.c) Reading test.out: This is a test. Today is: fri may 29 10:14:16 CEST 2008 I am in: nodo5 The current working directory is: /home/carlo 12) Logs files are in $PBS_HOME/*_logs/* 13) Good Submitting!
carlo@unipg.it