GridEngine
From FarmShare
We're using the Debian packages of "Sun Grid Engine" which isn't quite "Sun" anymore since Oracle bought Sun, and the Debian packages are a bit behind the current forks of Open Grid Engine or Son of Grid Engine or Univa Grid Engine.
Contents |
documentation
Start with 'man sge_intro'. Move on to 'man qsub'. Try submitting a simple job with 'echo "sleep 3600" | qsub", then run 'qstat' and 'qdel'.
shell mode
GE can run in "POSIX mode" or "Unix mode". See the "shell_start_mode" section of 'man sge_conf'
# qconf -sconf|grep shell shell_start_mode unix_behavior login_shells bash,sh,ksh,csh,tcsh
So you'll want to explicitly specify your shell with the -S flag to qsub. E.g.
# get rid of spurious messages about tty/terminal types #$ -S /bin/sh
useful commands
- see all hosts: qhost
- see which jobs are on which hosts: qhost -j
- see all jobs: qstat -f -u "*"
- see all host attributes: qstat -f -F -u "*"
- explain state 'a': qstat -explain a
- summary of slots: qstat -g c
queues
Under SGE, a 'queue' is a set of settings that get applied to jobs that are assigned to that queue.
We are using pretty much the default settings. This page should contain a description of all of the settings we've changed away from the defaults along with a reason for doing so.
We created four queues: bigmem.q, long.q, main.q and test.q. Use 'qconf -sql' to list queues, 'qconf -sq queue_name' to see the queue settings.
precise.q
Most jobs end up here. Time limit is 48hrs.
precise-long.q
10 day time limit instead of 48hrs
test.q
15min time limit instead of 48hrs
If you want to submit a job to this test queue, you must use '-l testq=1' flag to qsub.
making the test.q
- add testq to the global complex attributes list (qconf -mc)
senpai1:/root# qconf -sc |grep testq testq testq BOOL == FORCED NO 0 0
"requestable" is set to FORCED to make that attribute required (per 'man 5 complex')
- add testq to the complex attributes of the queue (qconf -sq test.q)
senpai1:/root# qconf -sq test.q|grep testq hostlist barley-testq.stanford.edu complex_values testq=1
- users now need to use the qsub parameter "-l testq=1" to have the job go to that queue instance. no jobs without testq=1 will go in that queue.
relevant thread: http://gridengine.org/pipermail/users/2012-March/002972.html
mem_free
Each node has a "load value" named "mem_free" that tracks actual free memory available.
Each node has a requestable and consumable "complex value" named "mem_free" that is set to 95G (190G for barley05).
Each job requests 4G of mem_free by default (unless uses specifies a different value). (qconf -sc)
You can see current values for the execution hosts with qstat -F, e.g. 'qstat -F -f -u "*"' and then search for mem_free.
#request 37GB RAM for this one-slot job qsub -l mem_free=37G job.script
Grid Engine will compare to the lower of the two mem_free values on the host when scheduling the job.
precise upg
We use the precise package gridengine-client v6.2u5-4
submission host
- qsub is actually a link to a gridengine wrapper script, but that is normal, provided by the distro package, it sets SGE_CELL and such
- sge_request in senpai1:/var/lib/gridengine/default/common/sge_request has the default settings applied to all jobs, it reads:
[code] -jsv /usr/sbin/jsv.tcl [/code]
This runs that tcl script as a "job submission verification" script. /usr/sbin/jsv.tcl includes /usr/sbin/jsv_include.tcl, the latter is standard GE-provided script.
jsv.tcl checks if you have a kerberos ticket and adds KRB5CCNAME env var to the job environment and runs "auks -a" to add this krb cred to AUKS. If no krb ticket, do nothing.
execution host
- prolog: /usr/local/libexec/gridengine/prolog.sh with parameter 1200
- epilog: /usr/local/libexec/gridengine/epilog.sh
- shepherd_cmd: /usr/local/libexec/gridengine/shepherd.sh
The job gets started under an sge_shepherd, so the shepherd_cmd runs first. shepherd.sh sets the krb5 config _if_ the KRB5CCNAME was set in the job environment, and then runs shepherd_pag.sh. If no KRB5CCNAME was set, then run regular sge_shepherd.
shepherd_pag.sh actually gets the krb cred from auks (auks -g -u $uid) using the barley-sgeadmin principal and runs 'aklog' to get the AFS tokens, and then runs regular sge_shepherd.
The prolog runs before the job starts but after the shepherd runs, and it runs renew_cred.sh _if_ there's a kerberos principal available.