GridEngine

From FarmShare

(Difference between revisions)
Jump to: navigation, search
(execution host)
 
(9 intermediate revisions not shown)
Line 1: Line 1:
We're using the Debian packages of "Sun Grid Engine" which isn't quite "Sun" anymore since Oracle bought Sun, and the Debian packages are a bit behind the current forks of Open Grid Engine or Son of Grid Engine or Univa Grid Engine.
We're using the Debian packages of "Sun Grid Engine" which isn't quite "Sun" anymore since Oracle bought Sun, and the Debian packages are a bit behind the current forks of Open Grid Engine or Son of Grid Engine or Univa Grid Engine.
 +
 +
==useful commands==
 +
*see all hosts: '''qhost'''
 +
*see all jobs, split out by host: '''qhost -j'''
 +
*see all jobs, split out by queue: '''qstat -f -u "*"'''
 +
*see all jobs and memory usage per job: ''' /farmshare/user_data/chekh/qmem/qmem -u '''
 +
*see all host attributes: '''qstat -f -F -u "*"'''
 +
*explain state 'a': '''qstat -explain a'''
 +
*summary of slots: '''qstat -g c'''
 +
*see your job: '''qstat -f -j JOBID'''
 +
*delete your job: '''qdel JOBID'''
==documentation==
==documentation==
Line 16: Line 27:
   #$ -S /bin/sh
   #$ -S /bin/sh
-
==useful commands==
 
-
*see all hosts: '''qhost'''
 
-
*see which jobs are on which hosts: '''qhost -j'''
 
-
*see all jobs: '''qstat -f -u "*"'''
 
-
*see all host attributes: '''qstat -f -F -u "*"'''
 
-
*explain state 'a': '''qstat -explain a'''
 
-
*summary of slots: '''qstat -g c'''
 
==queues==
==queues==
Line 32: Line 36:
We are using pretty much the default settings.  This page should contain a description of all of the settings we've changed away from the defaults along with a reason for doing so.
We are using pretty much the default settings.  This page should contain a description of all of the settings we've changed away from the defaults along with a reason for doing so.
-
We created four queues: bigmem.q, long.q, main.q and test.q.  Use 'qconf -sql' to list queues, 'qconf -sq queue_name' to see the queue settings.
+
We created three queues: raring.q, raring-long.q and test.q.  Use 'qconf -sql' to list queues, 'qconf -sq queue_name' to see the queue settings.
-
==main.q==
+
The name raring comes from the Ubuntu release name for 13.04
-
Most jobs end up here.  Time limit is 48hrs.
+
-
==bigmem.q==
+
<source lang="sh">
-
This queue is explicitly tied to the machine with more physical memory. Same settings as main.q
+
$ lsb_release -a
 +
No LSB modules are available.
 +
Distributor ID: Ubuntu
 +
Description: Ubuntu 13.04
 +
Release: 13.04
 +
Codename: raring
 +
</source>
-
==long.q==
+
==raring.q==
-
2week time limit instead of 48hrs
+
 
 +
This queue is for Ubuntu 13.04 (raring) jobs.  Most jobs end up here.  Time limit is 48hrs.  If your job is still running at 48hours it will be killed in this queue.
 +
 
 +
==raring-long.q==
 +
 
 +
Similar to raring.q but with 30 day time limit instead of 48hrs.  See [[FarmShare_tutorial#submit_a_large_or_long_job]] for examples.
==test.q==
==test.q==
-
15min time limit instead of 48hrs
+
15min time limit instead of 48hrs.  This queue is intended to be for debugging your submission scripts.  It will almost never have queued jobs so that you can run right away.
If you want to submit a job to this test queue, you must use '-l testq=1' flag to qsub.
If you want to submit a job to this test queue, you must use '-l testq=1' flag to qsub.
 +
 +
For example
 +
<source lang="sh">
 +
$ echo "hostname" | qsub -l testq=1
 +
</source>
 +
 +
==making the test.q==
==making the test.q==
Line 72: Line 93:
Each node has a requestable and consumable "complex value" named "mem_free" that is set to 95G (190G for barley05).
Each node has a requestable and consumable "complex value" named "mem_free" that is set to 95G (190G for barley05).
-
Each job requests 4G of mem_free by default (unless uses specifies a different value). (qconf -sc)
+
Each job requests 4G of mem_free by default per slot (unless user specifies a different value). (qconf -sc)
You can see current values for the execution hosts with qstat -F, e.g. 'qstat -F -f -u "*"' and then search for mem_free.
You can see current values for the execution hosts with qstat -F, e.g. 'qstat -F -f -u "*"' and then search for mem_free.
Line 80: Line 101:
Grid Engine will compare to the lower of the two mem_free values on the host when scheduling the job.
Grid Engine will compare to the lower of the two mem_free values on the host when scheduling the job.
-
 
=precise upg=
=precise upg=
Line 93: Line 113:
This runs that tcl script as a "job submission verification" script.  /usr/sbin/jsv.tcl includes /usr/sbin/jsv_include.tcl, the latter is standard GE-provided script.
This runs that tcl script as a "job submission verification" script.  /usr/sbin/jsv.tcl includes /usr/sbin/jsv_include.tcl, the latter is standard GE-provided script.
-
jsv.tcl adds KRB5CCNAME env var to the job environment and runs "auks -a" to add this krb cred to AUKS.
+
jsv.tcl checks if you have a kerberos ticket and adds KRB5CCNAME env var to the job environment and runs "auks -a" to add this krb cred to AUKS.  If no krb ticket, do nothing.
==execution host==
==execution host==

Latest revision as of 12:21, 3 February 2015

We're using the Debian packages of "Sun Grid Engine" which isn't quite "Sun" anymore since Oracle bought Sun, and the Debian packages are a bit behind the current forks of Open Grid Engine or Son of Grid Engine or Univa Grid Engine.

Contents

useful commands

  • see all hosts: qhost
  • see all jobs, split out by host: qhost -j
  • see all jobs, split out by queue: qstat -f -u "*"
  • see all jobs and memory usage per job: /farmshare/user_data/chekh/qmem/qmem -u
  • see all host attributes: qstat -f -F -u "*"
  • explain state 'a': qstat -explain a
  • summary of slots: qstat -g c
  • see your job: qstat -f -j JOBID
  • delete your job: qdel JOBID

documentation

Start with 'man sge_intro'. Move on to 'man qsub'. Try submitting a simple job with 'echo "sleep 3600" | qsub", then run 'qstat' and 'qdel'.

shell mode

GE can run in "POSIX mode" or "Unix mode". See the "shell_start_mode" section of 'man sge_conf'

 # qconf -sconf|grep shell
 shell_start_mode             unix_behavior
 login_shells                 bash,sh,ksh,csh,tcsh

So you'll want to explicitly specify your shell with the -S flag to qsub. E.g.

 # get rid of spurious messages about tty/terminal types
 #$ -S /bin/sh


queues

Under SGE, a 'queue' is a set of settings that get applied to jobs that are assigned to that queue.


Grid Engine settings on farmshare

We are using pretty much the default settings. This page should contain a description of all of the settings we've changed away from the defaults along with a reason for doing so.

We created three queues: raring.q, raring-long.q and test.q. Use 'qconf -sql' to list queues, 'qconf -sq queue_name' to see the queue settings.

The name raring comes from the Ubuntu release name for 13.04

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 13.04
Release:	13.04
Codename:	raring

raring.q

This queue is for Ubuntu 13.04 (raring) jobs. Most jobs end up here. Time limit is 48hrs. If your job is still running at 48hours it will be killed in this queue.

raring-long.q

Similar to raring.q but with 30 day time limit instead of 48hrs. See FarmShare_tutorial#submit_a_large_or_long_job for examples.

test.q

15min time limit instead of 48hrs. This queue is intended to be for debugging your submission scripts. It will almost never have queued jobs so that you can run right away.

If you want to submit a job to this test queue, you must use '-l testq=1' flag to qsub.

For example

$ echo "hostname" | qsub -l testq=1


making the test.q

  • add testq to the global complex attributes list (qconf -mc)
 senpai1:/root# qconf -sc |grep testq
 testq               testq      BOOL        ==    FORCED         NO         0        0

"requestable" is set to FORCED to make that attribute required (per 'man 5 complex')

  • add testq to the complex attributes of the queue (qconf -sq test.q)
 senpai1:/root# qconf -sq test.q|grep testq
 hostlist              barley-testq.stanford.edu
 complex_values        testq=1
  • users now need to use the qsub parameter "-l testq=1" to have the job go to that queue instance. no jobs without testq=1 will go in that queue.

relevant thread: http://gridengine.org/pipermail/users/2012-March/002972.html


mem_free

Each node has a "load value" named "mem_free" that tracks actual free memory available.

Each node has a requestable and consumable "complex value" named "mem_free" that is set to 95G (190G for barley05).

Each job requests 4G of mem_free by default per slot (unless user specifies a different value). (qconf -sc)

You can see current values for the execution hosts with qstat -F, e.g. 'qstat -F -f -u "*"' and then search for mem_free.

 #request 37GB RAM for this one-slot job
 qsub -l mem_free=37G job.script

Grid Engine will compare to the lower of the two mem_free values on the host when scheduling the job.

precise upg

We use the precise package gridengine-client v6.2u5-4

submission host

  • qsub is actually a link to a gridengine wrapper script, but that is normal, provided by the distro package, it sets SGE_CELL and such
  • sge_request in senpai1:/var/lib/gridengine/default/common/sge_request has the default settings applied to all jobs, it reads:

[code] -jsv /usr/sbin/jsv.tcl [/code]

This runs that tcl script as a "job submission verification" script. /usr/sbin/jsv.tcl includes /usr/sbin/jsv_include.tcl, the latter is standard GE-provided script.

jsv.tcl checks if you have a kerberos ticket and adds KRB5CCNAME env var to the job environment and runs "auks -a" to add this krb cred to AUKS. If no krb ticket, do nothing.

execution host

  • prolog: /usr/local/libexec/gridengine/prolog.sh with parameter 1200
  • epilog: /usr/local/libexec/gridengine/epilog.sh
  • shepherd_cmd: /usr/local/libexec/gridengine/shepherd.sh

The job gets started under an sge_shepherd, so the shepherd_cmd runs first. shepherd.sh sets the krb5 config _if_ the KRB5CCNAME was set in the job environment, and then runs shepherd_pag.sh. If no KRB5CCNAME was set, then run regular sge_shepherd.

shepherd_pag.sh actually gets the krb cred from auks (auks -g -u $uid) using the barley-sgeadmin principal and runs 'aklog' to get the AFS tokens, and then runs regular sge_shepherd.

The prolog runs before the job starts but after the shepherd runs, and it runs renew_cred.sh _if_ there's a kerberos principal available.

Personal tools
Toolbox
LANGUAGES