Barley info

From FarmShare

(Difference between revisions)
Jump to: navigation, search
Line 5: Line 5:
*480 max jobs per user (look for max_u_jobs in output of 'qconf -sconf')  
*480 max jobs per user (look for max_u_jobs in output of 'qconf -sconf')  
*3000 max jobs in the system (look for max_jobs in output of 'qconf -sconf')  
*3000 max jobs in the system (look for max_jobs in output of 'qconf -sconf')  
-
*48hr max runtime for any job in regular queue (look for h_rt in output of 'qconf -sq precise.q')  
+
*48hr max runtime for any job in regular queue (look for h_rt in output of 'qconf -sq raring.q')  
-
*30 days max runtime for the long queue (look for h_rt in output of 'qconf -sq precise-long.q')  
+
*30 days max runtime for the long queue (look for h_rt in output of 'qconf -sq raring-long.q')  
*15min max runtime in test.q  
*15min max runtime in test.q  
*4GB default mem_free request per slot ('qconf -sc |grep mem_free')
*4GB default mem_free request per slot ('qconf -sc |grep mem_free')

Revision as of 10:49, 29 October 2013

Follow the FarmShare tutorial or the User Guide

current barley policies

  • 480 max jobs per user (look for max_u_jobs in output of 'qconf -sconf')
  • 3000 max jobs in the system (look for max_jobs in output of 'qconf -sconf')
  • 48hr max runtime for any job in regular queue (look for h_rt in output of 'qconf -sq raring.q')
  • 30 days max runtime for the long queue (look for h_rt in output of 'qconf -sq raring-long.q')
  • 15min max runtime in test.q
  • 4GB default mem_free request per slot ('qconf -sc |grep mem_free')

Technical details

  • 19 new machines, AMD Magny Cours 24 cores each, 96GB RAM
  • 1 new machine, AMD Magny Cours 24 cores, 192GB RAM
  • ~450GB local scratch on each
  • ~100TB in /farmshare/user_data shared across all barley and corn systems (introduced summer 2013)
  • Open Grid Scheduler 2011.11p1
  • 10GbE interconnect (Juniper QFX3500 switch)

how to use the barley machines

To start using these new machines, you can check out the man page for 'sge_intro' or the 'qhost', 'qstat', 'qsub' and 'qdel' commands.

Initial issues:

  • You are limited in space to your AFS homedir ($HOME) and local scratch disk on each node ($TMPDIR)
  • The execution hosts don't accept interactive jobs, only batch jobs for now.
  • You'll want to make sure you have your Kerberos TGT and your AFS token.

If you want to use the newer bigger storage:

  1. log into any FarmShare machine: ssh sunetid@corn.stanford.edu
  2. cd to /farmshare/user_data/<your username> (or wait 5mins if it doesn't exist yet)
  3. write a job script: "$EDITOR test_job.script"
    1. see 'man qsub' for more info
    2. use env var $TMPDIR for local scratch
    3. use /farmshare/user_data/<your username> for shared data directory
  4. submit the job for processing: "qsub -cwd test_job.script"
  5. monitor the jobs with "qstat -f -j JOBID"
    1. see 'man qstat' for more info
  6. check the output files that you specified in your job script (the input and output files must be in /farmshare/user_data/)

Any questions, please email 'farmshare-discuss@lists.stanford.edu' Some good intro usage examples here: http://gridscheduler.sourceforge.net/howto/basic_usage.html

Personal tools
Toolbox
LANGUAGES