FAQ
From FarmShare
(→How do I change my shell?) |
(→Why can't I access files in ~/afs-home or /afs?) |
||
Line 33: | Line 33: | ||
AFS access requires valid Kerberos credentials and an [[AFS#Authentication|AFS token]]. You can use the <code>klist</code> and <code>token</code> commands to view your existing credentials, if any; if you're having trouble accessing files in AFS, try re-authenticating. | AFS access requires valid Kerberos credentials and an [[AFS#Authentication|AFS token]]. You can use the <code>klist</code> and <code>token</code> commands to view your existing credentials, if any; if you're having trouble accessing files in AFS, try re-authenticating. | ||
- | + | <source lang="sh">kinit && aklog</source> | |
See [[Advanced Connection Options]] for a suggested SSH configuration that can help reduce the occurrence of token issues at login. | See [[Advanced Connection Options]] for a suggested SSH configuration that can help reduce the occurrence of token issues at login. |
Revision as of 13:20, 14 September 2017
Policy
No. FarmShare is not approved for use with high-risk data, including protected health information and personally identifiable information. Do not use FarmShare resources to store or process protected information.
Shell and Environment
How do I change my shell?
bash
is the default shell for most users, and should be the default shell for all new accounts. Older accounts may use tcsh
by default, instead. If you would like to change your shell for any reason you can send e-mail to srcc-support@stanford.edu. The bash
, zsh
, fish
, mksh
, and tcsh
shells are installed, but not all are equally well-supported.
FarmShare uses Stanford's central account infrastructure, so changing your shell on FarmShare will affect all other systems that use this infrastructure (for example, myth.stanford.edu
). Please acknowledge your understanding of this by including something like the following in the text of your request.
Please change my default shell to $SHELL. I understand that this is a global change and will affect not only FarmShare systems, but all other systems at Stanford that use the University's central account infrastructure.
Why does my shell exit when running the module
command?
An early version of the default tcsh
shell configuration set an option, printexitvalue
, that was in conflict with the Lmod configuration. This issue has been fixed for new users, but existing users may have configurations that still set printexitvalue
in ~/.tcshrc.set
. You can either edit this file to remove the statement, make the statement a comment (by prepending #
), or copy over a corrected version of the default file from /etc/skel
.
cp /etc/skel/.tcshrc.set ~
You can either log out and back in again, or run unset printexitvalue
once, to make the change take effect.
Storage
Where are my files?
FarmShare no longer uses AFS for users' home directories. AFS is still accessible on rice
systems, and you can access your AFS home directory using the convenience link, ~/afs-home
.
Why can't I access files in ~/afs-home
or /afs
?
AFS access requires valid Kerberos credentials and an AFS token. You can use the klist
and token
commands to view your existing credentials, if any; if you're having trouble accessing files in AFS, try re-authenticating.
kinit && aklog
See Advanced Connection Options for a suggested SSH configuration that can help reduce the occurrence of token issues at login.
Are my data backed up?
We take regular snapshots of data in your home directory (/home/$USER
) and may be able to recover lost or damaged files in some cases. Data in your AFS home directory (~/afs-home
) are backed up every night, and backups are kept for 30 days. The most recent backup is mounted at ~/afs-home/.backup
; if you need to recover data from an older backup you should submit a HelpSU request. Data stored on the scratch volume (/farmshare/user_data/$USER
) are not backed up and may be purged without warning.
Slurm
Why can't I submit jobs to the gpu
partition?
You must explicitly request GPU resources using the --gres
option when you submit a job to the gpu
partition.
sbatch --partition=gpu --gres=gpu:1
See the man
page for sbatch
for more information.
No Launcher Bar Farmvnc
after connecting to Farmvnc session, hit 'ctrl + alt + t' to open a terminal window. Inside the terminal run
dconf reset -f /org/compiz/ && setsid unity
error: failed receiving gdi request response for mid=1 (got syncron message receive timeout error)
This is a cryptic error that means the master system is overloaded and did not respond before the timeout period. Try again in a couple of minutes.
error: can't open output file "/afs/ir/users/c/h/chekh/YYY.oXXXX"
Check that you have your Kerberos credentials and AFS tokens per AFS
where is GView version 5?
Gauss View (GV5) is available via gaussian module.
SAS error message
When I attempt to run the command "sas", I receive the following error message. ERROR: User does not have appropriate authorization level for library SASUSER. NOTE: Unable to initialize the options subsystem. ERROR: (SASXKINI): PHASE 3 KERNEL INITIALIZATION FAILED. ERROR: Unable to initialize the SAS kernel.
Try to re-auth (kinit ; aklog), module load sas. Also check that you're not over quota with the 'fs quota' command or the '/usr/bin/check-stanford-afs-quota' command.
Received disconnect from <IP address>: 2: Too many authentication failures for...
That error message is from OpenSSH and it means it's not letting you log in because you don't have the right credentials. Check that your kerberos tickets are what you expect or that you're providing the correct password.
Decrypt integrity check failed
k5start: error getting credentials: Decrypt integrity check failed
This just means that you typed your Kerberos password wrong when kinit or whatever prompted you for it.
How to submit a binary for execution
Use the '-b' flag to 'qsub', read the qsub man page for more info. But you should probably write a small wrapper script instead.
Why won't my job run?
If your job is in state 'qw' for longer than you like, check its full output qstat -f -j JOBID and see what the scheduling reason is. It will be verbose about explaining why that job can't run in each queue instance. For full cluster information, check qstat -g c and qstat -f -u "*" and qhost -j or e-mail us for more explanations.
Why does pressing 'd' cause my windows to disappear?
The GNOME keybinding for d may be broken when using VNC for remote display. You can edit the relevant keyboard shortcut ("Hide All Normal Windows and Set Focus to the Desktop") using the GNOME Control Center (gnome-control-center). Alternatively, you can edit ~/.gconf/apps/metacity/global_keybindings/%gconf.xml manually. For example:
<?xml version="1.0"?> <gconf> <entry name="show_desktop" mtime="0123456789" type="string"> <stringvalue><Control><Alt>d</stringvalue> </entry> </gconf>
We added this setting system-wide to cardinal+corn on 2012-04-18
I get a CPLEX license error, what should I do?
You may see something like:
corn04:~> cplex Failed to initialize CPLEX environment. CPLEX Error 32201: ILM Error 8: CPLEX: access key has expired. Exiting
Check that you're using the latest version of CPLEX:
[chekh@corn05.stanford.edu] ~ [0] $ module load cplex [chekh@corn05.stanford.edu] ~ [0] $ which cplex /farmshare/software/non-free/CPLEX_Studio124/cplex/bin/x86-64_sles10_4.1/cplex
My job errored out, what should I do?
First, check what the error message is, something like
qstat -f -j $JOBID | grep err
If you think that the error is not with your job script or the job parameters, try to just resubmit.
If your job is now in state E or in state Eqw, probably the best thing to do is something like:
qresub $JOBID # this will resubmit the job with a new job id qdel $JOBID # delete the old errored one
If the job failed because you have an error in your script, it'll error out again. Sometimes we have intermittent filesystem problems, so the job will run fine when you resubmit.
You can also log in to machine 'senpai1' and try a 'qacct -j $JOBID' there, it will give you the accounting info written at end of job, there can be a different error message there.
tset: standard error: Invalid argument
In your job script, you probably don't explicitly specify which shell to use. Probably your default shell is csh and your csh startup scripts are getting loaded and something there is generating that error because the job is not run under an interactive session. So the solution is to either specify a shell on the first line of your job script in the usual Unix way e.g.
#!/bin/bash
or else use the -S flag to qsub, e.g.
# get rid of spurious messages about tty/terminal types #$ -S /bin/sh
tty errors
You may see things like
tset: standard error: Invalid argument Undefined tty stdin: is not a tty
or
Warning: no access to tty (Bad file descriptor). Thus no job control in this shell.
See the question above, and specify a shell. See the 'shell_start_mode' section of 'man sge_conf' for more info.
"failed searching requested shell because:" or other execvp errors
Often these come up because you created your job script on a Windows machine and then copied it over without adjusting the line endings. To see the exact line endings, try a 'cat -ve' on your file , or try to open it in a text editor. To convert the line endings, run 'dos2unix' on it, or re-upload with "ASCII" instead of "binary" mode.
Another example error:
error reason 1: 11/05/2014 12:16:26 [233269:25790]: execvp(/farmshare/software/free/oge/2011.11p1/FSsaucy/spool/barley09/job_scripts/2171682, "/farmshare/software/free/oge/2011.11p1/FSsaucy/spool/barley09/job_scripts/2171682") failed: No such file or directory Job is in error state
For more info, see https://en.wikipedia.org/wiki/Newline
How do I block my jobs from running on certain machines?
Use a qsub option of the form
-l 'h=!barley17.stanford.edu&!barley18.stanford.edu'
(in this example to block the job from running on barley17 and barley18. This can be useful since the machines are not identical and may not even have the same versions of some software).