R
From FarmShare
(→Installing CRAN Packages) |
|||
(53 intermediate revisions not shown) | |||
Line 1: | Line 1: | ||
- | + | == Which R are you using? == | |
+ | Try run | ||
+ | which R | ||
+ | Try run | ||
+ | R --version | ||
+ | |||
+ | As of 2014-07, we have two versions of R installed. If you do nothing, you'll get the default R that comes with Ubuntu 14.04, which is R v3.0.2, and includes a lot of R libraries which are distributed by Ubuntu. | ||
+ | |||
+ | As of 2015-01 we have a second newer version available, v3.1.2, available via 'module load r'. This also includes rstudio. | ||
+ | |||
+ | To use the newer version: | ||
+ | log in with X11 forwarding or via [[FarmVNC]] | ||
+ | module load r | ||
+ | rstudio | ||
== Looking at installed packages == | == Looking at installed packages == | ||
+ | You can see the list of installed R libraries by the library() call in R | ||
<source lang="r"> | <source lang="r"> | ||
library(); | library(); | ||
- | |||
- | |||
</source> | </source> | ||
+ | |||
+ | We have a lot of packages already installed, you can ask us to install more, or just install them quickly in your homedir. | ||
+ | |||
+ | You can also use | ||
+ | <source lang="r"> | ||
+ | libPaths(); | ||
+ | </source> | ||
+ | to check which directories R will look in. https://stat.ethz.ch/R-manual/R-devel/library/base/html/libPaths.html | ||
== Installing CRAN Packages == | == Installing CRAN Packages == | ||
Most [http://cran.r-project.org/ CRAN] packages can be installed per-user by running install.packages() in an interactive session: | Most [http://cran.r-project.org/ CRAN] packages can be installed per-user by running install.packages() in an interactive session: | ||
- | < | + | <source lang="r"> |
- | </ | + | install.packages("package_name", dependencies = TRUE) |
- | R initially attempts to install to /usr/local/lib/R, | + | </source> |
+ | R initially attempts to install to /usr/local/lib/R, and you don't have permissions to write there, so it will prompt for the creation of a library subdirectory in ~/R (if necessary) and fall back to installation there when the initial attempt fails. If your package requires dependencies available from the standard Ubuntu [http://packages.ubuntu.com/ repositories] you can e-mail us requesting installation. | ||
- | You can, of course, install R libraries into any arbitrary path and just add that path to your R env. | + | You can, of course, install R libraries into any arbitrary path and just add that path to your R env. That will probably break the next time R is upgraded to a new version, since your packages are built with the older version. |
+ | |||
+ | If you have trouble with some kind of SSL error, you can explicitly specify an HTTP mirror, e.g. | ||
+ | install.packages("spatstat", dependencies=TRUE, repos="http://cran.cnr.Berkeley.edu/") | ||
== R Sample Job == | == R Sample Job == | ||
- | Here's an example R file that generates a large array, fills it with some random numbers, then sleeps for 5mins. This happens to use up almost exactly 8GB of RAM. | + | Here's an example R file that generates a large array, fills it with some random numbers, then sleeps for 5mins. This happens to use up almost exactly 8GB of RAM. And you know it's going to run for about 5 mins. |
+ | |||
+ | Save this as 8GB.R: | ||
+ | |||
<source lang="r"> | <source lang="r"> | ||
- | |||
x <- array(1:1073741824, dim=c(1024,1024,1024)) | x <- array(1:1073741824, dim=c(1024,1024,1024)) | ||
x <- gaussian() | x <- gaussian() | ||
Sys.sleep(300) | Sys.sleep(300) | ||
</source> | </source> | ||
+ | |||
Here's an example SGE submit script that runs that R file. | Here's an example SGE submit script that runs that R file. | ||
<source lang="sh"> | <source lang="sh"> | ||
Line 33: | Line 60: | ||
# use the current directory | # use the current directory | ||
#$ -cwd | #$ -cwd | ||
+ | #$ -S /bin/bash | ||
+ | |||
# mail this address | # mail this address | ||
- | #$ -M | + | #$ -M $USER@stanford.edu |
# send mail on begin, end, suspend | # send mail on begin, end, suspend | ||
#$ -m bes | #$ -m bes | ||
- | # | + | |
- | #$ - | + | # request 8GB of RAM, not hard-enforced on FarmShare |
+ | #$ -l mem_free=8G | ||
+ | |||
+ | # request 6 mins of runtime, is hard-enforced on FarmShare | ||
+ | #$ -l h_rt=00:06:00 | ||
R --vanilla --no-save < 8GB.R | R --vanilla --no-save < 8GB.R | ||
Line 47: | Line 80: | ||
qsub r_test.script | qsub r_test.script | ||
- | Here are the output files that I get, one from stderr, one from stdout | + | Here are the output files that I get, one from stderr, one from stdout |
- | + | ||
- | + | ||
- | + | <pre>$ cat r_test.script.o2029205 | |
- | + | ||
- | + | ||
- | + | R version 3.0.1 (2013-05-16) -- "Good Sport" | |
- | + | Copyright (C) 2013 The R Foundation for Statistical Computing | |
- | + | ||
- | R version | + | |
- | Copyright (C) | + | |
- | + | ||
Platform: x86_64-pc-linux-gnu (64-bit) | Platform: x86_64-pc-linux-gnu (64-bit) | ||
R is free software and comes with ABSOLUTELY NO WARRANTY. | R is free software and comes with ABSOLUTELY NO WARRANTY. | ||
+ | You are welcome to redistribute it under certain conditions. | ||
+ | Type 'license()' or 'licence()' for distribution details. | ||
R is a collaborative project with many contributors. | R is a collaborative project with many contributors. | ||
Line 72: | Line 99: | ||
'help.start()' for an HTML browser interface to help. | 'help.start()' for an HTML browser interface to help. | ||
Type 'q()' to quit R. | Type 'q()' to quit R. | ||
- | |||
> x <- array(1:1073741824, dim=c(1024,1024,1024)) | > x <- array(1:1073741824, dim=c(1024,1024,1024)) | ||
> x <- gaussian() | > x <- gaussian() | ||
> Sys.sleep(300) | > Sys.sleep(300) | ||
- | > | + | > |
- | + | ||
- | + | ||
- | + | </pre> | |
- | + | ||
- | + | And here's the e-mail I get about the job, you can see the runtime and memory usage: | |
+ | <pre> | ||
+ | Job 2029205 (r_test.script) Complete | ||
+ | User = chekh | ||
+ | Queue = saucy.q@barley12.Stanford.EDU | ||
+ | Host = barley12.Stanford.EDU | ||
+ | Start Time = 07/10/2014 12:54:31 | ||
+ | End Time = 07/10/2014 13:00:08 | ||
+ | User Time = 00:00:29 | ||
+ | System Time = 00:00:06 | ||
+ | Wallclock Time = 00:05:37 | ||
+ | CPU = 00:00:35 | ||
+ | Max vmem = 8.107G | ||
+ | Exit Status = 0 | ||
+ | </pre> | ||
== Another R Sample Job == | == Another R Sample Job == | ||
Line 94: | Line 131: | ||
print("Finished") | print("Finished") | ||
</source> | </source> | ||
+ | |||
+ | Job script, let's call it R-jags.submit.script | ||
+ | <source lang="sh"> | ||
+ | #!/bin/bash | ||
+ | |||
+ | # use the current directory | ||
+ | #$ -cwd | ||
+ | #$ -S /bin/bash | ||
+ | |||
+ | # mail this address | ||
+ | #$ -M $USER@stanford.edu | ||
+ | # send mail on begin, end, suspend | ||
+ | #$ -m bes | ||
+ | |||
+ | R --vanilla --no-save < R-jags.R | ||
+ | </source> | ||
+ | |||
+ | Submit it to the test queue with a small memory requirement: | ||
+ | qsub -l mem_free=200M -l testq=1 R-jags.submit.script | ||
+ | |||
+ | |||
+ | Looking at the output files, it errored out because R can't find the package rjags. You have two alternatives: | ||
+ | *include the [[R]] library from /mnt/glusterfs/software | ||
+ | *use modules to specify the full [[R]] install from /mnt/glusterfs/software | ||
+ | |||
+ | The first way, you would add this line to your R script: | ||
+ | |||
+ | .libPaths(c("/mnt/glusterfs/software/free/R-2.15.0/lib/R/library", "/usr/lib/R/library")) | ||
+ | |||
+ | The second way, your script will look like this: | ||
+ | <source lang="sh"> | ||
+ | |||
+ | $ cat R-jags.submit.script | ||
+ | #!/bin/bash | ||
+ | |||
+ | # use the current directory | ||
+ | #$ -cwd | ||
+ | #$ -S /bin/bash | ||
+ | |||
+ | # mail this address | ||
+ | #$ -M chekh@stanford.edu | ||
+ | # send mail on begin, end, suspend | ||
+ | #$ -m bes | ||
+ | |||
+ | eval `tclsh /mnt/glusterfs/software/free/modules/tcl/modulecmd.tcl sh autoinit` | ||
+ | module load R-2.15.0 | ||
+ | R --vanilla --no-save < R-jags.R | ||
+ | </source> | ||
+ | |||
+ | == Jupyter == | ||
+ | |||
+ | R can also be run in a [https://jupyter.org Jupyter] notebook on FarmShare servers and used via a web browser. | ||
+ | |||
+ | [https://irkernel.github.io IRkernel] is available as part of the prebuilt Jupyter environment accessible via the [[Jupyter| Jupyter installation guide]]. | ||
== Links == | == Links == | ||
Line 102: | Line 193: | ||
*http://me.eng.uab.edu/wiki/index.php?title=R-userinfo | *http://me.eng.uab.edu/wiki/index.php?title=R-userinfo | ||
*https://www.stanford.edu/dept/statistics/cgi-bin/projects/stat-sysadminwiki/index.php/R_Jobs | *https://www.stanford.edu/dept/statistics/cgi-bin/projects/stat-sysadminwiki/index.php/R_Jobs | ||
+ | *http://www.glennklockwood.com/di/R-para.php | ||
+ | |||
+ | == building our local R == | ||
+ | Here's how I usually do it. | ||
+ | *cd /mnt/glusterfs/software/free | ||
+ | *wget http://cran.cnr.berkeley.edu/src/base/R-2/R-2.15.1.tar.gz | ||
+ | *tar zxvf R-2.15.1.tar.gz | ||
+ | *cd R-2.15.1 | ||
+ | *./configure --enable-R-shlib | ||
+ | *make | ||
+ | * don't "make install" | ||
+ | * write new FarmShare module, e.g. /mnt/glusterfs/software/free/modules/tcl/modulefiles/R-2.15.1 | ||
+ | |||
+ | ===2014-07-10=== | ||
+ | R 3.1.1 released today, I compiled it as chekh on corn40 (Ubuntu 13.10) | ||
+ | |||
+ | *cd /farmshare/software/free/r | ||
+ | *wget http://cran.cnr.berkeley.edu/src/base/R-3/R-3.1.1.tar.gz | ||
+ | *cd R-3.1.1 | ||
+ | *./configure --enable-R-shlib | ||
+ | <pre> | ||
+ | R is now configured for x86_64-unknown-linux-gnu | ||
+ | |||
+ | Source directory: . | ||
+ | Installation directory: /usr/local | ||
+ | |||
+ | C compiler: gcc -std=gnu99 -g -O2 | ||
+ | Fortran 77 compiler: gfortran -g -O2 | ||
+ | |||
+ | C++ compiler: g++ -g -O2 | ||
+ | C++ 11 compiler: g++ -std=c++11 -g -O2 | ||
+ | Fortran 90/95 compiler: gfortran -g -O2 | ||
+ | Obj-C compiler: gcc -g -O2 -fobjc-exceptions | ||
+ | |||
+ | Interfaces supported: X11, tcltk | ||
+ | External libraries: readline, ICU, lzma | ||
+ | Additional capabilities: PNG, JPEG, TIFF, NLS, cairo | ||
+ | Options enabled: shared R library, shared BLAS, R profiling | ||
+ | |||
+ | Recommended packages: yes | ||
+ | </pre> | ||
+ | *make | ||
+ | *write /farmshare/software/mf/saucy/r/3.1.1.lua | ||
+ | *also added rstudio | ||
+ | ===2015-02-03=== | ||
+ | Added R 3.1.2 as above to Ubuntu 14.04. | ||
+ | ===2015-07-13=== | ||
+ | As chekh on corn25 (oldest CPU) | ||
+ | cd /farmshare/software/free/r | ||
+ | wget http://cran.r-project.org/src/base/R-3/R-3.2.1.tar.gz | ||
+ | tar zxvf R-3.2.1.tar.gz | ||
+ | cd R-3.2.1 | ||
+ | ./configure --enable-R-shlib | ||
+ | make | ||
+ | |||
+ | ==lapack issues== | ||
+ | If you see messages like: | ||
+ | unable to load shared object '/usr/lib/R/modules//lapack.so': | ||
+ | most likely you're mixing R versions and libraries. | ||
+ | |||
+ | Double check that you are not setting R library path to point to directories with older libraries. | ||
+ | |||
+ | This test script should run fine if you have everything set correctly | ||
+ | |||
+ | <pre> | ||
+ | $ cat lapack.r | ||
+ | data(iris) | ||
+ | zz = lm(Sepal.Length ~., data = iris) | ||
+ | summary(zz) | ||
+ | |||
+ | $ R --no-save < lapack.r | ||
+ | </pre> | ||
+ | |||
+ | ==links== | ||
+ | *proclus - http://web.stanford.edu/group/proclus/cgi-bin/mediawiki/index.php/Software-R | ||
+ | *sherlock - http://sherlock.stanford.edu/mediawiki/index.php/R |
Latest revision as of 23:00, 22 June 2018
Contents |
Which R are you using?
Try run
which R
Try run
R --version
As of 2014-07, we have two versions of R installed. If you do nothing, you'll get the default R that comes with Ubuntu 14.04, which is R v3.0.2, and includes a lot of R libraries which are distributed by Ubuntu.
As of 2015-01 we have a second newer version available, v3.1.2, available via 'module load r'. This also includes rstudio.
To use the newer version:
log in with X11 forwarding or via FarmVNC module load r rstudio
Looking at installed packages
You can see the list of installed R libraries by the library() call in R
library();
We have a lot of packages already installed, you can ask us to install more, or just install them quickly in your homedir.
You can also use
libPaths();
to check which directories R will look in. https://stat.ethz.ch/R-manual/R-devel/library/base/html/libPaths.html
Installing CRAN Packages
Most CRAN packages can be installed per-user by running install.packages() in an interactive session:
install.packages("package_name", dependencies = TRUE)
R initially attempts to install to /usr/local/lib/R, and you don't have permissions to write there, so it will prompt for the creation of a library subdirectory in ~/R (if necessary) and fall back to installation there when the initial attempt fails. If your package requires dependencies available from the standard Ubuntu repositories you can e-mail us requesting installation.
You can, of course, install R libraries into any arbitrary path and just add that path to your R env. That will probably break the next time R is upgraded to a new version, since your packages are built with the older version.
If you have trouble with some kind of SSL error, you can explicitly specify an HTTP mirror, e.g.
install.packages("spatstat", dependencies=TRUE, repos="http://cran.cnr.Berkeley.edu/")
R Sample Job
Here's an example R file that generates a large array, fills it with some random numbers, then sleeps for 5mins. This happens to use up almost exactly 8GB of RAM. And you know it's going to run for about 5 mins.
Save this as 8GB.R:
x <- array(1:1073741824, dim=c(1024,1024,1024)) x <- gaussian() Sys.sleep(300)
Here's an example SGE submit script that runs that R file.
#!/bin/bash # use the current directory #$ -cwd #$ -S /bin/bash # mail this address #$ -M $USER@stanford.edu # send mail on begin, end, suspend #$ -m bes # request 8GB of RAM, not hard-enforced on FarmShare #$ -l mem_free=8G # request 6 mins of runtime, is hard-enforced on FarmShare #$ -l h_rt=00:06:00 R --vanilla --no-save < 8GB.R
You can submit it with just
qsub r_test.script
Here are the output files that I get, one from stderr, one from stdout
$ cat r_test.script.o2029205 R version 3.0.1 (2013-05-16) -- "Good Sport" Copyright (C) 2013 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > x <- array(1:1073741824, dim=c(1024,1024,1024)) > x <- gaussian() > Sys.sleep(300) >
And here's the e-mail I get about the job, you can see the runtime and memory usage:
Job 2029205 (r_test.script) Complete User = chekh Queue = saucy.q@barley12.Stanford.EDU Host = barley12.Stanford.EDU Start Time = 07/10/2014 12:54:31 End Time = 07/10/2014 13:00:08 User Time = 00:00:29 System Time = 00:00:06 Wallclock Time = 00:05:37 CPU = 00:00:35 Max vmem = 8.107G Exit Status = 0
Another R Sample Job
R script, let's call it R-rjags.R
print("Hello World") library(rjags) #this just loaded some settings from that library print("Finished")
Job script, let's call it R-jags.submit.script
#!/bin/bash # use the current directory #$ -cwd #$ -S /bin/bash # mail this address #$ -M $USER@stanford.edu # send mail on begin, end, suspend #$ -m bes R --vanilla --no-save < R-jags.R
Submit it to the test queue with a small memory requirement:
qsub -l mem_free=200M -l testq=1 R-jags.submit.script
Looking at the output files, it errored out because R can't find the package rjags. You have two alternatives:
- include the R library from /mnt/glusterfs/software
- use modules to specify the full R install from /mnt/glusterfs/software
The first way, you would add this line to your R script:
.libPaths(c("/mnt/glusterfs/software/free/R-2.15.0/lib/R/library", "/usr/lib/R/library"))
The second way, your script will look like this:
$ cat R-jags.submit.script #!/bin/bash # use the current directory #$ -cwd #$ -S /bin/bash # mail this address #$ -M chekh@stanford.edu # send mail on begin, end, suspend #$ -m bes eval `tclsh /mnt/glusterfs/software/free/modules/tcl/modulecmd.tcl sh autoinit` module load R-2.15.0 R --vanilla --no-save < R-jags.R
Jupyter
R can also be run in a Jupyter notebook on FarmShare servers and used via a web browser.
IRkernel is available as part of the prebuilt Jupyter environment accessible via the Jupyter installation guide.
Links
Some other departments have some other more detailed examples:
- http://wiki.genomics.upenn.edu/index.php/HPC:ExamplesR
- http://me.eng.uab.edu/wiki/index.php?title=R-userinfo
- https://www.stanford.edu/dept/statistics/cgi-bin/projects/stat-sysadminwiki/index.php/R_Jobs
- http://www.glennklockwood.com/di/R-para.php
building our local R
Here's how I usually do it.
- cd /mnt/glusterfs/software/free
- wget http://cran.cnr.berkeley.edu/src/base/R-2/R-2.15.1.tar.gz
- tar zxvf R-2.15.1.tar.gz
- cd R-2.15.1
- ./configure --enable-R-shlib
- make
- don't "make install"
- write new FarmShare module, e.g. /mnt/glusterfs/software/free/modules/tcl/modulefiles/R-2.15.1
2014-07-10
R 3.1.1 released today, I compiled it as chekh on corn40 (Ubuntu 13.10)
- cd /farmshare/software/free/r
- wget http://cran.cnr.berkeley.edu/src/base/R-3/R-3.1.1.tar.gz
- cd R-3.1.1
- ./configure --enable-R-shlib
R is now configured for x86_64-unknown-linux-gnu Source directory: . Installation directory: /usr/local C compiler: gcc -std=gnu99 -g -O2 Fortran 77 compiler: gfortran -g -O2 C++ compiler: g++ -g -O2 C++ 11 compiler: g++ -std=c++11 -g -O2 Fortran 90/95 compiler: gfortran -g -O2 Obj-C compiler: gcc -g -O2 -fobjc-exceptions Interfaces supported: X11, tcltk External libraries: readline, ICU, lzma Additional capabilities: PNG, JPEG, TIFF, NLS, cairo Options enabled: shared R library, shared BLAS, R profiling Recommended packages: yes
- make
- write /farmshare/software/mf/saucy/r/3.1.1.lua
- also added rstudio
2015-02-03
Added R 3.1.2 as above to Ubuntu 14.04.
2015-07-13
As chekh on corn25 (oldest CPU)
cd /farmshare/software/free/r wget http://cran.r-project.org/src/base/R-3/R-3.2.1.tar.gz tar zxvf R-3.2.1.tar.gz cd R-3.2.1 ./configure --enable-R-shlib make
lapack issues
If you see messages like:
unable to load shared object '/usr/lib/R/modules//lapack.so':
most likely you're mixing R versions and libraries.
Double check that you are not setting R library path to point to directories with older libraries.
This test script should run fine if you have everything set correctly
$ cat lapack.r data(iris) zz = lm(Sepal.Length ~., data = iris) summary(zz) $ R --no-save < lapack.r
links
- proclus - http://web.stanford.edu/group/proclus/cgi-bin/mediawiki/index.php/Software-R
- sherlock - http://sherlock.stanford.edu/mediawiki/index.php/R