From FarmShare

Jump to: navigation, search

Trying to compile a yen-specific R with Intel MKL. Following

as chekh on yen4

 mkdir /farmshare/software/free/r/yen-r
 tar zxvf R-3.2.2.tar.gz
 cd R-3.2.2/
 module load intel/2016
 CFLAGS="-O2 -march=native" CXXFLAGS="-O2 -march=native" FFLAGS="-O2 -march=native" FCFLAGS="-O2 -march=native" ./configure --with-blas="-L${MKLROOT}/lib/intel64 -lmkl_gf_lp64 -lmkl_sequential -lmkl_lapack95_lp64 -lmkl_core" --with-lapack --prefix=/farmshare/software/free/r/yen-r/ir --enable-R-shlib

That will use gcc but Intel MKL to compile it.

configure results:

R is now configured for x86_64-pc-linux-gnu

  Source directory:          .
  Installation directory:    /farmshare/software/free/r/yen-r/ir

  C compiler:                gcc -std=gnu99  -O2 -march=native
  Fortran 77 compiler:       gfortran  -O2 -march=native

  C++ compiler:              g++  -O2 -march=native
  C++ 11 compiler:           g++  -std=c++11 -O2 -march=native
  Fortran 90/95 compiler:    gfortran -O2 -march=native
  Obj-C compiler:	     gcc -g -O2 -fobjc-exceptions

  Interfaces supported:      X11, tcltk
  External libraries:        readline, BLAS(ATLAS), LAPACK(generic), zlib, bzlib, lzma, PCRE, curl
  Additional capabilities:   PNG, JPEG, TIFF, NLS, cairo, ICU
  Options enabled:           shared R library, R profiling

  Capabilities skipped:      
  Options not enabled:       shared BLAS, memory profiling

  Recommended packages:      yes
 make install


use file proclus:~chekh/alex_blas_test.R (first example from )

It prints out the runtime at the end, but I guess you actually need toadjust by the number of cores it used. Which you have to catch from e.g. the output of /usr/bin/time CPU.

proclus login node, R 2.15 mkl:

module load R
/usr/bin/time R --vanilla < alex_blas_test.R
%*%	 6.871 	crossprod	 6.421> 
68.40user 0.61system 1:10.39elapsed 98%CPU (0avgtext+0avgdata 2796640maxresident)k
16inputs+0outputs (0major+20901minor)pagefaults 0swaps

so that was 6.8,6.4 on one CPU

yen4, Ubuntu R:

$ /usr/bin/time R --vanilla < alex_blas_test.R 
%*%	 1.84 	crossprod	 1.141> 
179.62user 290.75system 0:26.70elapsed 1761%CPU (0avgtext+0avgdata 697980maxresident)k
112inputs+24outputs (1major+514924minor)pagefaults 0swaps

so that was 1.8, 1.1 but with ~18 CPU

yen4, newly built R:

$ /usr/bin/time /farmshare/software/free/r/yen-r/ir/bin/R --vanilla < alex_blas_test.R 
%*%	 13.382 	crossprod	 12.467> 
125.97user 29.37system 2:35.23elapsed 100%CPU (0avgtext+0avgdata 748872maxresident)k
0inputs+0outputs (0major+331647minor)pagefaults 0swaps

So that was 13.3 and 12.4 on one CPU.

yen4, previously built R, super slow:

$ module load r
$ which R
$ /usr/bin/time R --vanilla < alex_blas_test.R
%*%	 80.502 	crossprod	 83.449> 
784.90user 64.11system 14:10.21elapsed 99%CPU (0avgtext+0avgdata 686988maxresident)k
30232inputs+16352outputs (7major+341459minor)pagefaults 0swaps

80s on one CPU, about 6 x slower than the MKL build, and about 40x slower than the Ubuntu multi-core run.

srn-chek desktop, RRO R per RRO, made my fans spin up a bit

export PATH=~/RRO/usr/lib64/RRO-3.2.2/R-3.2.2/lib/R/bin/:$PATH
/usr/bin/time R --vanilla < test.R
%*%	 1.229 	crossprod	 0.956> 
78.78user 4.11system 0:28.95elapsed 286%CPU (0avgtext+0avgdata 849888maxresident)k
26480inputs+0outputs (69major+1029313minor)pagefaults 0swaps

so 1.2 and 0.9 with ~3 cores, about 6 times faster than yen4

Personal tools