HierRegressions
From FarmShare
Contents |
Hierarchical Linear Regressions
http://www.r-tutor.com/gpu-computing/rbayes/rhierlmc
http://www.r-tutor.com/content/download
installing rpud
Rpud comes in two parts. The first is a standard R library which can be installed the usual way, except we pass NVCC_OPT to instruct the code generator to produce code for the 2.0 level devices we are using. The second part is distributed as a binary blob which has an install.sh.
Here we get and install the first part into your home directory (under lib/R).
module load cuda mkdir /tmp/$USER cd /tmp/$USER wget http://www.r-tutor.com/sites/default/files/rpud/rpud_0.3.3.tar.gz mkdir ~/lib/R export R_LIBS_USER="${HOME}/lib/R" NVCC_OPT="-arch compute_20" R CMD INSTALL --library=$R_LIBS rpud_0.3.3.tar.gz
Here we get and install the second part:
module load cuda cd /tmp/$USER wget http://www.r-tutor.com/sites/default/files/rpud/rpudplus_0.3.3.tar.gz export R_LIBS_USER="${HOME}/lib/R" cd rpudplus_0.3.3 ./install.sh
running on CPU
As a comparison, we will run on CPU using standard bayesm R library.
save following source code as rpudtestcpu.r
library(bayesm) data("cheese") str("cheese") retailer <- levels(cheese$RETAILER) nreg <- length(retailer); nreg regdata <- NULL for (i in 1:nreg) { filter <- cheese$RETAILER==retailer[i] y <- log(cheese$VOLUME[filter]) X <- cbind(1, # intercept placeholder cheese$DISP[filter], log(cheese$PRICE[filter])) regdata[[i]] <- list(y=y, X=X) } Data <- list(regdata=regdata) Mcmc <- list(R=2000) system.time( outcpu <- bayesm::rhierLinearModel( Data=Data, Mcmc=Mcmc)) beta.3cpu <- mean(as.vector(outcpu$betadraw[, 3, 201:2000])) beta.3cpu
Run it. In this case it took 42 seconds.
cp -p /farmshare/software/examples/R/hierregress/cheese.rda . R --no-save < rpudtestcpu.r
you should see output similar to:
bishopj@rye01:~$ R --no-save < rpudtestcpu.r R version 2.15.2 (2012-10-26) -- "Trick or Treat" Copyright (C) 2012 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. [Previously saved workspace restored] > library(bayesm) > > data("cheese") > str("cheese") chr "cheese" > > retailer <- levels(cheese$RETAILER) > nreg <- length(retailer); nreg [1] 88 > > regdata <- NULL > > for (i in 1:nreg) { + filter <- cheese$RETAILER==retailer[i] + y <- log(cheese$VOLUME[filter]) + X <- cbind(1, # intercept placeholder + cheese$DISP[filter], + log(cheese$PRICE[filter])) + regdata[[i]] <- list(y=y, X=X) + } > > Data <- list(regdata=regdata) > Mcmc <- list(R=2000) > > system.time( + outcpu <- bayesm::rhierLinearModel( + Data=Data, + Mcmc=Mcmc)) Z not specified -- putting in iota Starting Gibbs Sampler for Linear Hierarchical Model 88 Regressions 1 Variables in Z (if 1, then only intercept) Prior Parms: Deltabar [,1] [,2] [,3] [1,] 0 0 0 A [,1] [1,] 0.01 nu.e (d.f. parm for regression error variances)= 3 Vbeta ~ IW(nu,V) nu = 6 V [,1] [,2] [,3] [1,] 6 0 0 [2,] 0 6 0 [3,] 0 0 6 MCMC parms: R= 2000 keep= 1 MCMC Iteration (est time to end - min) 100 ( 0.7 ) 200 ( 0.7 ) 300 ( 0.6 ) 400 ( 0.6 ) 500 ( 0.5 ) 600 ( 0.5 ) 700 ( 0.5 ) 800 ( 0.4 ) 900 ( 0.4 ) 1000 ( 0.4 ) 1100 ( 0.3 ) 1200 ( 0.3 ) 1300 ( 0.2 ) 1400 ( 0.2 ) 1500 ( 0.2 ) 1600 ( 0.1 ) 1700 ( 0.1 ) 1800 ( 0.1 ) 1900 ( 0 ) 2000 ( 0 ) Total Time Elapsed: 0.71 user system elapsed 42.288 0.036 42.448 > > beta.3cpu <- mean(as.vector(outcpu$betadraw[, 3, 201:2000])) > beta.3cpu [1] -2.146799 >
running on GPU
save following source code as rpudtestgpu.r
library(bayesm) data("cheese") str("cheese") retailer <- levels(cheese$RETAILER) nreg <- length(retailer); nreg regdata <- NULL for (i in 1:nreg) { filter <- cheese$RETAILER==retailer[i] y <- log(cheese$VOLUME[filter]) X <- cbind(1, # intercept placeholder cheese$DISP[filter], log(cheese$PRICE[filter])) regdata[[i]] <- list(y=y, X=X) } Data <- list(regdata=regdata) Mcmc <- list(R=2000) library(rpud) # load rpudplus system.time( outgpu <- rpud::rhierLinearModel( Data=Data, Mcmc=Mcmc, output="bayesm")) beta.3 <- mean(as.vector(outgpu$betadraw[, 3, 201:2000])) beta.3
Run it. For GPU it takes 0.35 seconds, down from 42.448 on the CPU alone.
bishopj@rye01:~$ module load cuda bishopj@rye01:~$ export R_LIBS_USER=$HOME/lib/R bishopj@rye01:~$ R --no-save < rpudtestgpu.r R version 2.15.2 (2012-10-26) -- "Trick or Treat" Copyright (C) 2012 The R Foundation for Statistical Computing ISBN 3-900051-07-0 Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. [Previously saved workspace restored] > library(bayesm) > > data("cheese") > str("cheese") chr "cheese" > > retailer <- levels(cheese$RETAILER) > nreg <- length(retailer); nreg [1] 88 > > regdata <- NULL > > for (i in 1:nreg) { + filter <- cheese$RETAILER==retailer[i] + y <- log(cheese$VOLUME[filter]) + X <- cbind(1, # intercept placeholder + cheese$DISP[filter], + log(cheese$PRICE[filter])) + regdata[[i]] <- list(y=y, X=X) + } > > > Data <- list(regdata=regdata) > Mcmc <- list(R=2000) > > library(rpud) # load rpudplus Rpudplus 0.3.3 http://www.r-tutor.com Copyright (C) 2010-2013 Chi Yau. All Rights Reserved. Rpudplus is free for academic use only. There is absolutely NO warranty. Attaching package: 'rpud' The following object(s) are masked from 'package:bayesm': rhierLinearModel, rmultireg > > system.time( + outgpu <- rpud::rhierLinearModel( + Data=Data, + Mcmc=Mcmc, + output="bayesm")) Starting Gibbs Sampler for Linear Hierarchical Model 88 Regressions 1 Variables in Z (if 1, then only intercept) Prior Parms: =========== Deltabar [,1] [,2] [,3] [1,] 0 0 0 A [,1] [1,] 0.01 nu.e (d.f. for regression error variances) = 3 Vbeta ~ IW(nu,V) nu = 6 V [,1] [,2] [,3] [1,] 6 0 0 [2,] 0 6 0 [3,] 0 0 6 MCMC parms: ========== R = 2000, keep = 1 .................... user system elapsed 0.192 0.156 0.350 > > beta.3 <- mean(as.vector(outgpu$betadraw[, 3, 201:2000])) > beta.3 [1] -2.145573 >