MCglmnet {MCglmnet}R Documentation

Monte Carlo Resampling Enhanced Regularized Generalized Linear Models and Cox Proportional Hazard Models

Description

Monte Carlo Resampling Enhanced Regularized Generalized Linear Models and Cox Proportional Hazard Models

Usage

MCglmnet(d.train, xCol, yCol, d.test = NULL, xCol.test,
  yCol.test = NULL, idCol.test, RespType = c("B", "M", "S", "C"),
  CV.iter = 10, n.fold = 5, n.glmnet.fold = 5, seed.iter = 50,
  Strat = TRUE, perf.glmnet = "auc", lambda.method = c("opt", "1se"),
  resample.method = c("CV", "bootstrap"), SigGenesList = "SigList.csv",
  BoxPlotName = "boxplot.pdf", CoefPlotName = "coefplot.pdf",
  ofile = TRUE, ofilename = "results.csv",
  RobustGenesList = "robust.genes.list.csv", Path = "./",
  Ext.Val = FALSE, Output.Ext.Raw = FALSE,
  results.ext.RAW = "ext.raw.preds.csv",
  Results.Ext = "external.results.csv", seed = 263, XParm = 4,
  YParm = 4, pre.filter = NULL, time.cut = NULL, ...)

Arguments

d.train

training data

xCol

the start column number in the training data. All columns (include this column) on its right hand side are the predictors in training data

yCol

the column number for the response variable in the training data.

d.test

test data.

xCol.test

the start column number for predictors in the test data.

yCol.test

the column number for the response variable in the test data.

idCol.test

id column in the test set. (ATTENTION: May not need this in final package.)

RespType

the response variable type. “B” means binary, “M” means multiple categories, “C” means continuous, and "S" means survival;

CV.iter

number of iterations for n-fold outside cross validation. Default is 10.

n.fold

number of folds in the n-fold outside cross validation. Default is 5.

n.glmnet.fold

number of folds in the inside cross validation for tuning parameter selection. Default is 5.

seed.iter

number of MC iterations. Default is 50.

Strat

indicator (TRUE or FALSE) for whether statification is used in n-fold cross validation folds generation. Default value is TRUE.

perf.glmnet

the performance measure used in the inside cross validation for tuning parameter selection. Default is "auc" for Binary Response.

lambda.method

the method used for selecting tuning parameter lambda in inside cross validation. "opt" or "1se". Default is "opt".

resample.method

resampling method used in MC iterations. "CV" (cross validation) or "bootstrap". Default is "CV".

SigGenesList

output file for signature markers. Default is "SigList.csv".

BoxPlotName

output file for boxplots for signature markers. Default is "BoxPlots-SigList.pdf".

CoefPlotName

output file for coefficient plot for signature markers. Default is "coef.plot-SigList.pdf".

ofile

indicator for whether to save to the log file. TRUE or FALSE. Default is TRUE.

ofilename

log file name. Default is "results.csv".

RobustGenesList

output file for Robust markers. The frequencies of each markers was selected in the cross validation are recorded. Default is "robust.gene.list.csv".

Path

The path for output files. Default is currenty directory "./".

Ext.Val

indictor for whether conducting external validation on test data. TRUE or FALSE. Default is FALSE.

Output.Ext.Raw

indicator for whether prediction results on test data is saved. TRUE or FALSE. Default is FALSE.

results.ext.RAW

output file for prediction results on test data. Default is "ext.raw.preds.csv".

Results.Ext

output file for predictive performance for external validation on the test data. Default is "external.results.csv".

seed

the seed for random number generation that is set before running MCglmnet. Default is 263.

XParm

the number of rows in the layout of the Boxplots in the boxplot output file.

YParm

the number of columns in the layout of the Boxplots in the boxplot output file.

pre.filter

number of prefitlered markers.

time.cut

time cutoff values for evaulating performance for time-to-event response.

...

extra parameters that are passed to "glmnet" inside the MCglmnent algorithm. e.g., the elasticnet mixing parameter "alpha" can be passed.

Value

The MCglmnet function does not return an object. All the results are saved in the output files.

Author(s)

Feng Hong, Lu Tian, Viswanath Devanarayan

References

Feng Hong, Lu Tian, and Viswanath Devanarayan (In Review) Improving the robustness of variable selection and predictive performance of regularized generalized linear models and cox proportional hazard models.

Examples

## Not run: 
MCglmnet(dat.train, xCol=2, yCol=1, RespType="B",
d.test=dat.test, xCol.test=2, yCol.test=1, Ext.Val=TRUE,
lambda.method="1se",
CV.iter=10, n.fold=10, seed.iter=50,
Path="./", ofile=TRUE, alpha=1,
pre.filter=10);

## End(Not run)


[Package MCglmnet version 1.0.0 Index]