We will mainly use two packages during this workshop: dada2 and phyloseq.
It is recommanded that you update your R software in order that these packages run smoothly.
You should have R version 3.6.0 or newer, and Bioconductor version 3.9. Using Bioconductor is the easiest way to install both dada2 and phyloseq. You can check your actual R version by executing the R.version command in R.
R.version
If your version is older than 3.6, follow the next steps. If not, skip to the Installing the packages section.
install.packages("installr")
library(installr)
updateR()
This should migrate the packages you’ve installed on your older version of R to the folder corresponding to the newest version of R. It also updates all the packages you’ve installed from the CRAN repository.
install.packages('devtools')
library(devtools)
install_github('andreacirilloac/updateR')
library(updateR)
updateR(admin_password = 'Admin user password')
In order to check if everything went according to plan:
R.version
Now that you’ve got an up to date version of R,
we’re going to install dada2 and phyloseq.
The new installation solution uses BiocManager which is preferable to the older biocLite method.
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("dada2", version = "3.9")
BiocManager::install("phyloseq", version = "3.9")
The OLD way of doing this was (using biocLite):
source("https://bioconductor.org/biocLite.R") # Installs Bionconductor
biocLite("dada2") # Installs dada2
biocLite("phyloseq") # Installs phyloseq
If this doesn’t work you can try other methods. Methods for DADA2 and Phyloseq.
If you don’t succeed, we’ll help you out in the first timeslot of the workshop.
On RStudio you can type ?‘name’ or help(‘name’) For example:
library(dada2) # Loads the DADA2 package in the environment
?dada2
?filterAndTrim # One of DADA2's functions
help(filterAndTrim)
This is particularly interesting to understand the parameters the function uses
You can also see the source code of a function by just typing its name.
library(dada2) # Loads the dada2 package in the environment
filterAndTrim
Whenever you’re launching R, it selects a default directory (i.e a file in which you can save your script, your objects). The function getwd will give you the path to the directory you are in. In order to change it, the function setwd is used. You have to specify the path to the directory you want. Another usefull function is list.files which will return all the files present in the directory you’re in.
getwd()
setwd("/Users/Serina/SelfStudy/ASM/")
list.files()
Note: It is actually worth your time to use the here package and create a new R project:
install.packages("here")
here("data", "file_wanted.csv")
see the recommendation by Jenny Bryan here.
When you run a pipeline or analysis, it is convenient to save the objects such as matrices or dataframes. By doing so, you won’t have to run your analysis every single time you want to view the object in question.
In order to save an object the function saveRDS can be used. The object can then be opened with the readRDS function. This pair of functions are an alternative to the save and load. Their adventage is to allow the user to give a new name to the saved object when they load it.
mat <- matrix(sample(0:1, 12, replace=TRUE),3,4) # A 3 by 4 matrix containing 0s and 1s
saveRDS(mat, "nem.rds") # Save mat with Neo as a file name
the.matrix <- readRDS("nem.rds") # Load the file nem.rds but it will no longer be named mat but the.matrix in your environment
identical(mat, the.matrix ) # Checks if mat and my.matrix are identical