A hands on course on Statistics for heterogeneous biological data

(Sequences, images, clinical data, mRNA, single cell measurements, ..) using R and many Bioconductor packages.

Course Schedule

Live lectures are at 1:30pm PST to 2:50pm on Mondays and Wednesdays in Room 203 in building 200 called the History Corner.

Lecture readings need to be done before the lecture time.

Weekly Labs will be done either on your own or during Lab sessions run by the Teaching Team.

Here is the sketch of the schedule, it may change when the survey results of preferred topics comes in.

Date Topic
Wednesday 27 Sept Introduction (covering: book chapter introduction and chapter 1; lecture 1)
Monday Oct 2 Bioconductor, RR and Simulations (covering: book chapter 1 & 2; lectures 1 & 2)
Wednesday Oct 4 Simulations (covering: book chapter 1 & 2; lectures 1 & 2)
Monday Oct 9 Graphics (covering: book chapter 3; online lecture 3)
Wednesay 11 Oct Statistics, Mixture Models and Variance Stabilization (covering: book chapter 2, 4; lecture 4)
Monday Oct 16 Clustering (covering: book chapter 5; lecture 5)
Wed Oct 18 Testing and RNA-Seq (covering: book chapters 6 & 8; lectures 6 - 8)
Monday Oct 23 Multivariate Analysis (covering: book chapter 7; lectures 9 & 10)
Wed Oct 25 Single Cell data SingleCellExperiment
Monday October 30 RNA seq, Single Cell and GLMs
Wednesday Nov 1 Multi-domain, multitable, heterogeneous multi-omics data (book chapter 9).
Monday Nov 6 Networks, graphs and phylogenetic trees (book chapter 10
Wednesday Nov 8 Embeddings and nonlinear pseudo-time manifold learning
Monday Nov 13 Microbial ecology; abundance testing
Wednesday Nov 15 Working with image data (book, chapter 11)
Monday Nov 27 Supervised Learning methods for heterogeneous data (book, chapter 12).
Wednesday Nov 29 Experimental design, analysis good practice (book, chapter 13), computational tools
Monday Dec 4 Reproducible Research

The book

The book for the course is available on Amazon and Cambridge University Press

It is also available for free online at both Stanford and EMBL.

The data are all available together as a large compressed tar file