4/11: Topic and date assignments posted in the schedule.
3/30: Please email the following information by Sunday 4/3 (11:59pm)
- Presentation preferences. Email staff *list* a list of five of the selected topics and five times to present, both in order of preference.
- Whether you are taking the class for 2 or 3 units
- Lectures: Tues/Thurs 1:30-2:50pm in Clark Center, room S361 (enter by going through Peet's Coffee on the third floor)
- Alex Bishara (office hours: 12:30-1:30pm Tues and Thurs in Clark S260, email: ude.drofnats@arahsiba)
The course will consist primarily of student presentations on topics in selected applications. Presentations will be prepared with the help of the instructor. Students will help form the class content by choosing the topics they would like to present. This course offers a great opportunity to explore cutting edge research work all across the field of computational biology, to critically read and discuss recent research work, and to practice presentation skills.
Selected Topics and Papers
- Meet with Serafim at least two weeks before presenting to discuss the papers and to outline the presentation. Make sure to read the papers carefully before the meeting.
- Meet with one of the TAs at least one week before presenting to discuss your presentation outline.
- Meet with one of the TAs at least two weekdays before presenting for feedback on a completed set of slides.
- The presentation should be approximtely 40 minutes in length plus discussion.
- Please send the slides in PDF format to the TAs on the day of your presentation. If you would like to present from PowerPoint, please send the slides in .ppt or .pptx format as well.
- Choose one paper from the assigned topic section to critique.
- The critique should be 2 to 3 pages long (single-spaced, using a 12pt font and standard page setup). Please send in PDF
- The assignment must be submitted to the TAs before the topic is presented in class.
- Write summary one on a topic presented on or before 5/3. Due Tuesday, 5/3 (11:59pm).
- Write summary two on a topic presented between after 5/3. Due Tuesday, 5/31 (11:59pm).
- The summary should be 1 page long (single-space, using a 12pt font and standard page setup). Please send in PDF
- You do not need to sign up for summaries; you just need to turn in each summary before its due date.
- Write each summary on one paper related to a topic on the schedule. The summaries cannot be on one of the papers presented in class and cannot be on a topic that you are presenting or critiquing.
- Please submit this assignment to the TAs in PDF format, with the filename formatted as lastname.topicnumber.pdf. Also, please attach a copy of the paper itself to the email.
- One presentation, one critique and two summaries
- Two presentations (if enough slots available)
- One presentation and one critique
- One presentation and two summaries
Attendance is required for all students in the class (only two unexcused allowed).
Honor Code: The Stanford Honor Code applies to every document you submit for this class. In particular, please be careful not to plagiarize the papers you are presenting or summarizing. Make sure that you always correctly cite your sources (for examples, when you show figures or use illustrations in your presentations). Summaries and critiques must be written using your own words, not by copying text from the respective papers. If you need to cite text verbatim from some source, always put it in quotes and mention the source. If you have any question about this, or in case of doubt, please contact the TA.
SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing.
Bankevich et al. Journal of computational biology, 2012
ExSPAnder: a universal repeat resolver for DNA fragment assembly.
Prjibelski et al. Bioinformatics, 2014
Efficient de novo assembly of large genomes using compressed data structures.
Simpson et al. Genome Research, 2012
Velvet: Algorithms for de novo short read assembly using de Bruijn graphs.
Zerbino et al. Genome Research, 2008
How to apply de Bruijn graphs to genome assembly.
Compeau et al. Nature Biotechnology, 2011
IDBA -- A Practical Iterative de Bruijn Graph De Novo Assembler.
Peng et al. RECOMB, 2010
De novo assembly of a haplotype-resolved human genome.
Cao et al. Nature Biotechnology, 2015
Assembling Large Genomes with Single-Molecule Sequencing and Locality Sensitive Hashing.
Berlin et al. Nature Biotechnology, 2014
De novo sequencing and variant calling with nanopores using PoreSeq.
Szalay et al. Nature Biotechnology, 2015
A complete bacterial genome assembled de novo using only nanopore sequencing data.
Loman et al. Nature Methods, 2015
Assembly and diploid architecture of an individual human genome via single-molecule technologies.
Pendleton et al. Nature Methods, 2015
Hybrid error correction and de novo assembly of single-molecule sequencing reads.
Koren et al. Nature Biotechnology, 2012
Single haplotype assembly of the human genome from a hydatidiform mole.
Steinberg et al. Genome Research, 2014
Haplotype-resolved genome sequencing of a Gujarati Indian individual.
Kitzman et al. Nature Biotechnology, 2011
Accurate whole-genome sequencing and haplotyping from 10 to 20 human cells.
Peters et al. Nature, 2012
Whole-genome haplotyping using long reads and statistical methods.
Kuleshov et al. Nature Biotechnology, 2014
Haplotyping germline and cancer genomes with high-throughput linked-read sequencing.
Zheng et al. Nature Biotechnology, 2016
The genome sequence of the colonial chordate, Botryllus schlosseri.
Voskoboynik et al. eLIFE, 2013
Illumina TruSeq Synthetic Long-Reads Empower De Novo Assembly and Resolve Complex, Highly-Repetitive Transposable Elements.
McCoy et al. PLOS one, 2014
Read clouds uncover variation in complex regions of the human genome.
Bishara et al. Genome Research, 2015
Targeted sequencing by proximity ligation for comprehensive variant detection and local haplotyping.
JP de Vree et al. Nature Biotechnology, 2014
Synthetic long-read sequencing reveals intraspecies diversity in the human microbiome.
Kuleshov et al. Nature Biotechnology, 2016
Accurate, multi-kb reads resolve complex populations and detect rare microorganisms.
Sharon et al. Genome Research, 2015
Basic Local Alignment Search Tool.
Altschul et al. Journal Molecular Biology, 1990
Fast and accurate short read alignment with Burrows-Wheeler transform.
Li et al. Bioinformatics, 2009
Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM.
Li et al. arXiv, 2013
Fast gapped-read alignment with Bowtie 2.
Langmead et al. Nature Methods, 2012
Improved genome inference in the MHC using a population reference graph.
Dilthey et al. Nature Genetics, 2014
De novo assembly and genotyping of variants using colored de Bruijn graphs.
Iqbal et al. Nature Genetics, 2012
TopHat: discovering splice junctions with RNA-Seq.
Trapnell et al. Bioinformatics, 2009
Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation.
Trapnell et al. Nature Biotechnology, 2010
Comprehensive transcriptome analysis using synthetic long-read sequencing reveals molecular co-association of distant splicing events.
Tilgner et al. Nature Biotechnology, 2015
Genome-guided transcript assembly by integrative analysis of RNA sequence data.
Boley et al. Nature Biotechnology, 2014
Near-optimal RNA-Seq quantification.
Bray et al. arXiv, 2015
Using populations of human and microbial genomes for organism detection in metagenomes.
Ames et al. Genome Research, 2015
Kraken: ultrafast metagenomic sequence classification using exact alignments.
Wood et al. Genome Biology, 2014
Scalable metagenomic taxonomy classification using a reference genome database.
Ames et al. Bioinformatics, 2013
Binning metagenomic contigs by coverage and composition.
Alneberg et al. Nature Methods, 2014
Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes.
Albertsen et al. Nature Biotechnology, 2013
Detection of low-abundance bacterial strains in metagenomic datasets by eigengenome partitioning.
Cleary et al. Nature Biotechnology, 2015
IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth.
Peng et al. Bioinformatics, 2012
Accurate, multi-kb reads resolve complex populations and detect rare microorganisms.
Sharon et al. Genome Research, 2015
Synthetic long-read sequencing reveals intraspecies diversity in the human microbiome.
Kuleshov et al. Nature Biotechnology, 2016
Identification and assembly of genomes and genetic elements in complex metagenomic samples without using reference genomes.
Nielsen et al. Nature Biotechnology, 2014
Species-level deconvolution of metagenome assemblies with Hi-C-based contact probability maps.
Burton et al. G3, 2014
Compressive Genomics.
Po-Ru-Loh et al. Nature Biotechnology, 2012
Compressive Mapping for Next-Generation Sequencing.
Yorukoglu et al. Nature Biotechnology, 2016
A unified mixed-model method for association mapping that acocunts for multiple levels of relatedness.
Yu et al. Nature Genetics, 2005
Genome-wide efficient mixed-model analysis for association studies.
Zhou et al. Nature Genetics, 2012
FaST linear mixed models for genome-wide association studies.
Lippert et al. Nature Methods, 2011
A mixed-model approach for genome-wide association studies of correlated traits in structured populations.
Korte et al. Nature Genetics, 2012
Efficient multivariate linear mixed model algorithms for genome-wide association studies.
Zhou et al. Nature Methods, 2014
Accurate non-parametric estimation of recent effective population size from segments of identity by descent.
Browning et al. American Journal of Human Genetics, 2015
Parente2: a fast and accurate method for detecting identity by descent.
Rodriguez et al. Genome Research , 2015
Improving the accuracy and efficiency of identity-by-descent detection in population data.
Browning et al. Genetics, 2013
High-resolution detection of identity by descent in unrelated individuals.
Browning et al. American Journal of Human Genetics, 2010
Genes mirror geography within Europe.
November et al. Nature, 2008
Inferring parental genomic ancestries using pooled semi-Markov processes.
Zou et al. Bioinformatics, 2015
RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference.
Maples et al. American Journal of Human Genetics, 2013
Sensitive Detection of Chromosomal Segments of Distinct Ancestry in Admixed Populations.
Price et al. PLOS Genetics, 2009
Fast and accurate inference of local ancestry in Latino populations.
Baran et al. Bioinformatics, 2012
A model-based approach for analysis of spatial structure in genetic data.
Yang et al. Nature Genetics, 2012
ZIFA: dimensionality reduction for zero-inflated single-cell gene expression analysis.
Pierson et al. Genome Biology, 2015
Computational analysis of cell-to-cell heterogeneity in single-cell RNA sequencing data reveals hidden subpopulations of cells.
Buettner et al. Nature Biotechnology, 2015
Spatial reconstruction of single-cell gene expression data.
Satija et al. Nature Biotechnology, 2015
A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies.
Howie et al. PLOS Genetics, 2009
Genotype Imputation with Millions of Reference Samples.
Browning et al. American Journal of Human Genetics, 2016
minimac2: faster genotype imputation.
Fuchsberger et al. Bioinformatics, 2014
Haplotype Estimation Using Sequencing Reads.
Delaneau et al. American Journal of Human Genetics, 2013
A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals..
Browning et al. American Journal of Human Genetics, 2009
Comprehensive characterization of complex structural variations in cancer by directly comparing genome sequence reads.
Moncunill et al. Nature Biotechnology, 2014
Similarity network fusion for aggregating data types on a genomic scale.
Wang et al. Nature Methods, 2014
MuSiC: identifying mutational significance in cancer genomes.
Dees et al. Genome Research, 2012
Network-based stratification of tumor mutations.
Hofree et al. Nature Methods, 2013
Predicting effects of noncoding variants with deep learning–based sequence model.
Zhou et al. Nature Methods, 2015
Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning.
Alipanahi et al. Nature Biotechnology, 2015
The human splicing code reveals new insights into the genetic determinants of disease.
Xiong et al. Science, 2015
- 3/29: Course Logistics and Introduction
- 3/31: Intro to Assembly I slides
- Functional Genomics, Christine Tataru
- Compressive Genomics, Richard Tang
- Single-cell RNA Sequencing, Yuan Xue
- Cancer, Mark Berger and
- RNA sequencing and transcript assembly/quantification, Nimit Jain
- Ancestry Inference, Kat Gregory
- Imputing Genotypes and Haplotype Phase, Nico Chaves
- Genome-Wide Association Studies with Mixed Models, Kai Kent
- Read Alignment and Resequencing, Lauren Ellis
- Metagenomics, Danielle Kain
- Functional Genomics (p2), Dana Wyman
- De novo Assembly II, Sheila Ramaswamy
5/10,5/12:Guest Lecture TBA
5/17:
- Ancestry Inference (p2), Sebastian Le Bras
- Metagenomics (p2), Yuki Yoshiyasu
- Read Alignment and Resequencing (p2), Ponan Li
- IBD, Pavitra Rengarajan
- Cancer (p2), Shirin Sadri
- De novo Assembly III, Alec Tarashansky
- Genome-Wide Association Studies with Mixed Models (p2), Kapil Kanagal
- Imputing Genotypes and Haplotype Phase (p2), Joe Wan
- Epidemiology, Russ Islam
- Gene regulatory network inference, Russ Islam