Francisco M. De La Vega, D.Sc.

Adjunct Professor in Biomedical Data Science

1265 Welch Road MC5464, MSOB West Wing, Third Floor

Email: or


Twitter: @ribozyme


I am a geneticist and computational biologist interested in the applications of genomics in research and healthcare, with extensive industry experience in leading complex technical projects that combine algorithms, genomic data, and software to deliver new genetic analysis and clinical diagnostic tools. I have also worked in many foundational public-private consortiums in genomics such as the 1000 Genomes project, the Genome-in-a-Bottle consortium, and the ICGG PanCancer Analysis of Whole Genomes, which developed tools and population scale data that we are now applying in the clinic to implement precision medicine approaches. Challenges to the realization of this vision include the massive amounts of data, the management and semantic normalization of associated metadata, complex relationships among the types of relevant data, and the need to make the data easily accessible. The computational aspects of the effective collection, representation, and analysis of the vast omics data in the elucidation of the etiology of rare and common genetic disease have been a major theme of my work. I am currently interested in the application of genome sequencing data in complex trait genetics, population genetics, genetic epidemiology, cancer genomics, precision medicine, and clinical diagnostics.


BIODS 235: Best practices for developing data science software for clinical and healthcare applications.

New seminar style course aimed to provide an overview of the strategies, processes, and regulatory hurdles to develop software implementing new algorithms or analytical approaches to be used in clinical diagnosis or medical practice. Planned for Winter 2021, it will be open to graduate students across Stanford and combines short lectures, guest industry speakers, and workshop sessions.


Adjunct Professor, Dept. of Biomedical Data Science, Stanford School of Medicine, Stanford, CA, USA, January 2016-Present.

Vice President, Hereditary Disease, Tempus Labs, Redwood City CA 11/2020-present

Chief Scientific Officer, Head of Product R&D, Fabric Genomics, Inc. Okalnad, CA 11/2018-10/2020

Chief Bioinformatics Officer, TOMA Biosciences, Inc. Foster City, CA. 9/2015-11/2017.

Consulting Professor, Dept. of Genetics, Stanford School of Medicine, Stanford, CA, USA, January-December, 2015.

Chief Scientific Officer, Annai Systems, Inc., Burlingame, CA, USA. 2/2014-9/2015.

Vice President, Genome Science. Real Time Genomics, Inc., San Francisco, CA, USA. 8/2012-2013.

Distinguished Scientific Fellow & Vice President, Advance Genomics Research. Applied Biosystems, Foster City, CA, USA, 1998-2010.

Assistant Professor. Department of Genetics. CINVESTAV-IPN, Mexico City, Mexico,1989-1998.


Editorial Board Member, NAR Bioinformatics and Genomics Journal, Oxford University Press. 2019-present.

External Advisory Board Member, ImmPort Bioinformatics Integration Support Contract (NIAID-NIH). 2013-present.

Steering Committee Member, Genome-in-a-Bottle consortium, NIST. 2014-present.

Steering Committee Member, High Throughput Sequencing algorithms (HiTSeq) - Community of Special Interest (COSI) of the ISCB. 2011-present.

Member of the Board of Directors, International Society for Computational Biology (ISCB). 2016-2018.

Steering Committee Member, 1000 Genomes Project Consortium. 2008-2010.


Doctor of Science (D.Sc.) Genetics and Molecular Biology, Center for Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV-IPN), Mexico City, Mexico. 2000.

Master of Science (M.Sc.) Pharmacology and Toxicology, Center for Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV-IPN), Mexico City, Mexico. 1989.

Biologist (B.Sc) Experimental Biology, Metropolitan Autonomous University (UAM), Mexico City, Mexico.1986.

MIT Sloan Executive Certificate in Management and Leadership. Sloan School of Business. Massachusetts Institute of Technology. Cambridge, MA. 2007.


Senior Member, International Society for Computational Biology, 2017.

Inaugural Inductee, I^2 Society (Invention x Innovation), Life Technologies, 2009.

Bio-IT World's Best Practices Award, Basic Research category, Boston, MA, 2008.

DNA (Demonstrated Noteworthy Achievement) Award. Applera Corporation, 2002.

Candidate to National Researcher, National System of Researchers, Mexico, 1992.

Wood-Whelan Research Fellowship, International Union of Biochemistry, 1990.

Selected Publications (Google Scholar Citations)

ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium. Pan-cancer analysis of whole genomes. Nature. 2020 Feb;578(7793):82-93.

Zook JM, McDaniel J, Parikh h, Heaton H., Irvine S.A., Trigg L., Truty R., McLean C.Y., De La Vega F.M., Xiao C., Sherry S., Salit M, Genome in a Bottle Consortium. Reproducible integration of multiple sequencing datasets to form high-confidence SNP, indel, and reference calls for five human genome reference materials. Nat. Biotechnol. 2019, Apr, DOI: 10.1038/s41587-019-0074-6.

Krusche, P. , Trigg L, Boutros P.C., Mason C.E., De La Vega F.M., Moore BN.L., Gonzalez-Porta M., Eberle M.A., Tezak Z., Labadibi S., Truty R., Asimenos G, Funke B, Fleharty M, Salit M, Zook J.M., and Global Alliance for Genomics and Health Benchmarking Team. Best Practices for Benchmarking Germline Small Variant Calls in Human Genomes. Nat. Biotechnol. 2019, Apr, DOI: 10.1038/s41587-019-0054-x.

De La Vega, FM, and Bustamante CB. Polygenic risk scores: A biased prediction?. Research Highlight. Genome Med. 2018, Dec 27;10(1):100.

So AP, Vilborg A, Bouhlal Y, Koehler RT, Grimes SM, Pouliot Y, Mendoza D, Ziegle J, Stein J, Goodsaid F, Lucero MY, De La Vega FM, Ji HP. A robust targeted sequencing approach for low input and variable quality DNA from clinical samples. NPJ Genom Med. 2018 Jan 15;3:2. doi: 10.1038/s41525-017-0041-4.

1000 Genomes Project Consortium. A global reference for human genetic variation. Nature. 2015 Oct 1;526(7571):68-74.

Shringarpure SS, Carroll A, De La Vega FM, Bustamante CD. Inexpensive and Highly Reproducible Cloud-Based Variant Calling of 2,535 Human Genomes. PLoS One. 2015 Jun 25;10(6):e0129277.

Cleary JG, Braithwaite R, Gaastra K, Hilbush BS, Inglis S, Irvine SA, Jackson A, Littin R, Nohzadeh-Malakshah S, Rathod M, Ware D, Trigg L, De La Vega FM. Joint variant and de novo mutation identification on pedigrees from high-throughput sequencing data. J Comput Biol. 2014 Jun;21(6):405-19.

Craig DW, O'Shaughnessy JA, Kiefer JA, Aldrich J, Sinari S, Moses TM, Wong S, Dinh J, Christoforides A,Blum JL, Aitelli CL, Osborne CR, Izatt T, Kurdoglu A, Baker A, Koeman J, Barbacioru C, Sakarya O, De La Vega FM, Siddiqui A, Hoang L, Billings PR, Salhia B, Tolcher AW, Trent JM, Mousses S, Von Hoff D, Carpten JD. Genome and transcriptome sequencing in prospective metastatic triple-negative breast cancer uncovers therapeutic vulnerabilities.Mol Cancer Ther. 2013 Jan;12(1):104-16.

1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012 Nov 1;491(7422):56-65.

Bustamante CD, Gonzalez-Burchard, E, and De La Vega FM. Genomics for the world. Commentary. Nature, 2011, 475(7355):163-5.

Kidd JM, Gravel S, Byrnes J, Moreno-Estrada A, Musharoff S, Bryc K, Degenhardt JD, Brisbin A, Sheth V, Chen R, McLaughlin SF, Peckham HE, Omberg L, Bormann Chung CA, Stanley S, Pearlstein K, Levandowsky E, Acevedo-Acevedo S, Auton A, Keinan A, Acuña-Alonzo V, Barquera-Lozano R, Canizales-Quinteros S, Eng C, Burchard EG, Russell A, Reynolds A, Clark AG, Reese MG, Lincoln SE, Butte AJ, De La Vega FM, Bustamante CD. Population genetic inference from personal genome data: impact of ancestry and admixture on human genomic variatio. Am J Hum Genet. 2012 Oct 5;91(4):660-71.

1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature. 2010 Oct 28;467(7319):1061-73.

Leary RJ, Kinde I, Diehl F, Schmidt K, Clouser C, Duncan C, Antipova A, Lee C, McKernan K, De La Vega FM, Kinzler KW, Vogelstein B, Diaz LA Jr, Velculescu VE. Development of personalized tumor biomarkers using massively parallel sequencing. Science Translational Med. 2010 Feb 24;2(20):20ra14.

McKernan KJ, Peckham HE, Costa GL, McLaughlin SF, Fu Y, Tsung EF, Clouser CR, Duncan C, Ichikawa JK, Lee CC, Zhang Z, Ranade SS, Dimalanta ET, Hyland FC, Sokolsky TD, Zhang L, Sheridan A, Fu H, Hendrickson CL, Li B, Kotler L, Stuart JR, Malek JA, Manning JM, Antipova AA, Perez DS, Moore MP, Hayashibara KC, Lyons MR, Beaudoin RE, Coleman BE, Laptewicz MW, Sannicandro AE, Rhodes MD, Gottimukkala RK, Yang S, Bafna V, Bashir A, MacBride A, Alkan C, Kidd JM, Eichler EE, Reese MG, De La Vega FM, Blanchard AP. Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding. Genome Research, 2009 Sep;19(9):1527-41.

Hampe J, Franke A, Rosenstiel P, Till A, Teuber M, Huse K, Albrecht M, Mayr G, De La Vega FM, Briggs J, Gunther S, Prescott NJ, Onnie CM, Hasler R, Sipos B, Folsch UR, Lengauer T, Platzer M, Mathew CG, Krawczak M, Schreiber S. (2007) A genome-wide association scan of nonsynonymous SNPs identifies a susceptibility variant for Crohn disease in ATG16L1. Nature Genetics, 39(2):207-11.