amir bahmani

Amir Bahmani, Ph.D.

Amir Bahmani is the Director of Stanford Deep Data Research Computing Center (DDRCC) and the Director of Science and Technology at Stanford Healthcare Innovation Lab (SHIL) and a lecturer at Stanford University. He has been working on distributed and parallel computing applications since 2008.

Currently, Amir is an active researcher in the VA Million Veteran Program (MVP), Human Tumor Atlas Network (HTAN), the Human BioMolecular Atlas Program (HuBMAP), Stanford Metabolic Health Center (MHC) and Integrated Personal Omics Profiling (iPOP).

Please note that we have open positions at the Center. We offer internship opportunities for talented individuals. Contact Amir for details.



Research Interests
  • Computationally Intensive Medical Applications and Cloud Computing
  • In-Situ Data Analysis of HPC Applications
  • Data Privacy in Medical Applications
  • High Performance Machine Learning
  • Database Management Systems
  • Pervasive and Ubiquitous Computing

  • Stanford Data Ocean is the first serverless precision medicine educational platform for people of all experience levels to explore important questions.The platform requires zero provisioning or administration so that individuals and organizations can efficiently allocate resources to experiment with code, and scale innovative solutions, December 2020 - present.


  • The large amount of biomedical data derived from wearable sensors, electronic health records, and molecular profiling (e.g., genomics data) is rapidly transforming our healthcare systems. The increasing scale and scope of biomedical data not only is generating enormous opportunities for improving health outcomes but also raises new challenges ranging from data acquisition and storage to data analysis and utilization. To meet these challenges, we developed the Personal Health Dashboard (PHD), which utilizes state-of-the-art security and scalability technologies to provide an end-to-end solution for big biomedical data analytics, December 2017 - present.


  • Genomic data analysis across multiple cloud platforms is an ongoing challenge, especially when large amounts of data are involved. Here, we present Swarm, a framework for federated computation that promotes minimal data motion and facilitates crosstalk between genomic datasets stored on various cloud platforms. We demonstrate its utility via common inquiries of genomic variants across BigQuery in the Google Cloud Platform (GCP), Athena in the Amazon Web Services (AWS), Apache Presto and MySQL. Compared to single-cloud platforms, the Swarm framework significantly reduced computational costs, run-time delays and risks of security breach and privacy violation. Stanford Center for Genomics and Personalized Medicine (SCGPM) (In collboration with VA's Million Veteran Program), Winter 2019 - present.


  • A major drawback of executing genomic applications on cloud computing facilities is the lack of tools to predict which instance type is the most appropriate, often resulting in an over- or under- matching of resources. Determining the right configuration before actually running the applications will save money and time. Here, we introduce Hummingbird, a tool for predicting performance of computing instances with varying memory and CPU on multiple cloud platforms. Stanford Center for Genomics and Personalized Medicine (SCGPM) (In collboration with VA's Million Veteran Program), Winter 2017 - present.


  • The objective of this work was to create a cloud-based annotation engine that automatically annotates the user's VCF files, and scale over the cloud. - Stanford Center for Genomics and Personalized Medicine (SCGPM) (In collboration with VA's Million Veteran Program and Google Genomics), Summer 2016 - present.


  • ScalaTrace is an MPI tracing toolset that provides orders of magnitude smaller, if not near-constant size, communication traces regardless of the number of nodes while preserving structural information. Combing intra- and inter-node compression techniques of MPI events, the trace tool extracts an application's communication structure. A replay tool allows communication events recorded by our trace tool to be issued in an order-preserving manner without running the original application code - NCSU, Spring 2013 - 2017.



  • The objective of this work is to create a software framework for highly parallel analytics of medical big data in the cloud. Our longterm idea is to take patient data as it becomes available during MRI imaging as well as DNA testing and consult existing medical databases to uncover potential data correlations that imply specific diseases. - NCSU, Duke, UNC, Spring 2016.


  • Worked as a software consultant (Intern) at illumina Inc. The overarching objective of the project was to develop a system to do literature search indexing for illumina research product automatically and efficiently. Main tasks were 1) Implementing automation system, 2) Regenerating ontology/dictionary files and 3) Improving the indexing process using Spark, Hadoop map/reduce. June 2015 - August 2015.
  • Worked as an HPC engineer (Intern) at Impulonic Corporation. The company released a product for acoustic analysis, called Acoustect SDK. This SDK contains two broad categories of acoustic simulation algorithms: ARD and GA. 1) Deployed Acoustect SDK on the Windows Azure and Amazon EC2 platforms , 2) Adapted the existing C# / WPF front-end in the Acoustect SDK to create a desktop front-end that runs ARD on Azure and EC2, 3) Provided an option in the front-end to launch multiple simulations on multiple compute nodes on Azure and EC2, and 4) Deployed MPARD a cluster-based version of ARD on Azure. May 2014 - August 2014.
  • Worked as a research assistant and JAVA developer on the PERCEPOLIS project, the overarching objective of which is to develop an educational cyberinfrastructure that facilitates resource sharing, collaboration, and personalized learning in higher education. We leverage advances in agent-based software engineering, databases, global information sharing processes, and pervasive computing to create this cyberinfrastructure - Missouri S&T, Fall 2010 - Summer 2012.
  • System anomalies, such as performance bottlenecks, resource hotspots, and service level objective (SLO) violations, constitute major threats to large-scale hosting infrastructures. Handling such anomalies in a dynamic execution environment requires an adaptive anomaly management system. ALERT is a self-evolving, context-aware anomaly prediction scheme capable of raising alerts before an anomaly occurs so that the administrator or an automated anomaly prevention system can apply the necessary counter-measures. The current implementation of ALERT uses decision tree (DT) based classification scheme. The effectiveness of ALERT's prediction model depends on the optimality of the DT. Learning an optimal decision tree is an NP-complete problem, so we have replaced the DT for classification with a Bayesian classifier scheme and tested our implementation on the Google App Engine and PlanetLab wide-area network system testbeds, Spring 2013.


* shared first authorship
** shared senior authorship


  • David J. Florez Rodriguez, Electrical Engineering MS Candidate, Stanford University 2022
  • Ryan Park, Computer Science BS/MS Candidate, Stanford University 2022
  • Claire Muscat, Computer Science MS Candidate, Stanford University 2022
  • Nicholas Midler, Biomedical MS Candidate, Stanford University 2021
  • Peter Knowles, Psychology Undergraduate Student, Stanford University 2020
  • Camille Lauren Berry, Design Impact Engineering MS Candidate, Stanford University 2020
  • Jason Kenichi Li, Computer Science MS Candidate, Stanford University 2020
  • Diego Celli, Computer Science MS Candidate, Stanford University 2020
  • Sushil Upadhyayula, Computer Science MS Candidate, Stanford University, Spring 2020
  • Josh Payne, Computer Science MS Candidate, Stanford University, Spring 2020
  • Gregory Young, Computer Science MS Candidate, MIT University, Spring 2020
  • Hoangminh Huynhnguyen, Computer Science Ph.D. Candidate, University of Illinois at Chicago, Summer 2019
  • Lek Tin, Computer Science MS Candidate, University of California, Riverside, Summer 2019
  • Audrey Haque, Design Engineering MS Candidate, Harvard University, Summer 2019
  • Kyle Ferriter, Computer Science MS Candidate, North Carolina State University, Summer 2019
  • Zhanfu Yang, Computer Science MS Candidate, Purdue University, Summer 2019
  • Arash Alavi, Computer Science Ph.D. Candidate, University of California, Riverside, Summer 2018
  • Utsab Ray, Computer Science Ph.D. Candidate, North Carolina State University, Summer 2018, 2019
  • Negin Forouzesh, Computer Science Ph.D. Candidate, Virginia Tech, Summer 2018
  • Ziye Xing, Computer Science MS, University of California, Los Angeles, Summer 2018
  • Education

    Contact Me

    Stanford Center for Genomics and Personalized Medicine, Stanford Medicine, Stanford University, Palo Alto, CA, USA

    abahman [You know]

    amirbahmani [dot] h [You know] gmail!


    Amir's research activities are dedicated in memory of his deceased wife Someyra who passed away due to cancer in 2014.