News

August 2022
[Article] Deep Data Needs and Precision Health, Inside Precision Medicine 9.4 (2022): 44-46.
June 2022
[Video] Deep Data Needs and Challenges in Precision Medicine at Precision Medicine World Conference (PMWC) 2022 Silicon Valley.
October 2021
[Video] All-In-One: Secure, Scalable, Intelligent Solutions for Precision Medicine at Stanford CyberFest 2021.
May2021
[Video] Interview with Eric Schmidt (Former CEO of Google) and Mark Russinovich (CTO of Microsoft Azure) on The Future of Cloud Computing and Its Impact on Healthcare Applications.

GENE222/BMI222/CS273C: Cloud Computing for Biology and Healthcare

[Swarm Tool] Our paper Swarm: A Federated Cloud Framework for Large-scale Variant Analysis published in PLOS Computational Biology.
March 2021
[Hummingbird Tool] Our paper Hummingbird: Efficient Performance Prediction for Executing Genomic Applications in the Cloud published in Bioinformatics.
December 2020
[MyPHD App iOS/Android] Our team launched the first pre-symptomatic alerting system for infectious diseases using smartwatch data [COVID-19  Wearables Study].

[Behind the Paper, COVID-19] Early Detection of COVID-19 at Scale Using Wearables.

Research

Research Interests

Computationally Intensive Medical Applications and Cloud Computing
In-Situ Data Analysis of HPC Applications
Data Privacy in Medical Applications
High Performance Machine Learning
Database Management Systems
Pervasive and Ubiquitous Computing

Stanford Data Ocean

Stanford Data Ocean is the first serverless precision medicine educational platform for people of all experience levels to explore important questions.The platform requires zero provisioning or administration so that individuals and organizations can efficiently allocate resources to experiment with code, and scale innovative solutions, December 2020 - present.
SDO
Personal Health Dashboard

The large amount of biomedical data derived from wearable sensors, electronic health records, and molecular profiling (e.g., genomics data) is rapidly transforming our healthcare systems. The increasing scale and scope of biomedical data not only is generating enormous opportunities for improving health outcomes but also raises new challenges ranging from data acquisition and storage to data analysis and utilization. To meet these challenges, we developed the Personal Health Dashboard (PHD), which utilizes state-of-the-art security and scalability technologies to provide an end-to-end solution for big biomedical data analytics, December 2017 - present.
MyPHD
Swarm: A Federated Cloud Framework for Large-scale Variant Analysis

Genomic data analysis across multiple cloud platforms is an ongoing challenge, especially when large amounts of data are involved. Here, we present Swarm, a framework for federated computation that promotes minimal data motion and facilitates crosstalk between genomic datasets stored on various cloud platforms. We demonstrate its utility via common inquiries of genomic variants across BigQuery in the Google Cloud Platform (GCP), Athena in the Amazon Web Services (AWS), Apache Presto and MySQL. Compared to single-cloud platforms, the Swarm framework significantly reduced computational costs, run-time delays and risks of security breach and privacy violation. Stanford Center for Genomics and Personalized Medicine (SCGPM) (In collboration with VA's Million Veteran Program), Winter 2019 - present.
Swarm
Hummingbird: Efficient Performance Prediction for Executing Genomic Applications in the Cloud

A major drawback of executing genomic applications on cloud computing facilities is the lack of tools to predict which instance type is the most appropriate, often resulting in an over- or under- matching of resources. Determining the right configuration before actually running the applications will save money and time. Here, we introduce Hummingbird, a tool for predicting performance of computing instances with varying memory and CPU on multiple cloud platforms. Stanford Center for Genomics and Personalized Medicine (SCGPM) (In collboration with VA's Million Veteran Program), Winter 2017 - present.
Hummingbird
AnnotationHive: Design and Implementation of a Cloud-based Annotation Engine - (Java/Google Dataflow/Google Genomics)

The objective of this work was to create a cloud-based annotation engine that automatically annotates the user's VCF files, and scale over the cloud. - Stanford Center for Genomics and Personalized Medicine (SCGPM) (In collboration with VA's Million Veteran Program and Google Genomics), Summer 2016 - present.
AnnotationHive
ScalaJack and ScalaTrace- (C/C++/MPI)

ScalaTrace is an MPI tracing toolset that provides orders of magnitude smaller, if not near-constant size, communication traces regardless of the number of nodes while preserving structural information. Combing intra- and inter-node compression techniques of MPI events, the trace tool extracts an application's communication structure. A replay tool allows communication events recorded by our trace tool to be issued in an order-preserving manner without running the original application code - NCSU, Spring 2013 - 2017.
ScalaJack

ScalaTrace
ElasticMedFlow: Design and Implementation of a Scalable, Adaptable Multistage Pipeline for Medical Applications - (Python/Scala/Apache Spark/C/C++/MPI)

The objective of this work is to create a software framework for highly parallel analytics of medical big data in the cloud. Our longterm idea is to take patient data as it becomes available during MRI imaging as well as DNA testing and consult existing medical databases to uncover potential data correlations that imply specific diseases. - NCSU, Duke, UNC, Spring 2016.
ElasticMedFlow
Automation of Literature Search Indexing for NextBio Research - (Java/SQL/Shell Scripting/Apache Spark)

Worked as a software consultant (Intern) at illumina Inc. The overarching objective of the project was to develop a system to do literature search indexing for illumina research product automatically and efficiently. Main tasks were 1) Implementing automation system, 2) Regenerating ontology/dictionary files and 3) Improving the indexing process using Spark, Hadoop map/reduce. June 2015 - August 2015.
Cloud-Based Acoustect SDK - (C#/C++/MPI/Shell Scripting)

Worked as an HPC engineer (Intern) at Impulonic Corporation. The company released a product for acoustic analysis, called Acoustect SDK. This SDK contains two broad categories of acoustic simulation algorithms: ARD and GA. 1) Deployed Acoustect SDK on the Windows Azure and Amazon EC2 platforms , 2) Adapted the existing C# / WPF front-end in the Acoustect SDK to create a desktop front-end that runs ARD on Azure and EC2, 3) Provided an option in the front-end to launch multiple simulations on multiple compute nodes on Azure and EC2, and 4) Deployed MPARD a cluster-based version of ARD on Azure. May 2014 - August 2014.
Pervasive Cyberinfrastructure for Personalized Learning and Instructional Support (PERCEPOLIS) - (JAVA/JADE)

Worked as a research assistant and JAVA developer on the PERCEPOLIS project, the overarching objective of which is to develop an educational cyberinfrastructure that facilitates resource sharing, collaboration, and personalized learning in higher education. We leverage advances in agent-based software engineering, databases, global information sharing processes, and pervasive computing to create this cyberinfrastructure - Missouri S&T, Fall 2010 - Summer 2012.
Context-Aware Anomaly Prediction Using Bayesian Classifiers - (Python/Google App Engine)

System anomalies, such as performance bottlenecks, resource hotspots, and service level objective (SLO) violations, constitute major threats to large-scale hosting infrastructures. Handling such anomalies in a dynamic execution environment requires an adaptive anomaly management system. ALERT is a self-evolving, context-aware anomaly prediction scheme capable of raising alerts before an anomaly occurs so that the administrator or an automated anomaly prevention system can apply the necessary counter-measures. The current implementation of ALERT uses decision tree (DT) based classification scheme. The effectiveness of ALERT's prediction model depends on the optimality of the DT. Learning an optimal decision tree is an NP-complete problem, so we have replaced the DT for classification with a Bayesian classifier scheme and tested our implementation on the Google App Engine and PlanetLab wide-area network system testbeds, Spring 2013.

Publications

Updated list: Google Scholar
[Nature Med'22] A. Alavi, G. Bogu, M. Wang, E. Rangan, A. Brooks, Q. Wang, E. Higgs, A. Celli, T. Mishra, A. Metwally, K. Cha, P. Knowles, A. Alavi, R. Bhasin, S. Panchamukhi, D. Celis, T. Aditya, A. Honkala, B. Rolnik, E. Hunting, O. Dagan-Rosenfeld, A. Chauhan, J. Li, C. Bejikian, V. Krishnan, L. McGuire, X. Li, A. Bahmani**, M.Snyder** "Real-time Alerting System for COVID-19 and other Stress Events using Wearable Data", Nature Medicine, November 2021.
[Nature Comm'21] A. Bahmani*, A. Alavi*, T. Buergel*, S. Upadhyayula, Q. Wang, S. Ananthakrishnan, A. Alavi, D. Celis, D. Gillespie, G. Young, Z. Xing, M. Nguyen, A. Haque, A. Mathur, J. Payne, G. Mazaheri, J. Li, P. Kotipalli, L. Liao, R. Bhasin, K. Cha, B. Rolnik, A. Celli, O. Dagan-Rosenfeld, E. Higgs, W. Zhou, C. Berry, K. Winkle, K. Contrepois, U. Ray, K. Bettinger, S. Datta, X. Li, M. Snyder "A Scalable, Secure, and Interoperable Platform for Deep Data-driven Health Management", Nature Communications, October 2021.
[Nature Med'21] J. Dunn, L. Kidzinski, R. Runge, D. Witt, J. Hicks, S. Rose, X. Li, A. Bahmani, S. Delp, T. Hastie, M. Snyder "Wearable Sensors Enable Personalized Predictions of Clinical Laboratory Measurements", Nature Medicine, May 2021.
[PLOS Computational Biology'21] A. Bahmani*, Kyle Ferriter*, V. Krishnan, AR. Alavi, AM. Alavi, P. Tsao, M. Snyder , C. Pan "Swarm: A Federated Cloud Framework for Large-scale Variant Analysis", PLOS Computational Biology 17(5): e1008977, May 2021.
[Europe PMC'21] B. Keating, et al "Early Detection of SARS-CoV-2 and other Infections in Solid Organ Transplant Recipients and Household Members using Wearable Devices.", Transplant International: Official Journal of the European Society for Organ Transplantation, March 2021.
[Bioinformatics'21] A. Bahmani*, Z. Xing*, V. Krishnan*, U. Ray*, F. Mueller, A. Alavi, P. Tsao, M. Snyder, and C. Pan "Hummingbird: Efficient Performance Prediction for Executing Genomic Applications in the Cloud", Bioinformatics, March 2021.
[BIBM'20] K. Ferriter, F. Mueller, A. Bahmani, C. Pan "VCFC: Structural and Semantic Compression and Indexing of Genetic Variant Data", IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Decmber 2020. (9.4% acceptance rate)
[Nature Biomed'20] T. Mishra*, M. Wang*, A. Metwally*, G. Bogu*, A. Brooks*, A. Bahmani*, A. Alavi*, A. Celli, E. Higgs, O. Dagan-Rosenfeld, B. Fay, S. Kirkpatrick, R. Kellogg, M. Gibson, T. Wang, E. Hunting, P. Mamic, A. Ganz, B. Rolnik, X. Li, M. Snyder, "Pre-symptomatic detection of COVID-19 from smartwatch data", Nature Biomedical Engineering, November 2020.
[Cell'20] K. Contrepois, et al "Molecular Choreography of Acute Exercise", Cell, May 2020.
[Cell'20] O. Rozenblatt-Rosen, et al "The Human Tumor Atlas Network: charting tumor transitions across space and time at single-cell resolution", Cell, April 2020.
[Nature'19] S. Lin, et al "HuBMAP Consortium. "The human body at cellular resolution: the NIH Human Biomolecular Atlas Program." Nature", Nature, October 2019.
[Nature'19] W. Zhou, et al "Longitudinal Multi-Omics of Host-Microbe Dynamics in Prediabetes", Nature, May 2019.
[IPDPS'18] A. Bahmani, F. Mueller "Chameleon: Online Clustering Of MPI Program Traces", The 32^nd IEEE International Parallel and Distributed Processing Symposium, Vancouver, Canada, May 21-25, 2018. (24.5% acceptance rate).
[JPDC'17] A. Bahmani, F. Mueller "Scalable Communication Event Tracing via Clustering", Journal of Parallel and Distributed Computing, March 2017.
[JPDC'16] A. Bahmani F. Mueller "Efficient Clustering for Ultra-Scale Application Tracing", Journal of Parallel and Distributed Computing, August 2016.
[HICOMB'16] A. Bahmani, A. Sibley, M. Parsian, K. Owzar, F. Mueller "Sparkscore: Leveraging Apache Spark for Distributed Genomic Inference", The 15^th IEEE International Workshop on High Performance Computational Biology (IPDPS'16), May 2016.
[Big Data'15] A. Bahmani, F. Mueller "ACURDION: An Adaptive Clustering-Based Algorithm For Tracing Large-Scale MPI Applications", 2015 IEEE International Conference on Big Data, Nov 2015. (18% acceptance rate).
[ICS'14] A. Bahmani, F. Mueller "Scalable Tracing Of MPI Programs Through Signature-Based Clustering Algorithms", The 28^th International Conference on Supercomputing, June 2014. (21% acceptance rate).
[DEXA'12] A. Bahmani, S. Sedigh, A. Hurson "Ontology-Based Recommendation Algorithms For Personalized Education", The 23^rd International Conference on Database and Expert Systems Applications, LNCS, September 2012, Austria.
[FIE'11] A. Bahmani, S. Sedigh, A. Hurson "Context-Aware Recommendation Algorithms For The PERCEPOLIS Personalized Education Platform", The 41^st ASEE/IEEE Frontiers in Education Conference, F4E-1-F4E-6, October 12 - 15, 2011, Rapid City, South Dakota, USA.
[IEEE CCECE'08] A. Bahmani, M. Naghibzadeh, B. Bahmani "Automatic Database Normalization and Primary Key Generation", The 21^st Canadian Conference on Electrical and Computer Engineering, May 4 - 7, 2008, Ontario, Canada.

* shared first authorship
** shared senior authorship

Interns

David J. Florez Rodriguez, Electrical Engineering MS Candidate, Stanford University 2022

Ryan Park, Computer Science BS/MS Candidate, Stanford University 2022

Claire Muscat, Computer Science MS Candidate, Stanford University 2022

Nicholas Midler, Biomedical MS Candidate, Stanford University 2021

Peter Knowles, Psychology Undergraduate Student, Stanford University 2020

Camille Lauren Berry, Design Impact Engineering MS Candidate, Stanford University 2020

Jason Kenichi Li, Computer Science MS Candidate, Stanford University 2020

Diego Celli, Computer Science MS Candidate, Stanford University 2020

Sushil Upadhyayula, Computer Science MS Candidate, Stanford University, Spring 2020

Josh Payne, Computer Science MS Candidate, Stanford University, Spring 2020

Gregory Young, Computer Science MS Candidate, MIT University, Spring 2020

Hoangminh Huynhnguyen, Computer Science Ph.D. Candidate, University of Illinois at Chicago, Summer 2019

Lek Tin, Computer Science MS Candidate, University of California, Riverside, Summer 2019

Audrey Haque, Design Engineering MS Candidate, Harvard University, Summer 2019

Kyle Ferriter, Computer Science MS Candidate, North Carolina State University, Summer 2019

Zhanfu Yang, Computer Science MS Candidate, Purdue University, Summer 2019

Arash Alavi, Computer Science Ph.D. Candidate, University of California, Riverside, Summer 2018

Utsab Ray, Computer Science Ph.D. Candidate, North Carolina State University, Summer 2018, 2019

Negin Forouzesh, Computer Science Ph.D. Candidate, Virginia Tech, Summer 2018

Ziye Xing, Computer Science MS, University of California, Los Angeles, Summer 2018

Education

Ph.D.2014 - 2017
NC State University, Raleigh, NC, USA

Computer Science

Research Title: ScalaJack: Scalable Trace-Based Tools for In-Situ Data Analysis of HPC Applications

Supervisor: Prof. Frank Mueller
M.Sc.2012 - 2014
NC State University, Raleigh, NC, USA

Computer Science

Graduate Research and Teaching Assistant
M.Sc.2010 - 2012
Missouri University of Science and Technology, Rolla, MO, USA

Computer Science

Graduate Research Assistant, Teaching Assistant and Instructor
B.Sc.2004 - 2009
Azad University of Mashhad, Mashhad, IRAN

Computer Software Engineering

Thesis Title: Automatic Database Normalization and Primary Key Generation

Advisor: Prof. Mahmoud Naghibzadeh

Contact Me

Stanford Center for Genomics and Personalized Medicine, Stanford Medicine, Stanford University, Palo Alto, CA, USA

https://web.stanford.edu/~abahman/

abahman [You know] stanford.edu

amirbahmani [dot] h [You know] gmail!

Personal

Amir's research activities are dedicated in memory of his deceased wife Someyra who passed away due to cancer in 2014.