[Swarm Tool] Our paper Swarm: A Federated Cloud Framework for Large-scale Variant Analysis published in PLOS Computational Biology.
- March 2021
[Hummingbird Tool] Our paper Hummingbird: Efficient Performance Prediction for Executing Genomic Applications in the Cloud published in Bioinformatics.
- December 2020
[Behind the Paper, COVID-19] Early Detection of COVID-19 at Scale Using Wearables.
- Computationally Intensive Medical Applications and Cloud Computing
- In-Situ Data Analysis of HPC Applications
- Data Privacy in Medical Applications
- High Performance Machine Learning
- Database Management Systems
- Pervasive and Ubiquitous Computing
Swarm: A Federated Cloud Framework for Large-scale Variant Analysis
Genomic data analysis across multiple cloud platforms is an ongoing challenge, especially when large amounts of data are involved. Here, we present Swarm, a framework for federated computation that promotes minimal data motion and facilitates crosstalk between genomic datasets stored on various cloud platforms. We demonstrate its utility via common inquiries of genomic variants across BigQuery in the Google Cloud Platform (GCP), Athena in the Amazon Web Services (AWS), Apache Presto and MySQL. Compared to single-cloud platforms, the Swarm framework significantly reduced computational costs, run-time delays and risks of security breach and privacy violation. Stanford Center for Genomics and Personalized Medicine (SCGPM) (In collboration with VA's Million Veteran Program), Winter 2019 - present.
Hummingbird: Efficient Performance Prediction for Executing Genomic Applications in the Cloud
A major drawback of executing genomic applications on cloud computing facilities is the lack of tools to predict which instance type is the most appropriate, often resulting in an over- or under- matching of resources. Determining the right configuration before actually running the applications will save money and time. Here, we introduce Hummingbird, a tool for predicting performance of computing instances with varying memory and CPU on multiple cloud platforms. Stanford Center for Genomics and Personalized Medicine (SCGPM) (In collboration with VA's Million Veteran Program), Winter 2017 - present.
AnnotationHive: Design and Implementation of a Cloud-based Annotation Engine - (Java/Google Dataflow/Google Genomics)
The objective of this work was to create a cloud-based annotation engine that automatically annotates the user's VCF files, and scale over the cloud. - Stanford Center for Genomics and Personalized Medicine (SCGPM) (In collboration with VA's Million Veteran Program and Google Genomics), Summer 2016 - present.
ScalaJack and ScalaTrace- (C/C++/MPI)
ScalaTrace is an MPI tracing toolset that provides orders of magnitude smaller, if not near-constant size, communication traces regardless of the number of nodes while preserving structural information. Combing intra- and inter-node compression techniques of MPI events, the trace tool extracts an application's communication structure. A replay tool allows communication events recorded by our trace tool to be issued in an order-preserving manner without running the original application code - NCSU, Spring 2013 - 2017.
ElasticMedFlow: Design and Implementation of a Scalable, Adaptable Multistage Pipeline for Medical Applications - (Python/Scala/Apache Spark/C/C++/MPI)
The objective of this work is to create a software framework for highly parallel analytics of medical big data in the cloud. Our longterm idea is to take patient data as it becomes available during MRI imaging as well as DNA testing and consult existing medical databases to uncover potential data correlations that imply specific diseases. - NCSU, Duke, UNC, Spring 2016.
Automation of Literature Search Indexing for NextBio Research - (Java/SQL/Shell Scripting/Apache Spark)
Worked as a software consultant (Intern) at illumina Inc. The overarching objective of the project was to develop a system to do literature search indexing for illumina research product automatically and efficiently. Main tasks were 1) Implementing automation system, 2) Regenerating ontology/dictionary files and 3) Improving the indexing process using Spark, Hadoop map/reduce. June 2015 - August 2015.
Cloud-Based Acoustect SDK - (C#/C++/MPI/Shell Scripting)
Worked as an HPC engineer (Intern) at Impulonic Corporation. The company released a product for acoustic analysis, called Acoustect SDK. This SDK contains two broad categories of acoustic simulation algorithms: ARD and GA. 1) Deployed Acoustect SDK on the Windows Azure and Amazon EC2 platforms , 2) Adapted the existing C# / WPF front-end in the Acoustect SDK to create a desktop front-end that runs ARD on Azure and EC2, 3) Provided an option in the front-end to launch multiple simulations on multiple compute nodes on Azure and EC2, and 4) Deployed MPARD a cluster-based version of ARD on Azure. May 2014 - August 2014.
Pervasive Cyberinfrastructure for Personalized Learning and Instructional Support (PERCEPOLIS) - (JAVA/JADE)
Worked as a research assistant and JAVA developer on the PERCEPOLIS project, the overarching objective of which is to develop an educational cyberinfrastructure that facilitates resource sharing, collaboration, and personalized learning in higher education. We leverage advances in agent-based software engineering, databases, global information sharing processes, and pervasive computing to create this cyberinfrastructure - Missouri S&T, Fall 2010 - Summer 2012.
Context-Aware Anomaly Prediction Using Bayesian Classifiers - (Python/Google App Engine)
System anomalies, such as performance bottlenecks, resource hotspots, and service level objective (SLO) violations, constitute major threats to large-scale hosting infrastructures. Handling such anomalies in a dynamic execution environment requires an adaptive anomaly management system. ALERT is a self-evolving, context-aware anomaly prediction scheme capable of raising alerts before an anomaly occurs so that the administrator or an automated anomaly prevention system can apply the necessary counter-measures. The current implementation of ALERT uses decision tree (DT) based classification scheme. The effectiveness of ALERT's prediction model depends on the optimality of the DT. Learning an optimal decision tree is an NP-complete problem, so we have replaced the DT for classification with a Bayesian classifier scheme and tested our implementation on the Google App Engine and PlanetLab wide-area network system testbeds, Spring 2013.
- [PLOS Computational Biology'21] A. Bahmani*, Kyle Ferriter*, V. Krishnan, AR. Alavi, AM. Alavi, P. Tsao, M. Snyder , C. Pan "Swarm: A Federated Cloud Framework for Large-scale Variant Analysis",PLOS Computational Biology 17(5): e1008977, May 2021.
- [Europe PMC'21] B. Keating, et al "Early Detection of SARS-CoV-2 and other Infections in Solid Organ Transplant Recipients and Household Members using Wearable Devices.", Transplant International: Official Journal of the European Society for Organ Transplantation, March 2021.
- [Bioinformatics'21] A. Bahmani*, Z. Xing*, V. Krishnan*, U. Ray*, F. Mueller, A. Alavi, P. Tsao, M. Snyder, and C. Pan "Hummingbird: Efficient Performance Prediction for Executing Genomic Applications in the Cloud", Bioinformatics, March 2021.
- [BIBM'20] K. Ferriter, F. Mueller, A. Bahmani, C. Pan "VCFC: Structural and Semantic Compression and Indexing of Genetic Variant Data", IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Decmber 2020. (9.4% acceptance rate)
- [Nature’20] T. Mishra*, M. Wang*, A. Metwally*, G. Bogu*, A. Brooks*, A. Bahmani*, A. Alavi*, A. Celli, E. Higgs, O. Dagan-Rosenfeld, B. Fay, S. Kirkpatrick, R. Kellogg, M. Gibson, T. Wang, E. Hunting, P. Mamic, A. Ganz, B. Rolnik, X. Li, M. Snyder, "Pre-symptomatic detection of COVID-19 from smartwatch data", Nature Biomedical Engineering, November 2020.
- [Cell'20] K. Contrepois, et al "Molecular Choreography of Acute Exercise", Cell, May 2020.
- [Cell'20] O. Rozenblatt-Rosen, et al "The Human Tumor Atlas Network: charting tumor transitions across space and time at single-cell resolution", Cell, April 2020.
- [Nature'19] S. Lin, et al "HuBMAP Consortium. "The human body at cellular resolution: the NIH Human Biomolecular Atlas Program." Nature", Nature, October 2019.
- [Nature'19] W. Zhou, et al "Longitudinal Multi-Omics of Host-Microbe Dynamics in Prediabetes", Nature, May 2019.
- [IPDPS'18] A. Bahmani, F. Mueller "Chameleon: Online Clustering Of MPI Program Traces", The 32nd IEEE International Parallel and Distributed Processing Symposium, Vancouver, Canada, May 21-25, 2018. (24.5% acceptance rate).
- [JPDC'17] A. Bahmani, F. Mueller "Scalable Communication Event Tracing via Clustering", Journal of Parallel and Distributed Computing, March 2017.
- [JPDC'16] A. Bahmani F. Mueller "Efficient Clustering for Ultra-Scale Application Tracing", Journal of Parallel and Distributed Computing, August 2016.
- [HICOMB'16] A. Bahmani, A. Sibley, M. Parsian, K. Owzar, F. Mueller "Sparkscore: Leveraging Apache Spark for Distributed Genomic Inference", The 15th IEEE International Workshop on High Performance Computational Biology (IPDPS'16), May 2016.
- [Big Data'15] A. Bahmani, F. Mueller "ACURDION: An Adaptive Clustering-Based Algorithm For Tracing Large-Scale MPI Applications", 2015 IEEE International Conference on Big Data, Nov 2015. (18% acceptance rate).
- [ICS'14] A. Bahmani, F. Mueller "Scalable Tracing Of MPI Programs Through Signature-Based Clustering Algorithms", The 28th International Conference on Supercomputing, June 2014. (21% acceptance rate).
- [DEXA'12] A. Bahmani, S. Sedigh, A. Hurson "Ontology-Based Recommendation Algorithms For Personalized Education", The 23rd International Conference on Database and Expert Systems Applications, LNCS, September 2012, Austria.
- [FIE'11] A. Bahmani, S. Sedigh, A. Hurson "Context-Aware Recommendation Algorithms For The PERCEPOLIS Personalized Education Platform", The 41st ASEE/IEEE Frontiers in Education Conference, F4E-1-F4E-6, October 12 - 15, 2011, Rapid City, South Dakota, USA.
- [IEEE CCECE'08] A. Bahmani, M. Naghibzadeh, B. Bahmani "Automatic Database Normalization and Primary Key Generation", The 21st Canadian Conference on Electrical and Computer Engineering, May 4 - 7, 2008, Ontario, Canada.
- Ph.D.2014 - 2017
NC State University, Raleigh, NC, USA
Supervisor: Prof. Frank Mueller
- M.Sc.2012 - 2014
NC State University, Raleigh, NC, USA
Graduate Research and Teaching Assistant
- M.Sc.2010 - 2012
Missouri University of Science and Technology, Rolla, MO, USA
Graduate Research Assistant, Teaching Assistant and Instructor
- B.Sc.2004 - 2009
Azad University of Mashhad, Mashhad, IRAN
Computer Software Engineering
Advisor: Prof. Mahmoud Naghibzadeh
Stanford Center for Genomics and Personalized Medicine, Stanford Medicine, Stanford University, Palo Alto, CA, USA
abahman [You know] stanford.edu
amirbahmani [dot] h [You know] gmail!
Amir's research activities are dedicated in memory of his deceased wife Someyra who passed away due to cancer in 2014.