

Announcements
Research
My main research areas of interest include:
 Datadriven healthcare: predictions and dynamic decisions.
 Graphical models and messagepassing algorithms.
 Probabilistic and statistical methods for inference from largescale data.
Teaching

Spring 2015: OIT 673Datadriven Decision Making and Applications in Healthcare (PhD)

Winter 2015: OIT 536 Data for Action: From Insights to Applications (MBA Elective)

Winters 20132015: OIT 367 Business Intelligence from Big Data (MBA Core)

Spring 2012: OIT 267 Data and Decisions  Accelerated (MBA Core)
Selected Projects (for a complete list click here)
DataDriven Decision Making in Healthcare

M. Bayati, M. Braverman, M. Gillam, K. Mack, G. Ruiz, M. Smith, and E. Horvitz,
DataDriven Decisions for Reducing Readmissions for Heart Failure: General Methodology and Case Study
, PLoS ONE, DOI: 10.1371/journal.pone.0109264, 2014.
 J. Goh, M. Bayati, S. Zenios, S. Singh, and D. Moore,
Data Uncertainty in Markov Chains: Application to Costeffectiveness Analyses of Medical Innovations ,
Working paper.
CoWinner of the William Pierskalla best paper award in health care.

N. Baker, M. Bayati, R. Torguson, K. Mack, H. Rappaport, E. Horvitz, R. Waksman,
Identifying Patients at High Risk for Readmission following Treatment for Acute Myocardial Infarction: a DataCentric Approach
,
American Heart Association Conference, Chicago 2014.

M. Braverman, M. Bayati, E. Horvitz, and M. Gillam, Health care policy development and execution, US Patent application, June 2010.
Summary: Rehospitalization patient admission to a hospital soon after the discharge is both common and costly. Nearly one in every five
patients is readmitted to the hospital within 30 days of their discharge. The estimated cost of unplanned rehospitalizations to
Medicare in 2004 was around $17.4 billion. Research shows that hospital initiatives such as: patient education programs, followup home
visits by pharmacists, extensive discharge packages, etc., can avert many rehospitalizations. However, proper allocation of these
costly and limited resources is a challenging problem. The first paper above represent a project on applications of
statistical learning for identifying patients with highest risk of being rehospitalized and obtaining a decision support mechanism for
allocating scarce resources to postdischarge support.
The result is an automated risk prediction system that is currently in use in several hospitals across US and Europe. Here is a short
video describing this project.
can be found here.


The second paper develops a theoretical framework for costeffectiveness analysis of medical
innovations challenged by data inadequacy. When Markov chains
are used as a computational tool for such studies, a possible consequence of this lack of data is that the
transition matrix of the chain cannot be estimated precisely. This paper shows, how to compute maximal
and minimal values for the discounted value of the chain (with respect to a vector of statewise costs or
rewards) as these uncertain transition parameters jointly vary within a given uncertainty set. It shows that
these problems are computationally tractable if the uncertainty set has a rowwise structure. Conversely,
It proves that if the rowwise structure is relaxed slightly, the problems become computationally intractable
(NPhard). The developed model is used to assess the costeffectiveness of fecal immunochemical testing (FIT), a
new screening method for colorectal cancer. The results show that despite the large uncertainty in existing data on
FIT's performance, it is a costeffective alternative screening modality to colonoscopy.
Graphical Models, Approximate Message Passing, and Statistical Learning
 M. Bayati, and A. Montanari,
The dynamics of message passing on dense graphs, with applications to compressed sensing,
IEEE Transcations on Information Theory, Vol. 57, No. 2, 2011.
 M. Bayati, and A. Montanari, The LASSO risk for gaussian matrices,
IEEE Transcations on Information Theory, Vol. 58, No. 4, 2012.
 M. Bayati, M. Lelarge and A. Montanari, Universality in Polytope Phase Transitions and Message Passing Algorithms,
Accepted to Annals of Applied Probability, 2013.
 M Bayati, M. A. Erdogdu, and A. Montanari,
Estimating LASSO risk and noise level, NIPS, 2013.
Summary: Recently, Donoho, Maleki and Montanari introduced
approximate message passing (AMP) as an extremely effective algorithm for
reconstructing high dimensional sparse signals from a small number of observations.
They also showed (through extensive numerical experiments) that dynamics of AMP is accurately
tracked by a simple onedimensional iteration termed "state evolution". In these papers we provided the first
rigorous foundation to state evolution. We proved that indeed it holds asymptotically in the large system
limit for sensing matrices with i.i.d. gaussian entries. Our techniques also provide a new approach
for the analysis of messagepassing algorithms on dense graphs.
In the second paper, we applied our results to the popular approach for dimensionality reduction known as the LASSO.
We proved asymptotically sharp predictions and explicit expressions for operating characteristics of the LASSO
(e.g., mean square error) that is the first rigorous derivation of such explicit formulas for random instances.
Through simulations on real data matrices (gene expression data and hospital medical records) we observed that
these results can be relevant in a broad array of practical applications.
The third paper extends the first paper beyond the gaussian assumption. In particular it proves a slightly weaker form of the result
for sensing matrices with i.i.d. distribution with subgaussian tail and using this
a conjecture from polytope geometry due to Donoho and Tanner is resolved. The fourth paper extends the results of the second paper to the more realistic
situations when the distribution of original signal and noise variance are unknown. It also obtains a consistent estimate for the noise variance which outperforms
existing estimate methods.
Modeling of Large Networks with Given Constraints
 M. Bayati, J. H. Kim and A. Saberi,
A sequential algorithm for generating random graphs,
Algorithmica, Vol. 58, No. 4, 2010.
Software (by David Gleich)
 M. Bayati, A. Montanari, and A. Saberi,
Generating Random Graphs without Short Cycles,
Submitted 2013.
Summary: A central hurdle in working with large networks is the lack
of good random graph models that capture their specific properties. Such random graph models can be
useful for example in simulating algorithms for the Internet or detecting motifs (patterns) in biological systems.
Unfortunately, the existing algorithms for generating random graphs with given degrees have
large running times (quadratic on problem size), making them impractical to use for networks with
millions of nodes. The first paper provides the fastest (almost linear running time) algorithm and
proved its correctness when the node degrees are in certain range. The second paper
extends that algorithm (with a different analysis technique that yields sharper bounds)
to design an efficient algorithm for generating random graphs with no short cycles
These graphs are used to construct high performance codes that can achieve the Shannon capacity.
Belief Propagation Algorithm for Distributed Optimization: Theory and Applications

M Bayati, C. Borgs, J. Chayes, R. Zecchina,
BeliefPropagation for Weighted bMatchings on Arbitrary Graphs and its Relation to Linear Programs with Integer Solutions
SIAM Journal in Discrete Math, 25, pp. 9891011, 2011.
 M.Bayati, D. Shah and M. Sharma,
Maxproduct for maximum weight matching: convergence, correctness and LP duality,
IEEE Transactions on Information Theory, Vol. 54, No 3, 2008.
 M. Bayati, C. Borgs, A. Braunstein, J. Chayes, A. Ramezanpour, and R. Zecchina,
Statistical Mechanics of Steiner Trees, Physical Review Letters (PRL), 101, 037208, 2008.
 M Bayati, D. F. Gleich, A. Saberi, Y. Wang,
Messagepassing algorithms for sparse network alignment,
ACM Transactions of Knowledge Discovery and Data mining (TKDD), vol 7, pages 3:1331, 2013.
Software and Data
Summary:
Belief propagation (BP) is a distributed algorithm that has been very successfully used in many areas including
machine learning and optimization. Despite its spectacular success, the
BP algorithm is known to be correct only in cyclefree networks. However, in practice, the (loopy) BP algorithm
works rather well in networks that have many cycles. The first two papers use BP algorithm for solving the wellknown
maximum weight matchings (MWM) problem. They prove that BP solves this problem correctly on all graphs when the
linear programming (LP) relaxation of the problem has integer solutions. They also prove that BP and LP are
equivalent for finding MWM. The proof technique gives a better understanding of the often noted,
but poorly understood connection between BP and LP.
The third paper uses BP for the NPhard problem of finding minimum weight Steiner tree and obtains a
very efficient algorithm. The algorithm is effective in systems biology
for finding previously undetected protein associations in cell signaling, as well as in
data clustering.
The fourth paper applies BP method to the network alignment problem that generalizes and unifies several approaches
for forming a matching or alignment between the nodes of two networks.
The paper proposes a mathematical programming framework for network alignment problem and
a sparse variation of it where only a small number of matches between
the nodes of the two networks are possible. It proposes new message passing algorithms that allow to compute,
very efficiently, near optimal solutions to the sparse network alignment problems
with network sizes as large as hundreds of thousands of nodes. Using (rigorous) linear programming upperbounds it
shows that the proposed algorithms produce near optimal solutions on problems in bioinformatics,
in ontology matching, and on synthetic problems.


Contact Info
Knight Management Center
655 Knight Way, E363
Stanford, CA 94305
Phone: (650) 7252285
Fax:(650) 7247402
Email: aaa@stanford.edu
aaa= bayati


Assistant: Sandra Davis
655 Knight Way, E324
Stanford, CA 94305
Phone:(650) 7360939
Fax:(650) 7247402
Email: bbb@stanford.edu
bbb=srdavis

