Reza Zadeh

Technical Advisor at Databricks
Consulting Professor at Stanford University

conferences · publications · media · talks · previously · misc

I am a consulting professor at Stanford within ICME, conducting research and teaching courses targeting doctorate students.

I focus on Machine Learning, Distributed Computing, and Discrete Applied Mathematics.

For fun, I fly planes as a private pilot, climb rocks as a PCIA instructor, and run.

Curriculum Vitae | Google Scholar | Third-person Bio


Teaching CME 305: Discrete Mathematics and Algorithms (MS&E 316) [Winter 2014] [Winter 2015]

Teaching CME 323: Distributed Algorithms and Optimization

Taught MS&E 317: Algorithms for Modern Data Models (CS 263)

Leading SMACC Consulting and ICME Computational Consulting

I contribute to the Apache Spark project, and am the initial creator of the Linear Algebra package in Spark.

I taught Spark workshop and Spark class. See Keynote at Bay Area ACM, and interview.

Conferences Organized

Distributed Machine Learning and Matrix Computations, NIPS 2014

Modern Massive Data Sets 2014 (MMDS 2014), at UC Berkeley

Large Scale Matrix Analysis and Inference Workshop, NIPS 2013

Clustering Theory Workshop, NIPS 2009


On the Evolution of Machine Learning: from Linear Models to Neural Networks [full pdf] [oreilly report]
Reza Bosagh Zadeh, interviewed by David Beyer
O'Reilly Media, 2015

Machine Learning using Big Data: How Apache Spark Can Help [magazine pdf]
Reza Bosagh Zadeh
Biomedical Computation Review, Spring 2015

MLlib: Machine Learning in Apache Spark [arxiv pdf]
Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, Shivaram Venkataraman, Davies Liu,
Jeremy Freeman, DB Tsai, Manish Amde, Sean Owen, Doris Xin, Reynold Xin,
Michael J. Franklin, Reza Zadeh, Matei Zaharia, Ameet Talwalkar

Generalized Low Rank Models [pdf] [julia code] [python code] [spark code]
Madeleine Udell, Corinne Horn, Reza Bosagh Zadeh, Stephen Boyd
arXiv, submitted October 2014.

Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares [pdf] [spark code]
Trevor Hastie, Rahul Mazumder, Jason Lee, Reza Bosagh Zadeh
Journal of Machine Learning Research

Factorbird - a Parameter Server Approach to Distributed Matrix Factorization [pdf]
Sebastian Schelter, Venu Satuluri, Reza Bosagh Zadeh
NIPS 2014 Workshop on Distributed Matrix Computations

Estimate of Shaking Intensity by Combining Earthquake Characteristics with Tweets [pdf] [slides] [video] [demo] [full]
Mahalia Miller, Lynne Burks, Reza Bosagh Zadeh
Tenth U.S. National Conference on Earthquake Engineering 2014
Press: TechCrunch, Mashable, Engadget, VentureBeat, Scientific American, Video Coverage, Full Coverage

Large Scale Graph Completion
Reza Bosagh Zadeh
Ph.d. Dissertation, April 2014. Advisor: Gunnar Carlsson
Award: Gene Golub Outstanding Thesis award, for best thesis in department.

Dimension Independent Matrix Square using MapReduce [pdf] [arxiv] [slides] [poster] [scalding code] [spark code]
Reza Bosagh Zadeh, Gunnar Carlsson
Poster at Symposium on the Theory of Computing (STOC 2013).
Press: Gigaom

On the Precision of Social and Information Networks [pdf] [slides]
Reza Bosagh Zadeh, Ashish Goel, Kamesh Munagala, Aneesh Sharma
Conference on Online Social Networks (COSN 2013). (Acceptance rate: 15%)

WTF: The Who to Follow Service at Twitter [pdf]
Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, Reza Bosagh Zadeh
World Wide Web Conference (WWW 2013). (Acceptance rate: 15%)
Product:, Press: TechCrunch

Dimension Independent Similarity Computation [pdf] [summary post] [slides] [spark code]
Reza Bosagh Zadeh, Ashish Goel
Journal of Machine Learning Research 2012.
Translation: Translated into Chinese by Xu Wenhao

Group Heterogeneity Increases the Risks of Large Group Size [abstract] [pdf]
Jonathan Cummings, Sara Kiesler, Reza Bosagh Zadeh, Aruna Balakrishnan
Psychological Science Journal 2012.

What's in a Move? Normal Disruption and a Design Challenge [pdf] [slides]
Reza Bosagh Zadeh, Sara Kiesler, Aruna Balakrishnan, Jonathan Cummings
Computer Human Interaction (CHI 2011). (Acceptance rate: 26%)

Research Team Integration: What It Is and Why It Matters [pdf]
Aruna Balakrishnan, Sara Kiesler, Jonathan Cummings, Reza Bosagh Zadeh
Computer Supported Cooperative Work (CSCW 2011). (Acceptance rate: 26%)

Supervised Clustering [pdf] [poster] [slides] [video]
Pranjal Awasthi, Reza Bosagh Zadeh
Neural Information Processing Systems (NIPS 2010). (Acceptance rate: 24%)

A Uniqueness Theorem for Clustering [pdf] [extension] [slides]
Reza Bosagh Zadeh, Shai Ben-David
Uncertainty in Artificial Intelligence (UAI 2009). (Acceptance rate: 27%)

Industry Publications

Distributing the Singular Value Decomposition with Spark [example code], Databricks Developer Blog

Efficient similarity algorithm now in Spark, thanks to Twitter [example code], Databricks Developer Blog

All-pairs similarity via DIMSUM, Twitter Engineering Blog
Covered by: Gigaom

Using Twitter to measure earthquake impact in almost real time, Twitter Engineering Blog
Covered by: TechCrunch, Mashable, Engadget, VentureBeat, Scientific American, Video Coverage, Full Coverage

Dimension Independent Similarity Computation (DISCO), Twitter Engineering Blog
Translated into Chinese by Xu Wenhao: DISCO in Mandarin

Selected Talks

These are in addition to the implied talks alongside papers above.

MLlib and Distributing the Singular Value Decomposition, Stanford University

Dimension Independent Matrix Square, MMDS 2014, UC Berkeley

The Libraries of Spark, Keynote at Data Science Bootcamp

MLlib and All-Pairs Similarity, University of Maryland

Distributed Computing with Spark, University of Maryland

Distributing Matrix Computations with Spark MLlib, Spark Meetup

Distributed Computing with Spark, eBay, Bay Area ACM

Towards a Principled Theory of Clustering, Carnegie Mellon University

Matrix Factorization and Spark, Codeneuro, San Francisco

Apache Spark in Four Parts, Raytheon

Spark Camp: An Introduction to Apache Spark with Hands-on Tutorials, Strata 2015

Dimension Independent Matrix Square using MapReduce, McGill University

Distributed Machine Learning on Spark, Toronto Hadoop User Group


Stanford University, Ph.D. in Computational Mathematics, 2010 - 2013, Stanford, California
Gene Golub Outstanding Thesis Award. Student Leadership Award. President of C2. GPA 4.0.

Twitter, Senior Data Scientist, 2011-2013, San Francisco, California
Graph Completion for Who To Follow, large scale machine learning, random walks on large graphs.

Carnegie Mellon University, Master of Computer Science, 2008 - 2010, Pittsburgh, Pennsylvania
Research masters. Clustering Theory. Applications of machine learning to social science. GPA 4.0.

Morgan Stanley, Financial Engineering, Jan 2008 - May 2008, Manhattan, New York
Implemented models for Credit Derivatives pricing.

Google, Researcher, Jan 2006 - Aug 2007, Mountain View, California
Statistical Machine Translation (SMT) alignment models. Language modelling for SMT. Large scale machine learning.

IBM, Software Engineer, May 2005 - Sep 2005, Toronto, Ontario
iSeries Development Studio team. Built plugins for Eclipse and WDSC.

University of Waterloo, Bachelor of Mathematics, Honors Computer Science, 2004 - 2008, Waterloo, Ontario
Graduated with Distinction on Dean's Honor list. Awarded several RAships. Active in programming contests. GPA 92% (=4.0).


Erdős number: 3 | Bacon number: 3 | Erdős-Bacon number: 6 [details].

Blog: No Gravity Math

Some videos of my talks and lectures

I enjoy generating prefactored random numbers, among other things.

Reviewer for JMLR, PRJ, NIPS, ECML PKDD, and KDD

Lived in 3 continents, 4 countries, 8 cities:
Ahvaz [8y] → London [5y] → Tehran [2y] → London [2y] → Toronto [1y] → Waterloo [8m] → Toronto [4m] → Waterloo [4m] → Mountain View [4m] → Waterloo [4m] → Mountain View [4m] → Waterloo [4m] → Mountain View [4m] → Waterloo [4m] → New York [4m] → Waterloo [4m] → Pittsburgh [2y] → Palo Alto [?].

Contact info

Twitter: @Reza_Zadeh

Phone: (650) 898-3193

Email: my domain is and my username is rezab

Office: ICME, Huang building. (directions)

Stanord University