alt text 

Xinyu Hu  in
Computational & Mathematical Engineering
Stanford University

Meet with me?!
Email: xhu17 [at] stanford [dot] edu

Research Interests

My research focuses on computational social science and data science. Primarily, I love delving into the beauty of mathematical theory (esp. geometry and number theory, the story about my favorite number can be found in 1729). On top of it, seeing their applications with learning algorithms in sociology, finance and education makes me excited and motivated to explore further.

I've worked in convex optimization, portfolio management and extreme value statistics. Over the years, I've also built my tech stack in high frequency computing (C++/C), WebDev (MERN), iOS (Swift), Data Science/Machine Learning (Python/R, Tableau/PowerBI, Tensorflow, PyTorch, PySpark), during my internships at universities and various European AI startups.

Research Experience


Graduate Research Assistant,  Stanford School of Engineering
Mar. 2021 - Current
Focus: Computational Social Science


Research Intern,  Stanford Biomedical Data Science
Sep. 2020 - Dec. 2020
Focus: Genomic Modeling using Deep Learning, Covid 19 Severity Prediction


Deep Learning Research Assistant,  University of Michigan
Jun. 2019 - Oct. 2019
Focus: Gaussian Convolutional Processes Modeling, Convergence Theory in Deep Learning


Quantitiave Finance Research Assistant,  Erasmus University Rotterdam
Dec. 2018 - Jun. 2020
Focus: Semicovariance Estimation

Work Experience


Data Science Intern,  PayPal Global Data Science
Upcoming  Jun. 2021 - Sep. 2021


Data Science & Engineering Developer,  World Bank
Jan. 2021 - Current
Tools: Python, Tableau, ReactJS, node.js, MangoDB, D3.js


Data Engineer Intern,  Dashmote
Jun. 2020 - Aug. 2020
Tools: Python, MySQL, PowerBI, AWS Lambda, AWS S3, Web Scraping

Selected Projects


Transformers for Textual Reasoning and Question Answering
Natural language models and systems have achieved great success in question answering tasks. However, much of the success is being measured on datasets such as SQuAD by Rajpurkar and Liang (2016) and RuleTakers by Clark et al. (2020) where questions simply require local phrase matching or shallow textual reasoning. As a result, the high performance transformers achieved on these tasks cannot demonstrate their ability to learn long-range relations and a holistic understanding of the text. We propose methods of reducing the attention mechanism of the transformer from a fully connected graph to one with sparser edge connections to see if it can yield improvements in performance for difficult reasoning tasks, generalizability, and learning efficiency.


Analysis of Fake Face Classifiers
Synthetic images of faces have become more commonplace with recent advances in software, but the detection of these forgeries has proved challenging for machine learning systems. We construct several systems for this task and then use contemporary analysis techniques to present an intuitive explanation of the behavior of these systems. We use these analysis techniques to explore the difference between a single-task and a multi-task classifier operating on the same dataset.


  • Languages: Simplified & Traditional Chinese (Native), English, Dutch

  • Programing Languages: C++, Python/R, JavaScript, MATLAB, Java, Swift

  • Tools: Spark, Torch, Tensorflow, Git, LaTeX


  • Sport: Hiking, Marathon, Brazilian Jiu-Jitsu

  • Book: Religion, European Politics, World War II, Sleep Architecture, Surrealism, Thai & Indian Culture

  • Drink: Black Coffee, Chai & Matcha Latte, PiƱa Colada, Mint Citrus Water

  • Others: Stargazing, Bartending, DJing, EDM Composition, Design