CS25: Transformers United V6

CS25 has become one of Stanford's hottest seminar courses, featuring top researchers at the forefront of Transformers research such as Geoffrey Hinton, Ashish Vaswani, and Andrej Karpathy. Our class has an incredibly popular reception within and outside Stanford, and millions of total views on YouTube. Each week, we dive into the latest breakthroughs in AI, from large language models like GPT to applications in art, biology, and robotics. Now on our sixth iteration of the course, we are excited to bring you fresh perspectives on where Transformer research is heading next.

The only homework for students is weekly attendance to the talks/lectures. Anybody is free to audit in-person or join our Zoom livestreams - you don't have to sign-up or be affiliated with Stanford! (Please do not contact us about this). We also have a lively Discord community (over 5000 members) - feel free to join and chat with hundreds of others about Transformers!

Instructors

Time and Location

Spring Quarter (March 30 - June 3)
Thursdays 4:30 - 5:50 pm PDT
Skilling Auditorium   |   Zoom Link   |   Slido

DateTitleDescription
April 2ndOverview of Transformers [In-Person]

Speakers: Instructors
Brief intro and overview of the history of ML/NLP, Transformers and how they work, and their impact. Discussion about recent trends, breakthroughs, applications, and current challenges. Link to slides. Paper discussed:
Feng et al., Baby Scale: Investigating Models Trained on Individual Children's Language Input, arXiv:2603.29522

Zeng et al., Bringing Up a Bilingual BabyLM: Investigating Multilingual Language Acquisition Using Small-Scale Models, arXiv:2603.29552

Singh et al., To Memorize or to Retrieve: Scaling Laws for RAG-Considerate Pretraining, arXiv:2604.00715

Singh et al., Curriculum-Guided Layer Scaling for Language Model Pretraining, arXiv:2506.11389

Singh et al., Interpretable Cross-Network Attention for Resting-State fMRI Representation Learning, arXiv:2603.00786

Liu et al., A Unified Definition of Hallucination: It's The World Model, Stupid!, arXiv:2512.21577

April 9thJEPA [In-Person]

Speakers: Hazel Nam & Lucas Maes (Brown University)
 
April 16thSSMs [In-Person]

Speaker: Albert Gu (CMU)
 
April 23thSpeaker: Nouamane Tazi (Hugging Face) 
April 30thTBA 
May 7thSpeaker: Andrew Lampinen (Anthropic) 
May 14thSpeaker: Vivek Natarajan (DeepMind) 
May 21thTBA 
May 28thSpeaker: Charles Frye (Modal)