Since their introduction in 2017, transformers have revolutionized Natural Language Processing (NLP). Now, transformers are finding applications all over Deep Learning, be it computer vision (CV), reinforcement learning (RL), Generative Adversarial Networks (GANs), Speech or even Biology. Among other things, transformers have enabled the creation of powerful language models like GPT-3 and were instrumental in DeepMind's recent AlphaFold2, that tackles protein folding.
In this seminar, we examine the details of how transformers work, and dive deep into the different kinds of transformers and how they're applied in different fields. We do this through a combination of instructor lectures, guest lectures, and classroom discussions. We will invite people at the forefront of transformers research across different domains for guest lectures.
Prerequisites: Basic knowledge of Deep Learning (must understand attention) or have taken CS224N / CS231N / CS230.
The bulk of this class will comprise of talks from researchers discussing latest breakthroughs with transformers and explaining how they apply them to their fields of research. The objective of the course is to bring together the ideas from ML, NLP, CV, biology and other communities on transformers, understand their broad implications, and spark cross-collaborative research.
The current class schedule is below (subject to change)
Date | Description | Course Materials | |
---|---|---|---|
Mon Sep 20 | Introduction to Transformers |
Recommended Readings:
Additional Readings:
|
|
Mon Sept 27 |
Transformers in Language: GPT-3, Codex Speaker: Mark Chen (OpenAI) |
Recommended Readings:
Additional Readings:
|
|
Mon Oct 4 |
Applications in Vision Speaker: Lucas Beyer (Google Brain) |
Recommended Readings: Additional Readings: | |
Mon Oct 11 |
Transformers in RL & Universal Compute Engines Speaker: Aditya Grover (FAIR) |
Recommended Readings:
|
|
Mon Oct 18 |
Scaling transformers Speaker: Barret Zoph (Google Brain) with Irwan Bello and Liam Fedus |
Recommended Readings:
|
|
Mon Oct 25 |
Perceiver: Arbitrary IO with transformers Speaker: Andrew Jaegle (DeepMind) |
Recommended Readings:
|
|
Mon Nov 1 |
Self Attention & Non-Parametric Transformers Speaker: Aidan Gomez (University of Oxford) |
Recommended Readings:
Additional Readings:
|
|
Mon Nov 8 |
GLOM: Representing part-whole hierarchies
in a neural network Speaker: Geoffrey Hinton (UoT) |
Recommended Readings:
Additional Readings:
|
|
Mon Nov 15 |
Interpretability with transformers Speaker: Chris Olah (AnthropicAI) |
Recommended Readings: Additional Readings: | |
Mon Nov 29 | Transformers for Applications in Audio, Speech and Music: From Language Modeling to Understanding to Synthesis. Speaker: Prateek Verma (Stanford) |