Machine learning from human preferences investigates mechanisms for capturing human and societal preferences and values in artificial intelligence (AI) systems and applications, e.g., for socio-technical applications such as algorithmic fairness and many language and robotics tasks when reward functions are otherwise challenging to specify quantitatively. While learning from human preferences has emerged as an increasingly important component of modern AI, e.g., credited with advancing the state of the art in language modeling and reinforcement learning, existing approaches are largely reinvented independently in each subfield, with limited connections drawn among them.
This course will cover the foundations of learning from human preferences from first principles and outline connections to the growing literature on the topic. This includes but is not limited to:
This is a graduate-level course. By the end of the course, students should be able to understand and implement state-of-the-art learning from human feedback and be ready to research these topics. Given how fast this area is growing, this course will consist of weekly lectures, presentations, and discussions of papers led by students. Students will also complete a final course project. If you are a CS PhD student at Stanford, this course is counted toward the breadth requirement for "Learning and Modeling" or "Human and Society".
The current class schedule is below (subject to change). A tentative reading list can be found here.
Date | Theme | Topics | Deadline |
---|---|---|---|
Sep 23 | - | Introduction | |
Sep 25 | Preference Models | Lecture: Individual Preference Models | |
Sep 30 | Lecture: Individual Preference Models | ||
Oct 2 | Student-led Discussion: Preference Models | ||
Oct 7 | Lecture: Aggregated Preference Models via Social Choice Theory & Game Theory | ||
Oct 9 | (Tentative) Guest lecture, Tan Zhi Xuan |
|
|
Oct 14 | Preference Measurement & Optimization | Lecture: Model-based Preferences Optimization via Active Learning | |
Oct 16 | Lecture: Model-based Preferences Optimization via Preference Elicitation | ||
Oct 21 | Lecture: Model-free Preference Optimization via Dueling Bandits | ||
Oct 23 | Lecture: Model-free Preference Optimization via Bayesian Optimization | ||
Oct 28 | Lecture: Aggregated Preference Optimization via Mechanism Design |
|
|
Oct 30 | Guest lecture: Joseph Jay Williams (U of T) | ||
Nov 4 | Guest lecture: Eytan Bashy | ||
Nov 6 | Student-led Discussion: Model-based and Model-free Preference Optimization | ||
Nov 11 | Value, Alignment, & Human-centered Design | Lecture: Value & Alignment |
|
Nov 13 | Lecture: Human-centered Design | ||
Nov 18 | Guest lecture: Colin Megill | ||
Nov 20 | Student-led Discussion: Value, Alignment, & Human-centered Design | ||
Nov 25 | Thanksgiving |
|
|
Nov 27 | Thanksgiving | ||
Dec 2 | Catch-up day (details TBD) | ||
Dec 4 | Project Expo | ||
Dec 9 | - |
|
|
Dec 14 | - |
|