Content

Machine learning from human preferences investigates mechanisms for capturing human and societal preferences and values in artificial intelligence (AI) systems and applications, e.g., for socio-technical applications such as algorithmic fairness and many language and robotics tasks when reward functions are otherwise challenging to specify quantitatively. While learning from human preferences has emerged as an increasingly important component of modern AI, e.g., credited with advancing the state of the art in language modeling and reinforcement learning, existing approaches are largely reinvented independently in each subfield, with limited connections drawn among them.

This course will cover the foundations of learning from human preferences from first principles and outline connections to the growing literature on the topic. This includes but is not limited to:

This is a graduate-level course. By the end of the course, students should be able to understand and implement state-of-the-art learning from human feedback and be ready to research these topics. Given how fast this area is growing, this course will consist of weekly lectures, presentations, and discussions of papers led by students. Students will also complete a final course project. If you are a CS PhD student at Stanford, this course is counted toward the breadth requirement for "Learning and Modeling" or "Human and Society".

Instructor

Teaching Fellow

Course Assistant

Logistics


Schedule

The current class schedule is below (subject to change). A tentative reading list can be found here.


Date Theme Topics Deadline
Sep 23 - Introduction
Sep 25 Preference Models Lecture: Individual Preference Models
Sep 30 Lecture: Individual Preference Models
Oct 2 Student-led Discussion: Preference Models
Oct 7 Lecture: Aggregated Preference Models via Social Choice Theory & Game Theory
Oct 9 (Tentative) Guest lecture, Tan Zhi Xuan
  • Homework 1 release
Oct 14 Preference Measurement & Optimization Lecture: Model-based Preferences Optimization via Active Learning
Oct 16 Lecture: Model-based Preferences Optimization via Preference Elicitation
Oct 21 Lecture: Model-free Preference Optimization via Dueling Bandits
Oct 23 Lecture: Model-free Preference Optimization via Bayesian Optimization
Oct 28 Lecture: Aggregated Preference Optimization via Mechanism Design
  • Homework 1 due
  • Homework 2 release
Oct 30 Guest lecture: Joseph Jay Williams (U of T)
Nov 4 Guest lecture: Eytan Bashy
Nov 6 Student-led Discussion: Model-based and Model-free Preference Optimization
Nov 11 Value, Alignment, & Human-centered Design Lecture: Value & Alignment
  • Homework 2 due
  • Homework 3 release
Nov 13 Lecture: Human-centered Design
Nov 18 Guest lecture: Colin Megill
Nov 20 Student-led Discussion: Value, Alignment, & Human-centered Design
Nov 25 Thanksgiving
  • Homework 3 due
  • Homework 4 release
Nov 27 Thanksgiving
Dec 2 Catch-up day (details TBD)
Dec 4 Project Expo
Dec 9 -
  • Homework 4 due
Dec 14 -
  • Final project due

Grading