Syllabus

Unless otherwise specified the course lectures and meeting times are:

Monday, Wednesday 1:30-2:50
Location: Cubberley Auditorium

The below is a preliminary schedule and is subject to change. Please check for updates!
EventDateDescriptionCourse Materials
Lecture Apr 3 Introduction to Reinforcement Learning Suggested Readings:
  1. [Linear Algebra Review]
  2. [Probability Review]
  3. Russell and Norvig, Sections 17.1-17.3
[python tutorial]
[slides]
Lecture Apr 5 Policy Iteration, TD Learning, Q-Learning Suggested Readings:
  1. Russell and Norvig, Sections 21.1-21.3
[slides]
[Lecture Video (Stanford only)]
A1 Apr 7 Assignment #1 Released [Assignment 1]
[Solution 1]
Lecture Apr 10 MC and linear value function approximation Suggested Readings:
  1. Russell and Norvig, Section 21.4
  2. chp 5 in Sutton and Barto[link]
[slides]
Lecture Apr 12 MC and linear value function approximation Suggested Readings:
  1. chp 9 in Sutton and Barto[link]
[slides]
[Lecture Video (Stanford only)]
Lecture Apr 17 Function approximation and deep learning Suggested Readings:
  1. Goodfellow, Bengio & Courville, Deep Learning, Chp 6
Backpropagation
  1. [CS231N video]
  2. [CS231N review]
  3. [Karpathy blog]
Vanishing gradients
  1. [WildML blog]
  2. [CS224D example]
[slides]
[Lecture Video (Stanford only)]
Lecture Apr 19 Deep reinforcement learning [slides]
[Lecture Video (Stanford only)]
[Introduction to Tensorflow(from CS224N)]
[Cartpole demo]
[Cartpole demo code]
A2 Assignment #2: Generalization in RL / Deep RL [Assignment 2]
[Solution 2]
Lecture Apr 24 Deep RL and MC vs TD Suggested Readings:
  1. Sutton and Barto Chapters 6, 9.4, 11 and 5 [link]
  2. Tsitsiklis and Van Roy 1997 [link]
[slides]
[Lecture Video (Stanford only)]
Lecture Apr 26 Sample efficient RL Suggested Readings:
  1. Alex Strehl's PhD thesis 2007 Chp 4 [link]
[slides]
[Lecture Video (Stanford only)]
Lecture May 1 Guest Lecture: UCRL [Near-optimal Regret Bounds for Reinforcement Learning]
[Lecture Video (Stanford only) - last 10 min missing]
Lecture May 3 Sample Efficient RL Suggested Readings:
  1. Reinforcement Learning in Finite MDPs [link]
[slides]
[Lecture notes]
[Lecture Video (Stanford only)]
Lecture May 8 Midterm Review [Review notes]
[Lecture Video (Stanford only)]
A3 Assignment #3: Sample efficient RL [Assignment 3]
Lecture May 10 Midterm
Lecture May 15 Guest Lecture: Pieter Abbeel: policy gradients [slides]
[Lecture Video (Stanford only)]
Lecture May 17 Policy Search 2 [slides]
[Lecture Video (Stanford only) - Couple minute interruption at 30:00]
Project Project milestone due
Lecture May 22 Guest Lecture: Chelsea Finn and Sergey Levine [slides]
[Lecture Video (Stanford only) - Part 1 - Start at 10:45]
[Lecture Video (Stanford only) - Part 2]
[Lecture Video (Stanford only) - Part 3]
Lecture May 24 Guest Lecture: Philip Thomas [slides]
[Lecture Video (Stanford only)]
Lecture May 31 Human in the loop
Guest Lecture: Dylan Hadfield-Menel
[slides]
[(guest speaker) slides]
[Lecture Video (Stanford only)]
Suggested Readings:
  1. Where to Add Actions in Human-in-the-Loop Reinforcement Learning [link]
Lecture Jun 5 Challenges of specifying rewards
Project Jun 7 Poster Session Poster Session