| Topic |
|
Course Materials |
|
| Introduction to Reinforcement Learning |
|
- Lecture 1 Draft Slides [Post class version]
- Additional Materials:
|
| Tabular MDP planning |
|
- Lecture 2 Slides (pre-class) [Post class, annotated]
- Additional Materials:
- SB (Sutton and Barto) Chp 3, 4.1-4.4
|
| Tabular RL policy evaluation |
|
- Lecture 3 Slides (pre-class) [Post class, with annotations]
- Additional Materials:
- SB (Sutton and Barto) Chp 5.1, 5.5, 6.1-6.3
|
| Q-learning |
|
- Lecture 4 Slides (preclass) (post class with annotations)
- Additional Materials:
- SB (Sutton and Barto) Chp 5.2, 5.4, 6.4-6.5, 6.7
|
<
| Policy Gradient |
|
- Lecture 5 Slides [Post lecture with annotations]
- Lecture 6 Slides [Post class annotations]
- Lecture 7 Slides [Post class annotations]
- Additional Materials:
- SB (Sutton and Barto) Chp 13
|
<
| Imitation Learning and Learning from Human Input and Batch RL |
|
- Lecture 7 Slides [Post class annotations]
- Lecture 8 Slides (preclass) [Post class with annotations]
|
=
Data Efficient RL |
|
- Lecture 9 Slides [Post class annotations]
- Lecture 10 Slides (no preclass) [Post class annotations]
- Lecture 11 Slides [Post class annotations]
- Lecture 12 Slides [Post class annotations]
- Additional Materials:
|
| Ethics and Society Guest Lecture |
|
- Lecture 10 Part Guest Slides (see latter half of slides)
|
Monte Carlo Tree Search and Conquering Go |
|
- Lecture 13 Draft slides [Post class with annotations]
|
| RL Guest Lecture |
|
- Shane Gu: World of World Modeling
|