Course Description

To realize the dreams and impact of AI requires autonomous systems that learn to make good decisions. Reinforcement learning is one powerful paradigm for doing so, and it is relevant to an enormous range of tasks, including robotics, game playing, consumer modeling and healthcare. This class will provide a solid introduction to the field of reinforcement learning and students will learn about the core challenges and approaches, including generalization and exploration. Through a combination of lectures, and written and coding assignments, students will become well versed in key ideas and techniques for RL. Assignments will include the basics of reinforcement learning as well as deep reinforcement learning — an extremely promising new area that combines deep learning techniques with reinforcement learning. In addition, students will advance their understanding and the field of RL through an open ended project.

By the end of the class students should be able to:

Class Time and Location

Winter quarter (January 08 - March 16, 2018)
Lecture: Monday, Wednesday 11:30 AM - 12:50 PM
Location: NVIDIA Auditorium

Emma's office hours will be held in Gates 218. Alex and Xinkun will hold office hours in the Lathrop Learning Hub. Other CA office hours will be held in the Huang Basement. See Calendar for times.

For both in-person and online SCPD office hours, you will need to register an account on QueueStatus. When you wish to join the queue, click "Sign Up" at the CS234 queue. Be sure to enter your email when you "Sign Up"; this is a way for the CA to contact you. Look for announcements on the left panel for more information. For online office hours, you will need to install Zoom (instructions below) to video call with the CA: the CA will contact you via Zoom when he/she reaches you in the queue.

Instructions for installing Zoom:

Attendance is not required but is encouraged. Sometimes we may do in class exercises or discussions and these are harder to do and benefit from by yourself. However, if you are not able to attend class, the class is recorded. It has previously been shown that watching lecture videos in small groups with one person pausing to facilitate discussion can yield student performance as high as attending lectures live, and we have heard of students getting together to watch videos in small groups in the past, so we encourage you to consider this if you are unable to attend a particular lecture or if you’re participating in the class as a SCPD student. I am always excited to hear about new ways students find to effectively learn the material, so sharing such tips is always appreciated.


We believe students often learn an enormous amount from each other as well as from us, the course staff. Therefore to facilitate discussion and peer learning, we request that you please use Piazza for all questions related to lectures, homeworks and projects.

For SCPD students, if you have generic SCPD specific questions, please email or call 650-741-1542. In case you have specific questions related to being a SCPD student for this particular class, please contact us at

For exceptional circumstances that require us to make special arrangements, please email the course CA Anchit at For example, such a situation may arise if a student requires extra days to submit a homework due to a medical emergency, or if a student needs to schedule an alternative midterm date due to events such as conference travel etc. They will be considered and approved on a case by case basis.

You will be awarded with up to 2% extra credit if you answer other students' questions in a substantial and helpful way on Piazza.


See Piazza. NOTE: If you enrolled in this class on Axess, you should be added to the Piazza group automatically within a few hours. You can also register independently — there is no access code required to join the group.

I care about academic collaboration and misconduct because it is important both that we are able to evaluate your own work (independent of your peer’s) and because not claiming others’ work as your own is an important part of integrity in your future career. I understand that different institutions and locations can have different definitions of what forms of collaborative behavior is considered acceptable. In this class, for written homework problems, you are welcome to discuss ideas with others, but you are expected to write up your own solutions independently (without referring to another’s solutions). For coding, you are allowed to do projects in groups of 2, but for any other collaborations, you may only share the input-output behavior of your programs. This encourages you to work separately but share ideas on how to test your implementation. Please remember that if you share your solution with another student, even if you did not copy from another, you are still violating the honor code. In terms of the final project, you are welcome to combine this project with another class assuming that the project is relevant to both classes, given that you take prior permission of the class instructors. If your project is an extension of a previous class project, you are expected to make significant additional contributions to the project.

We periodically run similarity-detection software over all submitted student programs, including programs from past quarters and any solutions found online on public websites. Anyone violating the Stanford University Honor Code will be referred to the Office of Judicial Affairs. If you think you made a mistake (it can happen, especially under stress or when time is short!), please reach out to Emma or the head CA; the consequences will be much less severe than if we approach you.

Students who may need an academic accommodation based on the impact of a disability must initiate the request with the Office of Accessible Education (OAE). Professional staff will evaluate the request with required documentation, recommend reasonable accommodations, and prepare an Accommodation Letter for faculty dated in the current quarter in which the request is being made. Students should contact the OAE as soon as possible since timely notice is needed to coordinate accommodations. The OAE is located at 563 Salvatierra Walk (650-723-1066,

If you're enrolled in the class on credit/no credit status, you will be graded on work as usual per standard Stanford rules. The only distinction with those taking the class for letter grade is that you must obtain a C- (C minus) grade or higher in the class, for you to be marked as CR.