Course Project


The project is a chance to explore RL in more depth for a focused project. Novel research ideas are welcome but are not expected nor required to receive full credit. In addition, projects do not always work: in such cases, a careful illustration (using theoretical proofs and/or experimental results, plus a discussion) of why the proposed idea did not work and/or was substantially more work than anticipated is encouraged. If the reason is that not enough coding was done, this will not be considered a compelling reason.

Project Ideas

To give you some project ideas, we are sharing some of the projects from previous years below:

  1. Using Transfer Learning Between Games to Improve Deep Reinforcement Learning Performance and Stability, Chaitanya Asawa, Christopher Elamri, David Pan. [Poster] [Paper]
  2. Mastering the game of Go from scratch, Michael Painter, Luke Johnston. [Poster] [Paper]
  3. Comparison of Control Methods: Learning Robotics Manipulation with Contact Dynamics, Keven Wang, Bruce Li. [Paper]
  4. Information Directed Reinforcement Learning, Andrea Zanette, Rahul Sarkar. [Poster] [Paper]
  5. Reward Backpropagation Prioritized Experience Replay, Yangxin Zhong, Borui Wang, Yuanfang Wang. [Poster] [Paper]
  6. Online Learning for Causal Bandits, Vinayak Sachidananda, Prof. Emma Brunskill. [Poster] [Paper]
  7. DeepShuai: Deep Reinforcement Learning based Chinese Chess Player, Chengshu Li, Kedao Wang, Zihua Liu. [Poster] [Paper]
  8. EteRNA-RL: Using reinforcement learning to design RNA secondary structures, Isaac Kauvar, Ethan Richman, William E Allen. [Poster] [Paper]
  9. Adversarially Robust Policy Learning through Active Construction of Physically-Plausible Perturbations, Ajay Mandlekar, Yuke Zhu, Animesh Garg, Li Fei-Fei, Silvio Savarese. [Poster] [Paper]

The following guest lecture slides from last year's class offering may also help you in generating good project ideas.

  1. Cooperative Inverse Reinforcement Learning, Dylan Hadfield-Menell. [Slides]
  2. Maximum Entropy Framework: Inverse RL, Soft Optimality, and More, Chelsea Finn and Sergey Levine. [Slides]
  3. Reinforcement Learning – Policy Optimization Pieter Abbeel. [Slides]
  4. Safe Reinforcement Learning, Philip S. Thomas. [Slides]

You may also consider browsing through the RL publications listed below, to get more ideas.

We also encourage course projects that try to reproduce recent results in a RL paper, for example as promoted in the ICLR 2018 Reproducibility Challenge. Research reproducibility is an important issue in machine learning, and the goal of a reproducibility project should be to provide detailed feedback to the authors of a RL paper about how reproducible their results are. See the challenge page for more information (although the deadline for submission to the challenge has passed, this is still a great project idea).

Important Dates and Times

DateTimeEventLate Day Policy
Feb 5 11:00 PM (23:00) Initial project proposal 2 late days allowed. See Late Day Policy.
Feb 26 11:00 PM (23:00) Project milestone 2 late days allowed. See Late Day Policy.
Mar 14 11:50 AM - 2:50 PM Poster Session No late days allowed. See Late Day Policy.
Mar 19 11:00 PM (23:00) Final report No late days allowed. See Late Day Policy.

Grading Policy

For grading policy, see the section on Grade Breakdown.

Project Proposal

The project proposal should be about 200-400 words, include the names of the project team members and the project mentor (someone who agrees to give you feedback). The mentor can be one of the course staff or someone external to the class. There is a list of staff interests at the bottom of the page to help you find a mentor. The proposal should also include a brief overview of the proposed project and project plan that includes the following :

Submit your project proposal by following the Submission Instructions. For the late day policy please see here.

Project Milestone

Your project milestone should be between 2 - 3 pages using the ICML template. The following is a suggested structure for your report:

Submit your project milestone by following the Submission Instructions. For the late day policy please see here.

Final Report Submission

Your final report should be between 6 - 8 pages using the ICML template. After the class, we will post all the final reports online so that you can read about each others work. If you do not want your final report to be posted online, then please let us know when you submit your writeup.
You should include a brief statement on the contributions of different members of the team in the report. Team members will normally g et the same grade, but we reserve the right to differentiate in egregious cases.

Submit your final report by following the Submission Instructions. For the late day policy please see here.

Report. The following is a suggested structure for the report:
Supplementary Material is not counted towards your 6-8 page limit.
Examples of things to put in your supplementary material: Examples of things to not put in your supplementary material:

Collaboration Policy and Honor Code

Projects should be done in groups of 3. In very rare exceptions we will allow groups of 2. We very strongly encourage you to do groups in 3 — we have a limited number of staff, and doing projects in groups of 3 will allow us to give you and your classmates higher quality feedback on your projects!

If you are doing this project jointly with another class, you must inform us and the other instructors, specify if there are other partners that are not in CS 234 that you are working with, and also be able to describe the aspects of the project that are relevant to CS 234.

You may use any existing code, libraries, etc. and consult any papers, books, online references, etc. for your project. However, you must cite your sources in your writeup and clearly indicate which parts of the project are your contributions and which parts were implemented by others. Under no circumstances may you look at another group’s code or incorporate their code into your project.

Also read the section on Academic Collaboration and Misconduct for an overview of the collaboration policy and academic integrity standards expected in general.

Staff Interests

Below is a list of the course staff's RL research interests. Feel free to ask any of us to be the mentor to your project.