Assignment 2: Playing Pong with Deep Q Learning
Part 1: Question 0-4: Due Date: 2/5 (Fri) 6:00 PM (18:00) PST.
Part 2: Question 5-6: Due Date: 2/12 (Fri) 6:00 PM (18:00) PST.
See course webpage for the late day policy.
In this assignment, you will use Q-learning with function approximators (deep Q-network) to play games! We will use Open AI gym for the Atari environment. Please note that it takes more than 8 hours for Pong to train so please start early.
- understand the basic Q-learning method.
- understand the use of experience replay , target network and the selection of hyperparameter tuning.
- implement and apply a linear approximator on the Pong game.
- implement and apply the DQN on the Pong game.
- understand the differences and tradeoffs between these two approximators.
Tasks
There are 2 parts to this assignment: written and code components.
The coding assignment, including some starter code, is available here.
The written homework questions are available here.
If you would like to typeset your solution, we have provided our assignment2 LaTex file here. PLEASE DO NOT REDISTRIBUTE THIS FILE WITH ANYONE OUTSIDE THE CLASS!
Setup
Working remotely on Azure
(highly recommended for training on Pong)
As part of this course, you can use Azure for your assignments. We recommend this route for anyone who is having trouble with installation set-up, or if you would like to use better CPU/GPU resources than you may have locally. Please see the set-up tutorial here for more details.Working locally
While you are waiting for the Azure subscriptions to be set up, you can get started on all problems except q5 and q6 (Pong).Note: Please be sure you have Anaconda or Miniconda installed. The following instructions should work on all platforms. If you have any trouble getting set up, please come to office hours and the TAs will be happy to help.
Create conda environment on your local system: replace <your-system> with your system, either mac or windows
cd starter_code_torch
conda env create -f cs234-torch-<your-system>.yml
conda activate cs234-torch
Notes
NOTE 1: Please start early!Submitting your work
Submit both your written assignment and coding assignment by following the Submission Instructions.Reference papers
- Human-level control through deep reinforcement learning, V. Mnih et al., Nature, 2015.
- Playing Atari with Deep Reinforcement Learning, V. Mnih et al., NIPS Workshop, 2013.