This course is designed around lectures, assignments, and a course project to give students practical experience building spoken language systems. We will use modern software tools and algorithmic approaches. There are no exams. We aim for each student to build something they are proud of.
There are four homeworks. Homework topics:
- Introduction to audio analysis and spoken language tools
- Building a complete dialog system using Amazon Alexa Skills Kit
- Implementing end-to-end deep neural network approaches to speech recognition using PyTorch
- Working with advanced deep learning toolkits for speech recognition (SpeechBrain) and voice cloning
(Homeworks 3 and 4 will use a newly developed spoken dialog dataset, HarperValleyBank)
Course projects can range from algorithmic research with the goal of publishing academic papers, or designing and demonstrating spoken language systems.
Lectures are Tuesdays and Thursdays, 5:30pm - 6:30pm Pacific time. The lecture venue is Sapp Center for Science Teaching & Learning room 111 (STLC 111). Lectures will be held in person and are not recorded to encourage more student interaction and candid discussion.
Please use Ed Discussion for all communication related to the course. We encourage you to keep posts public when possible in order to prevent duplication. For private matters, please either make a private post visible only to the course instructors or email email@example.com. For longer discussions, we strongly encourage you to use office hours.
Andrew Maas: Thu. 6:45-8 PM | In person. Lecture hall after class.
Gaurab Banerjee: Wed. 1:30-2:30pm, 3-4pm | calendly
Shreya Gupta: Thu. 2-3 PM, Fri. 11 AM - 12 noon | calendly
Alex Ke: Tue. 3-5 PM | calendly
- Homeworks: 45%
- Homework 1: 10%
- Homework 2: 12%
- Homework 3: 12%
- Homework 4: 11%
- Course Project: 50%. Point breakdown for project will be provided as part of the course project handout. Final report and poster are the main components of course project grade.
- Participation: 5%
- Attending each of the 3 guest lectures in the course, or ask a question in advance on Ed if you are unable to attend. 1% each lecture
- Ed contributions. We will award 2% to the top 10 Ed contributors. All other students will receive a fraction of 2% based on their contributions relative to the 10th highest contributor. (e.g. 0.5 * 2% for 50% contribution level compared with 10th highest student)
All assignments are to be submitted via our Gradescope. Each student will have a total of three free late (calendar) days to use for homeworks. Once these late days are exhausted, any assignments turned in late will be penalized 20% per late day. However, no assignment will be accepted more than three days after its due date. Each 24 hours or part thereof that a homework is late uses up one full late day. Please note that late days are applied individually. Submitting a project deliverable late costs each group member one late day per day.
Regrades will also be handled through Gradescope. We will begin to accept regrade requests for an assignment the day after grades are released for a window of three days. We will not accept regrades for an assignment outside of that window. Regrades are intended to remedy grading errors, so regrade requests must discuss why you believe your answer is correct in light of the deduction you received. When you submit a regrade request, the grader may review your entire assignment, in which case you may lose points on other questions. Your score on an assignment may decrease if you submit for a regrade.
Proficiency in Python. Homework assignments will be in a mixture of Python using PyTorch, Jupyter Notebooks, Amazon Skills Kit, and other tools. We attempt to make the course accessible to students with a basic programming background, but ideally students will have some experience with machine learning or natural language tasks in Python.
Foundations of Machine Learning and Natural Language Processing (CS 124, CS 129, CS 221, CS 224N, CS 229 or equivalent). You should be comfortable with basic concepts of machine learning and natural language processing. We do not strictly enforce a particular set of previous courses but students will have to fill in gaps on their own depending on background.
Useful Reference Texts
- Dan Jurafsky and James H. Martin. Speech and Language Processing (3rd ed. draft) [link]
- Yoav Goldberg. A Primer on Neural Network Models for Natural Language Processing [link]
- Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press. [link]
- CS224N Python Tutorial [Notebook link] [Slides link]
- CS224N PyTorch Tutorial [link]
We strongly encourage students to form study groups. Students may discuss and work on programming assignments and quizzes in groups. However, each student must write down the solutions independently, and without referring to written notes from the joint session. In other words, each student must understand the solution well enough in order to reconstruct it by him/herself. In addition, each student should submit his/her own code and mention anyone he/she collaborated with. It is also an honor code violation to copy, refer to, or look at written or code solutions from a previous year, including but not limited to: official solutions from a previous year, solutions posted online, and solutions you or someone else may have written up in a previous year. Furthermore, it is an honor code violation to post your assignment solutions online, such as on a public git repo.