CS224S: Spoken Language Processing

Winter 2021

Introduction to spoken language technology with an emphasis on dialog and conversational systems. Deep learning and other methods for automatic speech recognition, speech synthesis, affect detection, dialogue management, and applications to digital assistants and spoken language understanding systems.

Syllabus Piazza Canvas Gradescope

Course Information

This course is designed around lectures, assignments, and a course project to give students practical experience building spoken language systems. We will use modern software tools and algorithmic approaches. There are no exams. We aim for each student to build something they are proud of.

Homework topics:

  1. Introduction to audio analysis and spoken language tools
  2. Building a complete dialog system using Amazon Alexa Skills Kit
  3. Building a speech recognition system with the Kaldi Speech Recognition Toolkit
  4. Implementing end-to-end deep neural network approaches with PyTorch

(Homeworks 3 and 4 will use a newly developed spoken dialog dataset, HarperValleyBank)

Course projects can range from algorithmic research with the goal of publishing academic papers, or designing and demonstrating complete dialog systems.

Course Staff

Course Assistants

Office Hours

Andrew Maas: Mon. 6:45-8 PM | Wed. 6:45-8 PM
Mike Wu: Tue. 3-4 PM | Thu. 4-5 PM
Alex Bucquet: Tue. 1:30-2:30 PM | Wed. 8:30-9:30 AM
Sandra Ha: Thu. 5-6 PM | Fri. 5-6 PM
John Kamalu: Wed. 2:30-3:30 PM | Fri. 2:30-3:30 PM
Samuel Kwong: Sat. 10 AM - 12 PM

Logistics

Please use our class Piazza forum for all communication related to the course. We encourage you to keep posts public when possible in order to prevent duplication. For private matters, please either make a private post visible only to the course instructors or email cs224s-staff@lists.stanford.edu. For longer discussions, we strongly encourage you to come to office hours.

This course is remote only for the 2020-2021 academic year due to COVID-19. Lectures and office hours will be offered synchronously on Zoom. Lectures are Mondays and Wednesdays, 5:30pm - 6:30pm PST.

Grading

  • Homework: 40%
  • Course Project: 50%
  • Participation: 10%

All assignments are to be submitted via our Gradescope. Each student will have a total of three free late (calendar) days to use for homeworks. Once these late days are exhausted, any assignments turned in late will be penalized 20% per late day. However, no assignment will be accepted more than three days after its due date. Each 24 hours or part thereof that a homework is late uses up one full late day. Please note that late days are applied individually.

Regrades will also be handled through Gradescope. We will begin to accept regrades for an assignment the day after grades are released for a window of three days. We will not accept regrades for an assignment outside of that window. Regrades are intended to remedy grading errors, so regrade requests must discuss why you believe your answer is correct in light of the deduction you received. When you submit a regrade request, the grader may review your entire assignment, in which case you may lose points on other questions. Your score on an assignment may decrease if you submit for a regrade.

Prerequisites

  • Proficiency in Python. Homework assignments will be in a mixture of Python using PyTorch, Jupyter Notebooks, Amazon Skills Kit, and other tools. We attempt to make the course accessible to students with a basic programming background, but ideally students will have some experience with machine learning or natural language tasks in Python.

  • Foundations of Machine Learning and Natural Language Processing (CS 124, CS 129, CS 221, CS 224N, CS 229 or equivalent). You should be comfortable with basic concepts of machine learning and natural language processing. We do not strictly enforce a particular set of previous courses but students will have to fill in gaps on their own depending on background.

Useful Reference Texts

  • Dan Jurafsky and James H. Martin. Speech and Language Processing (3rd ed. draft) [link]
  • Yoav Goldberg. A Primer on Neural Network Models for Natural Language Processing [link]
  • Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press. [link]

Honor code

We strongly encourage students to form study groups. Students may discuss and work on programming assignments and quizzes in groups. However, each student must write down the solutions independently, and without referring to written notes from the joint session. In other words, each student must understand the solution well enough in order to reconstruct it by him/herself. In addition, each student should submit his/her own code and mention anyone he/she collaborated with. It is also an honor code violation to copy, refer to, or look at written or code solutions from a previous year, including but not limited to: official solutions from a previous year, solutions posted online, and solutions you or someone else may have written up in a previous year. Furthermore, it is an honor code violation to post your assignment solutions online, such as on a public git repo.

The Stanford Honor Code

The Stanford Honor Code as it pertains to CS courses