CS224S: Spoken Language Processing

Spring 2024

Introduction to spoken language technology with an emphasis on dialog and conversational systems. Deep learning and other methods for automatic speech recognition, speech synthesis, affect detection, dialogue management, and applications to digital assistants and spoken language understanding systems.

Syllabus Canvas Ed Forum

Poster Session

Please join us in person for the final projet poster session!

Anyone with Stanford affiliation, and members of the spoken language research/industry community are welcome to join us Wednesday June 5 for a final project poster session. In Spoken Language Processing this year we have about 65 student groups with project topics ranging from speech synthesis with style transfer to exploring foundation model features for spoken language tasks, and even building speech datasets for new languages! Each group will present a poster and be available for questions/discussion as guests circulate.

When: Wednesday June 5, 2024. 12:30pm - 2:00pm

Where: Mackenzie Room. Jen-Hsun Huang Engineering Center. Stanford Campus

What: Spoken Language Processing Class Project Poster Session

Who: We welcome members of the Stanford and Speech/NLP communities

Course Information

This course is designed around lectures, assignments, and a course project to give students practical experience building spoken language systems. We will use modern software tools and algorithmic approaches. There are no exams. We aim for each student to build something they are proud of.

There are three homeworks. Homework topics:

  1. Introduction to audio analysis and speech synthesis tools
  2. Working with speech recognition toolkits and APIs
  3. Leveraging audio foundation models and working with non-English speech tasks

Course projects can range from algorithmic research with the goal of publishing academic papers, or designing and demonstrating spoken language systems.

Logistics

Lectures are Mondays and Wednesdays, 12:30pm - 1:20pm Pacific time. The lecture venue is Jordan Hall room 040 (420-040), which is on the lower level of Jordan Hall and accessible via outside doors from the lower courtyard behind Jordan Hall. Lectures will be held in person and students are strongly encouraged to participate in person. We will record lectures using Zoom and make recordings available on Canvas after class (only available to enrolled students).

Please use Ed Discussion for all communication related to the course. We encourage you to keep posts public when possible in order to prevent duplication. For private matters, please either make a private post visible only to the course instructors or email cs224s-spr2324-staff@lists.stanford.edu. For longer discussions, we strongly encourage you to use office hours.

Course Staff

Course Assistants

Office Hours

Andrew Maas: Monday & Wednesday 1:20 - 2:00 PM | In person. Outside of lecture hall after class.
Gautham Raghupathi: Monday 3:15 pm - 4:15 pm. Zoom link (password: 577468)
Fahad Nabi: Tuesday 5:45 pm - 7 pm. Zoom link
Abhinav Garg: Wednesday 9 am - 10 am. Zoom link
Tolúlọpẹ́ Ogunremi: Thursday 10:30 am - 11:30 am. Zoom link

Grading

  • Homeworks: 35%
    1. Homework 1: 11%
    2. Homework 2: 12%
    3. Homework 3: 12%
  • Course Project: 60%. Point breakdown for project will be provided as part of the course project handout. Final report and poster are the main components of course project grade.
  • Participation: 5%
    1. Attending each of the 6 guest lectures in the course, or ask a question in advance on Ed if you are unable to attend. 0.5% each lecture
    2. Ed contributions. We will award 2% to the top 10 Ed contributors. All other students will receive a fraction of 2% based on their contributions relative to the 10th highest contributor. (e.g. 0.5 * 2% for 50% contribution level compared with 10th highest student)

Late days

All assignments are to be submitted via our Gradescope. Each student will have a total of five (5) free late (calendar) days to use for homeworks. Once these late days are exhausted, any assignments turned in late will be penalized 20% per late day. However, no assignment will be accepted more than three (3) days after its due date. Each 24 hours or part thereof that a homework is late uses up one full late day. Please note that late days are applied individually. Submitting a project deliverable late costs each group member one late day per day.

Regrades

Regrades will also be handled through Gradescope. We will begin to accept regrade requests for an assignment the day after grades are released for a window of three days. We will not accept regrades for an assignment outside of that window. Regrades are intended to remedy grading errors, so regrade requests must discuss why you believe your answer is correct in light of the deduction you received. When you submit a regrade request, the grader may review your entire assignment, in which case you may lose points on other questions. Your score on an assignment may decrease if you submit for a regrade.

Prerequisites

  • Proficiency in Python. Homework assignments will be in a mixture of Python using PyTorch, Jupyter Notebooks, Amazon Skills Kit, and other tools. We attempt to make the course accessible to students with a basic programming background, but ideally students will have some experience with machine learning or natural language tasks in Python.

  • Foundations of Machine Learning and Natural Language Processing (CS 124, CS 129, CS 221, CS 224N, CS 229 or equivalent). You should be comfortable with basic concepts of machine learning and natural language processing. We do not strictly enforce a particular set of previous courses but students will have to fill in gaps on their own depending on background.

Useful Reference Texts

  • Dan Jurafsky and James H. Martin. Speech and Language Processing (3rd ed. draft) [link]
  • Yoav Goldberg. A Primer on Neural Network Models for Natural Language Processing [link]
  • Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press. [link]
  • CS224N Python Tutorial [Notebook link] [Slides link]
  • CS224N PyTorch Tutorial [link]

Honor code

We encourage students to form study groups. Students may discuss and work on programming assignments and quizzes in groups. However, each student must write down the solutions independently, and without referring to written notes from the joint session. In other words, each student must understand the solution well enough in order to reconstruct it by him/herself. In addition, each student should submit his/her own code and mention anyone he/she collaborated with. It is also an honor code violation to copy, refer to, or look at written or code solutions from a previous year, including but not limited to: official solutions from a previous year, solutions posted online, and solutions you or someone else may have written up in a previous year. Furthermore, it is an honor code violation to post your assignment solutions online, such as on a public git repo.

AI Tools Policy

Students are required to independently submit their solutions for homework assignments. Collaboration with generative AI tools such as Co-Pilot and ChatGPT is allowed, treating them as collaborators in the problem-solving process. However, the direct solicitation of answers or copying solutions, whether from peers or external sources, is strictly prohibited. If you use tools to help complete the homework, please cite them in your report.

Employing AI tools to substantially complete assignments or the project is considered a violation of the Honor Code. For additional details, please refer to the Generative AI Policy Guidance here.

The Stanford Honor Code

The Stanford Honor Code as it pertains to CS courses