Course Project

Winter 2021

Project Overview

One of the main goals of this course is to prepare you to develop spoken language processing systems of practical use. If you are interested in research, CS224S should also leave you well-qualified to do speech recognition and language understanding in an academic setting. The final project in this course will offer you an opportunity to do exactly this.

Important Dates

  • Proposal: Due at 11:59 PM PST on Wednesday, February 17

  • Milestone: Due at 11:59 PM PST on Wednesday, March 3

  • Presentation: During lecture times on Monday, March 15 and Wednesday, March 17

  • Report: Due at 11:59 PM PST on Monday, March 22. You may not use late days for this extended deadline.

Project Topics

Your first task is to pick a project topic. If you are looking for project ideas, please see the course project lecture and check Piazza for ideas from the staff and other teams. You can also go to either Andrew Maas’ or Mike Wu’s office hours for help generating and refining project ideas.

Task Project

Many fantastic class projects come from students picking either an application or dataset that they are interested in and applying topics from the class to that task. Alternatively, if you are interested in specific set of techniques from the class we can help find a dataset or task where you can tractably explore those techniques.

Building a functioning dialog system for a specific task can also be a great project. While this may not be as research-oriented in the sense of running experiments, an excellent course project could build a useful dialog system (e.g. Alexa Skill) for a task and be able to demo it. This demo system needs to be paried with some empirical evaluation, training a component, or otherwise designing and testing the system in a thoughtful way.

Research Project

If you are already working on a research project related to topics in class we encourage you to apply what you learned in class as a project. An excellent CS224S project will comprise a publishable or nearly-publishable piece of work. In previous years, some number of students continue working on their projects after completing CS224S and submit their work to a conference or journal. You can also look through the course projects from 2017.

Recent Works

For inspiration, you might also look at some recent spoken language understanding research papers. Topics covered in class span several conferences, but you can look at the recent pro- ceedings of Interspeech, ASRU, SigDial, EMNLP, ACL, NAACL, NeurIPS, ICLR, and ICML for research papers in this area.

Datasets

This quarter, we have introduced a new dataset: the HarperValleyBank corpus. We encourage students to explore this in their projects. Stanford also has many datasets which you might use as a benchmark task for your project. You can browse the available datasets from LDC and similar in the NLP group inventory here. We also compiled a supplemental list here. Please post on Piazza if you need help getting access to a dataset.

Project Logistics

Notes on Forming Projects

  • Team size. To facilitate overlap with other courses, students may do final projects solo or in teams of up to 3 people. We strongly recommend you do the final project in a team, as we expect significant experiment time vs. just setting up tools/data and running a baseline. We expect larger teams to undertake larger projects or more experiments.

  • Contribution. We expect each team member to make a significant contribution to the overall project. In the final report, include a Contribution section that describes what pieces each person contributed to the final project. We typically assign the same project grade to all team members, but we might differentiate in rare cases. You can contact us in confidence in the event of unequal contribution.

  • External collaborators. You can work on a project that has external (non CS224S student) collaborators, but you must make it clear in your final report which parts of the project were your work.

  • Sharing projects. You can share a single project between CS224S and another class, but we expect the project to be accordingly bigger, and you must declare that you are sharing the project in your project proposal.

  • Using external resources. You may use external tools/frameworks for building deep learning systems, doing speech processing, or building something in a framework like Alexa Skills Kit. Simply stitching together tools for a demo is not a sufficient project. Instead, use tools as a way to focus your effort on the meaningful research question or most critical capabilities for your project.

Project Evaluation

Projects will be evaluated based on:

  • Technical quality of the work. (Does the technical material make sense? Are the things tried reasonable? Are the proposed algorithms or applications clever and interesting? Do the authors convey novel insight about the problem and/or algorithms? Are any system demos compelling in their function/scope?)

  • Significance. (Did the authors find a dataset to appropriately test their hypothesis? Is this work likely to be useful and/or have impact? Could a demo system / tool be used in practice?)

  • Novelty of the work. (Does this solve a new problem/domain/task/dataset? Does it introduce a new approach? Is it a justifiable next step given previous work in this area?)

  • Clarity of the write-up. (Your write-up should clearly describe your task, dataset, solution approach, relevant related work, any experiments/test you tried, along with your conclusions.)

Try not to overthink these criteria nor worry too much if you’re not sure that you can do well on all of them. Just think of this as an “ideal” that you should aspire to (especially if your goal is to do publishable work).

Lastly, a few words of advice: Many of the best class projects come from students working on topics that they are excited about. So, pick something that you can get excited and passionate about! Be brave rather than timid, and do feel free to propose ambitious things that you are excited about. Finally, if you are not sure what would or would not make a good project, we encourage you to either post on Piazza or come to office hours to talk about project ideas.

Project Office Hours

Please use Andrew’s and Mike’s office hours only for project-related questions. For homework questions you can visit any other TA’s office hours. In addition to office hours, Mike will be opening 30 min slots from 10 AM to 3 PM every Friday to chat with teams about project related questions. Make a booking here.

Deliverables

All project groups will submit each of the following via Gradescope.

Project Proposal (10%)

Proposals are due at 11:59 PM PST on Wednesday, February 17th. A proposal should be a maximum of 500 words and include the following:

  • The task you plan to work on.
  • The dataset you plan to use.
  • A sketch of your proposed approach/model.
  • How do you plan to evaluate your approach.

If your proposed project will be done jointly with a different class project (with the consent of the other class instructor), your proposal must clearly say so. The teaching staff will review your proposal and contact you if we foresee any issues with your project.

Milestone (10%)

Milestones are due at 11:59 PM PST on Wednesday, March 3. They should be submitted through Gradescope. A milestone report should include the following:

  • Literature review of 2-5 relevant papers.
  • If your approach requires collecting data, describe your data collection set up.
  • If you are working with an existing dataset, show basic data exploration for the dataset and how you are/planning to train with / use the dataset.
  • Baseline results for comparison to your ongoing work.

Project Presentation (10%)

Project presentation happens during lecture time. Each project team will submit a short video presenting the major results of their work. Your video can include a system demo, slides with voiceover, or screen capture of how your system works. A recorded Zoom session is sufficient for this video. During class time, we will play your video and people can ask questions in chat for you to discuss/answer. We will announce details of the final presentation sessions later in the quarter.

Final Report (70%)

Final project write-ups are due at 11:59 PM PST on Monday, March 22. Final project write-ups should be 5 pages of text and may include additional pages for appendices, figures, references, and everything else you choose to submit. The following is a suggested structure:

  • Title, Author(s)

  • Abstract: This is an overview of the story of your project (the motivation, the findings, the impact). It should not be more than 300 words.

  • Introduction: This section introduces your problem and the overall plan for approaching your problem. Motivate why the problem your project is solving is important.

  • Related Works: This section discusses relevant literature for your project. What other approaches have tried to solve the same problem as your project? What are the differences in methodology?

  • Approach: This section details the framework of your project. Be specific, which means you might want to include equations, figures, plots, etc. Do not mention any experiments yet; this is your opportunity to present the models, algorithms, and new technical contributions in abstract.

  • Experiments: This section begins with what kind of experiments you are doing, what kind of dataset(s) you are using, and what is the way you measure or evaluate your results. It then shows in details the results of your experiments. By details, we mean both quantitative evaluations (show numbers, figures, tables, etc) as well as qualitative results (show images, example results, etc).

  • Conclusion: What have you learned? Summarize takeaways and suggest future directions.

  • References: This is absolutely necessary. Please follow a consistent citation scheme for your report template and include references to previous work related to your task, dataset, and/or modeling approaches.

Please use ACL2020 template for write-ups. The template can be downloaded here. If you do this work in collaboration with someone else, or if someone else (such as another professor) advises you on this work, your write-up must fully acknowledge their contributions.

After the class, we will post all the final write-ups online so that you can read about each others’ work. If you do not want your write-up to be posted online, please specifically mention it in the final paragraph of your write-up.

Previous Course Projects

Spring 2017

List of Projects What's Up, Doc? A Medical Diagnosis Bot
Monica Agrawal, Janette Cheng, Caelin Tran

Alternative Political Speech Classification "Facts"
Tyler Dammann, Regina Nguyen

Improving Forced Alignments
Christina Ramsey, Frank Zheng

Predicting Assertiveness in Conversation Using Deep Learning
Isabella Cai, Catherina Xu, Grace B. Young

Dialogue Acts in Design Conversations
Ethan Chan, Aaron Loh, Connie Zeng

Style Transfer for Prosodic Speech
Anthony Perez, Chris Proctor, Archa Jain

Native Language Identification of Spoken Language Using Recurrent
Kai-Chieh Huang, Jennifer Lu, Wayne Lu

Reading Emotions from Speech using Deep Neural Networks
Anusha Balakrishnan, Alisha Rege

Automatic Lyrics Transcription by Separating Vocals from Background Music
Diveesh Singh, Helen Jiang, Mindy Yang

Applying a Recurrent Neural Network using Connectionist Temporal Classification to Automatic Recognition of Lyrics in Singing
Maneesh Apte, Matthew Chen, Teddy Morris-Knower, Shalom Rottman-Yang

Deep RNN Speech Recognition with Sub-Labels
Jiayu Wu, Yangxin Zhong, Qixiang Zhang

NATLID: Native Language Identification
Ankita Bihani, Anupriya Gagneja, Mohana Prasad Sathya Moorthy

Detecting Personality Traits in Conversational Speech
Liam Kinney, Anna Wang, Jessica Zhao

Compression of Deep Speech Recognition Networks
Stephen Koo, Priyanka Nigam, Darren Baker

Generating Adversarial Examples for Speech Recognition
Dan Iter, Jade Huang, Mike Jermann

Rappify: Adding Rhythm to Speech
Ian Torres, Jacob Conrad Trinidad

Identifying Confidence in Speech
Grady Williams, Bryan McLellan, Grant Sivesind

Battleship as a Dialog System
Gerry Meixiong, , Tony Tan-Torres, Jeffrey Yu

The Effect of Speech Disfluencies on Turn-Taking
Lucy Li, Kartik Sawhney, Divya Sain

Native Language Identification from Speech Transcriptions
Kent Blake, Greg Ramel, Matthew Volk

A Neural Network Approach to the Native Language Inference Task
Roger Chen, Kenny Leung

Improving Conversational Forced Alignment with Lexicon Expansion
Christopher Liu, , Stephanie Mallard, Ryan Silva

Detecting Lies via Speech Patterns
Amanda Chow, John Louie

pWAVE: A Novel Dataset for Emotional Confidence Detection
Sanjay Kannan, Rooz Mahdavian

Infer Your First Language from Your English
Qiwen Fu, Wei-ting Hsu, Yundong Zhang

Native Language Identification from i-vectors and Speech Transcriptions
Ben Ulmer, Aojia Zhao, Nolan Walsh

Monaural Source Separation Using Neural Networks
Simon Kim, Mark Kwon, Sunmi Lee

Neural Lie Detection with the CSC Deceptive Speech Dataset
Shloka Desai, Maxwell Siegelman, Zachary Maurer

Native Language Identification through Speech
Delenn Chin, Kevin Chen, David Morales

Dialogue System for Restaurant Reservations using Hybrid Code Network
Charles Akin-David, David Xue, Evelyn Mei

VitiBot: A Dialog System Sommelier
Stephanie Tang, Ivan Suarez, Jim Andress

Applying Artistic Style Transfer to Natural Language
Thaminda Edirisooriya, Morgan Tenney

End-to-end neural networks for subvocal speech recognition
Pol Rosello, Pamela Toman, Nipun Agarwala

Latent Sentiment
Frank Cipollone, Hugo Clifford Kitano, Mila Faye Schultz

Pitch Perfect: Predicting Startup Funding Success Based on Shark Tank Audio
Shubha Raghvendra, Jeremy Wood, Minna Xiao

Improving Acoustic Models for Enriched Lexicons
Vivian Hsu, Addison Leong, Antariksh Mahajan

Text-to-speech Synthesis System based on Wavenet
Yuan Li, Xiaoshi Wang, Shutong Zhang

Classification and Recognition of Stuttered Speech
Manu Chopra, Kevin Khieu, Thomas Liu

Convolutional Neural Networks and Grammar Rules Analysis for Speech-based Native Language Identification
Dilsher Ahmed, Long-Huei Chen, Ayooluwakunmi Jeje

Statistial Methods for Native Language Identification
Tony Bruess, Frank Fan, Brexton Pham

End-to-End Neural Speech Synthesis
Alex Barron

Storytime - End to end neural networks for audiobooks
Pierce Freeman, Ethson Villegas, John Kamalu

Accent Conversion Using Artificial Neural Networks
Amy Bearman, Kelsey Josund, Gawan Fiore

Learning to Recognize Speech From Chaotically Synthesized Data
Faraz Bonab, Samuel Ginn

Applying Backoff to Concatenative Speech Synthesis
Lily Liu, Luladay Price, Andrew Zhang

Deep Learning Approaches for Online Speaker Diarization
Chaitanya Asawa, Nikhil Bhattasali, Allan Jiang

Auditory Deep Q Networks
Austin Ray, Do-Hyoung Park, Vignesh Venkataraman

Detecting and Artistically Representing Romantic Compatibility in Human Dialogue
Chris Salguero, Anna Teixeira, Ramin Ahmar

Modeling Intonation in Text-to-Speech Synthesis with a Bidirectional Long Short-Term Memory Recurrent Neural Network
Kevin Garbe, Aleksander Glowkal

Mark My Words! End-to-End Memory-Enhanced Neural Architectures for Automatic Speech Recognition
Amani Peddada, Lindsey Kostas