Announcements

5/4/08 Quiz 4 was posted last Thursday and is due tonight, Sunday, at midnight, as with previous quizzes.
4/30/08 PA1 is graded and will be handed back in class today. Unpicked-up PA1's will be placed in the slots outside of Professor Manning's office. Because of getting PA1 back, Quiz 4 will be given out delayed starting Wednesday night.
4/23/08 Quiz 3 is up and due at midnight next Sunday, 4/27. Dittos otherwise.
4/15/08 Quiz 2 is up and due at midnight next Sunday, 4/20. The format and rules are the same as before.
4/7/08 Quiz 1 is up and due at 5:00 pm this Friday, 4/11. There is no time limit but you may only take the quiz once. The format is multiple choice.
4/6/08 Quiz 0 is up for everyone to make sure they can log into and take the quizzes.
3/28/08 Looking for a programming partner? Try posting to the class newsgroup.
3/28/08 This quarter, cs224n will be available broadcast by SCPD for the first time ever. Welcome!
3/28/08 We've started updating the website for Spring 2008, but there's still work to do. Don't rely on the info here before Monday night.


Course Description

This course is designed to introduce students to the fundamental concepts and ideas in natural language processing (NLP), and to get them up to speed with current research in the area. It develops an in-depth understanding of both the algorithms available for the processing of linguistic information and the underlying computational properties of natural languages. Word-level, syntactic, and semantic processing from both a linguistic and an algorithmic perspective are considered. The focus is on modern quantitative techniques in NLP: using large corpora, statistical models for acquisition, disambiguation, and parsing. Also, it examines and constructs representative systems.

Prerequisites

  • Adequate experience with programming and formal structures (e.g., CS106B/X and CS103B/X).
  • Programming projects will be written in Java 1.5, so knowledge of Java (or a willingness to learn on your own) is required.
  • Knowledge of standard concepts in artificial intelligence and/or computational linguistics (e.g., CS121/221 or Ling 180).
  • Basic familiarity with logic, vector spaces, and probability.

Intended Audience

Graduate students and advanced undergraduates specializing in computer science, linguistics, or symbolic systems.

Textbook and Readings

This year, the required text will be:

  • Daniel Jurafsky and James H. Martin. 2008. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition. Second Edition. Prentice Hall.
    The book won't be able in time for the class. (June 2008 update: it's now available for purchase!) We will use a reader containing parts of the second edition. The reader is available for ordering at University Readers. You order it online and they ship it to you. The cost is $40.58. [Detailed purchasing instructions.] Once you've ordered it, you can have access to the first couple of chapters that we'll use online for free. If you have any difficulties, please e-mail orders@universityreaders.com or call 800.200.3908, and email the class email list. It's referred to as J&M in the syllabus. [Book website]

Of course, I'm also fond of:

  • Christopher D. Manning and Hinrich Schütze. 1999. Foundations of Statistical Natural Language Processing. MIT Press.
    Buy it at the Stanford Bookstore (recommended class text) or Amazon ($64 new).
    You can read the text online on a Stanford network computer! It's referred to as M&S in the syllabus. While a bit older, it also has good and often distinct coverage of many topics. All NLP researchers should have an autographed copy. Please see http://nlp.stanford.edu/fsnlp/ for supplementary information about the text, including errata, and pointers to online resources.

Other useful reference texts for NLP are:

  • James Allen. 1995. Natural Language Understanding. Benjamin/Cummings, 2ed.
  • Gerald Gazdar and Chris Mellish. 1989. Natural Language Processing in X. Addison-Wesley. [Where X = Prolog, Lisp, or, I think, Snobol.
  • Frederick Jelinek. 1998. Statistical Methods for Speech Recognition. MIT Press.

Other papers with relevant material will occasionally be posted or distributed for appropriate class lectures.

Copies of in-class hand-outs, such as readings and programming assignments, will be posted on the syllabus, and hard copies will also be available outside Gates 158 (in front of Prof. Manning's office) while supplies last.

Assignments and Grading

There will be three substantial programming assignments, each exploring a core NLP task. They are a chance to see real, close to state-of-the-art tools and techniques in action, and where students learn a lot of the material of the class.

There will be a final programming project on a topic of your own choosing.

Finally, there will be simple weekly online quizzes, which will aim to check that you are thinking about what you hear/read.

Course grades will be based 60% on programming assignments (20% each), 8% on the quizzes, and 32% on the final project.

Be sure to read the policies on late days and collaboration.

Section

Sections will be held most weeks to go over background material, or to address issues related to the programming assignments. Sections are optional, but students are encouraged to attend for a better understanding of background material and the assignments.

Course Information


Lectures: MW 11:00-12:15
Location: Terman Aud
Section: F 11:00-11:50
Location: Skilling 193
Professor: Chris Manning

Electronic Communications

Web: http://cs224n.stanford.edu/

Newsgroup: su.class.cs224n
-- Post general assignment questions, etc. here.

Staff mailing list:
cs224n-spr0708-staff@lists.stanford.edu

Announcements mailing list:
cs224n-spr0708-students@lists.stanford.edu

Enrolled students are automatically subscribed.
Others wishing to receive announcements should
go to mailman.stanford.edu, and subscribe to
"cs224n-spr0708-guests".

Assignments

Quizzes (due weekly)
Assignment 1 (due 4/16/08)
Assignment 2 (due 4/30/08)
Assignment 3 (due 5/14/08)
Final project

Collaboration Policy
Late Day Policy
Regrading Policy

Links

The Stanford NLP Group
Linguistic Corpora at Stanford
Statistical NLP links
Probabilistic parser links
Java 1.5 Overview
Java 1.5 New Features