STANFORD CS 224N -- Ling 237
Natural Language Processing 
Spring 2002

Course Information

Lecture: 4/3 units, MW 12:50-2:05 Gates B12
Section: 0 units, W 5:00-6:00 380-380Y


  • The final project deadline has been extended to Wednesday, June 5. However, because of time constraints for getting your final grades submitted, we are requiring that no more than 5 late days can be used on the final project. This means that the absolute final deadline is Monday, June 10. To calculate the late days available for your group, take (the integer lower bound of) the mean number of late days for the people in your group. That is, if you have 2 people in your group, and one person has 4 late days left, and the other person has 1 late day left, then you may use 2 late days for your final project.

  • The data and the script that were used for the checkpoint can be found at /afs/ir/class/cs224n/wsdp/bin/check_cases. If you received a message about "too many lines in STDIN" or "not enough lines in STDIN", then your script is making an error, and you should fix it before the final project deadline. Most likely, if you receive "not enough lines in STDIN", you are assuming that there is an extra line at the top of the .test files -- this may be because you are using InstanceSet directly with the test files, even though it is designed to handle training files, which have this extra line at the top. Before you submit your final project, please test it using this script to ensure that you have not made any such errors.

  • Some of you have been asking about a Java interface to WordNet. I have installed JWordNet in the class directory, at /afs/ir/class/cs224n/bin/JWordnet. There are some example programs in the directory. To run them, you must first set the WNHOME environment variable (setenv WNHOME /afs/ir/data/linguistic-data/lib/wordnet-1.7). When you run the program, you must explicitly give environment variable. To run the "hyponym" program, for example, type (from the JWordNet directory) "java -DWNHOME=$WNHOME hyponym dog". This will give you all of the hyponyms for the word dog.

  • The first homework assignment is due at 5pm on Monday, April 16. If you do not hand in the assignment in class, you should give it to Chris Manning (Gates 418) or Sara Weden (Gates 419). If neither of them are in their offices, please write the time that you submitted it on top of the assignment and slide it under Professor Manning's door.

  • The section will be held on Wednesdays from 5-6pm. The location will be determined soon and posted here early next week. Our apologies to the few people who are not available at this time. We chose the time with the fewest conflicts, but unfortunately, we could not accommodate everyone's schedule. If you are unable to attend section, please feel free to make use of TA office hours to catch up on material covered in section.

  • The section will be held in building 380, room 380Y.


    Useful Information and Handouts

    Course Description

    This course is designed to introduce students to the fundamental concepts and ideas in natural language processing (NLP), and to get them up to speed with current research in the area. It develops an in-depth understanding of both the algorithms available for the processing of linguistic information and the underlying computational properties of natural languages. Word-level, syntactic, and semantic processing from both a linguistic and an algorithmic perspective are considered. The focus is on modern quantitative techniques in NLP: using large corpora, statistical models for acquisition, disambiguation, and parsing. Also, it examines and constructs representative systems.


    Intended Audience

    Graduate students and advanced undergraduates specializing in computer science, linguistics, or symbolic systems.


    Sections will be held most weeks to go over background material, or to work through problems of the sort found in written and programming assignments. Students are strongly encouraged to attend sections for a better understanding of background material and the assignments.

    Required Materials

    Textbook and Readings

    The required text is Please see for supplementary information about the text, including errata, and pointers to online resources.

    As an additional optional text, you will also find in the bookstore

    Additional papers will occaisionally be distributed and discussed during the course of the class.

    Copies of in-class hand-outs, such as homework assignments and problem set solutions, will be posted here, and hard copies will also be available in the "handout hangout" in Gates 1B while supplies last.