This site

::  HOME
What? What not?
::  Site map
::  About this site
 
 

 

Corpus-related classes & projects

Corpora@Stanford

Getting started
@Stanford

::  Intro & Overview
Where corpora grow and why you like them
::  Playground rules
& registration

Apply for your visa to the land of corpora
::  Setting up your account
Pack your suitcase to the land of corpora

Available resources
@Stanford

::  User support
The Corpus TA &
our corpora-email-list
::  Corpora
[Ordering corpora | Checking out CDs]

::  Corpora-tools & Software
[Documents]

::  Corpus-related classes
& projects

Beyond Stanford

::  Top 10 info-sources
E-resources out there

For the Corpus TA

::  Guidelines & help
 

Classes

See also the department's site containing description of all linguistics courses.

  • LIN 12Q - You Can't Say That! Usage and Prescriptive Grammar
      Stanford Introductory Dialogue. Preference to sophomores. Prescriptions about language, both spoken and written; opinions about which choices are best or standard, from sometimes conflicting authorities. Case studies in modern English, using dictionaries, usage manuals, popular writing on language, and research on actual usage.
  • LIN 128/228 - Real English: The Syntax of Language Use
      Hands-on experience with modern corpus methods, and natural spoken and written syntactic data. Introduce and develop syntax through the syntactic analysis of spontaneous spoken conversations as well as newspaper reportage, using tagged and parsed corpora such as the Penn Treebank. Topics include standard subject matter suitable for a syntax introduction, but each of the core topics is investigated empirically in natural English.
  • LIN 203 - Research Methods in Linguistics
      Introduction to current research methods in linguistics through presentations and in-class, hands-on exercises. Topics include use of corpus data, extraction of suitable data from corpora, use of human subjects, experimental design, and elicitation and observation in the field and laboratory. Primarily for first year Ph.D. students in Linguistics; also open to M.A. students.
  • LIN 237/CS 224N - Natural Language Processing
      Algorithms for processing linguistic information and the underlying computational properties of natural languages. Morphological, syntactic, and semantic processing from a linguistic and an algorithmic perspective. Focus is on modern quantitative techniques in NLP: using large corpora, statistical models for acquisition, representative systems. Prerequisites: 138/238 or CS 121/221, and programming experience. Recommended: basic familiarity with logic and probability.
  • LIN 237D - Readings in Natural Language Processing

Projects

Although this list is far from being complete, this page is an attempt to give some pointers to corpus-related projects at Stanford.

  • LinGo has several strands of research that are corpus-related. See http://lingo.stanford.edu/ for detailed information. From the LinGo Project site:

      "The CSLI LinGO Lab is committed to the development of linguistically precise grammars based on the HPSG framework, and general-purpose tools for use in grammar engineering, profiling, parsing and generation. Early work in the CSLI LinGO Lab focused on the construction of a general-purpose grammar of English in the form of the English Resource Grammar (or ERG), and on further development of the LKB grammar engineering system. The LKB was also used at CSLI as the testbed for a number of teaching grammars and smaller-scale grammars for other languages including Japanese and Spanish."
    The relevant contact people are Tim Baldwin (tbaldwin@csli), Stephan Oepen (oe@csli), and Dan Flickinger (danf@csli).
  • Infomap
    The contact person is Dominic Widdows at CSLI.
  • PARC has several grammar designing projects going on.