This site

::  HOME
What? What not?
::  Site map
::  About this site
 
 

 

Top 10 info-sources

Corpora@Stanford

Getting started
@Stanford

::  Intro & Overview
Where corpora grow and why you like them
::  Playground rules
& registration

Apply for your visa to the land of corpora
::  Setting up your account
Pack your suitcase to the land of corpora

Available resources
@Stanford

::  User support
The Corpus TA &
our corpora-email-list
::  Corpora
[Ordering corpora | Checking out CDs]

::  Corpora-tools & Software
[Documents]

::  Corpus-related classes
& projects

Beyond Stanford

::  Top 10 info-sources
E-resources out there

For the Corpus TA

::  Guidelines & help
 

Our absolutely biased Top 10 Info Sources

Well, what to say? There is LOTS of stuff on corpora out there and it is getting more and more. This is an attempt to give you about 10 really good portals into the world of corpora beyond Stanford. But note also that some of our local pages contain further links to corpora and corpora-tools & Software (local as well as non-local). Ok, here we go:

General portals

  • Gateway to corpus linguistics on the internet
      By Florian Jaeger: Yeah, maybe I am being patriotic here ;-), but this is a nice page, well structured, containing tutorials, a bibliography, nice summaries of corpora, tool overviews, etc. It even explains to you how to save 16 days of work by using it as a reference source. The sitemap provides an easy way to navigate through the different parts of the site and the database seems to up-to-date (05/22/04). Finally, let me mention that this site does not only offer introductions for the beginnner but covers a wide range of more advanced topics in corpus linguistics.
  • David Lee's Database of Corpora and Tools
      By Florian Jaeger: This is definitely the biggest collection of corpora, corpora tools, tutorials, helpers, email lists, etc. (over 1,000 annotated links the last time I checked). Well, organized and easy to navigate. Nevertheless, even this one is not complete =).
  • The Linguist List Texts page
  • University of Tübingen Corpora page (German) - corpora and corpora tools
  • Essex Corpus Linguistics
  • Chris Manning's CL Resources page
      By Florian Jaeger: A large collection of annotated links (focusing on computational corpus linguistics): a lot of links to (free) software, dictionaries, copora, treebanks, corpora mark-up languages, and some organizations. This is a great site to start a search for a particular (type of) software.
  • Michael Barlow's Corpus page
  • Emily Bender's Corpora page
  • Knut Hofland's list of Corpora and Tools

Dictionary for corpus linguistics

Annotation formats

  • An overview over different annotation formats and conventions is provided by the LDC Annotations page