CSLI Publications logo
new books
knuth books
for authors
CSLI Publications
Facebook CSLI Publications RSS feed
CSLI Publications Newsletter Signup Button
Linguistic Databases cover

Linguistic Databases

edited by John Nerbonne

Linguistic Databases explains the increasing use of databases in linguistics. The enormous potential in linguistic data—billions of utterances and messages daily—has been difficult to exploit. Data must be archived and organized. Many linguists have had to concentrate on introspective data with its inevitable blinders toward frequency, variation, and naturalness. Applications of linguistics have been handicapped. This volume explores the potential advantages of database applications to linguistics.

Databases not only store large amounts of data, but also impose an organization in data, which facilitates access for researchers and applications developers. Linguistics Databases reports on database activities in phonetics, phonology, lexicography and syntax, comparative grammar, second-language acquisition, linguistic fieldwork and language pathology. The book presents the specialized problems of multi-media (especially audio) and multilingual texts, including those in exotic writing systems. Implemented solutions are discussed. The opportunities to use existing, minimally structured text repositories are presented.

John Nerborne is Professor of Computational Linguistics and Chair of Humanities Computing Groningen.


  • 1 Introduction John Nerbonne
  • 2 TSNLP — Test Suites for Natural Language Stephan Oepen, Klaus Netter, & Judith Klein
  • 3 From Annotated Corpora to Databases: the SgmlQL Language Jacques Le Maitre, Elisabeth Murisaco, & Monique Rolbert
  • 4 An Markup of a Test Suite with SGML Martin Volk
  • 5 An Open Systems Approach for an Acoustic-Phonetic Continuous Speech Database: The S_Tools Database-Management System (STDBMS) Werner A. Deautsch, Ralf Vollmann, Anton Noll, & Sylvia Moosmüller
  • 6 The Reading Database of Syllable Structure Erik & Linda Shockey
  • 7 A Database Application for the Generation of Phonetic Atas Maps Edgar Haimerl
  • 8 Swiss French PolyPhone and PolyVar: Telephone Speech Databases to Model Inter- and Intra-speaker Variability Gerard Chollet, Jean-Luc Cochard, Andrei Constaninescu, Cedric Jaboulet, & Philippe Langlais
  • 9 Investigating Arguemnt Structure: The Russian Nominalization Database Andrew Bredenkamp, Louisa Sadler, & Andrew Spencer
  • 10 The Use of a Psycholinguistic Database in the Simplification of Text for Aphasic Readers Siobhan Devilin & John Tait
  • 11 The Computer Learner Corpus: A Testbed for Electronic EFL Tools Sylviane Granger
  • 12 Linking WordNet to a Corpus Querey System Oliver Christ
  • 13 Mulitilingual Data Processing in the CELLAR Environment Gary F. Simons & John V. Thomson
  • Name Index
  • Subject Index


ISBN (Paperback): 1575860929 (9781575860923)
ISBN (Cloth): 1575860937 (9781575860930)
ISBN (Electronic): 157586892X (9781575868929)

Subject: Linguistics; Linguistic Analysis; Computational Linguistics

Add to Cart
View Cart

Check Out

Distributed by the
University of
Chicago Press

pubs @ csli.stanford.edu