Stefan Th. Gries Colloquium
Stanford Linguistics is pleased to announce the following colloquium:
Stefan Th. Gries
University of California, Santa Barbara
Friday, January 15, 3:30 pm, Margaret Jacks 126
Reception immediately afterwards in the department lounge.
Thanks Arto!
Abstract
Corpus linguistics is inherently a distributional discipline: corpora contain nothing but things to count: frequencies of occurrence (of morphemes, words, lemmas, n-grams, utterances, texts, etc.), frequencies of co-occurrence (of words, words and patterns, patterns and patterns, etc.), and distributions of elements (of elements within and across files/texts/registers). Thus, any subject studied corpus-linguistically must be operationalized in terms of counts and dispersions.
However, a decision in favor of a particular operationalization requires potentially treacherous decisions regarding the desired/required level of granularity. In many contemporary corpus-linguistic studies, such decisions are made arbitrarily and top-down/a priori. In this talk, I will argue in favor of (i) a more wide-spread use of different kinds of bottom-up approaches and (ii) more frequent and more thorough exploration as well as combination of different levels of granularity in corpus linguistic studies. To exemplify these arguments, I will use studies of register variation as well as diachronic change.