What about Optimality Theory?
Q: Your book is focused on old-fashioned rewrite
rules. It espouses a serialist view of morphology that is rejected by
most current phonologists. The prevailing nonsequential approach to
phonology, the Optimality Theory, OT, is based not on rewrite rules but
on constraints. Constraints are universal but each language may rank
them differently. Constraints are violable. A function called Gen maps each lexical form into all
possible output candidates. The winning candidate or candidates is
selected by checking the number of constraint violations and their
severity, determined by the ranking. The winner(s) may violate
some constraints but they come out better than any alternatives in this
evaluation. Is there a way to use
finite-state techniques to implement OT?
A: There is a lot of literature on this topic.
For example, you could start with Karttunen's paper The
Proper Treatment of Optimality in Computational Linguistics ( in
the Proceedings of FSMNLP'98.
on Finite-State Methods in Natural Language Processing, June 29-July 1,
pages 1-12, Bilkent University, Ankara, Turkey). Karttunen argues that
Gen is most likely a regular relation and that many if not all OT
constraints represent regular languages. But the fact that OT in
principle requires unlimited counting means that even the classical OT
of Smolensky and Prince (1993) cannot be modeled by a finite-state
device. The same is true, even without the counting issue, of some
other variants of OT such as John McCarthy's Sympathy Theory and
Benua's Output-output constraints.
Nevertheless, finite-state tools can be very useful for developing OT
descriptions in the classical style. The script Finnish OT Prosody
implements Paul Kiparsky's OT account of basic Finnish prosody
("Finnish Noun Inflection". In Generative Approaches to Finnic and
Saami Linguistics, Diane Nelson & Satu Manninen (eds.), pp 109-161,
CSLI Publications, 2003). The script demonstrates how to build a Gen
function that maps an underlying Finnish word into a prosodic
description that represents syllabification, primary and secondary
stress, and metrical structure. For example, the input opiskelija 'student' generates
candidate outputs such as
where periods represent syllable boundaries, the acute accent marks
primary stress, the grave accent secondary stress, and the two trochaic
feet are enclosed in parentheses. This happens to be the best output
but there are 10450 other candidates to consider. For longer words, the
number of output candidates is counted in the millions.
The script also shows how to encode and evaluate prosodic constraints
such as Clash, Align-Left, Lapse, Stress-to-Weight, and All-Feet-First as regular
expressions and how to evaluate them using an operation called Lenient Composition. Lenient
composition is described in Karttunen's 1998 paper and it is
implemented in xfst, although
the operator is not mentioned in the Book. We explain the idea here
Let us assume that R is a
mapping from the input form or forms to the current set of output
candidates and C
is the next constraint to be applied. The evaluation of the output
candidates with respect to C can be
encoded as the regular expression
R .O. C
is the lenient composition operator (N.B. capital O to distinguish
between lenient and ordinary composition). For each input form, if
there is at least one output candidate that meets the constraint C, all the
ones that violate the constraint are eliminated. Otherwise there is no
change in R
and the set of output candidates remains the same. Lenient composition
guarantees that each input form always has at least one output, no
matter how suboptimal.
The example script produces a transducer that maps 25 Finnish words
into their prosodic representations, and vice versa. Being able to do
computations of this sort can be very useful for a theoretical
phonologists because it makes it possible to work with huge sets of
output candidates without overlooking anything and because it takes
away the drudgery of manual constraint checking. In fact, without such
finite-state tools OT is a very difficult art to practice. The
implementation of Kiparsky system revealed a bug, not in the
implementation but in the constraint system itself. In some cases, the
desired winner loses to a candidate that should have been eliminated
by. For example, for the input kalasteleminen,
the constraints choose
over the desired winner
You may find it interesting to compare the OT implementation of
Finnish prosody with a non-OT
account for the same descriptive generalizations.
Finally, as to charge of serialism, constraint ranking in OT is quite
similar to rule ordering in the older rewriting paradigm.
We recommend the list of Computational
OT Papers for further study.