Restriction Operator Error: Technical Note

The implementation of the restriction operator, =>,  in the Book Version of xfst is not correct for restrictions that involve alternative contexts. For example, the expression

a => b _ c , d _ e

describes a language in which any instance of a occurs either in the context b_c or in the context d_e. The section on Restriction says correctly on page 65 that the above expression is equivalent to

[ [a => b _ c ] | [ a => d _ e ] ]*

These two expressions are indeed equivalent in the Book version of xfst and the text suggests that the same would be true for more complex cases. But the equivalence fails in more complicated cases where the language being constrained partially overlaps with a context. For example,  in the Book version of xfst

(a) b (c) => x (a) _ , _ (c) y

is not equivalent to

[ [(a) b (c) => x (a) _ ]  |  [(a) b (c) =>  _ (c) y] ]*

due to an error in the compilation algorithm for the restriction operator. The mistake was pointed out by Anssi Yli-Jyrä of the University of Helsinki, who generously provided us his own algorithm for compiling restriction expressions with multiple contexts. Yli-Jyrä's elegant and correct compilation algorithm for => expressions is implemented in xfst-8.3.3 and subsequent versions. The last two expressions are in fact equivalent and now compile into identical networks. But this is a fortuitous coincidence. The equivalence is due to the fact that a, b, c, d, e, x, and y in these examples are single symbols. In the general case, where the symbols may denote arbitrary languages with longer strings, the corresponding pairs of expressions are not equivalent. Correct handling of restrictions with multiple contexts requires a more complicated algorithm that xfst uses internally. See "Compiling Contextual Restrictions on Strings into Finite-State Automata" by Anssi Yli-Jyrä and Kimmo Koskenniemi, in the Post-proceedings of Eindhoven Fastar Days, September 3-4, 2004 (forthcoming).

Last Modified: Thursday, 28-Oct-2004 18:01:51 PDT