Optional Replacement
The Beesley and Karttunen Book gives an incorrect explanation of the
semantics of the optional replace opearator (>). It
says, on p. 66,
[A (>) B]
denotes an optional replacement, that is, the union of [A > B]
with the identity relation on A.
That is incorrect, as you can see by comparing the networks compiled
with [a
(>) b] and [[a > b]  a]:
Regular expression

Network

[a
(>) b] 
fs0:
? > fs0, a > fs0, b > fs0, a:b > fs0. 
[[a
> b]  a] 
fs0:
? > fs1, a > fs2, b > fs1, a:b > fs1.
fs1: ? > fs1, b > fs1, a:b > fs1.
fs2: (no arcs) 
The correct statement is
[A (>) B]
denotes an optional replacement. It is equivalent to [[A > B]  ?*]*.
The right hand side of the union must be ?*, here the
universal identity relation that maps any string into itself.
Furthermore, we need the zeroplus (Kleenestar) operator around the
union to ensure that any instance of "a" in the upper side string
corresponds both to a "b" and to an "a". For example, the
input "aa" must produce four output strings:
xfst[0]: read regex
[[a > b]  ?*]*;
360 bytes. 1 state, 4 arcs, Circular.
xfst[1]:
apply
down aa
aa
ab
ba
bb
Without the outmost *
operator, we would only get two outputs, "aa" and "bb".
Thanks to Gary A. Coen for spotting the error!
Last Modified: Tuesday, 10Jul2007 18:39:49 PDT