Q: The 'define' operator behaves in some undocumented way that can make it annoyingly hard to find bugs. The reason, I think, is because the interactive xfst shell remembers the definitions even though 'clear' is used. Perhaps a command to clear up the definitions should also be used?!
A: 'clear' is short for 'clear stack', which pops off and throws away any networks sitting on the stack.
When you use 'define', a network value is bound to a variable, and it is entered in the symbol table, which is quite separate from the stack. And defined values are remembered until they are 'undefine'd.
xfst: help undefine
It tells you that you can type
xfst: undefine ALL
to undefine all the entries in the symbol table.
Q: Consider the following real life example where a user defines A and then defines B, but uses B in A (meaning, B is used before it is defined). So now when the user invokes xfst and 'source's the file containing the 'define' statements A will not be properly defined and thus some things won't work. Now, the user, instinctively, reloades the file, and now suddenly things work. But wait -- before the second loading, the user might have changed B. Now the results are somehting really REALLY hard to debug because the code you have does not and cannot explain the mess caused by the wrong order of definitions.
A: Oops. In xfst there is no possiblity of forward reference--no use of variables before they are defined.
In contrast, in lexc, which is a kind of right-recursive phrase-structure grammar, each LEXICON is a production, and you can write and refer to LEXICONs in any order. xfst is quite different.
Take this example:
xfst: define A d o g B ;
xfst: define B f o o ;
In the first 'define' above, it is _not_ the case that "B is used before it is defined'. Rather what will happen in this case is that B is interpreted as a simple letter.
Here's a similar sequence of 'define' statements:
xfst: define Foo x y z Fum ;
xfst: define Fum m o o ;
Here in the definition of Foo, it is not the case that "Fum is used before it is defined." Rather Fum will be treated as a multichar symbol. The second statement defines Fum as a variable.
There are a couple of key concepts here:
xfst is an interactive "greedy interpreter". It compiles each regular expression immediately, using previously defined variables (if any) whose values are currently stored in the symbol table. It has no way of performing forward references to variables that are not yet defined.
In a significant design flaw, xfst regular expresions make no syntactic distinction between variable names and the names of multichar symbols. Thus if "Foo" is defined as a variable, then it will be treated as a variable in subsequently typed and interpreted regular expressions. But if "Foo" is not currently defined as a variable, then it will be parsed and treated as a multichar symbol.
In your example, typing the statements, and then reloading the script file, has the effect of first treating B as a normal alphabetic letter, and later treating B as a defined variable. To avoid this problem in stand-alone scripts, put
at the beginning.
Q: Can't xfst issue a warning (after it makes a pass on the file to be 'source'ed) that there are definitions which are are in a suspicious order?
A: xfst makes no such passes. It has no eye to the future. Rather it is a carefree, hedonistic interpreter that lives for the moment. It greedily interprets each regular expression as it is parsed, based on the state of the symbol table at that very moment.
If you want to write grammars with forward references and free orders, then you need to use lexc.
If you want to define regular expressions with self-reference, then see xfst and self-referencial expressions.