Questions on Knight section 31

25 April 2006

There are some questions regarding formulas found in section 31 of the Kevin Knight tutorial. He begins the section as follows:

P(a | e, f)   =    P(a, f | e)  /  P(f | e)   =    P(a, f | e)  /  Σa P(a, f | e)

Consider the computation of the denominator above. Using Model 1, we can write it as:

Σa Πj t(fj | eaj)

Knight is being imprecise here, and it's quite confusing. The latter expression is not really equal to the denominator above, because it neglects to take account of the length of the French sentence.

Here's a more precise account. Let's ignore the summation over a for the moment, and just focus on P(a, f | e). Since len(f) is a deterministic function of f, P(a, f | e) = P(a, f, len(f) | e). Now we repeatedly apply the chain rule for conditional probability:

P(a, f, len(f) | e) = P(len(f) | e) ×
P(a | len(f), e) ×
P(f | a, len(f) e)

So it's really the product of three probabilities, and Knight kind of ignored the first two, which determine how long the French sentence will be and what the alignment will be. He was really focused on the third piece, and it is true that

P(f | a, len(f) e)    =    Πj   t(fj | eaj)

So Knight's slip is very confusing. I think the reason he made it is just that he was focused on something else. The point he is trying to make in this section is something quite different: that the sum and product in his expression can be effectively interchanged (read the section for details), saving a huge number of multiplications.