Due: Friday, February 27 (by noon)
Submit assignments electronically to all three teachers
(ron.kaplan "at" microsoft.com, tracy.king "at" microsoft.com,
mforst "at" parc.com)
| Turn in: | 1. the final grammar you end up with. Please name your grammar with name-eng-week5.lfg |
| 2. a report containing the answers to the questions of the exercises (mini-PCFG, estimated feature weights, evaluation results, ...) |
| Exercises on: | |
| PART 1: | C-structure pruning |
| PART 2: | Stochastic disambiguation |
| PART 3: | Evaluation |
Start from the grammar eng-week5.lfg and parse the sentences in eng-week5.txt.
If you put a file called xlerc in the directory with your grammar and in xlerc you put:
set normalize_chart_graphs 1 create-parser eng-week5.lfg
then whenever you start xle in that directory, it will automatically load the respective grammar. This will save a lot of time when making and testing changes, and you will not have to worry about problems with the normalization of packed parse representations in PART 3.
Parse sentences 1 through 3 in eng-week5.txt by entering commands like
parse-testfile eng-week5.txt 1 parse-testfile eng-week5.txt 2 ...
These sentences all get several parses due to PP attachment ambiguities.
Sentences 4 through 6 in eng-week5.txt contain the same sentences, but brackets indicate the intended PP attachment in some (imaginary) context. Modify the grammar so that it can parse the sentences with the brackets; make sure that the brackets are exploited for the enforcement of a certain analysis. Hint: The most effective way to do this is to extend the METARULEMACRO. Note that there are already lexical entries for '[' and ']'.
Induce a mini-PCFG from the parses of these sentences by manually listing the context-free rules that give rise to the c-structures and determining their relative frequency. Ignore all local subtrees that exist only due to the brackets (LBR or RBR). Recall that the probabilities of the context-free rules are estimated as their relative frequencies in the parses of the ``corpus''.
What are the probabilities of the two readings of sentence 1 in eng-week5.txt according to your mini-PCFG? Which PP attachment (NP vs. VP) is more probable? Note: Email us if you are not sure about your solution of Exercise 1, so you can actually get the right parses for this exercise.
parse-testfile eng-week5.txt 7 -outputPrefix /tmp/We will now extract feature forests from these (packed) parse representations. The learning features we will use are listed in eng-week5-features.txt. The feature forests can be obtained using commands like the following (in a shell, not in XLE):
print-feature-forest eng-week5-features.txt /tmp/S7.pl /tmp/ff7.txtIn the feature forests, the features and their values are recorded as <feature ID>:<value>. If there are multiple parses, the features will be in a packed format; the labelling of the packing will be somewhat arbitrary but you should be able to count the number of times each feature occurs in each parse. Explain which feature values are extracted for which readings of the sentences. You are welcome to use the XLE documentation to see what the exact definition of the feature templates cs_conj_nonpar and cs_adjacent_label is.
triples match --matchMode best --sourceMode fs_file --sourceFile /tmp/fs13.pl --targetMode fs_file --targetFile eng-fs13-week5.plSince the option --matchMode is set to `best', the result indicates the quality of the reading that comes closest to the intended analysis.
Also try evaluating the `average' quality of all the readings that the grammar produces for sentence 13 by entering the following command:
triples match --matchMode average --sourceMode fs_file --sourceFile /tmp/fs13.pl --targetMode fs_file --targetFile eng-fs13-week5.plReport the evaluation figures you obtain (simply by copy-and-paste). What would the `best' precision, recall, and F-score figures be if the dependencies `FIRST' and `TOKEN' were ignored?
For your information: If S is the set of dependency triples in the (best) solution of the system and G is the set of dependency triples in the gold standard, the precision P is calculated as |S ∩ G|/|S| and the recall R is calculated as |S ∩ G|/|G|. The F-score is the geometrical mean of precision and recall, i.e. F = (2 * P * R)/(P + R).
You can find more details on f-score in the Wikipedia F-score page and the links in there to discussions of calculating precision and recall.
| Turn in: | 1. the final grammar you end up with. Please name your grammar with name-eng-week5.lfg |
| 2. a report containing the answers to the questions of the exercises (mini-PCFG, estimated feature weights, evaluation results, ...) |
If you have any questions, you can send us email (ron.kaplan "at" microsoft.com, tracy.king "at" microsoft.com, mforst "at" parc.com), call us (Ron: 650-245-6865, Tracy: 415-848-7276, Martin: 650-812-4788), or set up office hours with us.