Data set

The following role-labeled examples all involve the verb break as the target, but the argument structure is very different in each case. We'll make use of these examples throughout this assignment.

All the examples except (9) and (21) are from FrameNet (which happens not to have sentences of this form yet).

(The heading links go to the source FrameNet pages, which contain many more examples. You can restrict attention to the subset given here, though.)


  1. [Agent I] BROKE [Whole_patient the seal] [Instrument with my blunt thumb].
  2. She had laid the hammer there, after [[Agent she] had tried to BREAK [Whole_patient the tower window].
  3. For many years there were rumours of [Agent Egyptian vultures] BREAKING [Result open] [Whole_patient ostrich eggs] [Means by throwing stones at them at close range].
  4. [Agent I] BROKE [Whole_patient the car windows] [Instrument with a stone] [Purpose to try to reach Mr Pickering].
  5. [Agent They] BROKE [Whole_patient a chair] [Resistant_surface over me].
  6. [Whole_patient The bone] is also [Manner more extensively] BROKEN [Agent by the diurnal species] [Time during feeding], and it is less broken by owls.
  7. [Whole_patient Ground] to be BROKEN [Place at Biodefense Center]
  8. [Whole_patient The seal] broke.


  1. Virtually [Protagonist all the largest American corporations] BROKE [Norm the law] [Manner in one way or another]; ...
  2. [Time Throughout her career] [Protagonist she] successfully BROKE [Norm the rules of haute couture] until her name became synonymous with French chic.
  3. This raises the question of who is responsible if [Protagonist the individual trader] BREAKS [Norm the rules].
  4. If [Norm any of these conditions] is BROKEN, then so is the contract between you and the supplier.
  5. Though there is clearly a potential conflict of interest, it is far from clear that [Norm the law against insider trading] would be BROKEN [Protagonist by a firm that took advantage of it].
  6. `But if [Protagonist he] wishes to BREAK [Norm the contract], then the contract will be broken."


  1. [Experiencer Josef Jakobs] landed in a potato field in North Stifford, Essex, falling heavily and BREAKING [Body_part his ankle].
  2. [Experiencer Amelie] fell and BROKE [Body_part her hip] [Time just two days before they were due to sail on the liner for France].
  3. [Experiencer Clary] tripped over a cable during filming and BROKE [Body_part his foot].
  4. And [Time last month] [Experiencer an 18-year-old student at Lady Margaret Hall college] BROKE [Body_part a leg] and injured her spine [Containing_event when she fell out of a window after a pub crawl].
  5. Their trial was slated for 23 November but it was postponed until 10 January 1949 because [Experiencer Geisler] BROKE [Body_part three ribs]
  6. [Body_part His foot] broke [Containing_event when he lept off the roof].

Question 1 (2 points)

Provide the whole label sequence feature templates (in the terms of Toutanova et al. 2008) for examples (2) and (9). Use the version of those feature templates that includes lemma information. Your roles should be the FrameNet ones from the examples, not the PropBank ones that Toutanova et al. concentrate on.

Question 2 (2 points)

Building on Gildea and Jurafsky's (2002) results, Toutanova et al. employ a feature VOICE. Use the above data set to motivate the value of such a feature, by explaining (in a few sentences) how it could help improve labeling predictions here.

Question 3 (4 points)

Consider the task of trying to predict the distribution of the Agent, Protagonist, and Norm roles in the above data set.

  1. Describe one potentially useful features that would require an appeal to an outside lexical resource (e.g., a database, a lexicon). Say what that resource is, and describe the feature.
  2. Describe one potentially useful features that would require imposing additional structure on the data. Say what that additional structure is, and describe the feature.

You should feel free to adopt ideas from the readings (or anywhere else). If you do this, then include the citations.

Question 2 (2 points)

In their section 4.1, Toutanova et al. highlight the special problems posed by verbs like expect, try, and want, whose grammatical subjects are also semantically arguments of the embedded infinitival predicates — informally, it is as though Sam tried to win is semantically Sam tried Sam win.

Your task: Parse example (2) above using the Stanford parser demo and then explain how Stanford collapsed dependency representations could be used to address the displacement problem. (Your answer should be about three sentences long; you can write more, but three sentences should suffice.)