Question 1

The following table depicts (invented) results from an annotation project. The annotators labeled 10 examples (the rows). There were five annotators (the columns): A, B, C, D, and E. The possible labels for each example were +1 (positive), 0 (neutral), and -1 (negative). The data in CSV format.

ex1 0 0 +1 0 0
ex2 -1 +1 -1 +1 -1
ex3 +1 -1 -1 +1 +1
ex4 0 +1 0 +1 +1
ex5 +1 0 +1 +1 -1
ex6 -1 -1 -1 -1 -1
ex7 -1 0 -1 -1 -1
ex8 -1 0 -1 -1 -1
ex9 +1 +1 +1 +1 +1
ex10 -1 +1 -1 -1 +1

Part A [2.5 points]

Some examples are simply more ambiguous than others. Hsueh et al. (2009) propose a method for identifying such cases using the annotation distributions. For this problem, we'll use a modification of their method that seems appropriate for our simpler annotation setting: the entropy of the response distributions. To use this measure for an example ex\(_{i}\), turn its annotation vector into a probability distribution \(P_{i}\) over the category set \(\{-1, 0, +1\}\) and calculate

\[ H(P_{i}) = -\left(\sum_{x \in \{-1, 0, +1\}} P_{i}(x)\log_{2}P_{i}(x)\right) \]

Your task: use this measure to find the most ambiguous example in the data set. Give this example (by its id ex\(_{i}\)) and its associated entropy value.

Part B [2.5 points]

There is a trouble-maker in our midst: one of our annotators seems to be significantly less reliable than the others. Find the annotator whose column-vector of annotations has the largest mean Euclidean distance from all of the other annotators (excluding self-distances). Provide the name of this annotator and his associated mean Euclidean distance from the others.

(Note: identifying the majority annotation for each example and checking differences from it is a good way to home in on our miscreant/oddball.)

(Note: it is worth considering other methods for making these comparisons, including correlation and measures of inter-annotator agreement. However, we're asking only that you use Euclidean distance.)

Question 2

Domingos (2012) offers the memorable line “strong false assumptions can be better than weak true ones”.

Part A [2.5 points]

What problem is Domingos addressing when he says this? Give (i) the name of the problem and (ii) a brief, informal definition of it.

Part B [2.5 points]

Describe (1-2 sentences) two methods for addressing the problem. For each, give one reason why it is not a complete solution.

Note: In both parts A and B, we're looking for a few concise sentences in response to each question. No need to write a treatise.