$Stat\;\;30:$ March 1st

Independence-Some information is better than none....

We have seen that for two continuous variables such as the

and

scores we have a notion of correlation or dependence, knowing one tells us something about the other, here we are going to look at another type of information about people: categorical variables. These are not measurements that can take on any value but just a few possible.

Our first example will be colorblindness which we study in conjunction with gender. But to begin with we need an extra new concept.

Conditional Probability

Suppose that you have to guess the suit of a playing card drawn from a pack at random, you probability of guessing correctly is . Now suppose I tell you that it is red. You then have a chance, because there are less choices, you can use the additional information to restrict the space of possible outcomes.

We write the probability of A given that A happened as the probability of A given B: $P(A\vert B)$ .

Definition of the conditional probability of given :

$\begin{displaymath}P(A\vert B)=\frac{P(A \;and\; B)}{P(B)}\end{displaymath}$

Intuitive definition:
Knowing that another event B occurs does not affect the probability of the event A of occurring, this means that A and B are independent.

As a formula: $P(A\;\; given\;\;B)=P(A\vert B)=P(A)$

$\begin{displaymath}P(A\; and\; B)= P(A) \times P(B\; given\; A) \end{displaymath}$

The only inequality that is always true is

$\begin{displaymath}P(A \; and\; B) \leq P(A)\end{displaymath}$

Example of how the conditional probabilities can both be bigger or smaller:

Colorblindness

Take the test: http://www.umist.ac.uk/UMIST_OVS/UES/COLOUR0.HTM

Classical one

Types and description and statistics

Study the world of of the colorblind

We rounded the numbers for a large group of people tested in the US, (results vary from region to region in the world, there is an island in micronesia with most people color blind ).

In considering colorblindedness, suppose I consider the binary random variables associated to color blindness and gender (associate 0 if male, 1 if female), these are called indicator variables, we can tabulate the probabilities of all 4 possible pairs of outcomes as:

$\begin{displaymath} \begin{array}{\vert ll\vert ll\vert l\vert} \hline && Male &... ...&254 &494\\ \hline Total && 256 & 256 & \\ \hline \end{array}\end{displaymath}$

$\begin{displaymath} \begin{array}{ll\vert ll\vert l} && Male & Female & Total\\ ... ... \hline Total &&\frac{1}{2}&\frac{1}{2}&\\ \hline \end{array}\end{displaymath}$

So that from this table of joint distribution we read:

$\begin{eqnarray*} P(colorblind)&=&P(C)=\frac{18}{512}\\ P(man)&=&P(M)=\frac{1}{... ...\frac{\frac{2}{512}}{\frac{1}{2}}=\frac{2}{256}=\frac{1}{128}\\ \end{eqnarray*}$

We see that for women: $P(C\vert W)\leq P(C)$
And for men $P(C\vert M)\geq P(C)$

When and are not independent
Sometimes we have $P(A \; given\; B) \leq P(A)$ ,

Sometimes we have $P(A \; given\; B) \geq P(A)$ ,

When two events are independent the probability of them both occurring is just the product of their probabilities.

The probability of throwing a double three with two dice is the result of throwing three with the first die and three with the second die. The total possibilities are, one from six outcomes for the first event and one from six outcomes for the second, Therefore (1/6) * (1/6) = 1/36.

The two events are independent, since whatever happens to the first die cannot affect the throw of the second, the probabilities are therefore multiplied, and remain 1/36.

Definition:Two events and are said to be independent if

$\begin{displaymath}P(E\vert F)=P(E)=P(E\vert F^c),\qquad P(F)>0\end{displaymath}$

Examples:
We draw two cards one at a time from a shuffled deck of cards.

In class we looked at the following two events:

A: The first card is a 7 $\clubsuit$ .
B: The second card is a queen $\bgroup\color{red}$\heartsuit$\egroup$ .

We saw that

$\begin{displaymath}P(B\vert A) \neq P(B)\end{displaymath}$

These events are not independent.

Usually it is quite useful when two variables are not independent, because we can predict one from another.

Bayes Rule and False Positives

Knowing there is dependence gives us an edge on computing alot of probabilities.

As a tree

Tree animation

$\begin{displaymath} P(B\vert A)=\frac{P(A\vert B)P(B)}{P(A)} \end{displaymath}$

False positives:
Suppose the probability of being ill (I) with some deadly but rare disease is $10^{-4}$ .

I is a set of people ill with the disease.

(+) the set of people who test positive for the disease.

There is a test for this disease which has no false negatives: if you have the disease, you will test positive $(P(+\vert I) = 1)$ . However, there are occasionally false positives; 1 person in 1000 who doesn't have the disease (is healthy, H) will test positive anyway $(P(+\vert not I) = 10^{-3})$ .

We want to know the probability that someone who has a positive test is actually ill.

Let's replace B in Bayes' rule with ``is ill'' (I) and A with ``tests positive'' (+). Then

$\begin{displaymath}P(I\vert+) = \frac{P(+\vert I)P(I)}{P(+)}\end{displaymath}$

We know $P(+\vert I)=1)$ and $P(I)= 10^{-4}$ , but we have to figure out

, the overall probability of testing positive. This is

$\begin{displaymath}P(+) = P(ill \cap +) + P(not\; ill \cap +)\end{displaymath}$

, according to the rule that if A and B are mutually exclusive (you can't be both ill and not ill) $P(A \cup B) = P(A)+P(B)$ . We can then say

$\begin{displaymath}P(+) = P(I) P(+\vert I) + (1-P(I)) P(+\vert not I)\end{displaymath}$

by the rule that $P(A \cap B) = P(A) P(B\vert A)$ . Putting it all together,

$\begin{eqnarray*} P(I\vert+)&=&\frac{P(+\vert I)P(I)}{P(+\vert I)P(I)+(1-P(I))P(... ...vert not\; I)}\\ &\approx& \frac{10^{-4}}{10^{-3}}=\frac{1}{10} \end{eqnarray*}$

Venn Diagrams

Truth/Error Table

		True state
		Ill-Disease	Not Ill
	(+)		Error= False Positive
Prediction
	(-)	Error= False Negative

Real Data

In the case of detection of presence of HIV in blood to be used for transfusion, the tests need to be very ``conservative '' in the sense of having no false negatives, but then the false positive rate can be quite high, you have the abstract of a JAMA paper where the computed false positive rate is around $4.8\%$ for a certain test called the Western blot. http://www.ama-assn.org/special/hiv/library/scan/sep98/sep98a.htm

Testing for Independence

Eye color/ Hair Color

Table 1: Eye color and hair color

Eyes Hair	Black	Brunette	Red	Blonde
Brown	68	20	15	5
Blue	119	84	54	29
Hazel	26	17	14	14
Green	7	94	10	16

The simple $\chi^2$ test for independence of this table computes

$\begin{displaymath}{\cal X}^2 = 138.3, \qquad df = 9\end{displaymath}$

This is a very extreme point in the $\chi^2$ distribution.

The probability of seeing such a high value for the $\chi^2$ statistic is tiny, (much smaller than $10^{-4}$ , so we can safely reject the null hypothesis of independence between hair color and eye color.

Macabre associations


+-------------------------------------------------------------------+
|                                                                   |
|  Sex  Age    POISON     GAS    HANG   DROWN     GUN    JUMP       |
|                                                                   |
|   M  10-20     1160     335    1524      67     512     189       |
|   M  25-35     2823     883    2751     213     852     366       |
|   M  40-50     2465     625    3936     247     875     244       |
|   M  55-65     1531     201    3581     207     477     273       |
|   M  70-90      938      45    2948     212     229     268       |
|                                                                   |
|   F  10-20      921      40     212      30      25     131       |
|   F  25-35     1672     113     575     139      64     276       |
|   F  40-50     2224      91    1481     354      52     327       |
|   F  55-65     2283      45    2014     679      29     388       |
|   F  70-90     1548      29    1355     501       3     383       |
|                                                                   |
+-------------------------------------------------------------------+

Other instances of prior information use

The famous Monty Hall problem:

Monty Hall

More explorations

Definitions

Lists of examples

Monty Hall

Bergeron's Monty Hall

Explanation for Monty Hall

Want to play by chunks of a thousand?

Want to play more?

Susan Holmes
2001-03-01