Lecture 6: Conditional Independence and Random Variables

Jan 25th, 2021

Lecture Materials

Learning Goals

By the end of lecture, you should understand how event probabilities can be conditionly independent of other events, and you should be familiar with the notion of a random variable, how to express its associated probability mass function, and how to compute its expectation. You should also know the name of Chris's baby girl!!!

Reading

Random Variables, Probability Mass Functions

Concept Check

https://www.gradescope.com/courses/226051/assignments/964194

Questions & Answers

Q: test

A1: live answered

Q: wait isn’t chain rule P(B | A) * P(A)

A1: fixed

A2: Yup

Q: In this case isn’t aren’t A and B conditionally independent given E since P(A|E) = P(AB|E) = 0?

A1: We would expect P(AB|E) to decompose into P(A|E)P(B|E) if A and B are conditionally independent given E

Q: can two events be dependent normally but independent conditionally?

A1: Yes

Q: If we know we have to find P(B|A), how do we decide between using Baye’s (with a normalized constant) vs. conditional probability?

A1: Sounds like you are asking whether we should decompose that term into P(A|B)P(B)/P(A) OR whether we should use P(B,A)/P(A). The answer is "what are the givens of your problem? If P(A|B)P(B) shows up as a term, you should use the former.

Q: how did he get 8 differet ways?

A1: Was answering another question and missed this - are you referring to the fact there are 3 events under consideration, E1,E2,E3? It probably came from 2^3. Think about bits - we are considering whether or not to do each event.

Q: does K represent the category of likes international emotion comedies?

A1: Yes, slide 10: http://web.stanford.edu/class/cs109/lectures/6-RandomVariables/6-RandomVariables.pdf

Q: where is the 2^3 coming from? watch or didnt watch and then three events?

A1: You can think of whether E1 occurs or doesn't occur as a "bit", then counting the configurations becomes homologous to counting bits. Alternatively, if we are thinking about 3 distinct items into 2 buckets, the buckets are "happened" and "didn't happen", and you count the ways you can put each of the events E1,E2,E3 into those buckets

A2: Yes, if we have three "bits" and each can be T or F (2 options), then there are 2^3 configurations. You could also derive this from putting 3 distinct items into 2 buckets

Q: how did we get P(E4|E1E2E3K) = P(E4|K) on slide 10?

A1: It occurs if we assume E4 is conditionally independent of E1,E2,E3 given K

Q: Just to clarify, “countably many values” includes infinity?

A1: To be more precise, we NEVER ask "what is the probability that X = infinity". We CAN ask "what is the probability that X = arbitrarily large number" and there is no limit to how large.

Q: where is the the two in 2(1/6) coming from and the three in 3(1/6)?

A1: If we are just summing probabilities, then it is Σ p(x). However, if we are computing expected value, it is defined as Σ x * p(x). The "x" is what you are referring to, and it is the output of the random variable. We are basically saying "muliply the output of the random variable with the probability of observing it"

Q: could we talk about how the support of y would differ from the event space?

A1: Are you referring to the sample space? The sample space is the set of possible outcomes (the elements of the set are literally the faces of two dice). The random variable Y maps from outcomes of the sample space to the number line.

A2: You can have different random variables that map from the same sample space but output differently to the number line. E.g. the sample space is "outcome of 3 coin flips". You could define a random variable X as the number of heads you see. You could define another random variable Y as the number of heads you see multiplied by 2. Both X and Y would have the same sample space {HHH, HTH, HHT ...}, but the support of X would be {0,1,2,3} and the support of Y would be {0,2,4,6}

Q: are we going to go over how to generate a PMF for a given example?

A1: If your sample space was "the upward facing side of each of 5 coin flips", then a random variable is just a way of mapping each of the possible outcomes to a number. We'll cover during discussion section!

Q: What does LOTUS stand for?

A1: why that is*

A2: Law of the unconscious statistician. It referrs to the useful rule we learned for computing the expected value of a FUNCTION of X. It is not that different from computing the expected value of X itself, but it is not trivial mathematically why that it.

Q: In the second case should you be willing to pay $16 or something else?

A1: If you’re constrained to pay a perfect power of 2, then it’s reasonable to say you’d be willing to spend as much as 16 dollars to play, yes.