Jan 27th, 2021
By the end of lecture, you should understand variance, how to compute it, what a Bernoulli trial is, and what a Binomial distribution is.
Variance, Bernoulli, Binomial
https://www.gradescope.com/courses/226051/assignments/969374
Q: wait so its the average of all the sqaure distances between the mean and …?
A1: And every given value that X can take on!
Q: what is the ( x - mu)^2 in the table?
A1: In this case the table doesn’t actually display/encode the Variance itself. We derive the variance from the pmf, which is what the table is displaying :)
Q: how does property 1 make intuitive sense?
A1: Intuitively, we are saying that variance is the difference between the average of all squared values that X can take on (E[X^2]), minus the square of the average of all non-squared value that X can take on (E[X]^2). This allows us codify the true distance between each point in our distribution to the mean.
Q: why is expectation linear?
A1: Expectation is linear since it is just the weighted sum of each value with its particular probability. Since variance contains a squared term it is not the case for variance.
Q: I don't understand the significance of Var(aX), could this be translated into more tangible terms?
A1: You can think of Var(aX) as a scaling constant which actually scales the spread of the distribution by a value of a^2. The reason we can drop the constant +b term is because it doesn’t actually change the spread. It just shifts the entire distribution but doesn’t actually affect spread.
Q: got it thank you!
A1: Of course!
Q: what does he mean by second moment?
A1: E[(X -E[X])^2] is the second central moment, whereas, E[X^2] is something known as the raw moment. In this class when we say second moment, we mean second central moment, or variance.
Q: is variance the same as the second moment? i just joined.
A1: For our purposes in this class, yes!
Q: why isn’t the variance for bernoulli p(1-p)^2? following the defition of variance
A1: Hard to answer through q&A but here is a link to a few proofs deriving the variance of bernoulli: https://proofwiki.org/wiki/Variance_of_Bernoulli_Distribution If you have ?’s about the steps, feel free to stop by Jerry or one or the TA’s office hours :)
Q: What does the “~” in, for example, X~Bin(n,p) stand for/mean?
A1: You can read “~” as “distributed as”. So X ~ Bin(n,p) means— We have some Random Variable, X, where X is distributed as a Binomial() distribution with n ind. trials and where each trial has a prob. p of success.
Q: How would we read X~Bin(X)? Is it “A binomial random variable of X”?
Q: i thin the slide should say Bin(p) random variables? first sentence on this slide
A1: Ber(p) is correct in this case. What we mean is that a Binomial distribution is the result of n independent Ber(p) distributions occuring one after the other in succession. Ex: when I flip a fair coin I have 50% probability of H or T. This is a Bernoulli, since it is either a success or failure. If I ask the question, What is the distribution of 5 independent coin flips, that will be the combination of 5 Bernoulli independent events which is a binomial distr.
Q: what are p^0, p^1, …? (from the 3 coin flips slide)
A1: p0=p1=p2=0.5. Aka we are flipping the same fair coin which has a 50/50 probability, three times!
Q: What does w.p mean?
A1: w.p. = with probability!
Q: Variance is not bounded below or above (ie it can be higher than 1 or lower than 0)?
A1: 1.) A variance cannot be negative since we square terms in the definition. 2.) Keep in mind Variance is a measure of the spread of a random variable and the support of that RV could be any number. You can have a situation as follows: X = avg. number of min a person sleeps Y = avg number of seconds a person sleeps In either case Var(X) and Var(Y) will possibly be greater than 1. Additionally the variance of y will probably be bigger than the variance of X since we changed our units from minutes -> seconds.
Q: how would you model shaking the board
A1: By shaking the board do you mean physically shaking it left/right. If so, this would be hard to model since you’re adding much more uncertainty to the system.
Q: Why did we do 5 choose k for the plinko game?
A1: We chose 5 for the 5cK since the total height of the pyramid is equal to 5. No matter what our journey will always look like L,R,R,L,L or R,R,L,L,L etc. (length 5)
Q: yeah! for the board
A1: The RV is a binomial since each pin in the board is a bernoulli random variable. So hitting multiple pins in succession can be modelled as a Binomial. Does that answer your question?
Q: I understand the 5 but why k? I don’t get why that formula will actually lead to bucket k. Does it have to do with the count of lefts and rights?
A1: Yes exactly, you can think of it as all the possible ways to choose the rights from the lefts.
Q: oh i was asking why we made the sucess “right” was that just because we can decide which one is a success?
A1: In other words, choosing all the rights from the lefts is the same as choosing all the lefts from the rights. ie (n Choose k) = (n Choose n-k)
A2: Exactly! The problem is very similar if we decide left is a success. We just chose right in this case, but the other choice is an analogous problem.
Q: for the instance that we are doing nCn, will (1 - p) be raised to the 0 power?
A1: Yep!