Lecture Materials


Learning Goals

Know how to compute conditional probabilities for continuous random variables

Reading

None!

Concept Check

https://www.gradescope.com/courses/226051/assignments/1037600

Questions & Answers


Q: if you cant factorize is it defintely dependent?

A1:  yes, it’s an if and only if thing… if you can’t factor, then the two random variables are mathematically coupled and therefore dependent.


Q: Is there a reason why we say P(a1 < X <= a2, b1 < Y <= b2) instead of P(a1 <= X <= a2) etc?

A1:  It’s arbitrary, and I don’t think there’s a standard. The < and <= are interchangable because the area under a point is zero, the volume under a line is zero, etc.


Q: so the total area under these graphs is 1?

A1:  Yes, very good. They are all designed to be valid joint PDFs in two dimensions.


Q: It looks like the terms in the covariance matrix that involve correlation cancel out to just be covariance. How come they’re written that way instead of just covariance?

A1:  I’ve actually seen both forms, and this is the first time I’ve seen this form used, so it must be Chris’s preference. But you could put the covariance notation there and it would be equally valid.


Q: does 0 correlation ensure independence? or is there any consistent relationship there?

A1:  No, that’s one of those things that isn’t true. Y = X^2 is an example of dependence but uncorrelated.


Q: If correlation is nonzero is there some way to rotate it so that it is zero? Like maybe write X’ and Y’ in terms of X and Y which are independent Gaussians?

A1:  Very interesting idea, as with a Jacobian matrix or something like that to redefine the axes. I’m not 100% sure, though my intuition is no. I’ll need to check, though. Very cool question.


Q: so 0 correlation only implies independece if the two (or more) random vars are normal?

A1:  erase the word “only” and yes. There’s no constraint requiring the two be normals if rho = 0 means independence. They might be other RVs for which this happens to be true.


Q: Im not sure what weight means in this context is this how blurry a pixel is?

A1:  live answered


Q: so the image is symetric because both X and Y are normal, and therefore have symetric probability distributions?

A1:  X and Y are normal with the same spread -> symmetric weights drawn from both directions.


Q: I’m a little confused on what “a density” is and why we want to use it?

A1:  In the case of continuous probability distributions, the probabilities don’t come at discrete values, but are rather smeared across the full range of possible values. Probabilities are areas under curves, so the height of the curve at any one point is a density.


Q: so when you find the conditional probability of two cont. random vars is it like a line integral because you’re finding the exact probability of a single point and then looking at what the other var does?

A1:  I’m not 100% following the line integral metaphor, so I’ll need to engage you on this after class.


Q: I’m still a bit confused on why for the bayes theorem conditioning on a continuous results in a probability but conditioning on a discrete variable results in a density?

A1:  When the affected variable—that is, the A of A | B—is a dicrete random variable, then the probability function is still on A, which is discrete, whether or not B is continuous of discrete. So Bayes’ Theorem in that case gives us probabilities instead of probability densities. When the affected variable (i.e. A) is continuous, then anything where A is the variable has to be expressed as a density, regardless of whether it’s unconditioned, conditioned on a discrete B, or conditioned on a continuous B. Bottom line: if A is discrete, BT gives us probabilities. If A is continuous, BT gives up probabilty densities.


Q: when do you know to use the bivariate normal?

A1:  That’s just the model that’s chosen by the problem. We often assume that noise is modeled as either a Gaussian or something related to a Gaussian (e.g. it’s log normal as opposed to normal).