March 3rd, 2021
To understand how to treat a distribution's parameters as unknowns and then determine what value those parameters should be assigned so that a distibution optimally matches data samples.
A1: live answered
Q: Does the beta distribution getting tighter after more samples have to do with the chernoff bound?
Q: Hi Chris this is a test question
Q: Since spring quarter course registration opens up this weekend, I was wondering if you have any general advice for what courses would be good follow-ups to CS109 (particularly ones with a lot of statistics). Thanks!
A1: Great question! Generally CS221 is the most direct successor, when you are ready. CS161 is another fantastic class. Then! There is… CS228: probabilistic graphical models. Its filled with profound math which takes the best ideas in CS109 to the next level. I haven’t checked if those are offered in Spring
Q: usually we look at negative log likelihoods right since then we can use gradient descent?
A1: coming right up in a probability course near you! (this one by monday)
Q: so we would check because we really just know that’s an extrema and don’t know whether its a min or max?
Q: how do we know there aren’t any local optima or are we ok with the idea that we might not be at the global optimum?
Q: Can we compute the Hessian to avoid maximums?
A1: you can! In a few days we will learn “gradient descent” which is really the bread and butter algorithm for AI. It never ends up in a maximum :)
Q: have the panelists watched the classic "lion king 1.5?"
A1: Ive seen lion king 1 and 2… what is this 1.5? That sounds like some harry potter 9 and 3/4 sort of magic… :)
Q: can you go over what unbiased vs biased means in probability?
A1: Good question. In stats theory, “unbiased” means that your guess, thought of as a random variable, is correct in expectation (it could be wrong, but in expectation its right!)
Q: Is this an example of “overfitting” or is it just not having enough data?
A1: exactly :D
Q: Is the fact that the uniform is not nicely differentiated related to the fact that the uniform distribution is not in the exponential family of distributions?
A1: i think its because of the discontinuity (which is what jerry just said). but what you say isn’t wrong, the exponential family is generally nicely differentiated.