Lecture Materials


Learning Goals

To understand how to treat a distribution's parameters as unknowns and then determine what value those parameters should be assigned so that a distibution optimally matches data samples.

Reading

None!

Concept Check

https://www.gradescope.com/courses/226051/assignments/1070164

Questions & Answers


Q: test?

A1:  live answered


Q: test

A1:  live answered


Q: test

A1:  live answered


Q: Does the beta distribution getting tighter after more samples have to do with the chernoff bound?

A1:  live answered


Q: Hi Chris this is a test question

A1:  yay!


Q: Since spring quarter course registration opens up this weekend, I was wondering if you have any general advice for what courses would be good follow-ups to CS109 (particularly ones with a lot of statistics). Thanks!

A1:  Great question! Generally CS221 is the most direct successor, when you are ready. CS161 is another fantastic class. Then! There is… CS228: probabilistic graphical models. Its filled with profound math which takes the best ideas in CS109 to the next level. I haven’t checked if those are offered in Spring


Q: usually we look at negative log likelihoods right since then we can use gradient descent?

A1:  coming right up in a probability course near you! (this one by monday)


Q: so we would check because we really just know that’s an extrema and don’t know whether its a min or max?

A1:  exactly


Q: how do we know there aren’t any local optima or are we ok with the idea that we might not be at the global optimum?

A1:  live answered


Q: Can we compute the Hessian to avoid maximums?

A1:  you can! In a few days we will learn “gradient descent” which is really the bread and butter algorithm for AI. It never ends up in a maximum :)


Q: have the panelists watched the classic "lion king 1.5?"

A1:  Ive seen lion king 1 and 2… what is this 1.5? That sounds like some harry potter 9 and 3/4 sort of magic… :)


Q: can you go over what unbiased vs biased means in probability?

A1:  Good question. In stats theory, “unbiased” means that your guess, thought of as a random variable, is correct in expectation (it could be wrong, but in expectation its right!)


Q: Is this an example of “overfitting” or is it just not having enough data?

A1:  exactly :D


Q: Is the fact that the uniform is not nicely differentiated related to the fact that the uniform distribution is not in the exponential family of distributions?

A1:  i think its because of the discontinuity (which is what jerry just said). but what you say isn’t wrong, the exponential family is generally nicely differentiated.