- David Kewei Lin

BTS III: Some loose ends

Oct 22, 2024 Tags: exposition learning-theory

This is the last installment to the series sparked off by the SMO Open Q4 problem, which led us to introduce some basic learning theory.

If you’re deciding whether to read this, I’ve approximately split up this post into two parts:

the first part is about dealing with a certain flavor of inequalities, where you are averaging a maximum over all $2^n$ combinations of $\pm$ signs.
the second part is some conventional ML theory content tying generalization error to the Rademacher complexity of loss functions

I would suggest reading just the first part unless you are really dying to see where Rademacher complexity comes from.

Cubics I: A better group law

Oct 15, 2024 Tags: euclidean-geometry

First post in the cubic curves series. The goal is to understand (the two versions of) the group law.

Next post: Cubics II: Constant cross-ratios.

BTS II: An introduction to learning theory

Oct 5, 2024 Tags: exposition learning-theory

X-posted from the SIMO X-Men blog.

In this blogpost, we’ll go off on a tangent and explore what it means to learn from data. In the process, we will get slightly closer (but not quite there) to the context where Rademacher complexity emerges.

This is the second blogpost in the sequence, continuing off BTS I.

$$\newcommand\lrangle[1]{\left\langle #1 \right\rangle}$$ $$\newcommand \E {\mathbb E}$$ $$\newcommand \cH {\mathcal H}$$ $$\newcommand \Var {\mathrm{Var}}$$ $$\newcommand \MSE {\mathrm{MSE}}$$

Behind the Scenes of SMO Q4, Part 1: the Proposal Backstory

Sep 28, 2024 Tags: exposition learning-theory problem-proposal

X-posted from the SIMO X-Men blog.

Recently, a problem of mine came out as Q4 on SMO Open 2024. I’ll go over how I proposed the problem and the theory behind it, which involves some machine learning ideas (!).

This will take a small sequence of posts:

Part 1: a complete backstory for how this problem came about
Part 1.5: trying to understand Rademacher complexity through examples
Part 2: an introduction to learning theory

Read More…

Sampling from a Gibbs' distribution

May 28, 2024 Tags: exposition senior-thesis

$\newcommand\E{\mathbb E}$ $\newcommand \cN {\mathcal N}$ $\newcommand \one{\mathbf 1}$ $\newcommand \Var {\mathrm{Var}}$ $\renewcommand \Im {\mathrm{Im}}$ $\renewcommand \Re {\mathrm{Re}}$ $\newcommand \tr {\mathrm {tr}}$

Here’s a folklore fact from statistical physics: given a random variable $X$, sampling from $\propto p(X)e^{\beta X}$ is exactly the same as:

(1) sampling a large number of independent $X$’s

(2) conditioning the sum to be $\sum X_i = N(u + o(1))$, where $$ u = \frac{\E Xe^{\beta X}}{\E e^{\beta X}}$$ (i.e. conditioning the mean to match that of the target distribution)

(3) picking one of the samples at random.

In this post, we’ll try to state it rigorously and prove it.

David Kewei Lin

Welcome to my personal site!

Recent Posts

BTS III: Some loose ends

Cubics I: A better group law

BTS II: An introduction to learning theory

Behind the Scenes of SMO Q4, Part 1: the Proposal Backstory

Sampling from a Gibbs' distribution