David Kewei Lin

Welcome to my personal site!

Recent Posts

BTS III: Some loose ends

Oct 22, 2024

X-posted from the SIMO X-Men blog.

This is the last installment to the series sparked off by the SMO Open Q4 problem, which led us to introduce some basic learning theory.

If you’re deciding whether to read this, I’ve approximately split up this post into two parts:

  • the first part is about dealing with a certain flavor of inequalities, where you are averaging a maximum over all $2^n$ combinations of $\pm$ signs.
  • the second part is some conventional ML theory content tying generalization error to the Rademacher complexity of loss functions

I would suggest reading just the first part unless you are really dying to see where Rademacher complexity comes from.

Read More…

BTS II: An introduction to learning theory

Oct 5, 2024

X-posted from the SIMO X-Men blog.

In this blogpost, we’ll go off on a tangent and explore what it means to learn from data. In the process, we will get slightly closer (but not quite there) to the context where Rademacher complexity emerges.

This is the second blogpost in the sequence, continuing off BTS I.

$$\newcommand\lrangle[1]{\left\langle #1 \right\rangle}$$ $$\newcommand \E {\mathbb E}$$ $$\newcommand \cH {\mathcal H}$$ $$\newcommand \Var {\mathrm{Var}}$$ $$\newcommand \MSE {\mathrm{MSE}}$$

Read More…

Behind the Scenes of SMO Q4, Part 1: the Proposal Backstory

Sep 28, 2024

X-posted from the SIMO X-Men blog.

Recently, a problem of mine came out as Q4 on SMO Open 2024. I’ll go over how I proposed the problem and the theory behind it, which involves some machine learning ideas (!).

This will take a small sequence of posts:

  • Part 1: a complete backstory for how this problem came about

  • Part 1.5: trying to understand Rademacher complexity through examples

  • Part 2: an introduction to learning theory

Sampling from a Gibbs' distribution

May 28, 2024

$\newcommand\E{\mathbb E}$ $\newcommand \cN {\mathcal N}$ $\newcommand \one{\mathbf 1}$ $\newcommand \Var {\mathrm{Var}}$ $\renewcommand \Im {\mathrm{Im}}$ $\renewcommand \Re {\mathrm{Re}}$ $\newcommand \tr {\mathrm {tr}}$

Here’s a folklore fact from statistical physics: given a random variable $X$, sampling from $\propto p(X)e^{\beta X}$ is exactly the same as:

(1) sampling a large number of independent $X$’s

(2) conditioning the sum to be $\sum X_i = N(u + o(1))$, where $$ u = \frac{\E Xe^{\beta X}}{\E e^{\beta X}}$$ (i.e. conditioning the mean to match that of the target distribution)

(3) picking one of the samples at random.

In this post, we’ll try to state it rigorously and prove it.

Read More…