David Kewei Lin

Welcome to my personal site!

Matroids!

Indepenent subsets of vectors inspire the definition of the matroid. We also introduce graphical matroids, and other equivalent definitions. Finally, we look at the connection to problems solvable by the greedy algorithm.

Reference: Oxley Chapter 1.

$\newcommand{\intersect}{\cap} \newcommand{\union}{\cup}$ $\newcommand{\cl}{\mathrm{cl}} % closure$

Two examples

The very first paper defining what a matroid was has the title “On the abstract properties of linear dependence” (Whitney). So there is no better starting place than to look at an object that it was trying to generalize.

Suppose we have a bunch of vectors, e.g.

$$E = \left\lbrace \binom{1}{0}, \binom{0}{1}, \binom{1}{1}, \binom{2}{0} \right\rbrace$$ and we can label these “$1,2,3,4$” respectively. We know what it means for a set of vectors to be linearly independent, so we can list down the linearly independent subsets $$\mathcal I = \lbrace \varnothing, \lbrace 1 \rbrace, \lbrace 2 \rbrace, \lbrace 3 \rbrace, \lbrace 4 \rbrace, \lbrace 1,2 \rbrace, \lbrace 1,3 \rbrace, \lbrace 2,3 \rbrace, \lbrace 2,4 \rbrace \lbrace 3,4 \rbrace \rbrace$$

Now imagine if I didn’t tell you what the vectors were (or even the underlying space), and all you have is the family $\mathcal I$. How might you check if I constructed $\mathcal I$ legitimately?

We would probably want to check the following:

Ta-da, you know what a matroid is now.

Here is a totally different example: let $E$ be the edges of a graph, and let $$\mathcal I = \lbrace X \subset E: \text{X does not contain a cycle} \rbrace$$ and we can check the properties:

Many different definitions!

The really cool thing is that we pretty much take any feature we know from linear algebra and axiomatize it, and we always end up with a matroid.

(I) On a graph, we can consider $\mathcal C = \lbrace \text{cycles} \rbrace$. We expect:

The independent sets are precisely those not containing cycles. (The correct terminology for an abstract matroid is a circuit.)

(II) For vectors, we can talk of bases - independent sets spanning the entire space (which we take to be the span of $E$, not necessarily $\mathbb R^n$). We only need the following:

The independent sets are precisely those which are subsets of some basis.

(III) For vectors, the dimension of the span allows us to attach a number to each subset $X$, which we call the rank $r(X)$. (If we imagine the vectors as the columns of a matrix, then the rank is precisely the dimensions spanned by the columns.)

Of course, a set $X$ is independent if precisely $r(X) = |X|$.

(IV) For vectors, if we have a subset $X$, we can take its span and then all the vectors which lie in the span, and we call this the closure of $X$, $\cl(X)$.

If a subset $X$ is closed (i.e. $\cl(X) = X$), then we also say that it is a flat. In the vector setting, flats are uniquely identified with subspaces spanned by vectors.

So where is the independent set? These are sets $X$ which are not contained in the closure of any proper subset of $X$.

(V) We can also define flats directly (drawing analogy to subspaces):

Matroids and the greedy algorithm

Here’s one connection that makes CS folk care about matroids.

Suppose we have a graph $G$, and for each edge $e$ there is a weight $w(e)$. We would like to find the minimum spanning tree, i.e., the tree that connects all the vertices with the minimum weight.

Kruskal’s algorithm tells us that the optimal answer is guaranteed by the greedy algorithm:

The amazing thing is that now we can simply generalize from spanning trees (which is equivalent to taking acyclic subgraphs) to the independent sets of a matroid. The even wilder fact is that the greedy algorithm works for all weights, then we’re looking at the independent set of some matroid.