Discrete Structures Proofwriting Checklist


Now that we’re transitioning to writing proofs about discrete structures like functions, graphs, and the like, there are a few specific points we’d like to you to check for in your proofs in addition to the general rules set out in the earlier Proofwriting Checklist. As you’re working through the problems on Problem Set Three and beyond, take a few minutes per proof to read over what you’ve written and check to make sure that your proofs obey the following criteria:

  • Appropriately call back to first-order definitions.
  • Don’t use quantifiers or connectives in your proofs.
  • Don’t repeat definitions; use them instead.
  • Distinguish between assuming and proving.
  • Type-check your proofs.

As before, we will be applying this checklist to the proofs that you submit, starting with Problem Set Three, as part of how we grade, so catching these sorts of issues before you submit will help us give you even better targeted feedback on your proofs. The remainder of this handout goes into more detail about what each of these rules mean.

Appropriately Call Back to First-Order Definitions

One of the main differences between the proofs that you’ll be writing on Problem Set Three and the proofs you did on Problem Set One and Problem Set Two is that we now have symbolic definitions of key terms and definitions expressed in the language of first-order logic. From a proofwriting perspective, this is quite useful – it means that you know exactly what it is that you’ll need to prove at each point in time. At the same time, this means that you need to be extremely careful when setting up your proofs to make sure that you’re making the right assumptions and ultimately trying to prove the right results.

For example, let’s suppose that you’re trying to prove that some function $f : A \to B$ is surjective. This means that you need to prove that

\[\forall b \in B. \exists a \in A. f(a) = b\text.\]

Given that this is the statement you want to prove, you should not start your proof off like this:

⚠ Incorrect! ⚠ Proof: Consider any $x_1, x_2 \in A$ where $x_1 \ne x_2$. Since $x_1 \ne x_2$, we know that (
) ■

The issue with this proof is that the structure of what’s written above doesn’t match the structure of the first-order statement in question. Specifically, since in this proof we’ve picked an arbitrary $x_1$ and $x_2$ from $A$ and assumed that $x_1 \ne x_2$, we’re essentially writing a proof that matches the following first-order logic fragment:

\[\forall x_1 \in A. \forall x_2 \in A. (x_1 \ne x_2 \to (\dots))\]

Take a minute to confirm that you see exactly why this is the case.

As you’re working through problems on discrete structures, we strongly recommend that, before you spend any mental energy trying to actually tackle the problem, you make sure that you have clearly identified what it is that you’re assuming and what it is that you need to prove. The Guide to Proofs on Discrete Structures contains a number of example proof templates that you can use for some common cases, but more generally you’ll want to get as much practice as possible going back and forth between first-order logic and proof setup.

There’s another way that we often see proofs fail to engage with first-order definitions, and that’s to operate at far too high of a level. As an example, if you look on Problem Set Three, you’ll see a problem about function powers. You’re asked to prove that if $f^3$ is a bijection, then $f$ is a bijection. Here’s an (incorrect!) proof of this.

⚠ Incorrect! ⚠ Proof: The function $f^3$ is defined in terms of the function $f$. Therefore, properties of $f^3$ will transmit to $f$. And since $f^3$ is a bijection, this means that $f$ must be as well. ■

This proof doesn’t establish that $f$ is injective and surjective, and instead relies on a high-level appeal to intuition that if you start with a function and hit it with the Math Hammer to form a new function, you are guaranteed to end up with a function that has the same properties as the one you started with. The problem here is that this argument essentially takes a much narrower statement (“$f$ is a a bijection”), which we aren’t sure is true, and tries to prove it by appealing to a much broader statement (“if any transformed version of any function has some property, then the original function has that property as well.”) The catch is that we haven’t proved that broader statement, and in fact, since it’s so broad, we should be even more skeptical of it than we are of the claim that the single function is injective and surjective. In fact, you can show that the statement given above isn’t true, and it’s worth finding some examples about why this is the case.

There are two lessons to take away from this example. First, be wary about going off-script in the course of writing a proof. If you’re asked to prove that something is an injection, or a bijection, etc., you have a formal definition you can call back to, and chances are that it’s probably best to prove that the object has the given type by explicitly appealing to each of the parts of this definition. If you decide not to do that and to prove the result through some other mechanism, you should be very suspicious about the line of reasoning you’re following.

This brings us to the second lesson here – exercise skepticism about broad claims in your proofs. If you’re making some claim about how all injections behave, or what every function looks like, etc., your immediate response should be to ask whether what you’re saying is actually true. If so, why? Go and prove it. If not, why not? Go disprove it. In the course of doing so, you’ll either discover a flaw in your original line of reasoning, which is great (it means that you’ve spotted an error in your proof and can go and try to address it), or you’ll end up with a more robust proof because you’ll have justified a critical and non-obvious claim you’ve made.

Don’t Use Quantifiers or Connectives in Your Proofs

The title of this section says it all – the convention in proofwriting is to not use quantifiers ($\forall$ or $\exists$) or connectives ($\land$, $\lor$, $\to$, etc.) in the course of writing up a mathematical proof. There’s no fundamental reason why proofs couldn’t be written using these symbols. It’s just the way that the mathematical community decided to do things. Since part of the goal of this class is to get you to write proofs in way that matches established guidelines, we’ll be enforcing this rule pretty strictly.

The good news is that many violations of this rule are quite easy to fix. For example, suppose that we’re trying to prove that some function $f : A \to B$ is surjective. Here’s one possible not-so-great way to start off a proof to that effect:

⚠ Incorrect! ⚠ Proof: We will prove that the function $f$ is surjective. To do so, we need to prove that $\forall b \in B. \exists a \in A. f(a) = b$. So consider $\forall b \in B$. We will prove that $\exists a \in A$ where $f(a) = b$. (
) ■

Here, the first-order logic notation that’s present in the proof is just a restatement of the formal definition of surjectivity, and from context it looks like the author of the proof may have included it to make clear that they know what the definition of surjectivity is and to remind the author of that definition.

As you saw in the original Proofwriting Checklist, it’s almost never a good idea to restate definitions in the abstract (“Don’t restate definitions; use them instead”). You can assume that whoever is reading your proof knows what a surjective function is, so restating the definition of a surjective function doesn’t actually accomplish anything here. We can safely delete that entire sentence from the proof, leaving us with this much simpler proof setup:

Proof: We will prove that the function $f$ is surjective. To do so, consider an arbitrary $b \in B$. We need to prove that there is some $a \in A$ where $f(a) = b$. (
) ■

This proof contains all of the important details of the original proof, but without any quantifiers or connectives.

If you’re writing a proof and find a spot where you feel like “the only way” to write out what you want to write is to use first-order logic notation, feel free to reach out to us over Ed or to come talk to us in office hours! We’re happy to weigh in and offer advice.

Don’t Repeat Definitions; Use Them Instead

“Hey!,” you’re probably saying. “Isn’t this something covered in the original Proofwriting Checklist?” Yes. Yes it is. As we transition to proofs on discrete structures, this rule will become even more relevant, and so we thought it was worth revisiting.

On Problem Set One and Problem Set Two, we gave you the advice to avoid restating definitions purely in the abstract and to instead use them in a way that demonstrates how that definition is applied. For example, consider these three statements:

⚠ We know that $x \in A$. Since $A \subseteq B$, we know that every element of $A$ is an element of $B$. Thus we see that $x \in B$. ⚠

⚠ We know that $x \in A$. Since $A \subseteq B$, we know that every $x \in A$ satisfies $x \in B$. Therefore, we see that $x \in B$. ⚠

⚠ We know that $x \in A$. Since $A \subseteq B$, we know that every $z \in A$ satisfies $z \in B$. Therefore, we see that $x ∈ B$. ⚠

All of these statements repeat a definition in the abstract before applying it. They're better off rewritten like this:

Since $x \in A$ and $A \subseteq B$, we see that $x \in B$.

There are a few reasons why it’s wise to avoid repeating definitions in the abstract. First, you can assume that the reader knows all of the relevant terms and definitions that are needed in your proofs. Your job as a proofwriter is not to convince the reader of what the definitions are, but to show how those definitions interact with one another to build into some result. In that sense, repeating a definition in the abstract, like what’s done above and to the left, doesn’t actually contribute anything to the argument you’re laying out. The reader already knows the definition, so that sentence is fully redundant.

Second, restating definitions in the abstract risks violating other checklist items. Let’s go one at a time through the three options above that we advise against. The first one is far too general (“every element of $A$ is an element of $B$”) and therefore breaks our advice of making specific claims about specific variables. The second one (“every $x \in A$ satisfies $x \in B$”) is a variable scoping error – is $x$ the specific value referred to in the first sentence, or is it a placeholder? The third one is making specific claims about the variable $z$ and doesn’t have a scoping error, but in that case $z$ is purely a placeholder – it doesn’t refer to any value. In each of those cases, you can safely delete things.

And finally, restating definitions in the abstract just makes things longer. Compare the three options above to the more polished version. All three of those proof fragments are significantly longer than the more concise and direct version shown to the right.

As an example, consider the following proof that contains a style error:

Theorem: If $f : A \to A$ is an involution, then $f$ is injective. ⚠ Incorrect! ⚠ Proof: Let $f : A \to A$ be an involution. We need to show that $f$ is injective. To see this, pick any $a_1, a_2 \in A$ where $f(a_1) = f(a_2)$. We need to show that $a_1 = a_2$.

Since $f(a_1) = f(a_2)$, we know that $f(f(a_2)) = f(f(a_2))$. We know that $f$ is an involution, so for any $a_1 \in A$, we know that $f(f(a_1)) = a_1$. Similarly, for any $a_2$, we know that $f(f(a_2)) = a_2$. This means that

\[a_1 = f(f(a_1)) = f(f(a_2)) = a_2\text,\]

so $a_1 = a_2, as required. ■

The core logic of this proof is perfectly sound. However, there’s an issue with how it’s written. Specifically, look at this sentence:

⚠ We know that $f$ is an involution, so for any $a_1 \in A$, we know that $f(f(a_1)) = a_1$. ⚠

Earlier in the proof, we asked the reader to pick their favorite choice of $a_1$ from $A$. Imagine that they’ve picked $a_1 = 137$. At this point in the proof, we’re now talking about “for any $a_1 \in A$.” From the perspective of the person reading the proof, this is like saying “for any $137$.” It’s not a meaningful statement.

The intent of this sentence, though, is clear – we want to say “all involutions do this for all values in their domain, and here’s a specific value in the domain, so here’s what we’ll conclude.” And that’s fine! But to write that, just say something like this:

We know that $f$ is an involution, so $f(f(a_1)) = a_1$.

Some people feel uncomfortable writing this. “But how is the reader supposed to know why this is legal?,” you might ask. It’s a fair question! The answer is that the convention is that whoever is reading the proof knows all the relevant terms and definitions, so when they see you saying this, they can look up the definition of an involution, see that it indeed lets you make this claim, and say “ah, okay, that makes sense.”

Distinguish Between Assuming and Proving

One of the trickier nuances to pick up when working with first-order logic definitions is that statements behave differently based on whether you’re assuming them or proving them. Let’s begin with a simple one. The definition of an even integer can be expressed as follows:

\[n\text{ is even} \qquad \text{if} \qquad \exists k \in \mathbb{Z}. n = 2k\text.\]

If you are proving that a number $n$ is even, then you need to find some integer $k$ where $n = 2k$. If you are assuming that a number $n$ is even, then you can introduce a new variable $k$ such that $n = 2k$. Notice that these are similar to one another – both involve some variable and $n$ being twice that variable – but that they’re fundamentally different concepts. One is “go on a scavenger hunt to find the choice of $k$ you want.” The other is “oh, how convenient! I get a new variable named $k$ with some nice properties.”

This highlights that assuming a statement is true means something totally different than proving that it’s true. In one case, you get a new variable for free. In the other, you have to go find one.

This gets all the more important when you’re working with more complex definitions. For example, consider the definition of injectivity, shown here:

\[\forall x_1 \in A. \forall x_2 \in A. (f(x_1) = f(x_2) → x_1 = x_2)\text.\]

If you are proving that a function $f$ is injective, then you

  • ask the reader to pick arbitrary $x_1$ and $x_2$, which must be elements of $A$;
  • assume that $f(x_1) = f(x_2)$; then
  • aim to prove that $x_1 = x_2$.

This process introduces two new variables $x_1$ and $x_2$ that the reader has picked for you. All you know about them is that $f(x_1) = f(x_2)$, and your goal is to prove that $x_1 = x_2$.

On the other hand, if you are assuming that a function $f$ is injective, then you

  • do not introduce any variables, and
  • instead wait to see if you find any values $y_1$ and $y_2$ where $f(y_1) = f(y_2)$. If so, you can then conclude that $y_1 = y_2$.

Notice how different this is! No new variables get introduced, and instead you’re left waiting for some sort of opportunity to present itself where you can draw some new conclusions.

As you transition from proofs about odd and even numbers – which are mostly proofs on existentially-quantified definitions – to proofs about functions and discrete structures – which are often proofs on universally-quantified statements – it is important to keep track of which statements you’re assuming, which statements you’re proving,

For example, consider the following incorrect proof about involutions and injections:

Theorem: For any function $f : A \to A$, if $f$ is an involution, then $f$ is injective.

⚠ Incorrect! ⚠ Proof: Let $f : A \to A$ be an involution. We will prove that $f$ is injective.

Since $f$ is an involution, for any $x_1 \in A$, we know that $f(f(x_1)) = x_1$. We also know that for any $x_2 \in A$ that $f(f(x_2)) = x_2$. (
) ■

The claim being proved here is indeed true, and the first paragraph is on the right track – we should assume f is an involution and should prove that f is injective.

Things go off-script, though, when we hit the second paragraph. The first sentence of the second paragraph talks about how $f(f(x_1)) = x_1$ for any $x_1 \in A$. This is true, but it breaks a few other checklist items. First, we haven’t properly introduced $x_1$ here – we didn’t ask the reader to pick it, we didn’t pick it ourselves, and it’s not a name for something already known how to exist. Also, here we’re restating the definition of involutions in the abstract, which doesn’t contribute anything to the proof.

Fundamentally, the reason we got into trouble here is we aren’t supposed to be introducing any variables or talking about involutions yet. Let’s look at the definition of involutions again:

\[\forall a \in A. f(f(a)) = a\text.\]

We are assuming that this statement is true. What does that mean? Well, if we look at the tables from the first lectures on functions or in the Guide to Proofs on Discrete Structures, we’ll see that, since we’re assuming a universally-quantified statement, we aren’t supposed to do anything yet! Instead, we’re supposed to keep an eye out for some value $x \in A$ where we can claim that $f(f(x)) = x$.

This mistake – introducing a variable when assuming a universally-quantified statement – is hard to recover from. In fact, in the above proof, I don’t think there’s any way to proceed with the current line of reasoning and get to a valid proof. The only correct strategy here is to delete the second paragraph and start over again.

There are a couple of indicators that you can keep an eye out for that are helpful for sensing that you’re mixing up assuming and proving something:

  1. If you find yourself talking about how something is true “for all $x$,” it might indicate that you’re trying to use a universally-quantified assumption before you’re ready to do so. If so, remove the sentence where you’re talking about “for all $x$” and see if you can discover where to apply it in a more concrete, specific sense.
  2. If you had the reader pick an arbitrary value and later on discover that it should have some specific, concrete value, it might mean that you’ve started off with a universally-quantified assumption, introduced a variable for it, and then later discovered which specific value you were supposed to work with. If so, double-check to make sure you’re starting in the right place.

And, as always, if you’re unsure whether what you’re doing is going in the right direction, just ask us! We’re happy to help out.

Type-Check Your Proofs

Type checking has been a recurring theme in this class – you’ve seen this in the set theory translations from Problem Set One and the first-order logic translations from Problem Set Two – and that trend continues forward as you’re talking about discrete structures. With the introduction of new terminology from discrete structures, there are more opportunities to make type errors, and so we thought we’d quickly survey some of the common mistakes.

Let’s imagine that $f : A \to B$ is a function and that $x$ is an element of $A$. There’s a difference between the function $f$ (a general transformation that can be applied to elements of $A$) and the object $f(x)$, which is a specific object contained in the set $B$. In other words:

  • $f$ is of type “function.”
  • $f(x)$ is of type “element of $B$”

This means, for example, that you could say something like “$f$ is injective” or “$f$ is a bijection,” but you shouldn’t write something like “$f(x)$ is injective” or “$f(x)$ is a bijection,” since $f(x)$ refers specifically to what you get when you plug $x$ into $f$. As a note, in practice, mathematicians tend to sometimes be a lot looser with the distinction between $f$ and $f(x)$ and often use one in place of the other, though for the purposes of CS103 we’ll expect you to keep the notation distinct to convince us that you understand why they don’t represent the same thing.