At the ballot box

Assignment written by Julie Zelenski

In 1887, French mathematician Joseph Bertrand pondered an idealized election conducted between two candidates, A and B. Each voter casts a ballot for a single candidate and adds it to a sealed box. When the polls close, the box is shaken to mix up the order and ballots are removed one by one and tallied. At the end of the tally, candidate A has more votes than B and wins the election. Bertrand's question was: What is the likelihood that the winning candidate A is strictly ahead of candidate B throughout the entire tally?

Although Bertrand posed the question in terms of probability, his answer uses an argument based on counting. An ordering of the ballots where the eventual winner is always in the lead is considered a good ordering. If you count the number of good orderings and divide by the count of all possible orderings, this ratio is the likelihood of a random ordering being a good one.

Consider an election with three ballots to tally: two for A and one for B. The ballots can be tallied in three possible orders: AAB, ABA, and BAA:

AAB: A takes the lead as the first ballot is tallied and never loses it (good)
ABA: A has an early lead, briefly loses lead before regaining it (not good)
BAA: A gains the lead only when tallying the last ballot (not good)

Given that all three orderings are equally likely and only one is good, the likelihood of a good ordering is 1/3.

Recursion can be a useful technique for counting tasks such as this one!

Count all orderings

Starting by writing the recursive function

int countAllOrderings(int a, int b)

that returns the number of possible orderings for tallying an election with ballot counts a and b.

Here are some hints on recursively breaking down the task:

The simplest case is when all remaining ballots are for the same candidate. There is only one possible ordering, e.g. AAAAAA. This sounds like a base case!
Otherwise both candidates have ballots remaining, meaning there are two possibilities for the next ballot to be tallied: either it will be a ballot for A or one for B. Your counting will consider both.
Tallying a single ballot leaves you with a slightly simpler form of the same problem to solve. Self-similarity is the hallmark of a recursive case! Consider how to make recursive call(s) to count the orderings for the smaller set of remaining ballots and how to combine those results to solve the original problem.
In lecture, recursion was used to generate/print all possible sequences (coin flips, dice rolls, …). Those examples have a similar structure to what is needed here, but a notable difference is that you do not construct the actual sequences, instead only count them, which makes your job a bit simpler.
The whole point of this exercise is to solve the problem recursively. So even if you know another way to derive the result, put that aside, and focus on recursive counting. There should be no loops, no multiplication, no factorial, no combinatorics, no generating functions, … No fancy math whatsoever, just addition. Also do not use additional data structures (no Stacks, Queues, Vectors, strings, etc.). Remember that you do not need to generate, print, or store the orderings, just count them.

Testing

To test your function, choose a few small values of a and b and list out the orderings by hand. Construct student test cases to confirm the results from countAllOrderings matches your hand counts.

Count good orderings

Now write the recursive function

int countGoodOrderings(int a, int b)

that returns the number of good orderings for tallying an election with ballot counts a and b. A good ordering is one in which candidate A was in the lead throughout.

Our thoughts on how to approach this:

The general idea is to copy/paste your countAllOrderings function and modify it to restrict exploration to only those orderings that meet the criteria for being good.
Add an additional argument to the recursive calls to keep track of the current value of A's' lead. As long as the value of lead is positive, the current ordering is good and you keep counting. Conversely, as soon the lead becomes zero or negative, you'll know there are no orderings to be counted on this path.
Because of the additional argument, you will need to use the original countGoodOrderings function to call to the inner recursive function. In other words, countGoodOrderings won’t be recursive, but your three-argument function will be.

Testing

Review the list of orderings you earlier made when constructing test cases for countAllOrderings. Identify which orderings meet the criteria for good. Construct student test cases that confirm the results from countGoodOrderings matches your hand counts.

Bertrand's Theorem

The magical pièce de résistance of this story is that Bertrand was able to derive a closed form for the likelihood for all values of a and b where a > b:

                                  (a - b)
Likelihood of good ordering   =  --------
                                  (a + b)

Such a neat (and surprising) result 🤯! We don't ask you to derive his proof, you may take the formula on faith. (But those of you in CS103 and/or CS109 or just curious should check out the references to learn more!)

Bertrand's nifty theorem gives a new way to confirm the correctness of your results. Previously you were hand-computing results in order to know what value to expect when writing a test case. This kind of testing is tedious, error-prone, and doesn't scale. But now you know that the ratio countGood(a, b)/countAll(a, b) should be equal to (a - b)/(a + b) for all valid values of a and b. This allows you to write test cases that provide coverage for both your functions in one go, neat!

Write a final STUDENT_TEST that uses a for loop over several different values for a and b, comparing Bertrand's formula to the ratio of your two functions. Because the counts grows extremely quickly O(n!) oh my indeed, you will need to limit your testing to values of a and b that are less than 15. These smaller values will permit the counting to complete in a reasonable amount of time and will not overflow the range of the int type. Having confirmed correctness on all of these small values, the power of induction (which is just recursion is disguise) allows you to conclude they also would work correctly on ever larger values, if only our system had the necessary time/memory to permit it.

Notes

Valid values for a and b require that a >= 0 and b >= 0 and a > b. You can also assume that a and b will be <= 15 for practical limitations. We will not test on calls with invalid argument values.
In C++, the division operator applied to two integers performs integer division and produces an integer result, truncating any fraction. To perform real-value division, one or both of the operands must be double type. You can use a C++ typecast to convert from one type to another. Take care where you apply the typecast, you want to convert the operand before division, not after.
```
int result = 1/4;               // integer division, result is 0
double result = double(1)/4;    // real-value division, result is 0.25
double result = double(1/4);    // typecast applied after integer division, result is 0.0
```

References

You don't need to delve into the mathematics behind Bertrand's theorem, but it is quite fascinating if you are curious to learn more:

Wikipedia page on Bertrand's Ballot Theorem
Mark Renault, Four Proofs of the Ballot Theorem