Lab 1 Extra Problems

Written by Julie Zelenski

Go back to this week's lab writeup here.

These optional problems are an opportunity to further exercise your bitwise understanding.

## Midpoint

For this exercise, start by reading Google researcher Joshua Bloch's newsflash about a ubiquitous bug from integer overflow. Who knew that simply computing the midpoint could be perilous?

We want a function that safely and correctly computes the midpoint of two integers. If the true midpoint is not an exact integral value, we are not fussy about how the result is rounded. As long as the function returns either integer neighbor, we're happy.

The file `mid.c` contains four different formulations to compute the midpoint in a simple program you can experiment with.

The `midpoint_original` function mostly works, but exhibits the bug called out in Bloch's article:

``````int midpoint_original(int x, int y) {
return (x + y)/2;
}
``````

If the sum `x + y` overflows, the result is erroneous. For example, the call `midpoint_original(INT_MAX-2, INT_MAX)` should return `INT_MAX-1`, but actually returns -2. Oops! In the original context, the two inputs to midpoint were array indexes, which perhaps explains how this bug was able to lurk for so long with no visible symptoms (i.e. arrays of yore rarely had such large dimensions).

Bloch's article proposes some fixes for calculating the midpoint. First he offers `midpoint_A` and then champions `midpoint_B` as a faster alternative.

``````int midpoint_A(int x, int y) {
return x + ((y - x) / 2);
}

int midpoint_B(int x, int y) {
return ((unsigned int)x + (unsigned int)y) >> 1;
}
``````
• Q: Consider how `midpoint_A` and `midpoint_B` have rearranged the original calculation. What has `midpoint_A` done to avoid overflow? What has `midpoint_B` done?

Both `midpoint_A` and `midpoint_B` work correctly as long as both inputs are non-negative. This constraint is never actually stated, it is merely implicit in the original context that the inputs are array indexes. So if one or both inputs is negative, what then? Let's investigate!

• Given a negative input, `midpoint_A` is susceptible to a different overflow than the original, in this case during subtraction. Q: What must be true about the two operands to subtract in order to cause overflow? What is the result of a subtraction operation that has overflowed? Work out what you think is an input that causes a failure for midpoint_A, then test out your theory by editing the `mid.c` program to try it. Build and run the program to verify your understanding.
• `midpoint_B` has its own distinct problem with negative inputs. To expose the flaw in `midpoint_B` it helps to work backward. Consider how the expression within `midpoint_B` will never evaluate to a negative result -- why not? Given this fact, any inputs for which the midpoint is negative must fail on `midpoint_B`. Q: What value will be returned instead of the expected in these cases? Edit the `mid.c` program to add an input that demonstrates the problem and verify that your theory matches the observed behavior. If you cast the sum back to a signed value before shifting right, you fix this particular problem, but at the expense of causing a different case to now fail. Experiment with the code to observe this. It feels like we just can't win!

The final version of midpoint we have for you is `midpoint_C`. This gem comes from the marvelous fascicle on bits written by the extraordinary Don Knuth. (fascicle??).

``````int midpoint_C(int x, int y) {
return (x & y) + ((x ^ y) >> 1);
}
``````

Knuth's `midpoint_C` is the real deal, computing a correct midpoint for all inputs, with no possibility of overflow! How this approach works is not obvious at first glance, but with some careful study you can work through how it all fits together.

• Start by considering the bitwise representation of a number and how its "on" bits correspond to the powers of 2 in the number's binary polynomial. For instance, `0000...01011`, which is the base-10 number 11, can be written as `1*2^0 + 1*2^1 + 0*2^2 + 1*2^3 = 11`.
• Now trace the effect of comparing the powers of 2 contained in x and y. The bitwise `&` identifies which powers of 2 the two inputs have in common and the bitwise `^` pulls out those powers which differ.
• How are those two results combined so as to gather the needed powers of 2 in the midpoint?
• The code above joins with `+`. Why must we use `+` here? Try substituting `|` and work out how the resulting expression now operates. The result is nearly equivalent, but not quite -- what has changed?

And what exactly is a fascicle? It took us several hops through the dictionary to find out!

## Zero Bytes

The `zero_byte.c` file in lab1 contains the function below that was taken from musl and lightly paraphrased. The task is to determine whether any of the 8 bytes in a long is a zero byte. The simple and naive approach would be to iterate over the 8 bytes and individually extract each byte and compare to zero. The function below manages to simultaneously test all bytes. What?! Puzzling out the operation of this extraordinary function is going to take some effort, but we promise it will be worth it!

Note: be careful with gdb here; it excludes leading 0s, so remember that there may be more zeros than it prints out!

``````bool has_zero_byte(unsigned long val) {
unsigned long ones = ~0UL/UCHAR_MAX;
unsigned long highs = ones << (CHAR_BIT - 1);
return (((val - ones) & highs) & ~val) != 0;
}
``````

Here is our guide for how to trace through the code.

• Work out the pattern of the bitmask constructed by the expression `~0UL/UCHAR_MAX`. Run the challenge program under `gdb` and print out the various sub-expressions, such as the numerator by itself, the denominator, and so on. Using `p/t` shows the binary representation which may be helpful. Both suffixes on the constant `0UL` are critical. How does the result change if the `L` were removed? What about if the `U` is removed?
• The original version of the code expressed the second line as:

``````unsigned long highs = (ones * (UCHAR_MAX / 2 + 1))
``````

The provided code uses a bitshift in place of the above multiplication/division. The two compute an equivalent result. Why is this so?

• Before moving on to the rest, be sure you have correctly constructed the bit patterns assigned to the `ones` and `highs` variables. Verify your understanding by printing the bits in gdb.

• The `highs` variable is being used as bitmask in the expression `(val - ones) & highs`. This masking operation extracts those bytes of val that exhibit a certain numeric property. What is that property?
• The final masking operation `& ~val` is filtering out bytes identified in the previous step that are not zero bytes. How did those false positives come about and how does this last step filter them out?

If you are getting mixed up on how it operates on all 8 bytes simultaneously, try the expressions on a single byte. In other words, what is `ones` for a single byte? What is `highs` for a single byte? How does the expression `((val - ones) & highs) & ~val` work for a single byte? After you work that out, then generalize to the multi-byte `unsigned long` type.

The function demonstrates a form of "mini-parallelism"-- a single operation on the entire long is effectively applying that operation to all 8 bytes in parallel. While the naive loop racks up 3-6 operations per each byte (a total of some 40 ops), this version tests all bytes in just 4 total operations (assuming constant operations are precomputed at compile-time). Neat! This is an exceedingly clever piece of code and we hope a satisfying accomplishment for you to be able to understand and appreciate such an action-packed sequence of bitwise wizardry!