Written by Julie Zelenski
This optional challenge problem is an opportunity to further exercise your bitwise understanding.
The challenge.c file in lab1 contains the function below that was taken from musl and lightly paraphrased. The task is to determine whether any of the 8 bytes in a long is a zero byte. The simple and naive approach would be to iterate over the 8 bytes and individually extract each byte and compare to zero. The function below manages to simultaneously test all bytes. What?! Puzzling out the operation of this extraordinary function is going to take some effort, but we promise it will be worth it!
Note: be careful with gdb here; it excludes leading 0s, so remember that there may be more zeros than it prints out!
bool has_zero_byte(unsigned long val)
{
unsigned long ones = ~0UL/UCHAR_MAX;
unsigned long highs = ones << (CHAR_BIT - 1);
return (((val - ones) & highs) & ~val) != 0;
}
Here is our guide for how to trace through the code.
- Work out the pattern of the bitmask constructed by the expression
~0UL/UCHAR_MAX. Run the challenge program undergdband print out the various sub-expressions, such as the numerator by itself, the denominator, and so on. Usingp/tshows the binary representation which may be helpful. Both suffixes on the constant0ULare critical. How does the result change if theLwere removed? What about if theUis removed? -
The original version of the code expressed the second line as:
unsigned long highs = (ones * (UCHAR_MAX/2+1))The code in challenge.c uses a bitshift in place of the above multiplication/division. The two compute an equivalent result. Why is this so?
-
Before moving on to the rest, be sure you have correctly constructed the bit patterns assigned to the
onesandhighsvariables. Verify your understanding by printing the bits in gdb. - The
highsvariable is being used as bitmask in the expression(val - ones) & highs. This masking operation extracts those bytes of val that exhibit a certain numeric property. What is that property? - The final masking operation
& ~valis filtering out bytes identified in the previous step that are not zero bytes. How did those false positives come about and how does this last step filter them out?
If you are getting mixed up on how it operates on all 8 bytes simultaneously, try the expressions on a single byte. In other words, what is ones for a single byte? What is highs for a single byte? How does the expression ((val - ones) & highs) & ~val work for a single byte? After you work that out, then generalize to the multi-byte unsigned long type.
The function demonstrates a form of "mini-parallelism"-- a single operation on the entire long is effectively applying that operation to all 8 bytes in parallel. While the naive loop racks up 3-6 operations per each byte (a total of some 40 ops), this version tests all bytes in just 4 total operations (assuming constant operations are precomputed at compile-time). Neat! This is an exceedingly clever piece of code and we hope a satisfying accomplishment for you to be able to understand and appreciate such an action-packed sequence of bitwise wizardry!