Assignment 1 advice

Written by Julie Zelenski

The assignment writeup is the place to go for the formal specification; this is the companion advice page where we offer hints and recommendations and answer common question. You may want to skim this page now to get a sense of what's covered and return here on an as-needed basis for further review when/if these issues arise. We will also update this page in real-time with any late-breaking issues that come up while students are working on the assignment.

Advice for any/all assignments

Please start by reading our best practices for completing cs107 assignments for broadly applicable advice for tackling any assignment!

In the samples directory, we provide a sample solution program. Your goal is to make your program match the behavior of our sample. If ever in doubt, run the sample solution to see the correct output or observe the expected handling.

Get in the habit of using sanitycheck early and often. The handy sanity check tool automates comparison of your program's output to the sample. The default sanity check runs a few simple cases that verify basic functionality. You can use custom sanity check with your list of custom cases to run more comprehensive tests and further validate your program's correctness.

Working with bits

Embrace hexadecimal. The base-16 system very conveniently translates to and from binary -- each byte is expressed in two hex digits, one hex digit for the 4 high-order bits, the other for the 4 low bits. Masks in hex or expressed via their construction are more evocative of the binary representation and help you avoid mistakes. For example, a mask to extract the sign bit from a 32-bit value can be cleanly expressed as 1 << 31 or 0x80000000 and both are quite readable. The mask also happens to be decimal 2147483648U, but who can tell? Would you notice if you had accidentally mistyped 2147486348U? Say no to decimal, hex is your pal!

Learn your bitwise operators. It is wasteful to invoke the (expensive, floating-point) pow function to compute integer powers of two: 1 << 10 does a perfect job of computing pow(2, 10) and requires way fewer compute cycles. The bitwise operations are also the right tools to rearrange/isolate/change bits as opposed to the more indirect use of integer multiply/divide/mod with powers of two.

Aim for directness. Students who aren't yet comfortable with bits often end up expressing operations in roundabout/convoluted ways. Some common pitfalls are shown in the code below.

No left-shift right-shift. In the code below, the first version right-shifts then immediately left-shifts back. This seems to just cycle the bits back to where they were originally -- what is even the point? Trace carefully, though, the right-shift will discard the lsb and the left-shift will then zero-fill. This is roundabout way to wipe the lsb, but the better approach is to just directly mask it off as in second version:

    result = (val >> 1) << 1;     // shift right then left, why?

     // Better version:
     result = val & ~1;             // directly turn off lsb

No redundant/unnecessary ops. Imagine your goal is to extract the 4 high-order bits of a byte and shift them down to the lower half-byte. You could do it with two steps (mask and shift), but in fact, the shift will already discard the low-order bits, so no need to mask them off first:

    result = (val & 0xf0) >> 4;  // mask off lower 4, then shift down

    // Better version:
    result = val >> 4;            // directly shift, low-order bits gone!

Know your identities/relationships. When putting together a bitwise construction, stop and consider whether there is a more direct way to accomplish the same thing, e.g.

    b = a | 0                   // this is no op! why bother?
    b = a ^ -1                  // more clearly written as b = ~a
    b = ~a + 1                  // more clearly written as b = -a
    b = (1U << 31) - 1          // INT_MAX from <limits.h> 
    b = INT_MIN >> 31           // how about just -1 instead?

At first you may find yourself using a more convoluted expression, but reviewing in a later pass, you can spot these issues and clean them up. Eventually you want to become facile enough with bit manipulations that you go straight for the streamlined, direct expression from the beginning.

Common questions about assign1


How do I display values in binary?

In gdb, p/t prints in the binary format. The /t format also works for the gdb x and display commands. The C printf function does not have a binary format, but %x will print in hex, which is the next best thing.

What is allowed/disallowed when writing the saturating functions?

Absolutely no use of the relational operations: < > <= >=. All other operators (equality, arithmetic, logical, bitwise, ...) are fine. There must be no conditional execution based on bitwidth (e.g. no switch/if that divides into different code paths depending on the value of bitwidth). Other conditional execution, such as for different handling for overflow or not, is allowed.

Are there any differences in the bitwise operations when applied to signed versus unsigned values?

The bitwise operations & | ^ ~ << operate identically on signed versus unsigned types. The right shift operator >> comes in two flavors: logical and arithmetic. The logical shift is used for unsigned types, arithmetic for signed. A logical right shift zero-fills the bits in the leftmost positions that were vacated by the shift to right. The arithmetic right shift copies the previously leftmost bit to fill the vacated bits (thus replicating the sign bit). (Side note: The C standard leaves unspecified whether a signed right shift is arithmetic or logical, our gcc compiler uses arithmetic and you may depend on that, but in the future may run into a different behavior on another compiler.)

I'm getting the error warning: left shift count >= width of type. What does this mean?

Code such as this will produce that warning:

long val = 1 << 32;

An integer constant defaults to type int, shifting a 32-bit int by 32 positions would effectively shift off all the bits. The compiler thinks this is a pretty silly thing to do, so warns about it. Assigning the result to a variable of type long (64 bits) doesn't help; you have to start with a 64-bit value if you intend to compute a 64-bit result. Read the next question!

What bitwidth and signedness does a numeric integer constant have?

An integer literal is signed int by default. Suffixes can be added to the constant to dictate its type. A suffix of L indicates the type is long and U (either by itself or with L) indicates it is unsigned.

What are the rules for integer promotion in C?

When assigning from narrower bitwidth (e.g. char, short) to a wider, the source bytes are copied to the low-order bytes of the destination, and the upper bytes are assigned based on the promotion rules. If the source type was unsigned, the upper bytes of destination are zero-filled. If the source type was signed, its sign bit (the msb) is replicated across the upper bytes of destination (this is called sign extension). It may help to trace through this and confirm for yourself how/why this is value-preserving. If an operation is performed on two operands of mixed type, integer promotion is implicitly applied to the narrower first. It also the case (and a bit surprising, too) that integer promotion is also applied to types narrower than int when an operation is performed on them, even if no other wider type is involved. For example, the bitshift and invert operations promote the operand to an int, so bit-shifting a char will promote to an int before shifting, bit-inverting of a char similarly promotes, and the default result type of these expressions will be int. If the bitwidth of the operand was larger than int, the bitwidth will be unchanged by these promotion rules.

My Unicode characters are not printing or appear garbled. How can I fix?

If the characters output by the UTF-8 program do not appear correctly on your terminal, but the encoded hex bytes are correct, you may need to set your terminal to expect UTF-8. Changing the configuration is OS/terminal-specific, but within 5 minutes of Googling we found utf8 advice for Putty, MobaXterm and Xterm. The default Mac OS X Terminal.app works by default. Note that sanity check should still pass even if your terminal is showing the wrong character.

Rabbit

                      A rabbit
                           bit
                  A little bit
                   An itty-bitty
                    Little bit of beet.
                      Then bit
                        By bit
                        He bit
                                Because he liked the taste of it.
               But when he bit
                     A wee bit more,
               It was more bitter than before.
             "This beet is bitter!"
                        Rabbit cried.
                 "I feel a bit unwell inside!"
               But when he bit
        Another bite, that bit of beet
Seemed quite all right.
                                Besides
When all is said and done,
                    Better bitter beet
                Than none.

                                              -- Mary Ann Hoberman