Assignment 4: A bit of fun

Due date: Tue May 03 11:59 pm - Hard deadline: Thu May 05 11:59 pm

Assignment written by Julie Zelenski
Thanks to John Regehr (Utah) for saturating arithmetic

[Quick Links: Implementation details, Advice page, Grading]

Learning goals

During this assignment, you will gain practice in

  1. using the C bitwise operators to manipulate binary data
  2. dissecting the bitwise data representation of various types (ints, floats, instructions)
  3. working within/around the limits of integer and floating point representation

The assignment

Your next assignment consists of a programming "problem set". Our most recent material spans a diverse set of topics that don't easily lend themselves to being aggregated into one combined program, so instead this is a set of small exercises that individually explore topics such as bitwise operators, bit vectors, integer arithmetic, floating point representation, and instruction encoding.

For each exercise below, you are asked to write a function or two to accomplish a specific task or do a lab-like exploration on a particular topic. The code you're working with may be short, but it is mighty-- working through these exercises will really help solidify your understanding of bits, bytes, and numbers.

1. 2 bits, 4 bits, 8 bits, a short...

Your first task is to use bitwise operations to compare the count of "on" bits between two signed integers. The cmp_bits function is given two integer values a and b. It returns a negative result if the bitwise representation of a has fewer "on" bits than b, a positive result if a has more "on" than b, and zero if both have the same number of "on" bits.

int cmp_bits(int a, int b)

2. Sudoku sets

Treating an unsigned type as a bit vector can compactly represent and efficiently manipulate a set over a given range. Bit vectors are used in time and space-constrained situations to provide dense storage and allow fast manipulation of the entire group of bits at once. In a bit vector, each individual bit is designated to represent a particular value and a bit is turned on to indicate its corresponding value is contained in the set. In this problem, we are have sets drawn from the range [1-9]. A set is represented using an unsigned short (16 bits). The bit at the nth position (counting from least significant bit at the 0th position) is designated to represent the value n. The bits in the positions 1 to 9 represent the set membership and the other seven bits of the short are ignored. Below is a few example sets and their corresponding bit vectors. The bit positions 1 to 9 are shown in blue to distinguish them from the unused bits around them.

0000000000000000           // {}
0000001010100100           // {2 5 7 9}
0000000000001110           // {1 2 3}

Write the make_set function to create a bit vector set from an array. The arguments to the function are an array of integer values and the array count. The function returns an unsigned short that represents a bit vector set of the values from the array. The bits in positions 1-9 of the returned result mark the set membership, the remaining bits are zeros. You may assume that no value in the array will be outside the range 1-9.

unsigned short make_set(int values[], int nvalues)

Now write the is_single function to manipulate these bit vector sets in efficiently computing a result needed when solving a Sudoku puzzle. The goal of Sudoku is to assign every cell in the grid a digit from 1 to 9 subject to the constraint that each digit can be used only once in each row, column, and block. (You can read about Sudoku in wikipedia if you're curious, but for the purposes of this problem, you don't need to know any more details). The is_single function is given three well-formed bit vector sets representing the digits already used in the row, column, and block for a cell. ("well-formed" means that the short will have bits on only in positions 1-9, remaining bits are zeros). The possibilities for a cell consist of only those digits that are unused; any digit that is already used in the row, column, or block is not a possibility. The function is_single returns true if the already used digits force only one option for this cell (number of possibilities is exactly 1) and false otherwise (i.e. number of possibilities is 0 or 2 or more).

bool is_single(unsigned short used_in_row, unsigned short used_in_col, unsigned short used_in_block)

The ideal implementation of is_single builds on direct and efficient use of bitwise operators (~ & ^ | << >>) with very limited use of the integer arithmetic/relational operators and is expressed in straight-line code (i.e., no loops, no if/switch, no chain of essentially same operation repeated 8-9 times and joined with some connective operator). If you can't quite achieve the ideal, a working solution that is unnecessarily awkward/inefficient can still earn most of the credit.

3. Saturating arithmetic

Saturating arithmetic operations are those where results that overflow or underflow "stick" at the maximum or minimum representable result instead of wrapping around like the usual modular arithmetic operations. If you are using saturating addition and subtraction on signed integers, (INT_MAX + 1) == INT_MAX and (INT_MIN - 1) == INT_MIN. Saturating arithmetic is a common feature in 3D graphics and digital signal processing applications. A saturating addition function takes values a and b and returns their sum a+b if it doesn't overflow/underflow the representable range or otherwise returns the appropriate "sticky" value. You are to write two variants of saturating addition: one that operates on unsigned values, another on signed.

stype sat_add_signed(stype a, stype b)
utype sat_add_unsigned(utype a, utype b)

The functions are written in terms of stype/utype which are placeholders that can be filled by any integer family type. The bits.h header file has a #define that controls which type from the integer family (char, short, int, long, or long long) is chosen for stype/utype. Your implementation of the saturating operators must work independently of the bitwidth, using a single unified sequence of operations that works correctly for all. This portability requirement will lead you to think through how promotion/truncation operates between the integer-family types.

You are required to execute the exact same code for all bitwidths, which means you must not use any sort of if/switch that divides into cases for the different bitwidths. This doesn't mean that you can't use conditional logic (such as to separately handle overflow or non-overflow cases), but conditionals that dispatch based on the type/sizeof of operands are totally off-limits.

4. Get to the (floating) point already!

In order to reason about floating point behavior, it's critical to start with an understanding of the underlying representation. This next task is a lab-like exercise to help you get a handle on the mechanics of floats: how the float bits are used for sign/exponent/significand, which values are representable and which aren't, the gaps in the floating point number line, normalized versus denormalized, and so on. Start by adding a couple of terms to your vocabulary: binade and epsilon. A binade is the set of numbers that all have the same binary exponent. All representable float values in the interval [2n, 2n+1) for some value of n form one binade. The epsilon of a given float f is the distance from f to the next larger magnitude representable float (i.e. its neighbor on the floating point number line moving away from zero). All float values within one binade have the same epsilon.

Below is a list of interesting floats we selected for you to consider. From each description, work out what its 32-bit float is, diagram its float bits, compute its epsilon as a power of 2, express its value as a sum/difference of powers-of-2 and then show the value in decimal. As an example, consider the float f = 1.75. Its float bits are 0 01111111 11000000000000000000000, its value expressed as sum of powers-of-two is 20 + 2-1 + 2-2 (or as difference 21 - 2-2), its epsilon is 2-23, and its decimal value is 1.75. When writing a value using powers-of-2, either sum or difference of powers works; choose the one that results in the more compact sequence.

Here are the floats you are to consider: (Note that constant values below are expressed in decimal, i.e. -100.0 is negative one hundred, .01 is one one-hundredth)

The point of this exercise is to strengthen your facility with bits and how they are used for representing floats. We strongly recommend you first work out the bits/powers-of-2 using pencil-and-paper. Then verify your results by using code/gdb and get the decimal value via printf("%.9g"). Edit the template given in floats.txt to provide your answers and submit the completed file with your repo.

5. The wrath of Kahan

This problem concerns computing the average of a sequence of float values. The naive approach of summing the values and dividing by the count should do the job in theory, but in practice will not compute a correct answer for all inputs. Floating point math brings the possibility of overflow (magnitude too large for range), underflow (magnitude too small for range), loss of precision due to rounding error in addition, and loss of significant digits in subtraction, all of which can throw a wrench into even simple calculations.

In pure mathematics, a sequence of numbers has one true average. On a computer, the best we can hope for is the nearest representable float to that true average, so that's where we set the standard. If the average function produces that nearest representable float, we will call that a "correct" result. The true average of {1,1,0,0,0} is 0.4, but as noted in lab, 0.4 is a non-terminating binary decimal. Its nearest representable float is 0.400000006, so this is considered the correct result. Assuming the function is given a sequence of exactly represented floats, you might think it a straightforward task to return the nearest representable float to the average, but even seemingly valid approaches won't compute the correct float for some inputs.

An average function operates on a sequence of floats. We give you four pre-written variants of average: one "naive" version and three "improved" variants A, B, and C. Each of A, B, and C makes an intentional rearrangement of operations to avoid a specific floating point pitfall present in the naive version. The reformulations are equivalent according to the rules of pure math, but in a floating point world, these simple changes can and do affect the outcome for certain inputs. Each improved version succeeds somewhat on its good intentions by fixing some previously incorrect outcomes, yet each still has inputs for which it is incorrect.

This problem is formulated like a lab exercise. You won't write code, but instead will be observing and analyzing the code we give you. Read over the given naive average and compare its approach to averages A, B, and C. Each has changed the computation, you are to work out why those modifications were made -- what weakness in the naive version is it trying to address? For which inputs do you expect these changes produce an improved result?

The code in average.c sets up some different test sequences. Each sequence is passed to each of the four average functions. Work out what is the "correct" result for each sequence and run the code and observe which (if any) of the average functions return that result. Where the result differs from expected, dig in to find out what is happening. The sequences we provide are intentionally composed to show certain effects. You might find it helpful to experiment with additional test sequences of your own to explore further until you have a clear idea of how each function operations.

Edit the readme.txt file to summarize your findings in a short report (3 paragraphs). You should identify the strengths of averageA/B/C relative to the others, as well as what vulnerabilities remain. Aim to be precise in your analysis and relate the observed behaviors to the underlying mechanics of floating point operations. The grading for this question will value the thoughtfulness and thoroughness of your explanation above all -- show us that you're starting to make some sense of how these funny things called floats work!

An important learning goal is arriving at an understanding of the floating point limitations. You should be able to reason about errors that arise in floating point calculations and recognize the inherent tradeoffs that come in trying to work around them. An infinite domain in a finite number of bits is inherently a compromise, but knowing some of the techniques you can use to mitigate the worst of its effects and what approximations must be tolerated. Who ever knew it could be so challenging to write robust floating point code!

6. Some disassembly required

Your final task is to pick apart a binary-encoded x86-64 machine instruction and print its human-readable assembly equivalent. This is the job of a disassembler (the inverse of the assembler), such as the objdump tool or the gdb disassemble command. The instructions you must decode are the five variants of the pushq instruction shown in the table below.

Disassembled instruction Variant (what is pushed?) Binary-encoded machine instruction (num bytes in parens) In hex
pushq $0x3f10 immediate constant 01101000 00010000 00111111 00000000 00000000 (5) 68 10 3f 00 00
pushq %rbx register 01010011 (1) 53
pushq (%rdx) indirect 11111111 00110010 (2) ff 32
pushq 0x8(%rax) indirect with displacement 11111111 01110000 00001000 (3) ff 70 08
pushq 0xff(%rbp,%rcx,4) indirect with displacement and scaled-index 11111111 01110100 10001101 11111111 (4) ff 74 8d ff

x86-64 uses a variable-length encoding for machine instructions; some instructions are encoded as a single byte, and others require more than a dozen bytes. The pushq instructions to be decoded vary from 1 to 5 bytes in length. The first byte of a machine instruction contains the opcode which indicates the type of instruction. The opcode is followed by 0 or more additional bytes that encode additional details about the instruction such as the operands, register selector, displacement and so on. This additional bytes vary based on the particular instruction variant.

The table above shows various pushq assembly instructions each with its binary-encoded machine equivalent. To interpret the binary encoding:

void disassemble(const unsigned char *raw_instr)

The disassemble function takes a pointer to a sequence of raw bytes that represent a single machine instruction. How many bytes are used by the instruction is figured out during disassembling. The idea is to read the first byte(s) and use the opcode and variant to determine how many additional bytes need to be examined. Call our provided helper function print_hex_bytes to print the raw bytes, then use bitwise manipulation to extract and convert the operands and print the assembly language instruction. For constants (immediate or displacement), use the printf format %#x to show hex digits prefixed with 0x and no leading zeros.

As an example, if the disassemble function were passed a pointer to the bytes for the final instruction in the table above, it would print this line of output:

ff 74 8d ff    pushq 0xff(%rbp,%rcx,4)

We will only test your function on well-formed instructions of the five pushq variants listed above. There is no requirement that you gracefully handle any other inputs, such as other variants of push or other instruction types or malformed inputs. For slightly obscure reasons, there is a slight anomaly in the encoding used when %rbp or %rsp is the base register in indirect/indirect-with-displacement or %rsp as the index register for scaled-index. You do not need to allow for this nor make a special case for it, just disassemble as though the standard encoding is used for all registers without exceptions.

Requirements

Grading

Read our page on how assignments are graded for background information on our grading process and policies. Here's a plan for a rough point breakdown.

Functionality (75 points)

Code quality (buckets weighted to contribute ~15 points)

We will prefer solutions that are direct, efficient, and readable to those that sacrifice one of these qualities for the others.

Submissions received by the assignment due date will receive an on-time bonus equal to 5% of the functionality points earned by the submission.

Getting started

Check out the starter project from your cs107 repo using the command

hg clone /afs/ir/class/cs107/repos/assign4/$USER assign4

Note that these exercises are similar in spirit to the partnered explorations you do in lab, but the collaboration rules for assignments are much more restrictive than for lab. It is allowable to discuss high-level concepts with your peers, but these discussions must not descend into algorithmic details and code-level specifics. When working on an assignment, you should be doing your own independent thinking, design, coding, and debugging.

Finishing

The sanity check makes some simple calls to each function and verifies that the output printed by your disassemble operations is in the proper format for automated testing. Be sure that all other print statements in the required functions are removed/disabled. Any extraneous output can interfere with the autotester and cause test failures.

When finished, use submit to turn in your code for grading.

Keep tabs on how your knowledge is progressing with our post-task self check.