Lab sessions Tue May 15 to Sat May 19
Lab written by Julie Zelenski
Learning goals
This lab is designed to give you a chance to:
- study the relationship between C source and its assembly translation
- use
objdumpandgdbto disassemble and trace assembly code
Find an open computer and somebody new to sit with. Share your joy, angst, and relief from last week's midterm.
Get started
Clone the repo by using the command below to create a lab6 directory containing the project files.
git clone /afs/ir/class/cs107/repos/lab6/shared lab6
Open the lab checkoff form.
Lab exercises
To study C->assembly translation, we recommend you try out Compiler Explorer, a nifty online tool that provides an "interactive compiler". This link https://godbolt.org/g/J8wSw5 is pre-configured for myth's version of GCC and compiler flags from the CS107 makefiles. (To manually configure yourself: select language C,compiler version x86-64 gcc 5.4 and enter flags -Og -std=gnu99).
In Compiler Explorer, you can enter a C function, see its generated assembly, then tweak the C source and observe how those changes are reflected in the assembly instructions. You could instead make these same observations on myth using gcc and gdb, but Compiler Explorer makes it easier to do those tasks in a convenient environment for exploration. Try it out!
A note on register use: From Monday's lecture, remember which registers are used for passing values in and out of functions: the first three arguments are passed in %rdi, %rsi, and %rdx respectively and the return value is written to %rax. A reference to the e-named register accesses the lower 32-bit sub-register, e.g. %edi is the subregister of %rdi. Operations on pointers and longs (64-bit types) will use the full r-named registers, and operations on ints will use the e-named subregisters.
1) Assembly code study (70 minutes)
Addressing modes
Review the two deref functions below.
void deref_one(char *ptr, long index)
{
*ptr = '\0';
}
void deref_two(int *ptr, long index)
{
*ptr = 0;
}
Open Compiler Explorer https://godbolt.org/g/J8wSw5. Copy the above two functions and paste them into the left pane and review the generated assembly shown in the right pane.
- There is one difference between the assembly instruction sequences of the two functions. What is the difference? Why it is different?
- Edit both functions to instead assign
ptr[7]to zero. How does this change the addressing mode using for the destination operand of themovinstruction? Do bothdereffunctions change in the same way? - Edit both functions to now assign
ptr[index]to zero. How does this change the addressing mode using for the destination operand of themovinstruction? Do bothdereffunctions change in the same way? - Change the assignment statement to
ptr[0] = ptr[1]. For both functions, the assembly sequence is one instruction longer. Previously, only onemovinstruction was needed. This assignment statement requires two movs. Why?
Now add the deref_three function below to Compiler Explorer:
struct coord { int x, y; };
void deref_three(struct coord *ptr)
{
ptr->x = ptr->y;
}
- The assembly for
deref_threeis identical toderef_two, but the C source for the two functions seem to have nothing to do with one another! How is possible that both can generate the same assembly instructions?
Signed/unsigned arithmetic
The following two functions perform the same arithmetic operation on their arguments, but those arguments differ in type (signedness).
int s_arith(int a, int b)
{
return (a - b) * a;
}
unsigned int u_arith(unsigned int a, unsigned int b)
{
return (a - b) * a;
}
Copy and paste the above two functions into Compiler Explorer.
- Both functions generate the same sequence of assembly instructions! The choice of two's complement representation allows the add/sub/imul instruction to work for both unsigned and signed types. Neat!
- Edit both functions to perform a right shift on one of its arguments. When doing a right-shift, does gcc emit an arithmetic (
sar) or logical (shr) shift? Does it matter whether the argument is signed or unsigned?
Load effective address
The lea instruction packs a lot of computation in one instruction: two adds and a multiply by constant 1, 2, 4 or 8. It was designed for address arithmetic, but the math is compatible with regular integer operations and it is often used by compiler to do an efficient add/multiply combo.
int combine(int x, int y)
{
return x + y;
}
Copy and paste the combine function into Compiler Explorer.
- In the generated assembly, you'll see a
leainstead of the expectedadd. Interesting!leacan be used foradd, but what else can it do? - Edit the
combinefunction toreturn x + 2*yorreturn x + 8*y -17and observe how a singleleainstruction can also compute these more complex expressions. - Edit to
return x + 47*yand the result will no longer fit the pattern for anlea. What sequence of assembly instructions is generated instead?
Multiply is one of the more expensive instructions and the compiler prefers cheaper alternatives where possible. Copy and paste this function into Compiler Explore:
int scale(int x)
{
return x * 4;
}
- The
scalefunction multiplies its argument by 4. Look at its generated assembly-- there is no multiply instruction. What has the compiler used instead? Edit thescalefunction toreturn x*16. What assembly instruction is used now? - It is perhaps unsurprising that the compiler treats multiplication by powers of 2 as a special case given an underlying representation in binary, but it's got a few more tricks up its sleeve than just that! Edit the
scalefunction to instead multiply its argument by a small constant that is not a power of 2,e.g.x*3,x*7,x*12,x*17, .... For each, look at the assembly and see what instructions are used. Gcc sure goes out of its way to avoid multiply! - Experiment to find a small integer constant C such that
return x*Cis expressed as an actualimulinstruction.
Division
A universal complaint of third graders is that doing long division is tedious; modern CPUs concur. The hard work of division is something the compiler is eager to avoid. Let's find out what wizardry it is willing to employ.
Copy this division function into Compiler Explorer:
unsigned int u_div(unsigned int x)
{
return x / 2;
}
- In the generated assembly for
u_div, is nodivinstruction in sight. How then did it implement unsigned division by 2? What is the interpretation of one operandshrinstruction? - Change the function to divide by a different power of 2 instead (e.g. 4 or 8 or 64). What changes in the generated assembly?
Restore u_div back to its original return x / 2 and paste in this additional function that does a signed division by 2:
int s_div(int x)
{
return x / 2;
}
Whereas addition, subtraction, and multiplication operate equivalently on signed and unsigned integers, it's not quite so with division. If the divisor does not evenly divide the dividend, the quotient must be rounded. In the case of dividing an odd dividend by 2, the remainder is 1. Discarding the remainder (which is what unsigned divide does by shifting away the lsb) has the effect of rounding down to a lesser value. However, the rule for integer division is that quotient must be rounded toward zero, e.g. dividing 3/2 = 1 and dividing -3/2 = -1. A positive quotient is rounded down, but a negative quotient is rounded up.
- What is the bit pattern for
3and then3 >> 1? What is the bit pattern for-3and then-3 >> 1? (Assume arithmetic right shift) Do you see why the right shift (discard lsb) will round a positive value toward zero and rounds a negative value away from zero? This difference in rounding is the crux of why right-shift alone is insufficient if the result is negative. - Compare the assembly for
u_divtos_div. The signed version has a pair of instructions (shr,add) inserted and usessarin place ofshr. First consider that last substitution. If the number being shifted is positive there is no difference, but arithmetic right shift versus logical on a a negative number has a profound impact. What is different? Why? - Now let's dig into the
shrandaddinstructions that were inserted in the signed version. Trace through their operation when the dividend is positive and confirm these instructions made no change in the quotient. But these instructions do have an effect when the dividend is negative. They adjust the negative dividend by a "fixup" amount before the divide (shift). This pushes the dividend to the next larger multiple of the divisor so as to get the proper rounding for a negative quotient. - The necessary fixup amount when dividing by 2 is 1, the fixup for dividing by 4 is 3, for 8 it is 7 and so on. This calculation should be reminiscent of the
roundupfunction we studied way back in lab1! Change thes_divfunction to divide by 4 instead. The fixup amount is now 3. The assembly instructions handle the fixup slightly differently than before. It computes the dividend plus fixup and uses a "conditional mov" instruction to select between using the fixed or unmodified dividend based on whether the dividend is negative. Trace through the assembly to see how this is done. - There is a family of conditional move instructions that parallel the branch instructions (
jne->cmovne,jle->cmovle, and so on). The branch reads the condition codes to decide whether to take the branch or not, a conditional mov reads the condition codes to decide whether to complete the mov or ignore it.
Option further explorations for the curious: Division by powers of two get special treatment, sure, but what about constants that are not powers of two?
Add the u_div_10 function below to Compiler Explorer:
unsigned int u_div10(unsigned int x)
{
return x/10;
}
Look at the C source and generated assembly for udiv_by_ten. The assembly still doesn't use a div instruction, by there is a multiply by a bizarre number: -858993459 (in hex 0xcccccccd). What sorcery is this? This mysterious transformation is effectively multiplying by 1/10 as a substitute for divide by 10. The 1/10 value is being represented as a "fixed point fraction" which is constructed similarly to the way floats are. Enter 0.1 into the float visualizer tool and look to see where that same funny 0xcccccccd number shows up in the significand bits of the float. This technique is known as reciprocal multiplication and gcc will generally convert division by a constant in this way. The math behind this is somewhat complex and we'll won't make a PhD thesis out of it here, but if you'd like to know more, check out Doug Jones' tutorial on reciprocal multiplication.
2) Tools (40 minutes)
Deadlisting with objdump
As part of the compilation process, the assembler takes in assembly instructions and encodes them into binary machine form. Disassembly is the reverse process that converts binary-encoded instructions back into human-readable assembly text. objdump is a tool that operates on object files (i.e. files containing machine instructions in binary). It can dig out all sorts of information from the object file, but one of the more common uses is as a disassembler. Let's try it out!
- Invoking
objdump -dextracts the instructions from an object file and outputs the sequence of binary-encoded machine instructions alongside the assembly equivalent. This dump is called a deadlist ("dead" to distinguish from the study of "live" assembly as it executes). Usemaketo build the lab programs and thenobjdump -d codeto get a sample deadlist. - The
./countops.pypython script reports the most heavily used assembly instructions in a given object file. Try outcountops.py codefor an example. The script usesobjdumpto disassemble the file, tallies instructions by opcode, and reports the top 10 most frequent. Usewhichto get the path to a system executable (e.g.which vimor emacs or gcc) and then usecountops.pyon that path. Try this for a few executables. Does the mix of assembly instructions seem to vary much by program?
GDB commands for live assembly debugging
Below we introduce a few of the gdb commands that allow you to work with code at the assembly level.
The gdb command disassemble with no argument will print the disassembled instructions for the currently executing function. You can also give an optional argument of what to disassemble, a function name or code address.
(gdb) disass myfn
Dump of assembler code for function myfn:
0x000000000040051b <+0>: movl $0x1,-0x20(%rsp)
0x0000000000400523 <+8>: movl $0x2,-0x1c(%rsp)
...
The hex number in the leftmost column is the address in memory for that instruction and in angle brackets is the offset of that instruction relative to the start of the function.
You can set a breakpoint at a specific assembly instruction by specifying its address b *address or an offset within a function b * myfn+8. Note that the latter is not 8 instructions into main, but 8 bytes worth of instructions into main. Given the variable-length encoding of instructions, 7 bytes can correspond to one or several instructions.
(gdb) b *0x400570 break at specified address
(gdb) b *myfn+8 break at instruction 8 bytes into myfn()
The gdb commands stepi and nexti allow you to single-step through assembly instructions. These are the assembly-level equivalents of the source-level step and next commands. They can be abbreviated si and ni.
(gdb) stepi executes next single instruction
(gdb) nexti executes next instruction (step over fn calls)
The gdb command info reg will print all integer registers. You can print or set a register's value by name. Within gdb, a register name is prefixed with $ instead of the usual %.
(gdb) info reg
rax 0x4005c1 4195777
rbx 0x0 0
....
(gdb) p $rax show current value in %rax register
(gdb) set $rax = 9 change current value in %rax register
The tui (text user interface) demoed in lecture splits your session into panes for simultaneously viewing the C source, assembly translation, and/or current register state. The gdb command layout <argument> starts tui mode. The argument specifies which pane you want (src, asm, regs, split or next).
(gdb) layout split
Tui mode is great for tracing execution and observing what is happening with code/registers as you stepi. Occasionally, tui trips itself and garbles the display. The gdb command refresh sometimes works to clean it up, or you can try ctrl-l. If things get really out of hand, ctrl-x a will exit tui mode and return you to ordinary non-graphical gdb.
Reading and tracing assembly in GDB
Let's try out all these new gdb commands! Read over the program code.c. Compile the program and load into gdb.
- Compile the program and load into gdb. Disassemble
myfn. - Use the disassembly to figure out where
arris being stored. How are the values inarrinitialized? What happened to thestrlencall on the string constant to init the last array element? - What instructions were emitted to compute the value assigned to
count? What does this tell you about thesizeofoperator? - Set a breakpoint at
myfnand run the program. Use the gdb commandinfo localsto show the local variables. Compare this list to the declarations in the C source. You'll see some variables are shown with values ("live"), some are<optimized out>, but others don't show up at all. Look at the disassembly to figure out what happened to these entirely missing variables. Step through the function repeating theinfo localscommand to observe which variables are live at each step. Examine the disassembly to explain why there is no step at which bothtotalandsquaredare live.
Check off with TA
Before you leave, be sure to submit your checkoff sheet and have your lab TA come by and confirm so you will be properly credited. If you don't finish everything before lab is over, we strongly encourage you to finish the remainder on your own. Double-check your progress with self check.