Lab sessions Tue May 01 to Fri May 04
Lab written by Julie Zelenski
Find another world where freedom waits, past the the stars in fields of ancient void.
—Black Sabbath, Into the Void
Learning goals
During this lab, you will:
- explore how C
void*
/function pointers support generic operations - study the implementation of generic operations and client callback functions
- debug
void*
pitfalls
First things first! Find an open computer to share with a partner. Introduce yourself and share your favorite unix pipeline trick.
Get started
Clone the repo by using the command below to create a lab4 directory containing the project files.
git clone /afs/ir/class/cs107/repos/lab4/shared lab4
Open the lab checkoff form.
Lab exercises
1) Code study: callbacks (20 minutes)
The generic sort/search functions in the C library (qsort
, bsearch
, lfind
...) all use the same form of standard comparison function:
int comparison_fn_t (const void *, const void *)
A comparison function receives pointers to the two values to compare and returns an integer that indicates their order.
The most critical issue to understand is that all generic operations work with data via pointers to values, never the values directly. Referring to data by address is fundamental to how C supports generic functions. Sending or receiving an actual value is not workable because the values vary in type/size. Instead what is exchanged are pointers to values. All pointers, regardless of pointee, are 8-byte addresses that are type-compatible with void*
.
A comparison function follows a similar pattern:
- Cast the
void*
argument and assign to a pointer of known pointee type. - Dereference the typed pointer to access the value. (Steps 1 and 2 are often combined to cast and dereference in one expression.)
- Compare values to determine the result to return.
The actual comparison logic (Step 3) is usually straightforward; it is mishandling the void*
in Steps 1 and 2 that you have to watch out for.
Review the example comparison functions in the callbacks.c
file to see how each fits the above pattern. Work through the following issues with your partner:
- These callbacks immediately cast the
void*
arguments and store into a properly typed local variable. What is the advantage of doing this up front rather than instead cast/dereference the argument each time it is used? - Supplying a different comparison function to
qsort
allows you to sort the array using a different criteria. For example, the two callbacks that operate on value ofstruct city
type each sort by a different field. How can a comparison function sort in reverse order? How can a comparison function compare by one field as a primary sort order, breaking ties by comparing a secondary field? - The comparison function that orders cities by zip code returns the difference the values as a quick way to compute a positive, negative, or zero comparison result. Do you see how it works? This is a common shortcut for comparing integer values. (Note: the subtraction overflows if the difference exceeds
INT_MAX
, so if you need to support such extreme values, stick with the longer if/else that breaks into different cases.) - All comparison functions fit the same prototype. That means any comparison function can be applied to any type of array, there is no type-matching. What would be the consequence of sorting an array of int using a character comparison function? What about sorting an array of struct cities using a string comparison function?
- Look carefully to identify the subtle differences between the
letter_cmpfn
andstring_cmpfn
callbacks. Could you use theletter_cmpfn
to sort an array of strings by the first letter? Why or why not? - The GNU libc documentation gives an example
compare_doubles
comparison function. Its logic is a bit inscrutable at first, so let's figure out what it is up to. What is the result of evaluating an inequality such as5 > 2
? (Try it in gdb to confirm!) How does the subtraction of the two inequalities compute a result of the correct sign in this case?
2) Code study: gfind_max (45 minutes)
gfind_max
is a generic function to find the array element that is the largest according to the client's comparison function:
void *gfind_max(void *arr, int n, size_t elemsz,
int (*compar)(const void *, const void *))
{
1 void *pmax = arr;
2 for (int i = 1; i < n; i++) {
3 void *ith = (char *)arr + i*elemsz;
4 if (compar(ith, pmax) > 0)
5 pmax = ith;
6 }
7 return pmax;
}
Look over this code to see how a generic function is implemented. Here are some questions to talk over with your partner:
- Line 3 shows the idiomatic access to the ith position in a generic array. Be sure you understand the expression's purpose/operation. What is the intention of the typecast to
(char *)
? What would be the consequence of removing that cast? - The website cdecl.org can convert a declaration from "C gibberish into English". This is handy when trying to unravel an inscrutable declaration. Copy the parameter declaration for
compar
above and paste into decl to gets its explanation in English. - Note that invoking the client callback via a function pointer looks pretty much the same as making an ordinary function call. What happens if you attempt to call the function pointer passing the wrong number or wrong type of arguments?
gfind_max
returns avoid *
. What does that pointer represent? Why does the function return a pointer to a value rather than the value itself?- How could a client use
gfind_max
to find the smallest element instead of the largest?
The necessarily permissive nature of a void*
interface makes for a treacherous client experience. There are many ways to misuse a generic function, and the compiler often does not warn you about these transgressions. Let's explore this situation further.
The test_max
function makes four calls to gfind_max
. The first call is completely correct and prints the expected result. Each of the subsequent three calls is incorrect in some way. First, try to work out what you believe will be printed and then verify that your understanding is correct by running the program. Drawing memory diagrams and/or tracing in gdb may be very helpful in understanding the behavior.
- Incorrect call #1 passes a char comparison function to be used on an array of int elements. Yikes! Why is there no compile or runtime error from our mistake? Does running Valgrind report anything helpful? How does find_max behave in this case?
- What is the error in call #2? Where did the reported "max" value even come from?
- What is the error in call #3? Why will this call always return a pointer to the last element?
Now examine the function test_bsearch
also in generic.c
. As a rule, for bsearch
to be able to work properly, the array must be sorted according to the same comparison function that the search is using. The programmer who wrote this function is confounded by why they couldn't get their code to work using the same comparison function. They eventually got it working by resorting to using a different comparison for search than sort. They know this can't be good, but were unable to identify the correct fix.
You and your partner will investigate!
- Compile the program as-is and run it to observe that it does seem to work despite the mismatch in comparison functions. Change the code to use
cmp_first_char
as the comparison function for both sort and search. Run this version and it crashes. On what operation does in the code does it crash? - The original author's workaround was to add a different comparison function to be used for search. It is a big red flag that this comparison function typecasts its two
void*
arguments to different pointee types. The fact that it manages to "work" at all is sketchy and depends on a precise detail of howbsearch
is implemented. What detail is that? If you very carefully read the man page forbsearch
, you will see that this detail is guaranteed to be true for a conforming implementation, but that still doesn't make it a good idea to depend on it in this way. - Identify the proper fix to the code that makes the program work correctly and sort and search both use the same comparison function
cmp_first_char
, as they should.
The point of this exercise is to highlight the necessity of maintaining vigilance as a client of a void*
interface. It also foreshadows the futility of trying to get the correct code via trial and error. While randomly permuting *
&
and typecasts might eventually land on a correct combination, this approach does absolutely nothing for your understanding. Instead if you take the time to work through the operation on paper, draw diagrams, and trace execution in gdb, you can become confident about what level of indirection is appropriate in what context and why. Ask questions about what you don't understand!
3) Code study: memmove (30 minutes)
The C library provides a handful of raw memory routines (e.g. memcpy
, memset
, ...) that operate on data of unspecified type. Let's take a look inside memmove
(version below from musl) to better understand how these kind of functions are implemented.
1 void *musl_memmove(void *dest, const void *src, size_t n)
2 {
3 char *d = dest;
4 const char *s = src;
5
6 if (d==s) return d;
7 if (s+n <= d || d+n <= s) return memcpy(d, s, n);
8
9 if (d<s) {
10 for (; n; n--)
11 *d++ = *s++;
12 } else {
13 while (n) {
14 n--;
15 d[n] = s[n];
16 }
17 }
18 return dest;
19 }
Go over the code with your partner and discuss these questions:
- The function's interface declares its parameters as
void*
pointers, but internally it manipulates these pointers aschar*
. Why the inconsistency? What would be the consequence of trying to reconcile the discrepancy by declaring the interface aschar*
or changing the implementation to usevoid*
? - Note that there is no typecast on lines 3 and 4 when assigning from an untyped pointer to a typed pointer. A
void*
is the universal donor/recipient and can be freely exchanged with other pointer types, no cast necessary. I find the author's decision to not cast pleasing, given my preference for casting only where you absolutely must. The internets argue about this endlessly if you want to hear more about why it is contentious. - What special case is being handled on line 6?
- Review the man pages for
memcpy
andmemmove
to understand the differences between these two functions. What special case is being handled on line 7? - What two cases are being divided by the if/else on Lines 9/12? Why are both cases necessary?
- Despite lines 10-11 and lines 13-15 both accomplishing a similar function (i.e. copy bytes from source to destination), they are oddly dissimilar in construction:
for
instead ofwhile
, pointer arithmetic/dereference versus array indexing. Which version do you find easier to follow? Which one is easier for you to verify correctness? My preference would be to have picked one style and used for both. Can you come up with a theory to justify the change in style? (I myself have no clue!) - Trace the call
musl_memmove(NULL, "cs107", 0)
. Will it result in a segmentation fault from trying to read/write an invalid pointer? Why or why not? What about the callmusl_memmove(NULL, "cs107", -1)
? Verify your understanding by running thecode.c
program. - The man page for
memmove
states that the operation takes place as though it copies the data twice (src->temp, temp->dst), which implies that call tomemmove
might take twice as long asmemcpy
. However, the musl implementation doesn't operate in this literal manner. It does correctly handle overlap,but not by copying twice. What does it do instead? In this implementation, what then is the expected added cost of usingmemmove
overmemcpy
?
The implementation of memmove
should remind you of the strncpy
function we saw in lecture. The memxxx
functions have much in common with their strxxx
equivalents, just without the special case to stop at a null byte. In fact, the memxxx
functions are declared as part of the <string.h>
module and quite possibly written by the same author.
4) Protip: gdb print array (10 minutes)
A gdb feature we want to get into your repertoire is how to print arrays. If you print a stack array from within the function it is declared, gdb will show the array and its contents. In that context, gdb has access to both the element type and the count of elements, and uses it to print a nice representation of the entire array. However it cannot automatically do the same in other contexts, such as for a heap array or for an array/pointer passed into a function. It is possible to print the entire array in those contexts, but you have to provide more information to gdb. Let's see how!
- Run gdb on the program
callbacks
. Set a breakpoint onmain
and step into the function past the variable declaration/initializations. - Try
p scores
. Here in the context of its declaration, gdb knows that it is a stack array of a certain size and can show the entire stack array. Great! - Now try
p argv
. All gdb knows about argv is that it is a pointer. Bummer. - Try
p argv[0]@argc
and gdb will now print the entire contents of the argv array. Hooray! - The syntax to learn is
p ELEM@COUNT
whereELEM
is the 0th element andCOUNT
is the count of elements to print. ELEM and COUNT are C expressions and can refer to any variables in current scope. Tryp scores[1]@2
to show a 2-element portion in the middle of the array. - You can also add in a typecast if needed. For example, given a parameter
ptr
of typevoid*
that you know is the base address of an array ofnelems
elements of typechar*
, you could print the entire array asp *(char **)ptr@nelems
.
Check off with TA
Before you leave, complete your checkoff form and ask your lab TA to approve it so you are properly credited. If you don't complete all the exercises during the lab period, we encourage you to followup and finish the remainder on your own. Try our self-check to reflect on what you've done and how it's going.