Lab 4: void * and Function Pointers

Lab sessions Tue Apr 26 to Fri Apr 29

Lab written by Julie Zelenski, with modifications by Nick Troccoli

Learning Goals

During this lab, you will:

  1. explore how C void*/function pointers support generic operations
  2. study the implementation of generic operations and client callback functions
  3. debug void* pitfalls

Get Started

Clone the lab starter code by using the command below.

git clone /afs/ir/class/cs107/repos/lab4/shared lab4

Next, pull up the online lab checkoff and have it open in a browser so you can jot things down as you go. Only one checkoff needs to submitted for both you and your partner.

Review - Comparison Functions

Comparison functions are a common use case of generics + function pointers. For example, the generic sort/search functions in the C library (qsort, bsearch, lfind ...) are functions that can sort or search any type of data. In order to do that, they need the caller to provide a comparison function that specifies how to order two of their values. All these functions use the same form of standard comparison function:

int comparison_fn_t (const void *, const void *)

Any version of the comparison function that you may implement for your data receives pointers to the two values to compare and returns an integer that indicates their order, using the same +/-/0 return values as strcmp.

The most critical issue to understand is that all generic operations work with data via pointers to values, never the values directly. Referring to data by address is fundamental to how C supports generic functions. Sending or receiving an actual value is not possible because the values vary in type/size. Instead what is exchanged are pointers to values. All pointers, regardless of pointee, are 8-byte addresses that are type-compatible with void*.

Implementing a comparison function follows a similar pattern:

  1. Cast the void* argument and set a pointer of known pointee type equal to it.
  2. Dereference the typed pointer to access the value. (Steps 1 and 2 are often combined to cast and dereference in one expression.)
  3. Compare values to determine the result to return.

The actual comparison logic (Step 3) is usually straightforward; it is mishandling the void* in Steps 1 and 2 that you have to watch out for. You will get lots of practice with that in this lab!

Lab Exercises

(Note: for these exercises, some bytes may be in a different order than you expect. This is because of something called "endian-ness", and it turns out the myth machines are "little endian". You can read more endianness here, but you don't have to worry about it for CS107. The important thing to know for these exercises is just what the general bytes are that you are examining).

1) void * and Pitfalls (30 minutes + 10min discussion)

Learning Goals: Better understand the usage of comparison functions and the levels of indirection with generic comparison functions. See the various ways void * and comparison functions are prone to error. You will be writing custom comparison functions and writing code that uses comparison functions on assign4 - ask questions about anything you don't understand!

Starter code file: generic.c

The provided generic.c file shows examples of using qsort to sort various types of data. First, turn your attention to sort_city_names. Notice how supplying a different comparison function to qsort allows you to specify a different element ordering. For example, the two callbacks that operate on the list of city names each sort by a different criteria. Review the example comparison functions to see how each fits the standard comparison function pattern.

compare_strings_by_descending_length returns the difference between the values as a quick way to compute a positive, negative, or zero comparison result. This is a common shortcut for comparing integer values. (Note: the subtraction overflows if the difference exceeds INT_MAX, so if you need to support such extreme values, stick with the longer if/else with different cases.)

Note also that we try to preserve the "const-ness" of the parameters when casting. So, because a is const void *, we cast a to a const char ** and when we dereference it, we store it in a const char *.

Q1: Look carefully to identify the subtle differences between the compare_strings_alphabetically and compare_letters_alphabetically callbacks. Could you use compare_letters_alphabetically to sort an array of strings by the first letter? Why or why not?

Q2: Now it's your turn; try implementing the comparison function compare_strings_by_first_character to sort strings in alphabetical order by their first character. Hint: remember that characters can be compared numerically e.g. A is less than B.

The necessarily permissive nature of a void* interface makes for a treacherous client experience. There are many ways to misuse a generic function, and the compiler often does not warn you about these transgressions. Let's explore this situation further in the sort_integers function. There, the first call to qsort is completely correct and prints the expected result. The second call, however, is incorrect; it passes a char comparison function to be used on an array of int elements. Yikes!

Q3: Why is there no compile or runtime error from our mistake? How does qsort behave in this case? Drawing memory diagrams and/or tracing in gdb may be very helpful in understanding the behavior.

2) Debugging: bsearch Bug (20 minutes + 10min discussion)

Learning Goals: Better understand the usage of pointers and levels of indirection with generics functions like bsearch. You will use bsearch on assign4, and the bug you investigate below is an extremely common assign4 bug, so getting to the bottom of it now will save you time later on!

Starter code file: bsearch_bug.c

bsearch is a C standard library function to search for an element in a sorted array using binary search. As a rule, for bsearch to be able to work properly, the array must be sorted according to the same comparison function that the search is using. Take a look at bsearch_bug.c - the programmer who wrote the main function is confounded by why they couldn't get their code to work using the same comparison function. They eventually got it working by resorting to using a different comparison for search than sort. They know this can't be good, but were unable to identify the correct fix.

Your group will investigate!

Before starting: remember the core principle that all generic operations work with data via pointers to values, never the values directly. Keep this in mind as you explore this provided code. Draw memory diagrams to keep track of the pointers and levels of indirection you are working with.

Compile the program as-is and run it to observe that it does seem to work despite the mismatch in comparison functions. Change the code to use compare_first_characters as the comparison function for both sort and search. Run this version and it crashes.

Q1: On what operation in the code does it crash? Why? (Hint: try printing out parts of the expression it crashes on in GDB and see what errors out)

The original author's workaround was to add a different comparison function to be used for search. It is a big red flag that this comparison function typecasts its two void* arguments to different pointee types. The fact that it manages to "work" at all is sketchy and depends on a precise detail of how bsearch is implemented which may not be true for other generic function implementations. If you very carefully read the man page for bsearch, you will see that this detail is guaranteed to be true for a conforming implementation, but that still doesn't make it a good idea to depend on it in this way.

Q2: What bsearch detail does this code depend on to work? (Hint: what does the first parameter of the comparison function represent? How about the second? Check out the second paragraph of the manual page).

Identify the proper fix to the code that makes the program work correctly and use the same comparison function compare_first_characters for both sort and search, as they should.

Q3: How can we fix this bug? (Hint: take a closer look at the values of the parameters passed into bsearch when calling it).

The point of this exercise is to highlight the necessity of maintaining vigilance as a client of a void* interface. It also foreshadows the futility of trying to get the correct code via trial and error. While randomly permuting * & and typecasts might eventually land on a correct combination, this approach does absolutely nothing for your understanding. Instead, if you take the time to work through the operation on paper, draw diagrams, and trace execution in gdb, you can become confident about what level of indirection is appropriate in what context and why. Ask questions about what you don't understand!

3) GDB Tip: Printing Arrays (10 minutes)

We'll periodically try to introduce you to new helpful gdb commands or features to aid in your debugging. This week, we introduce how to print arrays.

If you print a stack array from within the function in which it is declared, gdb will show the array and its contents. In that context, gdb has access to both the element type and the count of elements, and uses it to print a nice representation of the entire array. (Try this in the main function by printing out nums!) However, it cannot automatically do the same in other contexts, such as for a heap array or for an array/pointer passed into a function (try this in my_function to see what it does).

However, it is possible to print the entire array in those contexts, but you have to provide more information to gdb. Let's see how!

  1. Run gdb on the gdb_practice program. Set a breakpoint on main and step into the function past the declaration/initialization of the array nums.
  2. Try p nums. Here in the context of its declaration, gdb knows that it is a stack array of a certain size and can show the entire stack array. Great!
  3. Now try p argv. All gdb knows about argv is that it is a pointer. Bummer.
  4. Try p argv[0]@argc and gdb will now print the entire contents of the argv array. Hooray!
  5. The syntax to learn is p ELEM@COUNT where ELEM is the first element to print and COUNT is the count of elements to print. ELEM and COUNT are C expressions and can refer to any variables in the current scope. Try p nums[1]@2 to show a 2-element portion in the middle of the array.
  6. Try using this feature while paused in the my_function function to print out the contents of the passed-in array.
  7. You can also add in a typecast if needed. For example, given a parameter ptr of type void* that you know is the base address of an array of nelems elements of type int*, you could print the entire array as p *(int *)ptr@nelems or p ((int *)ptr)[0]@nelems (note parentheses for operator precedence). Try this while paused in the my_function_generic function to print out the contents of the passed-in array.

For more practice, try setting breakpoints and printing out other values in the gdb_practice.c file, and feel free to edit the file as well.

[Optional] Extra Problems

Finished with lab and itching to further exercise your generics and function pointer skills? Check out our extra problems!


Nice work on the lab! It's okay if you don't completely finish all of the exercises during lab; your sincere participation for the full lab period is sufficient for credit. However, we highly encourage you to finish on your own whatever is need to solidify your knowledge. Also take a chance to reflect on what you got what from this lab and whether you feel ready for what comes next! The takeaway from lab4 should be getting your bearings in the world of raw memory. The goals are for you to be able to write and use functions passed as parameters (including the cryptic syntax), and know how to use the type system to your advantage wherever you can, but also how to work without it where you must. You should know how to make a proper call to memcpy/memmove, exactly where and why you need a typecast, and have increased vigilance about using the correct level of indirection. Here are some questions to verify your understanding and get you thinking further about these concepts:

  • Why must you typecast a void* in a pointer-arithmetic expression?
  • An asymmetric comparison function is one that casts its two void* arguments to different pointee types. Why is passing an asymmetric comparison function to bsearch almost certainly an indication of programmer error?
  • C search functions commonly return a pointer to the found element (if any), not the found element itself. Why is this?

screaming into the void