Lab 4 Extra Problems

Go back to this week's lab writeup here.

These optional problems are an opportunity to further exercise your understanding of generics and function pointers.

1) Code Study: memmove

About 25 minutes

The C library provides a handful of raw memory routines (e.g. memcpy, memset, ...) that operate on data of an unspecified type. Let's take a look inside memmove (version below based on code from musl) to better understand how these kind of functions are implemented. This code is included in the memmove.c file as well for you to experiment with.

 1    void *musl_memmove(void *dest, const void *src, size_t nbytes) {
 2        char *dest_copy = dest;
 3        const char *src_copy = src;
 4
 5        if (dest_copy == src_copy) {
 6            return dest_copy;
 7        }
 8
 9        if (src_copy + nbytes <= dest_copy || dest_copy + nbytes <= src_copy) {
10            return memcpy(dest_copy, src_copy, nbytes);
11        }
12
13        if (dest_copy < src_copy) {
14            for (int i = 0; i < nbytes; i++) {
15                dest_copy[i] = src_copy[i];
16            }
17        } else {
18            for (int i = nbytes - 1; i >= 0; i--) {
19                dest_copy[i] = src_copy[i];
20            }
21        }
22
23        return dest;
24    }

Go over the code and discuss these questions:

  • The function's interface declares its parameters as void* pointers, but internally it manipulates these pointers as char*. Q: Why the inconsistency? What would be the consequence of trying to reconcile the discrepancy by declaring the interface as char* or changing the implementation to use void*?
  • Note that there is no typecast on lines 2 and 3 when assigning from an untyped pointer to a typed pointer. A void* is the universal donor/recipient and can be freely exchanged with other pointer types, no cast necessary. That being said, there is a forever-ongoing online discussion about the contentious issue of whether or not to cast here; the perspective employed here is to not cast, as it is not required.
  • Q: What special case is being handled on line 5?
  • Review the man pages for memcpy and memmove to understand the differences between these two functions. Q: What special case is being handled on line 9?
  • Q: What two cases are being divided by the if/else on Lines 13/17? Why are both cases necessary?
  • The man page for memmove states that the operation takes place as though it copies the data twice (src->temp, temp->dst), which implies that call to memmove might take twice as long as memcpy. However, the musl implementation doesn't operate in this literal manner. It does correctly handle overlap, but not by copying twice. What does it do instead? Q: Take a look at lines 14-16 and lines 18-20 in particular - what is going on in each of these loops? In this implementation, what then is the expected added cost of using memmove over memcpy?
  • Trace the call musl_memmove(NULL, "cs107", 0). Q1: Will it result in a segmentation fault from trying to read/write an invalid pointer? Why or why not? Q2: What about the call musl_memmove(NULL, "cs107", -1)? Verify your understanding by running the memmove.c program.
  • Why not always use memmove? The man page seems to imply that some implementations (though not musl) do suffer a performance hit when using memmove as opposed to memcpy. Moreover, appropriately using memmove and memcpy can communicate to code readers when data may or may not overlap. For these reasons, we default to memcpy, and use memmove only when necessary.

The implementation of memmove may remind you of the strncpy function we saw in lecture. The memxxx functions have much in common with their strxxx equivalents, just without the special case to stop at a null byte. In fact, the memxxx functions are declared as part of the <string.h> module and quite possibly written by the same author.