Go back to this week's lab writeup here.
These optional problems are an opportunity to further exercise your understanding of generics and function pointers.
1) Code Study: memmove
About 25 minutes
The C library provides a handful of raw memory routines (e.g. memcpy
, memset
, ...) that operate on data of an unspecified type. Let's take a look inside memmove
(version below based on code from musl) to better understand how these kind of functions are implemented. This code is included in the memmove.c
file as well for you to experiment with.
1 void *musl_memmove(void *dest, const void *src, size_t nbytes) {
2 char *dest_copy = dest;
3 const char *src_copy = src;
4
5 if (dest_copy == src_copy) {
6 return dest_copy;
7 }
8
9 if (src_copy + nbytes <= dest_copy || dest_copy + nbytes <= src_copy) {
10 return memcpy(dest_copy, src_copy, nbytes);
11 }
12
13 if (dest_copy < src_copy) {
14 for (int i = 0; i < nbytes; i++) {
15 dest_copy[i] = src_copy[i];
16 }
17 } else {
18 for (int i = nbytes - 1; i >= 0; i--) {
19 dest_copy[i] = src_copy[i];
20 }
21 }
22
23 return dest;
24 }
Go over the code and discuss these questions:
- The function's interface declares its parameters as
void*
pointers, but internally it manipulates these pointers aschar*
. Q: Why the inconsistency? What would be the consequence of trying to reconcile the discrepancy by declaring the interface aschar*
or changing the implementation to usevoid*
? - Note that there is no typecast on lines 2 and 3 when assigning from an untyped pointer to a typed pointer. A
void*
is the universal donor/recipient and can be freely exchanged with other pointer types, no cast necessary. That being said, there is a forever-ongoing online discussion about the contentious issue of whether or not to cast here; the perspective employed here is to not cast, as it is not required. - Q: What special case is being handled on line 5?
- Review the man pages for
memcpy
andmemmove
to understand the differences between these two functions. Q: What special case is being handled on line 9? - Q: What two cases are being divided by the if/else on Lines 13/17? Why are both cases necessary?
- The man page for
memmove
states that the operation takes place as though it copies the data twice (src->temp, temp->dst), which implies that call tomemmove
might take twice as long asmemcpy
. However, the musl implementation doesn't operate in this literal manner. It does correctly handle overlap, but not by copying twice. What does it do instead? Q: Take a look at lines 14-16 and lines 18-20 in particular - what is going on in each of these loops? In this implementation, what then is the expected added cost of usingmemmove
overmemcpy
? - Trace the call
musl_memmove(NULL, "cs107", 0)
. Q1: Will it result in a segmentation fault from trying to read/write an invalid pointer? Why or why not? Q2: What about the callmusl_memmove(NULL, "cs107", -1)
? Verify your understanding by running thememmove.c
program. - Why not always use
memmove
? The man page seems to imply that some implementations (though not musl) do suffer a performance hit when usingmemmove
as opposed tomemcpy
. Moreover, appropriately usingmemmove
andmemcpy
can communicate to code readers when data may or may not overlap. For these reasons, we default tomemcpy
, and usememmove
only when necessary.
The implementation of memmove
may remind you of the strncpy
function we saw in lecture. The memxxx
functions have much in common with their strxxx
equivalents, just without the special case to stop at a null byte. In fact, the memxxx
functions are declared as part of the <string.h>
module and quite possibly written by the same author.