Learning Goals

During this section, you will:

get practice with the idea of virtual memory and virtual vs. physical addresses
become more familiar with the "base and bound" and "multiple segments" approaches to virtual memory
write your own code to translate a virtual address to a physical one using the "muliple segments" design

Get Started

Clone the section starter code by using the command below. This command creates a section6 directory containing the project files.

git clone /afs/ir/class/cs111/repos/lab6/shared section6

Next, pull up the online section checkoff and have it open in a browser so you can jot things down as you go.

In lecture, we saw the first two approaches we will discuss for implementing virtual memory: base and bound and multiple segments. Both of these rely on the idea of distinct virtual and physical address spaces, and the OS is the middle-person that intercepts any memory references and translates them.

We'll start by reviewing the mechanisms for base and bound and multiple segments, and then we'll implement the translation code for multiple segments so you can see what it could look like!

1) Virtual Memory Review

Review how base and bound and multiple segments work, and answer the following questions:

Q1: In a base and bound implementation, let's say a process has base = 4000 and bound = 200. For each of the following memory accesses, are they valid? And if so, what physical address would they really access?

accessing virtual address 0
accessing virtual address 50
accessing virtual address 300

Q2: Let's say the OS decides to move that process's reserved physical memory somewhere else; specifically, the OS copies it from base 4000 to base 2000. The OS also gives the process more memory, updating its bound to bound 500. How would the outcome of the memory accesses above change? (Cool note: the process itself has no idea its physical memory was moved, and all it needs to do to access the additional memory we've given it is to now refer to those larger virtual addresses. Pretty neat!).

The multiple segments implementation independently tracks info for multiple segments of memory, each with its own base and bound. This info is stored in the segment map, a map from segment number to segment information (like its base and bound). When there is a memory access, the OS knows which segment it's in by looking at the virtual address's most significant bits, which secretly encode the segment number. Then it looks at the virtual address's less significant bits, which secretly encode the offset within that segment (how the bits are partitioned is up to the particular design).

Q3: In a multiple segments implementation, let's say a process has 2 segments, segment 1 with base 1000 and bound 500, and segment 2 with base 2000 and bound 300. For each of the following memory accesses, are they valid? And if so, what physical address would they really access?

accessing segment 1, offset 400
accessing segment 2, offset 400
accessing segment 2, offset 50

2) Implementing Multiple Segments Translation

Now you'll get a chance to implement the code for the "multiple segments" design. We can't fiddle with the actual virtual memory system, so instead we've designed a simulation program that enacts how virtual memory is implemented in a simulated setting.

Specifically, we've provided a class VirtualMemory that represents a program's virtual address space, managed using the "multiple segments" design. You create one by specifying how many segments you want, and how big each of them is (for simplification, this implementation makes all the segments the same size):

// Initialize our virtual address space with 3 
// segments of 1000 bytes each
VirtualMemory v_mem(3, 1000);

VirtualMemory has a segment_map instance variable which maps from segment numbers (starting at 0) to structs:

typedef struct {
    char *base;
    size_t bound;
} SegmentInfo;

...

std::map<size_t, SegmentInfo> segment_map;

Thus, each segment has its own stored base (which is a physical address) and bound.

How do we get a pointer to this fake virtual address space? You can get a pointer to the start of a segment by specifying the segment number:

/* Get a virtual pointer to the start of the second segment
 * (0-indexed). A VirtualPointer can be used just like a 
 * regular char * pointer.
 */
VirtualPointer ptr = v_mem.get_segment_start_ptr(1);

A VirtualPointer is a simulated pointer type that we've defined that hooks into VirtualMemory. We need the ability to intercept every memory access, and we can't easily do that with a real program's pointers unless we're the OS :) So instead, we defined a custom type VirtualPointer set up so that every time someone dereferences one, it will call a function translate that we will write the code for. A VirtualPointer behaves just like a regular char * pointer, though - we have implemented the functionality to allow you to dereference it (which calls translate), do pointer arithmetic with it, and print it out:

*ptr = 'h';
*(ptr + 5) = 'z';
cout << ptr + 5 << endl;

When we print it, it prints a hex number that is the concatenation of its segment number and offset, e.g. the above prints (1 for segment 1, and 5 for offset 5):

0x1005

Your task is to implement the translate method within VirtualMemory; it takes in a VirtualPointer (a virtual address) and should use the segment_map to translate it to a physical address (represented as a char *) and return that. Assume that the segment number is valid, but if the virtual address's offset is invalid (i.e. outside segment bounds) you should print an error message, then cause a segmentation fault like this: raise(SIGSEGV), and then return nullptr (that line will never be reached, but otherwise C++ will complain about no return value).

char *translate(const VirtualPointer& p);

Remember the steps for multiple segments:

Look up info in the segment map for the segment that address is in
Compare offset to that segment’s bound, error if >=
Otherwise, add segment’s base to virtual address offset to produce physical address

Note: for a real virtual address, we might need to isolate certain bits to pull out the segment number and offset. But for this simulation, internally, a VirtualPointer has the following instance variables that might be useful:

class VirtualPointer {
public:
    ...

private:
    size_t segment;
    size_t offset;
    ...
};

Q4: implement the translate method and test it by running the provided test program, which attempts to access 3 addresses, the first two of which are valid, and the third of which is invalid and should cause a crash.

int main(int argc, char *argv[]) {
  VirtualMemory v_mem(3, kSegmentSize);
  VirtualPointer ptr = v_mem.get_segment_start_ptr(1);

  *ptr = 'h';
  cout << "Data at virtual address " << ptr << " is: " 
       << *ptr << endl;

  VirtualPointer newPtr = ptr + (kSegmentSize - 1);

  *newPtr = 'e';
  cout << "Data at virtual address " << newPtr << " is: " 
       << *newPtr << endl;

  // If we write beyond the segment size, we should crash
  VirtualPointer badPtr = ptr + kSegmentSize * 2;
  cout << "Trying to write to virtual address " 
       << badPtr << endl;
  *badPtr = 'l';

  return 0;
}

It should output the following: (the "memory violation" line is the error message printed by translate. The "Segmentation fault" line is printed automatically by the shell after this program receives a segmentation fault).

Data at virtual address 0x1000 is: h
Data at virtual address 0x13e7 is: e
Trying to write to virtual address 0x17d0
memory violation - accessing invalid offset 2000 in segment 1
Segmentation fault (core dumped)

NOTE: in a real virtual memory system, only the OS can see physical addresses; a user program is only aware of virtual addresses. You might imagine that test.cc is a user program, and translate is in the OS.