Section 6: Preemption and Virtual Memory

NOTE: this website is out of date. This is the course web site from a past quarter, Fall 2023. If you are a current student taking the course, you should visit the current class web site instead. If the current website is not yet visible by going to cs111.stanford.edu, it may be accessible by visiting this link until the new page is mounted at this address. Please be advised that courses' policies change with each new quarter and instructor, and any information on this out-of-date page may not apply to you.

Sections Wed Nov 15 to Fri Nov 17

Handout written by Nick Troccoli. Edits by Pratyush Agarwal.

Learning Goals

During this section, you will:

review interrupts and how we enable/disable them as needed to implement preemptive scheduling
get practice with the idea of virtual memory and virtual vs. physical addresses
become more familiar with the "base and bound" approach to virtual memory

Get Started

Clone the section starter code by using the command below. This command creates a section6 directory containing the project files.

git clone /afs/ir/class/cs111/repos/lab6/shared section6

Next, pull up the online section checkoff and have it open in a browser so you can jot things down as you go.

1. Preemption

thread-cycle-preemption.cc is the same thread cycling program as last week's section, but using preemptive scheduling - meaning that every 0.5 seconds, it automatically yields the current thread to run the next one every time the timer fires, instead of the threads manually calling yield. The code has been updated so that all threads print forever, and are switched between whenever the timer fires. The main thread now does the following:

// Fire the timer every 500,000 microseconds to context switch
timer_init(500000, timer_interrupt_handler);

while (true) {
    cout << "Hello, I am the main thread" << endl;
}

The other threads now do the following:

void thread_run() {
    intr_enable(true);
    while (true) {
        cout << "Hello, I am thread " << current->name << endl;
    }
}

And the interrupt handler just calls yield:

void timer_interrupt_handler() {
    yield();
}

All other code is the same as from the prior section. Compile and run the program, and trace through the code to answer the following questions:

Q1: When the program switches to executing thread one for the first time, where does it begin running?

Q2: When the program first switches back to executing the main thread, where does it resume?

The timer implementation we provide disables interrupts before calling your handler, and re-enables them after. You can imagine that the code within the timer that calls your handler looks something like the following:

IntrGuard guard;
timer_handler();
...

Q3: Why is it important that interrupts be disabled when the handler is run?

Q4: The program calls intr_enable(true) at the start of the other threads' function to re-enable interrupts. Why is that needed? What happens if that line is removed? Why? Give it a try to test your hypothesis.

2) Virtual Memory Review

In lecture, we saw the first approach we will discuss for implementing virtual memory: base and bound. It relies on the idea of distinct virtual and physical address spaces, and the OS is the middle-person that intercepts any memory references and translates them.

Review how base and bound works, and answer the following questions:

Q5: With virtual memory, we say that there are now two views of memory. What is meant by that, and how do those views differ?

Q6: What are the benefits of the OS translating from virtual to physical addresses, as opposed to just having physical addresses?

Q7: In a base and bound implementation, let's say a process has base = 4000 and bound = 200. For each of the following memory accesses, are they valid? And if so, what physical address would they really access?

accessing virtual address 0
accessing virtual address 50
accessing virtual address 300

Q8: Let's say the OS decides to move that process's reserved physical memory somewhere else; specifically, the OS copies it from base 4000 to base 2000. The OS also gives the process more memory, updating its bound to bound 500. How would the outcome of the memory accesses above change? (Cool note: the process itself has no idea its physical memory was moved, and all it needs to do to access the additional memory we've given it is to now refer to those larger virtual addresses. Pretty neat!).

Extra: Implementing Base-and-bound Translation

In this extra optional problem, you can implement the code for the "base and bound" design. We can't fiddle with the actual virtual memory system, so instead we've designed a simulation program that enacts how virtual memory is implemented in a simulated setting.

Specifically, we've provided a class VirtualMemory that represents a program's virtual address space, managed using the "base and bound" design. You create one by specifying how big the virtual memory space should be

// Initialize our virtual address space with 1000 bytes
VirtualMemory v_mem(1000);

VirtualMemory has base and bound instance variables to store the base and bound for the memory region:

char *base;
size_t bound;

How do we get a pointer to this fake virtual address space? You can get a pointer to the start like this:

/* Get a virtual pointer to the start of the memory space.
 * A VirtualPointer can be used just like a 
 * regular char * pointer.
 */
VirtualPointer ptr = v_mem.get_start_ptr();

A VirtualPointer is a simulated pointer type that we've defined that hooks into VirtualMemory. We need the ability to intercept every memory access, and we can't easily do that with a real program's pointers unless we're the OS :) So instead, we defined a custom type VirtualPointer set up so that every time someone dereferences one, it will call a function translate that we will write the code for. A VirtualPointer behaves just like a regular char * pointer, though - we have implemented the functionality to allow you to dereference it (which calls translate), do pointer arithmetic with it, and print it out:

*ptr = 'h';
*(ptr + 5) = 'z';
cout << ptr + 5 << endl;

When we print it, it prints a hex number that is the offset. E.g. the above would print:

0x5

Your task is to implement the translate method within VirtualMemory; it takes in a VirtualPointer (a virtual address) and should use the base and bound to translate it to a physical address (represented as a char *) and return that. If the virtual address's offset is invalid (i.e. outside segment bounds) you should print an error message, then cause a segmentation fault like this: raise(SIGSEGV), and then return nullptr (that line will never be reached, but otherwise C++ will complain about no return value).

char *translate(const VirtualPointer& p);

Remember the steps for base and bound:

Compare offset to the bound, error if >=
Otherwise, add the base to virtual address offset to produce physical address

You can get the offset of a VirtualPointer by doing p.offset.

Task: implement the translate method and test it by running the provided test program, which attempts to access 3 addresses, the first two of which are valid, and the third of which is invalid and should cause a crash.

int main(int argc, char *argv[]) {
  VirtualMemory v_mem(kMemorySize);
  VirtualPointer ptr = v_mem.get_start_ptr();

  *ptr = 'h';
  cout << "Data at virtual address " << ptr << " is: " << *ptr << endl;

  VirtualPointer newPtr = ptr + (kMemorySize - 1);

  *newPtr = 'e';
  cout << "Data at virtual address " << newPtr << " is: " << *newPtr << endl;

  VirtualPointer badPtr = ptr + kMemorySize * 2;
  cout << "Trying to write to virtual address " << badPtr << endl;
  *badPtr = 'l';

  return 0;
}

It should output the following: (the "memory violation" line is the error message printed by translate. The "Segmentation fault" line is printed automatically by the shell after this program receives a segmentation fault).

Data at virtual address 0x0 is: h
Data at virtual address 0x3e7 is: e
Trying to write to virtual address 0x7d0
memory violation - accessing invalid offset 2000
Segmentation fault

NOTE: in a real virtual memory system, only the OS can see physical addresses; a user program is only aware of virtual addresses. You might imagine that test.cc is a user program, and translate is in the OS.