Virtual Machine Monitors

Lecture Notes for CS 140
Winter 2013
John Ousterhout

  • Readings for this topic from Operating Systems: Principles and Practice: Section 10.2.
  • What is the abstraction provided by an OS to a process?
    • (Virtual) memory
    • A subset of the instruction set of the underlying machine
    • Most (but not all) of the hardware registers
    • A set of kernel calls with particular arguments for file I/O, etc.
    • Overall: a subset of the facilities of the underlying machine, augmented with extra mechanisms implemented by the operating system.
  • What if we implemented a different abstraction for a process, which looks exactly like the underlying hardware:
    • The complete instruction set of the underlying machine
    • Physical memory
    • Memory management unit (page tables, etc.)
    • I/O devices
    • Traps and interrupts
    • No predefined system calls
  • This abstraction is called a virtual machine:
    • To a "process", it appears that it has its own private machine.
    • Multiple "processes" can share a single machine, each thinking it's running on its own private machine.
    • The operating system for this is called a virtual machine monitor.
    • Can run a complete operating system inside a virtual machine: called a guest operating system.
    • Each virtual machine can run a different guest operating system.

Implementing virtual machine monitors

  • One approach: simulation
    • Write program that simulates instruction execution, like Bochs.
    • Simulate memory, I/O devices also.
    • Examples:
      • Use one large file to hold contents of a "disk"
      • Simulate kernel/user bit, interrupt vectors, etc.
    • Problem: too slow
      • 100x slowdown for CPU/memory
      • 2x slowdown for I/O
  • Better approach: use CPU to simulate itself.
    • Run virtual machine guest OS like a user process (in unprivileged mode).
    • Most instructions executed at the full speed of the CPU.
    • Anything "unusual" causes a trap into the virtual machine monitor, which simulates the appropriate behavior.
  • Special cases:
    • Privileged instructions (e.g. HALT):
      • Since virtual machine runs in user mode, these cause "illegal instruction" traps into VMM.
      • VMM catches these traps, simulates appropriate behavior.
    • Kernel calls in guest OS:
      • User program running under guest OS issues kernel call instruction.
      • Traps always go to VMM (not guest OS).
      • VMM analyzes trapping instruction, simulates system call to guest OS:
        • Move trap info from VMM stack to stack of guest OS
        • Find interrupt vector in memory of guest OS
        • Switch simulated mode to "privileged"
        • Return out of VMM to interrupt handler in guest OS.
      • When guest OS returns from system call, this traps to VMM also (illegal instruction in user mode); VMM simulates return to guest user level.
    • I/O devices:
      • Guest OS writes to I/O device register
      • VMM has arranged for the containing page to fault
      • VMM takes page fault, recognizes address as I/O device register
      • VMM simulates instruction and its impact on the simulated I/O device
      • When actual I/O operation completes, VMM simulates interrupt into the guest OS
      • For better performance, write new device drivers that call directly into the VMM (using system calls).
    • Virtual memory: VMM uses page tables to simulate virtual memory mapping in guest OS.
      • Three levels of memory:
        • Guest virtual address space
        • Guest physical address space
        • VMM physical memory
      • Guest OS creates page tables, but these aren't used by actual hardware.
      • VMM manages the real page tables, one set per virtual machine. These are called shadow page tables.
      • VMM manages physical memory
      • Initially all (shadow) page table entries have present 0.
      • When page fault occurs, VMM finds physical page and corresponding guest page table entry. Two possibilities:
        • present is 0 in the guest page table entry: this fault must be reflected to the guest OS:
          • Simulate page fault for guest OS (similar to kernel call).
          • Guest OS invokes I/O to load page into guest physical memory.
          • Guest OS sets present to 1 in guest page table entry.
          • Guest OS returns from page fault, which traps into VMM again (like returning from kernel call).
          • VMM sees that present is 1 in guest page table entry, finds corresponding physical page, creates entry in shadow page table.
          • VMM returns from the original page fault, causing guest application to retry the reference.
        • present is 1 in the guest page table entry: guest OS thinks page is present in guest physical memory (but VMM may have swapped it out anyway).
          • VMM locates the corresponding physical page, loading it in memory if needed.
          • VMM creates entry in shadow page table.
          • VMM returns from the original page fault, causing guest application to retry the reference.
          • In this situation the page fault is invisible to the guest OS.
      • If guest OS modifies its page tables, causes page fault, VMM updates shadow page tables to match.
  • Potential problem:
    • VMM must trap any behavior that requires simulation.
      • Special memory locations? Use page faults.
      • Special instructions? Must trap
    • Pathological case:
      • Instruction that is valid in both user mode and kernel mode
      • But, behaves differently in user mode
      • Example: "read processor status" (where kernel/user mode bit is in the status word)
    • Virtualizable: a machine with no such special cases
    • Until recently, very few machines were completely virtualizable (e.g. x86 wasn't until recently)
  • Dynamic binary translation: solution for machines that are not virtualizable:
    • VMM analyzes all code executed in virtual machine
    • Replaces non-virtualizable instructions with traps
    • Very tricky: how to find all code?
  • In practice, how much overhead do VMMs add?
    • CPU-bound applications: < 5%
    • I/O-bound applications: ~30%

History/usage of virtual machines

  • Invented by IBM in late 1960's
  • Original usage:
    • One VM per user
    • Each user ran a different guest OS
    • Single shared hardware platform
  • Interest died out in the 1980's and 1990's:
    • Each user has a private machine
  • Reinvented, made practical by Mendel Rosenblum and graduate students at Stanford, formed VMware.
  • Software development:
    • Need to test software on different OS versions:
    • Keep one VM for each OS version.
    • Use a single machine to test all versions.
  • Datacenters:
    • Problem: many machines, each running a single application
      • Need separate machines for isolation: application crash could bring down the entire machine
      • Most applications only need a fraction of machine's resources.
    • Solution: datacenter consolidation
      • One VM per application
      • Run several VM's on a single machine
      • Reduce # of machines
  • Encapsulation:
    • VMM can encapsulate entire state of a VM in a file.
    • Can save, continue, restore old state.
    • Datacenter example:
      • Can migrate VM's between machines to balance load
    • Software development:
      • Tests may corrupt the state of the machine
      • Solution:
        • Run tests in a VM
        • Always start tests from a saved VM configuration
        • Discard VM state after tests
        • Results: reproducible tests
  • Many other uses:
    • Run MacOS and Windows on the same machine
    • Security: can monitor all communication into and out of VM.