COMP 630: Lab 6b: Paravirtual VMM

Due Wednesday, April 27, 2022, 11:59 PM

Introduction

This lab will guide you through writing a basic paravirtual hypervisor. We will continue using the qemu x86 emulator, which includes support for Intel's VT-x hardware support. The main topics this lab will cover are: bootstrapping a guest OS, programming extended page tables, emulating privileged instructions, and using hypercalls to implement hard drive emulation over a disk image file.

Getting Started

Use Git to commit your Lab 5 source, fetch the latest version of the course repository, and then create a local branch called lab6b based on our lab6b branch, origin/lab6b:

kermit% cd ~/COMP630/lab
kermit% add git
kermit% git commit -am 'my solution to lab5'
Created commit 734fab7: my solution to lab5
 4 files changed, 42 insertions(+), 9 deletions(-)
kermit% git pull
Already up-to-date.

In lab 2, you should have configured your local git repository to pull updates from the starter code. Here, we assume this remote is called starter. You will need to fetch the new starter code from git repository starter as follows:

kermit% git remote
origin
starter
kermit% git fetch starter
remote: Enumerating objects: 236, done.
remote: Counting objects: 100% (236/236), done.
remote: Compressing objects: 100% (158/158), done.
remote: Total 210 (delta 47), reused 210 (delta 47), pack-reused 0
Receiving objects: 100% (210/210), 477.62 KiB | 933.00 KiB/s, done.
Resolving deltas: 100% (47/47), completed with 24 local objects.
From https://github.com/comp630-s22/jos
 * [new branch]      lab6b      -> starter/lab6b
kermit% git branch
  lab2
  lab3
  lab4
* lab5
  master
kermit% git branch -r
  origin/HEAD -> origin/master
  origin/lab2
  origin/lab3
  origin/lab4
  origin/lab5
  origin/master
  starter/lab2
  starter/lab3
  starter/lab4
  starter/lab5
  starter/lab6b
  starter/master

At this point, you now have a remote label "origin" that points to your private github, and "starter" that points to the starter code. Here, you can see that origin has lab5, as does starter, but only starter has lab 6b (on branch lab6b).

Now, we will push the lab6b starter code to our private repository (origin):

kermit% git checkout -b lab6b starter/lab6b
Branch 'lab6b' set up to track remote branch 'lab6b' from 'starter'.
Switched to a new branch 'lab6b'
kermit% git merge lab5
Merge made by recursive.
 kern/pmap.c |   42 +++++++++++++++++++
 1 files changed, 42 insertions(+), 0 deletions(-)
kermit% git push origin
Counting objects: 61, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (60/60), done.
Writing objects: 100% (61/61), 26.90 KiB | 1.08 MiB/s, done.
Total 61 (delta 21), reused 0 (delta 0)
remote: Resolving deltas: 100% (21/21), completed with 10 local objects.
remote:
remote: Create a pull request for 'lab6b' on GitHub by visiting:
remote:      https://github.com/comp630-s22/jos-dontest/pull/new/lab6b
remote:
To github.com:comp630-s22/jos-dontest
 * [new branch]      lab6b -> lab6b

Part 1: VMM Bootstrap

The JOS VMM is launched by a fairly simple program in user/vmm.c. This application calls a new system call to create an environment (similar to a process) that runs in guest mode instead of ring 3 (sys_env_mkguest).

Once the guest is created, the VMM then copies the bootloader and kernel into the guest's physical address space, marks the environment as runnable, and waits until the guest exits.

You will need to implement key pieces of the supporting system calls for the VMM, as well as some of the copying functionality.

Note: The starter code will fail during boot in test_ept_map(); this should pass when you have completed Exercise 2. You may temporarily disable this test until then.

You can try running the vmm from the shell in your guest by typing:

$ vmm

This will currently panic the kernel because the code to detect vmx and extended page table support is not implemented, but as you complete the lab, you will see this launch a JOS-in-JOS environment.

Making a guest environment

The JOS bookkeeping for sys_env_mkguest is already provided for you in kern/syscall.c. Especially if you did not work the JOS labs, you may wish to skim this code, as well as the code in kern/env.c to understand how environments are managed. A major difference between a guest and a regular environment is that a guest has its type set to ENV_TYPE_GUEST as well as a VmxGuestInfo structureand a vmcs structure associated with it.

The vmm directory includes the kernel-level support needed for the VMM---primarily extended page table support.

Your first task will be to implement detection that the CPU supports vmx and extended paging. do this by checking the output of the cpuid instruction and reading the values in certain model specific registers (MSRs).

Exercise 1. Read Chapters 23.6, 24.6.2, and Appendices A.3.2-3 from the Intel manual to learn how to discover if the CPU supports vmx and extended paging.

Once you have read these sections, implement the vmx_check_support() and vmx_check_ept() functions in vmm/vmx.c. You will also need to add support to sched_yield() to call vmxon() when launcing a guest environment.

If these functions are properly implemented, an attempt to start the VMM will not panic the kernel, but will fail because the vmm can't map guest bootloader and kernel into the VM.

Mapping in the guest bootloder and kernel

In user/vmm.c we have provided the structure of the code to set up the guest and bootloader. However, you must implement the memory manipulation code to copy the guest kernel and bootloader into the VM.

Like any other user application in JOS, the vmm has the ability to open files, read pages, and map pages into other environments via IPC. One difference is that we've added a new system call sys_ept_map, which you must implement. The high-level difference between sys_ept_map and sys_page_map is whether the page is added using extended page tables or regular page tables.

Exercise 2. Skim Chapter 28.2 of the Intel manual to familiarize yourself with low-level EPT programming. Several helpful definitions have been provided in vmm/ept.h.

Implement sys_ept_map in kern/syscall.c, as well as ept_lookup_gpa and ept_page_insert in vmm/ept.c. Once this is complete, you should have complete support for nested paging.

Note: If you disabled test_ept_map(), please reenable it at this point. It should pass now, and provides important unit tests for this functionality.

Note: you should write additional test cases for these functions to verify that they work properly.

At this point, you have enough host-level support function to map the guest bootloader and kernel into the guest VM. You will need to read the kernel's ELF headers and copy the segments into the guest.

Exercise 3. Implement copy_guest_kern_gpa() and map_in_guest() in user/vmm.c. For the bootloader, we use map_in_guest directly, since the bootloader is only 512 bytes, whereas the kernel's ELF header must be read by copy_guest_kern_gpa, which should then call map_in_guest for each segment.

Once this is complete, the kernel will attempt to run the guest, and will panic because asm_vmrun is incomplete.

Challenge (20 bonus points, or part of the final project) One way this VMM paravirtualizes JOS is to map the kernel into the guest and use a simpler, modified bootloader. Replace our modified bootloader in vmm/jos_boot.S with the unmodified bootloader in boot/boot.S, and eliminate the need to map the kernel for the guest in the VMM.

This will require trapping accesses to the I/O ports to detect disk reads, and possibly trapping other operations.

Implementing vmlaunch and vmresume.

In this exercise, you will need to write some assembly to launch the VM. Although much of the VMCS setup is completed for you, this exercise will require you to use the vmwrite instruction to set the host stack pointer, as well as the vmlaunch and vmresume instructions to start the VM.

In order to facilitate interaction between the guest and the JOS host kernel, we copy the guest register state into the environment's Trapframe structure. Thus, you will also write assembly to copy the relevant guest registers to and from this trapframe struct.

Exercise 4. Skim Chapter 26 of the Intel manual to familiarize yourself with the vmlaunch and vmresume instructions. Complete the assembly code in asm_vmrun in vmm/vmx.c, and extend env_run in kern/env.c to call vmx_vmrun() instead of env_pop_tf in the host kernel when called with an environment of type ENV_TYPE_GUEST.

Once this is complete, you should be able to run the VM until the guest attempts a vmcall instruction, which traps to the host kernel. Because the host isn't handling traps from the guest yet, the VM will be terminated.

Part 2: Handling VM exits

The equivalent event to a trap from an application to the operating system is called a VM exit. We have provided some skeleton code to dispatch the major types of exits we expect our guest to provide in the vmm/vmx.c function vmexit(). You will need to identify the reason for the exit from the VMCS, as well as implement handler functions for certain events in vmm/vmexits.c.

Similar to issuing a system call (e.g., using the int or syscall instruction), a guest can programmatically trap to the host using the vmcall instruction (sometimes called hypercalls). The current JOS guest uses three hypercalls: one to read the e820 map, which specifies the physical memory layout to the OS; and two to use host-level IPC, discussed below.

Multi-boot map (aka e820)

JOS is "told" the amount of physical memory it has by the bootloader. JOS's bootloader passes the kernel a multiboot info structure which possibly contains the physical memory map of the system. The memory map may exclude regions of memory that are in use for reasons including IO mappings for devices (e.g., the "memory hole"), space reserved for the BIOS, or physically damaged memory. For more details on how this structure looks and what it contains, refer to the specification. A typical physical memory map for a PC with 10 GB of memory looks like below.

        e820 MEMORY MAP
            address: 0x0000000000000000, length: 0x000000000009f400, type: USABLE
            address: 0x000000000009f400, length: 0x0000000000000c00, type: RESERVED
            address: 0x00000000000f0000, length: 0x0000000000010000, type: RESERVED
            address: 0x0000000000100000, length: 0x00000000dfefd000, type: USABLE
            address: 0x00000000dfffd000, length: 0x0000000000003000, type: RESERVED
            address: 0x00000000fffc0000, length: 0x0000000000040000, type: RESERVED
            address: 0x0000000100000000, length: 0x00000001a0000000, type: USABLE
    

For the JOS guest, rather than emulate a BIOS, we will simply use a vmcall to request a "fake" memory map. Complete emulation of this feature would be an excellent bonus task, or part of a final project.

Exercise 5. Complete the implementation of vmexit() by identifying the reason for the exit from the VMCS. You may need to search Chapter 27 of the Intel manual to solve this part of the exercise.

Implement the VMX_VMCALL_MBMAP case of the function handle_vmcall() in vmm/vmexits.c. Also, be sure to advance the instruction pointer so that the guest doesn't get in an infinite loop.

Once the guest gets a little further in boot, it will attempt to discover whether the CPU supports long mode, using the cpuid instruction. Our VMCS is configured to trap on this instruction, so that we can emulate it---hiding the presence of vmx, since we have not implemented emulation of vmx in software.

Exercise 6. Once this is complete, implement handle_cpuid() in vmm/vmexits.c.

When the host can emulate the cpuid instruction, your guest should run until it attempts to perform disk I/O.

Recall that JOS has a user-level file system server daemon, similar to a microkernel. We place the guest's disk image as a file on the host file system server. When the guest file system daemon requests disk reads, rather than issuing ide-level commands, we will instead use vmcalls to ask the host file system daemon for regions of the disk image file. This is depicted in the image below.

Exercise 7. Modify bc_pgfault amd flush_block in fs/bc.c to issue I/O requests using the host_read and host_write hypercalls. Use the macro VMM_GUEST to select different behavior for the guest and host OS.

Once this is complete, you will need to implement the IPC send and receive hypercalls in handle_vmcall, as well as the client code to issue ipc_host_send and ipc_host_recv vmcalls in lib/ipc.c.

Finally, you will need to extend the sys_ipc_try_send and sys_ipc_recv calls to detect whether the environment is of type ENV_TYPE_GUEST or not, and replace the pmap functions with ept calls.

Once these steps are complete, you should have a fully running JOS-on-JOS.

Hand-In Procedure

We will be grading your solutions with a grading program. You can run make grade to test your solutions with the grading program.

Handins in this class will be through a combination of submitting a branch of your code, via gradescope.

First, you should update gitinfo.txt, updating the branch to the appropriate branch for lab 6b.

Second, you will actually go to gradescope for the Lab 6b assignment, and submit via github. After a few minutes, you should be able to see the results of the autograder and confirm it matches what you expect. Let course staff know ASAP if there is an issue.

In this and all other labs, you may complete challenge problems for extra credit. If you do this, please add include details in the submission email, including a short (e.g., one or two paragraph) description of what you did to solve your chosen challenge problem and how to test it. If you implement more than one challenge problem, you must describe each one. Be sure to list the challenge problem number.

If you submit multiple times, we will take the latest submission and count late hours accordingly. If you submit late, please email the TAs so that we check for your latest code.

You do not need to turn in answers to any of the questions in the text of the lab. (Do answer them for yourself though! They will help with the rest of the lab.)

This completes the lab.


Last updated: 2022-05-25 08:15:55 -0400 [validate xhtml]