Due 5:00 PM, Monday, April 28, 2025
In this lab you will implement the basic kernel facilities required to get a protected user-mode environment (i.e., "process") running. You will enhance the JOS kernel to set up the data structures to keep track of user environments, create a single user environment, load a program image into it, and start it running. You will also make the JOS kernel capable of handling any system calls the user environment makes and handling any other exceptions it causes.
Note: In this lab, the terms environment and process are interchangeable - they have roughly the same meaning. We introduce the term "environment" instead of the traditional term "process" in order to stress the point that JOS environments do not provide the same semantics as UNIX processes, even though they are roughly comparable.
In this and future labs you will progressively build up your kernel.
Use Git to commit your Lab 2 source,
fetch the latest version of the course repository,
and then
create a local branch called lab3
based on our lab3
branch, origin/lab3
:
kermit% cd ~/COMP730/lab
kermit% git commit -am 'my solution to lab2'
Created commit 734fab7: my solution to lab2
4 files changed, 42 insertions(+), 9 deletions(-)
kermit% git pull
Already up-to-date.
kermit% git checkout -b lab3 origin/lab3
Branch lab3 set up to track remote branch refs/remotes/starter/lab3.
Switched to a new branch "lab3"
kermit% git merge lab2
Merge made by recursive.
kern/pmap.c | 42 +++++++++++++++++++
1 files changed, 42 insertions(+), 0 deletions(-)
Lab 3 contains a number of new source files, which you should browse:
inc/ |
env.h |
Public definitions for user-mode environments |
trap.h |
Public definitions for trap handling | |
syscall.h |
Public definitions for system calls from user environments to the kernel | |
lib.h |
Public definitions for the user-mode support library | |
kern/ |
env.h |
Kernel-private definitions for user-mode environments |
env.c |
Kernel code implementing user-mode environments | |
trap.h |
Kernel-private trap handling definitions | |
trap.c |
Trap handling code | |
trapentry.S |
Assembly-language trap handler entry-points | |
syscall.h |
Kernel-private definitions for system call handling | |
syscall.c |
System call implementation code | |
lib/ |
Makefrag |
Makefile fragment to build user-mode library,
obj/lib/libuser.a |
entry.S |
Assembly-language entry-point for user environments | |
libmain.c |
User-mode library setup code called from entry.S | |
syscall.c |
User-mode system call stub functions | |
console.c |
User-mode implementations of
putchar and getchar ,
providing console I/O | |
exit.c |
User-mode implementation of exit | |
panic.c |
User-mode implementation of panic | |
user/ |
* |
Various test programs to check kernel lab 3 code |
In addition, a number of the source files we handed out for lab2 are modified in lab3. To see the differences, you can type:
$ git diff lab2 | more
You may also want to take another look at the lab tools guide, as it includes information on debugging user code that becomes relevant in this lab.
This lab is divided into two parts, A and B.
As in lab 2, you may do challenge problems for extra credit. If you do this, please create entires in the file called challenge3.txt, which includes a short (e.g., one or two paragraph) description of what you did to solve your chosen challenge problem and how to test it. If you implement more than one challenge problem, you must describe each one. Be sure to list the challenge problem number. If you complete challenges from previous labs, please list them in this file (not a previous lab challenge file), with both the lab and problem number.
Passing all the make grade tests does not mean your code is perfect. It may have subtle bugs that will only be tickled by future labs. All your kernel code is running in the same address space with no protection. If you get weird crashes that don't seem to be explainable by a bug in the crashing code, it's likely that they're due to a bug somewhere else that is modifying memory used by the crashing code.
In this lab you may find GCC's inline assembly language feature useful, although it is also possible to complete the lab without using it. At the very least, you will need to be able to understand the fragments of inline assembly language ("asm" statements) that already exist in the source code we gave you. You can find several sources of information on GCC inline assembly language on the class reference materials page.
The new include file inc/env.h
contains basic definitions for user environments in JOS;
read it now.
The kernel uses the Env
data structure
to keep track of each user environment.
In this lab you will initially create just one environment,
but you will need to design the JOS kernel
to support multiple environments;
lab 4 will take advantage of this feature
by allowing a user environment to fork
other environments.
As you can see in kern/env.c
,
the kernel maintains three main global variables
pertaining to environments:
struct Env *envs = NULL; /* All environments */
struct Env *curenv = NULL; /* the current env */
static struct Env_list env_free_list; /* Free list */
Once JOS gets up and running,
the envs
pointer points to an array of Env
structures
representing all the environments in the system.
In our design,
the JOS kernel will support a maximum of NENV
simultaneously active environments,
although there will typically be far fewer running environments
at any given time.
(NENV
is a constant #define
'd in inc/env.h
.)
Once it is allocated,
the envs
array will contain
a single instance of the Env
data structure
for each of the NENV
possible environments.
The JOS kernel keeps all of the inactive Env
structures
on the env_free_list
.
This design allows easy allocation and
deallocation of environments,
as they merely have to be added to or removed from the free list.
The kernel uses the curenv
variable
to keep track of the currently executing environment at any given time.
During boot up, before the first environment is run,
curenv
is initially set to NULL
.
The Env
structure
is defined in inc/env.h
as follows
(although more fields will be added in future labs):
struct Env {
struct Trapframe env_tf; // Saved registers
struct Env *env_link; // Next free Env
envid_t env_id; // Unique environment identifier
envid_t env_parent_id; // env_id of this env's parent
enum EnvType env_type; // Indicates special system environments
unsigned env_status; // Status of the environment
uint32_t env_runs; // Number of times environment has run
// Address space
pml4e_t *env_pml4e; // Kernel virtual address of page map level-4
};
Here's what the Env
fields are for:
inc/trap.h
,
holds the saved register values for the environment
while that environment is not running:
i.e., when the kernel or a different environment is running.
The kernel saves these when
switching from user to kernel mode,
so that the environment can later be resumed where it left off.Env
on the
env_free_list
. env_free_list
points
to the first free environment on the list.Env
structure
(i.e., using this particular slot in the envs
array).
After a user environment terminates,
the kernel may re-allocate
the same Env
structure to a different environment -
but the new environment
will have a different env_id
from the old one
even though the new environment
is re-using the same slot in the envs
array.env_id
of the environment that created this environment.
In this way the environments can form a ``family tree,''
which will be useful for making security decisions
about which environments are allowed to do what to whom.ENV_TYPE_USER
. The idle
environment is ENV_TYPE_IDLE
and we'll introduce
a few more special types for special system service
environments in later labs.
ENV_FREE
:Env
structure is inactive,
and therefore on the env_free_list
.ENV_RUNNABLE
:Env
structure
represents a currently active environment,
and the environment is waiting to run on the processor.ENV_RUNNING
:Env
structure
represents the currently running environment.ENV_NOT_RUNNABLE
:Env
structure
represents a currently active environment,
but it is not currently ready to run:
for example, because it is waiting
for an interprocess communication (IPC)
from another environment.ENV_DYING
:Env
structure
represents a zombie environment. A zombie environment will be
freed the next time it traps to the kernel. We will not use
this flag until Lab 4.
Like a Unix process, a JOS environment couples the concepts of
"thread" and "address space". The thread is defined primarily by the
saved registers (the env_tf
field), and the address space
is defined by the PML4,page directory pointer, page directory and page tables pointed to by
env_pml4e
and env_cr3
. To run an
environment, the kernel must set up the CPU with both the saved
registers and the appropriate address space.
Note that in Unix-like systems, individual environments have their own kernel stacks. In JOS, however, only one environment can be active in the kernel at once, so JOS needs only a single kernel stack.
In lab 2,
you allocated memory in x86_vm_init()
for the pages[]
array,
which is a table the kernel uses to keep track of
which pages are free and which are not.
You will now need to modify x64_vm_init()
further
to allocate a similar array of Env
structures,
called envs
.
Exercise 1.
Modify x64_vm_init()
in kern/pmap.c
to allocate and map the envs
array.
This array consists of
exactly NENV
instances of the Env
structure allocated much like how you allocated the
pages
array.
Also like the pages
array, the memory backing
envs
should also be mapped user read-only at
UENVS
(defined in inc/memlayout.h
) so
user processes can read from this array.
You should run your code and make sure
check_boot_pml4e()
succeeds.
You will now write the code in kern/env.c
necessary to run a
user environment. Because we do not yet have a filesystem, we will
set up the kernel to load a static binary image that is embedded
within the kernel itself. JOS embeds
this binary in the kernel as a ELF executable image.
The Lab 3 GNUmakefile
generates a number of binary images in
the obj/user/
directory. If you look at
kern/Makefrag
, you will notice some magic that "links" these
binaries directly into the kernel executable as if they were
.o
files. The -b binary option on the linker
command line causes these files to be linked in as "raw" uninterpreted
binary files rather than as regular .o
files produced by the
compiler. (As far as the linker is concerned, these files do not have
to be ELF images at all - they could be anything, such as text files
or pictures!) If you look at obj/kern/kernel.sym
after
building the kernel, you will notice that the linker has "magically"
produced a number of funny symbols with obscure names like
_binary_obj_user_hello_start
,
_binary_obj_user_hello_end
, and
_binary_obj_user_hello_size
. The linker generates these
symbol names by mangling the file names of the binary files; the
symbols provide the regular kernel code with a way to reference the
embedded binary files.
In i386_init()
in kern/init.c
you'll see code to run
one of these binary images in an environment. However, the critical
functions to set up user environments are not complete; you will need
to fill them in.
Exercise 2.
In the file env.c
,
finish coding the following functions:
env_init()
:Env
structures
in the envs
array
and add them to the env_free_list
.
Also calls env_init_percpu
, which
configures the segmentation hardware with
separate segments for privilege level 0 (kernel) and
privilege level 3 (user).env_setup_vm()
:region_alloc()
:load_icode()
:env_create()
:env_alloc
and call load_icode
load an ELF binary into it.env_run()
:
As you write these functions,
you might find the new cprintf verb %e
useful -- it prints a description corresponding to an error code.
For example,
r = -E_NO_MEM;
panic("env_alloc: %e", r);
will panic with the message "env_alloc: out of memory".
Below is a call graph of the code up to the point where the user code is invoked. Make sure you understand the purpose of each step.
start
(kern/entry.S
)i386_init
cons_init
x86_vm_init
env_init
trap_init
(still incomplete at this point)env_create
env_run
env_pop_tf
Once you are done you should compile your kernel and run it under QEMU.
If all goes well, your system should enter user space and execute the
hello
binary until it makes a system call with the
int
instruction. At that point there will be trouble, since
JOS has not set up the hardware to allow any kind of transition
from user space into the kernel.
When the CPU discovers that it is not set up to handle this system
call interrupt, it will generate a general protection exception, find
that it can't handle that, generate a double fault exception, find
that it can't handle that either, and finally give up, reset, and
reboot in what's known as a "triple fault".
Usually, you would then see the CPU reset
and the system reboot. While this is important for legacy
applications (see
this blog post for an explanation of why), it's a pain for kernel
development, so with the patched QEMU you'll instead see a
register dump and a "Triple fault." message.
We'll address this problem shortly, but for now we can use the
debugger to check that we're entering user mode. Use make
qemu-gdb (or make qemu-nox-gdb) and set a GDB breakpoint
at env_pop_tf
,
which should be the last function you hit before actually entering user mode.
Single step through this function using si;
the processor should enter user mode after the iret
instruction.
You should then see the first instruction
in the user environment's executable,
which is the cmpl
instruction at the label start
in lib/entry.S
.
Now use b *0x... to set a breakpoint at the
int $0x30
in sys_cputs()
in hello
(see obj/user/hello.asm
for the user-space address).
This int
is the system call to display a character to
the console.
If you cannot execute as far as the int
,
then something is wrong with your address space setup
or program loading code;
go back and fix it before continuing.
At this point,
the first int $0x30
system call instruction in user space
is a dead end:
once the processor gets into user mode,
there is no way to get back out.
You will now need to implement
basic exception and system call handling,
so that it is possible for the kernel to recover control of the processor
from user-mode code.
The first thing you should do
is thoroughly familiarize yourself with
the x86 interrupt and exception mechanism.
Exercise 3. Read Chapter 9, Exceptions and Interrupts in the 80386 Programmer's Manual (or Chapter 5 of the IA-32 Developer's Manual), if you haven't already.
In this lab we generally follow Intel's terminology for interrupts, exceptions, and the like. However, terms such as exception, trap, interrupt, fault and abort have no standard meaning across architectures or operating systems, and are often used without regard to the subtle distinctions between them on a particular architecture such as the x86. When you see these terms outside of this lab, the meanings might be slightly different.
Exceptions and interrupts are both "protected control transfers," which cause the processor to switch from user to kernel mode (CPL=0) without giving the user-mode code any opportunity to interfere with the functioning of the kernel or other environments. In Intel's terminology, an interrupt is a protected control transfer that is caused by an asynchronous event usually external to the processor, such as notification of external device I/O activity. An exception, in contrast, is a protected control transfer caused synchronously by the currently running code, for example due to a divide by zero or an invalid memory access.
In order to ensure that these protected control transfers are actually protected, the processor's interrupt/exception mechanism is designed so that the code currently running when the interrupt or exception occurs does not get to choose arbitrarily where the kernel is entered or how. Instead, the processor ensures that the kernel can be entered only under carefully controlled conditions. On the x86, two mechanisms provide this protection:
The Interrupt Descriptor Table. The processor ensures that interrupts and exceptions can only cause the kernel to be entered at a few specific, well-defined entry-points determined by the kernel itself, and not by the code running when the interrupt or exception is taken.
The x86 allows up to 256 different interrupt or exception entry points into the kernel, each with a different interrupt vector. A vector is a number between 0 and 255. An interrupt's vector is determined by the source of the interrupt: different devices, error conditions, and application requests to the kernel generate interrupts with different vectors. The CPU uses the vector as an index into the processor's interrupt descriptor table (IDT), which the kernel sets up in kernel-private memory, much like the GDT. From the appropriate entry in this table the processor loads:
RIP
) register,
pointing to the kernel code designated
to handle that type of exception.CS
) register,
which includes in bits 0-1 the privilege level
at which the exception handler is to run.
(In JOS, all exceptions are handled in kernel mode,
privilege level 0.) The Task State Segment.
The processor needs a place
to save the old processor state
before the interrupt or exception occurred,
such as the original values of RIP
and CS
before the processor invoked the exception handler,
so that the exception handler can later restore that old state
and resume the interrupted code from where it left off.
But this save area for the old processor state
must in turn be protected from unprivileged user-mode code;
otherwise buggy or malicious user code
could compromise the kernel.
For this reason,
when an x86 processor takes an interrupt or trap
that causes a privilege level change from user to kernel mode,
it also switches to a stack in the kernel's memory.
A structure called the task state segment (TSS) specifies
the segment selector and address where this stack lives.
The processor pushes (on this new stack)
SS
, RSP
, EFLAGS
, CS
, RIP
, and an optional error code.
Then it loads the CS
and RIP
from the interrupt descriptor,
and sets the RSP
and SS
to refer to the new stack.
Although the TSS is large
and can potentially serve a variety of purposes,
JOS only uses it to define
the kernel stack that the processor should switch to
when it transfers from user to kernel mode.
Since "kernel mode" in JOS
is privilege level 0 on the x86,
the processor uses the RSP0
field of the TSS
to define the kernel stack when entering kernel mode.
JOS doesn't use any other TSS fields.
All of the synchronous exceptions
that the x86 processor can generate internally
use interrupt vectors between 0 and 31,
and therefore map to IDT entries 0-31.
For example,
a page fault always causes an exception through vector 14.
Interrupt vectors greater than 31 are only used by
software interrupts,
which can be generated by the int
instruction, or
asynchronous hardware interrupts,
caused by external devices when they need attention.
In this section we will extend JOS to handle the internally generated x86 exceptions in vectors 0-31. In the next section we will make JOS handle software interrupt vector 48 (0x30), which JOS (fairly arbitrarily) uses as its system call interrupt vector. In Lab 4 we will extend JOS to handle externally generated hardware interrupts such as the clock interrupt.
Let's put these pieces together and trace through an example. Let's say the processor is executing code in a user environment and encounters a divide instruction that attempts to divide by zero.
RSP0
field of the TSS,
which in JOS will hold the value of
KSTACKTOP
.KSTACKTOP
:
+--------------------+ KSTACKTOP | 0x00000 | old SS | " - 8 | old RSP | " - 16 | old EFLAGS | " - 24 | 0x00000 | old CS | " - 32 | old RIP | " - 40 <---- RSP +--------------------+
CS:RIP
to point to the handler function defined there.For certain types of x86 exceptions, in addition to the "standard" five words above, the processor pushes onto the stack another word containing an error code. The page fault exception, number 14, is an important example. See the 80386 manual to determine for which exception numbers the processor pushes an error code, and what the error code means in that case. When the processor pushes an error code, the stack would look as follows at the beginning of the exception handler when coming in from user mode:
+--------------------+ KSTACKTOP | 0x00000 | old SS | " - 8 | old RSP | " - 16 | old EFLAGS | " - 24 | 0x00000 | old CS | " - 32 | old RIP | " - 40 | error code | " - 48 <---- RSP +--------------------+
The processor can take exceptions and interrupts
both from kernel and user mode.
It is only when entering the kernel from user mode, however,
that the x86 processor automatically switches stacks
before pushing its old register state onto the stack
and invoking the appropriate exception handler through the IDT.
If the processor is already in kernel mode
when the interrupt or exception occurs
(the low 2 bits of the CS
register are already zero),
then the kernel just pushes more values on the same kernel stack.
In this way, the kernel can gracefully handle nested exceptions
caused by code within the kernel itself.
This capability is an important tool in implementing protection,
as we will see later in the section on system calls.
If the processor is already in kernel mode
and takes a nested exception,
since it does not need to switch stacks,
it does not save the old SS
or RSP
registers.
For exception types that do not push an error code,
the kernel stack therefore looks like the following
on entry to the exception handler:
+--------------------+ <---- old RSP | old EFLAGS | " - 8 | 0x00000 | old CS | " - 16 | old RIP | " - 24 +--------------------+
For exception types that push an error code,
the processor pushes the error code immediately after the old RIP
,
as before.
There is one important caveat to the processor's nested exception capability. If the processor takes an exception while already in kernel mode, and cannot push its old state onto the kernel stack for any reason such as lack of stack space, then there is nothing the processor can do to recover, so it simply resets itself. Needless to say, the kernel should be designed so that this can't happen.
You should now have the basic information you need in order to set up the IDT and handle exceptions in JOS. For now, you will set up the IDT to handle interrupt vectors 0-31 (the processor exceptions) and interrupts 32-47 (the device IRQs).
The header files inc/trap.h
and kern/trap.h
contain important definitions related to interrupts and exceptions
that you will need to become familiar with.
The file kern/trap.h
contains definitions
that are strictly private to the kernel,
while inc/trap.h
contains definitions that may also be useful
to user-level programs and libraries.
Note: Some of the exceptions in the range 0-31 are defined by Intel to be reserved. Since they will never be generated by the processor, it doesn't really matter how you handle them. Do whatever you think is cleanest.
The overall flow of control that you should achieve is depicted below:
IDT trapentry.S trap.c +----------------+ | &handler1 |---------> handler1: trap (struct Trapframe *tf) | | // do stuff { | | call trap // handle the exception/interrupt | | // undo stuff } +----------------+ | &handler2 |--------> handler2: | | // do stuff | | call trap | | // undo stuff +----------------+ . . . +----------------+ | &handlerX |--------> handlerX: | | // do stuff | | call trap | | // undo stuff +----------------+
Each exception or interrupt should have
its own handler in trapentry.S
and trap_init()
should initialize the IDT with the addresses
of these handlers.
Each of the handlers should build a struct Trapframe
(see inc/trap.h
) on the stack and call
trap()
(in trap.c
)
with a pointer to the Trapframe.
trap()
handles the
exception/interrupt or dispatches to a specific
handler function.
If and when trap()
returns,
the code in trapentry.S
restores the old CPU state saved in the Trapframe
and then uses the iret
instruction
to return from the exception.
Exercise 4.
Edit trapentry.S
and trap.c
and
implement the features described above. The macros
TRAPHANDLER
and TRAPHANDLER_NOEC
in
trapentry.S
should help you, as well as the T_*
defines in inc/trap.h
. You will need to add an
entry point in trapentry.S
(using those macros)
for each trap defined in inc/trap.h
, and
you'll have to provide _alltraps
which the
TRAPHANDLER
macros refer to. You will
also need to modify trap_init()
to initialize the
idt
to point to each of these entry points
defined in trapentry.S
; the SETGATE
macro will be helpful here.
Hint: your _alltraps
should:
GD_KD
into %ds
and %es
call trap
(can trap
ever return?)
Consider using the PUSHA
and POPA
macros; they fit nicely with the layout of the struct
Trapframe
.
Be sure to initialize every possible trap entry to a default handler (T_DEFAULT).
Test your trap handling code
using some of the test programs in the user
directory
that cause exceptions before making any system calls,
such as user/divzero
.
You should be able to get make grade
to succeed on the divzero
, softint
,
and badsegment
tests at this point.
Important note: Be very careful not to
declare your trap handling functions (defined by a
TRAPHANDLER
or TRAPHANDLER_NOEC
macro)
with a name already in use in the code. In your C code,
these will likely be declared as extern
,
which disables the compiler checking for conflicting symbol names---the
compiler instead just picks the first instance of a symbol it finds.
For instance, the function breakpoint
is already defined in inc/x86.h
, and the compiler will
not warn you about this if you also call your breakpoint
trap handler breakpoint
; rather, the compiler
will likely pick the wrong function.
For this reason, one strategy might be to pick a random prefix for
all traphandlers, like XTRPX_divzero
, XTRPX_pgfault
, etc.
Challenge 1! (2 bonus points)
You probably have a lot of very similar code
right now, between the lists of TRAPHANDLER
in
trapentry.S
and their installations in
trap.c
. Clean this up. Change the macros in
trapentry.S
to automatically generate a table for
trap.c
to use. Note that you can switch between
laying down code and data in the assembler by using the
directives .text
and .data
.
Questions
Answer the following questions:
user/softint
program behave correctly?
The grade script expects it to produce a general protection
fault (trap 13), but softint
's code says
int $14
.
Why should this produce interrupt vector 13?
What happens if the kernel actually allows softint
's
int $14
instruction to invoke the kernel's page fault handler
(which is interrupt vector 14)?This completes part A.
Now that your kernel has basic exception handling capabilities, you will refine it to provide important operating system primitives that depend on exception handling.
The page fault exception, interrupt vector 14 (T_PGFLT
),
is a particularly important one that we will exercise heavily
throughout this lab and the next.
When the processor takes a page fault,
it stores the linear address that caused the fault
in a special processor control register, CR2
.
In trap.c
we have provided the beginnings of a special function,
page_fault_handler()
,
to handle page fault exceptions.
Exercise 5.
Modify trap_dispatch()
to dispatch page fault exceptions
to page_fault_handler()
.
You should now be able to get make grade
to succeed on the faultread
, faultreadkernel
,
faultwrite
, and faultwritekernel
tests.
If any of them don't work, figure out why and fix them.
Remember that you can boot JOS into a particular user program
using make run-x or make run-x-nox.
You will further refine the kernel's page fault handling below, as you implement system calls.
The breakpoint exception, interrupt vector 3 (T_BRKPT
),
is normally used to allow debuggers
to insert breakpoints in a program's code
by temporarily replacing the relevant program instruction
with the special 1-byte int3
software interrupt instruction.
In JOS we will abuse this exception slightly
by turning it into a primitive pseudo-system call
that any user environment can use to invoke the JOS kernel monitor.
This usage is actually somewhat appropriate
if we think of the JOS kernel monitor as a primitive debugger.
The user-mode implementation of panic()
in lib/panic.c
,
for example,
performs an int3
after displaying its panic message.
Exercise 6.
Modify trap_dispatch()
to make breakpoint exceptions invoke the kernel monitor.
You should now be able to get make grade
to succeed on the breakpoint
test.
Challenge 2! (5 bonus points; mega-bragging rights for a good disassembler)
Modify the JOS kernel monitor so that
you can 'continue' execution from the current location
(e.g., after the int3
,
if the kernel monitor was invoked via the breakpoint exception),
and so that you can single-step one instruction at a time.
You will need to understand certain bits
of the EFLAGS
register
in order to implement single-stepping.
Optional: If you're feeling really adventurous, find some x86 disassembler source code - e.g., by ripping it out of QEMU, or out of GNU binutils, or just write it yourself - and extend the JOS kernel monitor to be able to disassemble and display instructions as you are stepping through them. Combined with the symbol table loading from lab 2, this is the stuff of which real kernel debuggers are made.
Questions
SETGATE
from trap_init
). Why?
How do you need to set it up in order to get the breakpoint exception
to work as specified above and what incorrect setup would
cause it to trigger a general protection fault?user/softint
test program does?User processes ask the kernel to do things for them by invoking system calls. When the user process invokes a system call, the processor enters kernel mode, the processor and the kernel cooperate to save the user process's state, the kernel executes appropriate code in order to carry out the system call, and then resumes the user process. The exact details of how the user process gets the kernel's attention and how it specifies which call it wants to execute vary from system to system.
In the JOS kernel, we will use the int
instruction, which causes a processor interrupt.
In particular, we will use int $0x30
as the system call interrupt.
We have defined the constant
T_SYSCALL
to 48 (0x30) for you. You will have to
set up the interrupt descriptor to allow user processes to
cause that interrupt. Note that interrupt 0x30 cannot be
generated by hardware, so there is no ambiguity caused by
allowing user code to generate it.
The application will pass the system call number and
the system call arguments in registers. This way, the kernel won't
need to grub around in the user environment's stack
or instruction stream. The
system call number will go in %rax
, and the
arguments (up to five of them) will go in %rdx
,
%rcx
, %rbx
, %rdi
,
and %rsi
, respectively. The kernel passes the
return value back in %eax
. The assembly code to
invoke a system call has been written for you, in
syscall()
in lib/syscall.c
. You
should read through it and make sure you understand what
is going on.
Exercise 7.
Add a handler in the kernel
for interrupt vector T_SYSCALL
.
You will have to edit kern/trapentry.S
and
kern/trap.c
's trap_init()
. You
also need to change trap_dispatch()
to handle the
system call interrupt by calling syscall()
(defined in kern/syscall.c
)
with the appropriate arguments,
and then arranging for
the return value to be passed back to the user process
in %eax
.
Finally, you need to implement syscall()
in
kern/syscall.c
.
Make sure syscall()
returns -E_INVAL
if the system call number is invalid.
You should read and understand lib/syscall.c
(especially the inline assembly routine) in order to confirm
your understanding of the system call interface.
You may also find it helpful to read inc/syscall.h
.
Run the user/hello
program under your kernel.
It should print "hello, world
" on the console
and then cause a page fault in user mode.
If this does not happen, it probably means
your system call handler isn't quite right.
You should also now be able to get make grade
to succeed on the testbss
test.
Challenge 3! (10 bonus points)
Implement system calls using the sysenter
and
sysexit
instructions instead of using
int 0x30
and iret
.
The sysenter/sysexit
instructions were designed
by Intel to be faster than int/iret
. They do
this by using registers instead of the stack and by making
assumptions about how the segmentation registers are used.
The exact details of these instructions can be found in Volume
2B of the Intel reference manuals.
The easiest way to add support for these instructions in JOS
is to add a sysenter_handler
in
kern/trapentry.S
that saves enough information about
the user environment to return to it, sets up the kernel
environment, pushes the arguments to
syscall()
and calls syscall()
directly. Once syscall()
returns, set everything
up for and execute the sysexit
instruction.
You will also need to add code to kern/init.c
to
set up the necessary model specific registers (MSRs). Section
6.1.2 in Volume 2 of the AMD Architecture Programmer's Manual
and the reference on SYSENTER in Volume 2B of the Intel
reference manuals give good descriptions of the relevant MSRs.
You can find an implementation of wrmsr
to add to
inc/x86.h
for writing to these MSRs here.
Finally, lib/syscall.c
must be changed to support
making a system call with sysenter
. Here is a
possible register layout for the sysenter
instruction:
rax - syscall number rdx, rcx, rbx, rdi - arg1, arg2, arg3, arg4 rsi - return pc rbp - return rsp rsp - trashed by sysenter
GCC's inline assembler will automatically save registers that
you tell it to load values directly into. Don't forget to
either save (push) and restore (pop) other registers that you
clobber, or tell the inline assembler that you're clobbering
them. The inline assembler doesn't support saving
%rbp
, so you will need to add code to save and
restore it yourself. The return
address can be put into %esi
by using an
instruction like leal after_sysenter_label,
%%esi
.
Note that this only supports 4 arguments, so you will need to leave the old method of doing system calls around to support 5 argument system calls. Furthermore, because this fast path doesn't update the current environment's trap frame, it won't be suitable for some of the system calls we add in later labs.
You may have to revisit your code once we enable asynchronous
interrupts in the next lab. Specifically, you'll need to
enable interrupts when returning to the user process, which
sysexit
doesn't do for you.
A user program starts running at the top of
lib/entry.S
. After some setup, this code
calls libmain()
, in lib/libmain.c
.
You should modify libmain()
to initialize the global pointer
env
to point at this environment's
struct Env
in the envs[]
array.
(Note that lib/entry.S
has already defined envs
to point at the UENVS
mapping you set up in Part A.)
Hint: look in inc/env.h
and use
sys_getenvid
.
libmain()
then calls umain
, which,
in the case of the hello program, is in
user/hello.c
. Note that after printing
"hello, world
", it tries to access
env->env_id
. This is why it faulted earlier.
Now that you've initialized env
properly,
it should not fault.
If it still faults, you probably haven't mapped the
UENVS
area user-readable (back in Part A in
pmap.c
; this is the first time we've actually
used the UENVS
area).
Exercise 8.
Add the required code to the user library, then
boot your kernel. You should see user/hello
print "hello, world
" and then print "i
am environment 00001000
".
user/hello
then attempts to "exit"
by calling sys_env_destroy()
(see lib/libmain.c
and lib/exit.c
).
Since the kernel currently only supports one user environment,
it should report that it has destroyed the only environment
and then drop into the kernel monitor.
You should be able to get make grade
to succeed on the hello
test.
Memory protection is a crucial feature of an operating system, ensuring that bugs in one program cannot corrupt other programs or corrupt the operating system itself.
Operating systems usually rely on hardware support to implement memory protection. The OS keeps the hardware informed about which virtual addresses are valid and which are not. When a program tries to access an invalid address or one for which it has no permissions, the processor stops the program at the instruction causing the fault and then traps into the kernel with information about the attempted operation. If the fault is fixable, the kernel can fix it and let the program continue running. If the fault is not fixable, then the program cannot continue, since it will never get past the instruction causing the fault.
As an example of a fixable fault, consider an automatically extended stack. In many systems the kernel initially allocates a single stack page, and then if a program faults accessing pages further down the stack, the kernel will allocate those pages automatically and let the program continue. By doing this, the kernel only allocates as much stack memory as the program needs, but the program can work under the illusion that it has an arbitrarily large stack.
System calls present an interesting problem for memory protection. Most system call interfaces let user programs pass pointers to the kernel. These pointers point at user buffers to be read or written. The kernel then dereferences these pointers while carrying out the system call. There are two problems with this:
For both of these reasons the kernel must be extremely careful when handling pointers presented by user programs.
You will now solve these two problems with a single mechanism that scrutinizes all pointers passed from userspace into the kernel. When a program passes the kernel a pointer, the kernel will check that the address is in the user part of the address space, and that the page table would allow the memory operation.
Thus, the kernel will never suffer a page fault due to dereferencing a user-supplied pointer. If the kernel does page fault, it should panic and terminate.
Exercise 9.
Change kern/trap.c
to panic if a page
fault happens in kernel mode.
Hint: to determine whether a fault happened in user mode or
in kernel mode, check the low bits of the tf_cs
.
Read user_mem_assert
in kern/pmap.c
and implement user_mem_check
in that same file.
Change kern/syscall.c
to sanity check arguments
to system calls.
Change kern/init.c
to run user/buggyhello
instead of user/hello
. Compile your kernel and boot it.
The environment should be destroyed,
and the kernel should not panic.
You should see:
[00001000] user_mem_check assertion failure for va 00000001
[00001000] free env 00001000
Destroyed the only environment - nothing more to do!
Note that the same mechanism you just implemented also works for
malicious user applications (such as user/evilhello
).
Exercise 10.
Change kern/init.c
to run user/evilhello
.
Compile your kernel and boot it.
The environment should be destroyed,
and the kernel should not panic.
You should see:
[00000000] new env 00001000
[00001000] user_mem_check assertion failure for va f010000c
[00001000] free env 00001000
This completes the lab. Make sure you pass all the make grade tests.
We will be grading your solutions with a grading program. You can run make grade to test your solutions with the grading program.
Handins in this class will be through a combination of submitting a branch of your code, via gradescope.
You will go to gradescope for the Lab3 assignment, and submit via github. After a few minutes, you should be able to see the results of the autograder and confirm it matches what you expect. Let course staff know ASAP if there is an issue.
In this and all other labs, you may complete challenge problems for extra credit.
If you complete challenge problems, please submit a copy of the lab to the "Challenge" assignment in gradescope for this lab. Note that challenge problems may be submitted until the last day of classes for no penalty (hence the separate assignment).
Please edit challenge.txt
to list each challenge problem number you completed, and, for each problem, write
a short (e.g., one or two paragraph) description of what you did
to solve your chosen challenge problem and how to test it.
If you implement more than one challenge problem,
you must describe each one.
If you submit multiple times, we will take the latest submission.
You do not need to turn in answers to any of the questions in the text of the lab. (Do answer them for yourself though! They will help with the rest of the lab.)
Last updated: 2025-03-18 14:43:44 -0400 [validate xhtml]