CSE 306: Lab 1: System calls

This lab is split into three parts. The first part concentrates on getting familiarized with C and the x86 assembly language. The second part will walk you through how a system call is issued and the Linux code which handles system calls. In the third part, you will implement a text file processing utility without using libc (i.e., issuing system calls directly from your application).

Software Setup

The files you will need for this and subsequent lab assignments in this course are distributed using the Git version control system.

Git is a powerful, but tricky, version control system. We highly recommend taking time to understand git so that you will be comfortable using it during the labs. We recommend the following resources to learn more about git:

VM Setup

Each student (enrolled and on the waiting list) will be given a virtual machine on the OS teaching cluster with the basic required software installed. You will have root access to this machine, and be able to install additional tools (editors, debuggers, etc) as you see fit.

You are also welcome to install the needed software on your own laptop. The course staff is not available to help you debug your personal laptop configuration. The tools page has directions on how to set up and use gcc and other course software on your own machine.

Each student currently enrolled or on the waiting list will be assigned a VM, IP, and an initial password. Newly enrolled students will be assigned a VM within a few days of enrolling.

To get started with your VM, you must do some system setup. You must initially setup your VM via the vSphere client; after setup is complete you can use ssh to access your VM.

There are some helpful notes on using vSphere here. NOTE: DO NOT INSTALL A NEW OS (CentOS or otherwise)---only use these notes for tips on using vSphere.

Use a remote desktop client to connect to connect to ts1.cs.stonybrook.edu:22. Use CS\userid to log in, using the same username and password you use for department email.

Start the vSphere client. Select the vm server assigned to you, eg. esx3sc.cs.stonybrook.edu or esx4sc.cs.stonybrook.edu and again use your CS department credentials. Your VM assignment should indicate which esx server to log into (but it will be either 3 or 4 if you lost the email). Ignore the SSL warning.

Once you have logged in, click on "Inventory", then select your VM from the list. Right click on it, and select "Open Console." Click the green "Play" button to start the VM. You can log in with the provided account and initial password. Immediately change your password! You can change your password using the passwd command on the console.

Use the ifconfig command to identify the IP address of your VM. You will use this to connect directly to the VM in the future.

After booting, the VM will be available via an ssh client on port 130 (use the -p option to specify a port, or add an entry to your .ssh/config file.). You will use the cse306 account and your newly changed password to log in over ssh.

Warning: It is highly advisable that you take a checkpoint of your VM at this point, especially if you intend to modify packages on the VM. This allows you to rollback the VM to a working state without CS department administrator help (which is not available on the weekends or late at night) if something goes horribly wrong. The button to take a checkpoint in vSphere is next to the button you used to power on the VM.

Good citizenship. You have administrator access to this VM and can install anything you like. That said, to keep load down, DO NOT INSTALL A GRAPHICAL DESKTOP on the VM. You may tunnel the X protocol (ssh -X) to your local machine and display your editor in a window.

Getting started with git

Each student will be assigned a private git repository initially populated with the lab 1 skeleton code. You will use this repository to turn in your code (see below). To download the files into your development environment, you need to clone the course repository, by running the commands below. Substitute your CS netid for USER below.

Notice that each student will have his or her own git repository hosted on scm; your CS user ID should substituted in the URL above (no CS\), and you will also use your CS department account to log into scm.

For students that do not have a private git repository yet, you may use the read-only git repository on the course webpage, using the alternate command below:

Git allows you to keep track of the changes you make to the code. For example, if you are finished with one of the exercises, and want to checkpoint your progress, you can commit your changes by running:

You can keep track of your changes by using the git diff command. Running git diff will display the changes to your code since your last commit, and git diff origin/lab1 will display the changes relative to the initial code supplied for this lab. Here, origin/lab1 is the name of the git branch with the initial code you downloaded from our server for this assignment.

We have set up the appropriate compilers and simulators for you on the CS lab machines.

Hand-In Procedure

Labs will be handed in using the make handin command. This creates a tag in git and pushes the tag and changes to the source repository. You must commit all changes you want included in the handin.

When you are ready to hand in your lab, create a file slack.txt noting how many late hours you have used both for this assignment and in total. (This is to help us agree on the number that you have used.) This file should contain a single line formatted as follows (where n is the number of late hours):

Then run make handin in the lab1 directory. If you submit multiple times, we will take the latest submission and count late hours accordingly.

In this and all other labs, you may complete challenge problems for extra credit. If you do this, please create a file called challenge.txt, which includes a short (e.g., one or two paragraph) description of what you did to solve your chosen challenge problem and how to test it. If you implement more than one challenge problem, you must describe each one.

For answering questions in the labs, you will also have a file called answers.txt, in which you will write answers to the assigned questions.

Migrating to your private repository

If you initially cloned the read-only (http) repository, you must update your git configuration. First, do a git pull to pull any changes from the read-only source. Once you have a private repository for handing in the labs, you should issue this command to add the handin repository (substituting your user ID appropriately):

You can double check the .git/config file to verify the update was successul. Initially, the config will look something like this:

Part 1: Assembly and C

The purpose of the first exercise is to introduce you to x86 assembly language and C.

Getting Started with C

Exercise 1. Read about programming with pointers in C. The best reference for the C language is The C Programming Language by Brian Kernighan and Dennis Ritchie (known as 'K&R'). We recommend that students purchase this book (here is an Amazon Link). There are several copies on reserve in the Science and Engineering library as well.

Read 5.1 (Pointers and Addresses) through 5.5 (Character Pointers and Functions) in K&R. Then download the code for pointers.c, run it, and make sure you understand where all of the printed values come from. In particular, make sure you understand where the pointer addresses in lines 1 and 6 come from, how all the values in lines 2 through 4 get there, and why the values printed in line 5 are seemingly corrupted.

There are other references on pointers in C, though not as strongly recommended. A tutorial by Ted Jensen that cites K&R heavily is available in the course readings.

We also recommend reading the Ksplice pointer challenge as a way to test that you understand how pointer arithmetic and arrays work in C.

Warning: Unless you are already thoroughly versed in C, do not skip or even skim this reading exercise. If you do not really understand pointers in C, you will suffer untold pain and misery in subsequent labs, and then eventually come to understand them the hard way. Trust us; you don't want to find out what "the hard way" is.

Compiling a C program

Make invokes the rules specified in the file 'Makefile'. These rules specify which binaries should be generated, and make only re-compiles these binaries if your source files are newer than the binary (or it does not exist).

Internally, make uses gcc to compile a C program. To learn more about the various command line options for gcc, type man gcc at the command line.

Important: Be sure your code compiles with the included makefile. This lab has some carefully selected gcc flags that prevent you from accidentally using interfaces you should not on this assignment. Do not use gcc on the command line, but learn to use make.

Debugging C with gdb

To debug a C application, you can type gdb foo, where foo is the binary name. You will typically then type run at the command line to start the program.

You can set address breakpoints in GDB with the b command. You may use function names, like main, or you can use virtual addresses. You have to start hex numbers with 0x, so say something like b *0x7c00 sets a breakpoint at address 0x7C00. Once at a breakpoint, you can continue execution using the c and si commands: c causes the program to continue execution until the next breakpoint (or until you press Ctrl-C in GDB), and si N steps through the instructions N at a time.

To examine instructions in memory (besides the immediate next one to be executed, which GDB prints automatically), you use the x/i command. This command has the syntax x/Ni ADDR, where N is the number of consecutive instructions to disassemble and ADDR is the memory address at which to start disassembling.

Exercise 2. Take a look at the lab tools guide, especially the section on GDB commands. Even if you're familiar with GDB, this includes some esoteric GDB commands that are useful for OS work.

Set a breakpoint at the main routine of pointers and step through the code. Print the value of each pointer as it changes and learn to step into the t function.

Getting Started with x86 assembly

If you are not already familiar with x86 assembly language, you will quickly become familiar with it during this course! The PC Assembly Language Book is an excellent place to start. Hopefully, the book contains mixture of new and old material for you.

Warning: Unfortunately the examples in the book are written for the NASM assembler, whereas we will be using the GNU assembler. NASM uses the so-called Intel syntax while GNU uses the AT&T syntax. While semantically equivalent, an assembly file will differ quite a lot, at least superficially, depending on which syntax is used. Luckily the conversion between the two is pretty simple, and is covered in Brennan's Guide to Inline Assembly.

Exercise 3. (10 points) Skim the PC Assembly Language book enough that you feel comfortable writing basic x86 assembly, know the basic facilities assembly gives you, and know where to find help as needed. You should skip all sections after 1.3.5 in chapter 1, which talk about features of the NASM assembler that do not apply directly to the GNU assembler. You may also skip chapters 5 and 6, and all sections under 7.2, which deal with processor and language features we won't use. This reading is useful when trying to understand assembly in Linux, and writing your own assembly. If you have never seen assembly before, read this book carefully.

Also read the section "The Syntax" in Brennan's Guide to Inline Assembly to familiarize yourself with the most important features of GNU assembler syntax. Linux uses the GNU assembler.

Become familiar with inline assembly by writing a simple program. Modify the program ex1.c to include inline assembly that increments the value of x by 1. Add this file to your lab directory so that it is turned in for grading with the rest of your code.

Certainly the definitive reference for x86 assembly language programming is Intel's instruction set architecture reference, which you can find on the CSE 306 reference page in two flavors: an HTML edition of the old 80386 Programmer's Reference Manual, which is much shorter and easier to navigate than more recent manuals but describes all of the x86 processor features that we will make use of in CSE 306; and the full, latest and greatest Intel 64 and IA-32 Combined Software Developer's Manuals from Intel, covering all the features of the most recent processors that we won't need in class but you may be interested in learning about. An equivalent (but even longer) set of manuals is available from AMD, which also covers the new 64-bit extensions now appearing in both AMD and Intel processors.

You should read the recommended chapters of the PC Assembly book, and "The Syntax" section in Brennan's Guide now. Save the Intel/AMD architecture manuals for later or use them for reference when you want to look up the definitive explanation of a particular processor feature or instruction.

Part 2: System calls in Linux

At this point, we will describe the basic mechanism by which programs issue system calls to the OS. First, we will give a bit of background on how system calls work, and then walk through the Linux source code that services system call requests.

Where is the OS kernel?

In most operating systems, the OS kernel is loaded into the virtual address space of every running program. For instance, Linux on the 32-bit x86 architecture is mapped into the upper Gigabyte of the address space, starting at address 0xf0000000.

Note that the virtual address space of a 32-bit processor is 2^32 = 4GB, so this gives an effective virtual address space of 3 GB to the application itself, and 1 GB for the kernel.

So how does the kernel protect itself from an application simply writing over its data structures, reading sensitive data, or calling kernel functions directly? Well, the memory mapping mechanism also allows the OS to specify which ring the CPU must be running in to access the memory region.

Protection rings

The x86 CPU includes four rings, or privilege levels. Most systems, however, only use two rings: ring zero (supervisor, or kernel mode) and ring three (user, or application mode). Higher numbered rings are more restricted; this means that they are not allowed to issue certain privileged instructions, such as instructions that will interact directly with the hardware. Similarly, the page protection mechanisms, which we will learn more about later in the course, can differentiate access permissions by what ring the CPU is currently running in.

Switching between rings

In general, once the CPU has transitioned into ring 3, or user mode, the only way to return to kernel mode is through an interrupt. An interrupt can be a hardware event, such as the disk signaling a read or write completion; or it can be an exception, such as a divide by zero; or a trap, where the software manually raises an interrupt.

On the x86 architecture, interrupts are assigned a specific 8-bit value (remember 2^8==256). For instance, a divide-by-zero exception is given interrupt number 0. This value serves as an index into the interrupt descriptor table, or IDT, where the kernel can register a handler function that is called when an interrupt fires. This is similar in some respects to registering an event handler in an event-driven program.

The IDT also specifies which ring the handler runs in; in general, the ring will be zero. So, any software that can cause a specific interrupt will cause the CPU to switch to ring zero and start executing a specific function.

Some interrupt numbers are assigned by the hardware manufacturer. Intel reserves interrupts 0--31 for exceptions, and by convention, the next 16 or so are typically used for device interrupts.

But what about the other 212 interrupt descriptors? These are under the control of the kernel. The most common use of an interrupt handler is to service traps or system calls from an application. For instance, Linux picked index $0x80, or 128, for its system call interrupt. Windows, in contrast uses $0x2e, or 46. Which interrupt to use is up to the OS---there is nothing magical about 128 vs. 46.

So, what does this look like in code? Well, if you disassemble a binary that makes a system call, you may see a line that looks like this:

The int instruction raises a specific interrupt in software, rather than requiring hardware to raise the interrupt. In other words, this code causes control to jump to the function specified for interrupt handler $0x80, running in ring zero.

The kernel can return control back to the application using the iret, or return from interrupt, instruction. This instruction reloads the application's register state and returns to ring 3.

System calls

A system call, then, is just a public function that the kernel provides to the application, and executes on the application's behalf. Common examples include reading and writing to a file, allocating memory, or terminating the program.

System calls are generally implemented using an interrupt handler, and pass arguments using specific registers. This is called a calling convention, and we will discuss this more later.

At this point, you may be asking yourself, why not just issue a function call directly into the kernel? The answer is primarily so that the kernel can protect itself from a buggy or malicious program. For instance, the system call function may validate that the input arguments make sense, and then call an internal function that does the work. If an application were able to call the internal function directly, the application could circumvent these input parameter checks, and potentially damage the kernel.

Linux system call entry point

A very helpful resource in browsing the Linux source code is the Linux Cross-Reference (LXR), located at http://lxr.linux.no/. This site includes a number of useful features that can help you find your way through the source code.

Another helpful reference for this code is Chapter 10 of "Understanding the Linux Kernel", which describes the system call mechanism of Linux.

Linux places all architecture-specific code in the arch directory. So for x86-specific code, look under arch/x86. Note that older kernels treated 32 and 64-bit x86 as different architecture (i386 and x86_64, respectively), however, in more recent versions these have been combined. However, under the x86 directory, there may be two versions of a file, one with a _32 and another with an _64.

So, the entry point for a system call issued by the int instruction is in arch/x86/kernel/entry_32.S, function system_call. Note that the .S suffix means this file is written in assembly.

In the next exercise, you will walk through this function and try to understand what the assembly code is doing. But let's jump ahead to the key line here:

This code is starting with the value in the eax register, multiplying it by four (i.e., byte aligning it), adding it to the pointer sys_call_table, reading the 32-bit word at this address, and then using that word as a function address and calling that function. Put differently, Linux constructs a table called sys_call_table (generated from the file arch/x86/syscalls/syscall_32.tbl). When converted to assembly, the table looks more like this:

When compiled to machine code, this forms a list of 4-byte function addresses. The value in the eax register, then, essentially serves as an index into this list. So, in the code above, if eax is set to 5 when the int $0x80 instruction is issued, the kernel will call the sys_open function. If eax is set to 2, the kernel will call sys_fork, etc.

Exercise 4. (10 points) Carefully read the system call assembly code. Do your best to understand what this assembly code is doing---it will help you with the coding part of this assignment.
Hint: Read the comments at the top of the file, and note that, by convention, functions in all capitals are macros that are often defined in the same file (if not, LXR can help you find them).
Specifically, answer these questions in answers.txt in your hand-in:
1) What is happening at line 515:

	cmpl $(NR_syscalls), %eax

2) When an interrupt comes in, the CPU pushes the general purpose registers onto the stack. These can be identified based on the offset from the stack pointer (stored in esp). Note that stacks "grow down", so every time something is pushed onto the stack, esp is decremented by four.
In your write-up, identify the six registers closest to esp and which system call argument they store. These are the (up-to) six input values to a system call. System calls which take fewer arguments simply ignore the values in the other registers.

So, to summarize, the eax register stores the system call number, and six registers are used to pass in the system call arguments, which are then pushed onto the stack. If you find the declaration of a sytem call, such as sys_open, you will see it declared with the special term asmlinkage. This means that the input arguments are pushed on the stack, as above. (Note that the implementation of most system calls uses a SYSCALL_DEFINE() macro, as here).

The way that input values are passed to a function is called a calling convention. As the name implies, this is something developers agree on by convention. The decision about where each input parameter is placed is somewhat arbitrary (although there are some good technical reasons for these choices). In general, C uses a different calling convention, where some arguments are passed in registers.

So what about the return value from a system call? This is returned in the eax register. By convention, zero or a positive value usually means success, and a negative value indicates an error number (defined in one of several errno header files, such as this one.

Writing your own system calls

Now that you should understand the basics of system calls, we are going to write our own system call function, using inline assembly.

Exercise 5. (25 points) Implement the inline assembly macros in mysyscall.h, which call the syscall indicated by NUM. You will need to figure out how to pass arguments to and from assembly (see Brennan's guide, above), and which registers to use.
Then, use these macros to implement your own getpid() system call in the my_getpid.c file. The code will compare the result with the libc system call, and indicate whether your call succeeded. You will need to search through the Linux source code for the appropriate system call number and number of arguments.
Advice: You will be using this header later, so as you discover the signature of a system call, you might create a macro for it in the mysyscall.h header.

Part 3: Using system calls

In the last part of the lab, you will write a simple file manipulation tool. The interesting part of this exercise is that you will not use libc. Rather, you will implement libc-like functionality using system calls directly.

To get you started, we have written some skeleton code in fileutil.c which will start you in a main function, and give you input arguments. We have also called the exit() system call stub at the end (although this must first be implemented in mysyscall.h, or else the program will crash or hang.

To step through the program in gdb and see what it is doing (without printf), do this:

Exercise 6. (55 points) Complete the fileutil program, as described below. Do not use libc, but rather, expand the use of your own system call macros, implemented in mysyscall.h.
The program should build on the course VMs, using the 'make' command. The program should take arguments as follows:

	 fileutil [-cdhru] infile|- outfile|-

The program will read infile and write outfile, or write to standard output if given '-'. The program will also take the following options (any, all, or none):

-c: Count the newlines in the file and output this on stderr (file handle 2)
-d: Convert the output to use DOS-style newlines (carriage-return, line feed). Display an error message and help if used with -u.
-h: Print a help message and exit.
-r: Reverse the contents of the file, on a line-by-line basis. For instance: "I love CSE 306.\nCSE 306 is fun!\n" should appear as "CSE 306 is fun!\nI love CSE306.\n".
-u: Convert the output to use Unix-style newlines (line feed only). Display an error message and help if used with -d.

The file names given as input can be any kind of files: relative pathnames or absolute ones. The input file must exist before the program can succeed. The output file may or may not exist: if it exists, it's OK to overwrite it but only if you won't damage the input file (i.e., the infile and outfile may be the same).
You should support both syntax like

    fileutil -cdr in.txt out.txt

as well as

    fileutil -c -d -r in.txt out.txt

although this is a lower-priority feature.
As special names, if infile is "-", you should read from stdin. If outfile is "-", you should output to stdout. That way, you can use the fileutil program as a "Unix filter" such as

$ cat foo.in | ./fileutil -d - foo.out
or
$ ./fileutil foo.in - > foo.out

The program should check for EVERY POSSIBLE ERROR that could occur BEFORE opening the input and output files and while changing the file data. In the case of a possible error, the program should print a detailed error message on stderr and exit with a non-zero status code. This means that you should, for example, check that the input files are readable, that you have permission to read and/or write them (as needed), that neither is a directory, character/block special device, etc. Other errors could occur, and it is your job to write the code that checks for all of those conditions. (A good way to think about it is to inspect manual pages for the system calls that you use, and think to yourself "what else could go wrong when this function runs"). Some of the more complex error conditions for which you should check is running out of disk space or quota, or when the two file arguments use different names but actually point to the same file (do not try to modify a file in place -- this may corrupt the file).

Many of the possible errors are listed on the analogous libc manual page. For instance, you might look at man 2 open to learn all possible return values (however, the error will be a directly returned as a negative value, not placed in errno).

You are welcome to include the definitions in standard header files, but you may not link against libc, or include .c files from the libc source.

NEWLINE EXAMPLES:

The files test-input-unix.txt and test-input-dos.txt give examples of both Unix and DOS-style newlines, which you may use to check your newline conversion code.

Hints and common pitfalls:

When you use the open system call to create a new file (using the O_CREAT) option, be sure to specify a valid mode, such as 0755. If you find yourself using sudo to read or delete the file, the likely culprit is a bad mode argument.

STYLE AND MORE:

Aside from testing the proper functionality of your code, we will also evaluate the quality of your code. Be sure to use a consistent style, well documented, and break your code into separate functions and/or source files as it makes sense.

To be sure your code is very clean, it must compile with "gcc -Wall -Werror" without any errors or warnings!

If the various sources you use require common definitions, then do not duplicate the definitions. Make use of C's code-sharing facilities.

You must include a README file with this and any assignment. The README file should describe what you did, what approach you took, results of any measurements you made, which files are included in your submission and what they are for, etc. Feel free to include any other information you think is helpful to us in this README; it can only help your grade.

STRATEGIES:

If you are not a confident C programmer, one strategy is to temporarily allow libc so that you can figure out the high-level program logic first. This can be done by removing the -nostdlib logic from your Makefile. Once this verison of the code works, you could then disable libc and use the system call definitions in mysyscall.h. Nota bene: Submissions using libc will not receive full credit.

Another strategy is to implement unit test cases for every system call you think you need, so you can debug the system call independently of your larger assignment.

This completes the lab. Type make handin in the lab1 directory. After successful submission, you should receive a confirmation email (although you may need to try twice to get the email). If submission fails, double check that you have committed all of your changes, followed the instructions to add the handin repository, and read any error messages carefully before emailing the TAs for help.