COMP 530H: Lab 0H: Practice with Kernel C Programming

Due 11:59 PM, Wednesday, August 29, 2018

Introduction

For this assignment you will write a C function that will be capable of performing a variety of operations on some prototypical Linux kernel structures. To test your function you will also write a main program that will create instances of these prototypical structures and then call your function to manipulate the structures. You should develop all of your code on the Linux server walter.cs.unc.edu.

This assignment is designed to provide practice using the C skills that are going to be important for subsequent assignments that involve writing code for the Linux kernel. The C language elements that are emphasized are typedefs, structures, pointers, character strings, and enumeration types.

A note on 530 and 530H

This assignment serves as an "audition" for COMP 530H. Upon completing this assignment, the course staff will review your code, along with your responses on the survey, and decide whether to let you into the section.

If you complete this assignment and are not selected for COMP530H, this lab may replace Lab 0, or be submitted as extra credit. For fairness, any COMP 530 student may submit this lab for extra credit (up to 10 points on or before the dute date, or for up to 60% credit/6 points up until the last day of classes).

Github classroom

The files you will need for this and subsequent lab assignments in this course are distributed using the Git version control system.

This semester, we will use github classroom to hand-out and submit assignment. This means you will need to create an account at github.com (if you don't have one already) and send the instructor your github user ID ASAP. If you create a new account you will need to verify your email address. Once you are invited to join the class, you will need to authorize github classroom to access your account.

You will also need to generate ssh keys on any system you are using, if you do not already have them:

mkdir .ssh
chmod 700 .ssh
cd .ssh
ssh-keygen -t rsa

Once you have generated an id_rsa.pub file, you wil need to upload this to github (click on your user icon in the upper right coner, go to "Settings", then "SSH and GPG Keys" and upload your key(s).

Getting familiar with git

Git is a powerful, but tricky, version control system. We highly recommend taking time to understand git so that you will be comfortable using it during the labs. We recommend the following resources to learn more about git:

  1. Understanding git conceptually This is a MUST READ if you want to work on git smoothly. (You may skip the last part: Rebasing, for now)
  2. Quick 15-20 mins online exercise to get to know git.
  3. Git user's manual
  4. If you are already familiar with other version control systems, you may find this CS-oriented overview of Git useful.

Working in a group

You may complete this assignment alone or with as many teammates as you like, per the course collaboration policy. However, you are strongly encouraged to do this assignment alone, as it will be difficult to get all needed points if you do all assignments in a group, and subsequent assignments will be much harder than this one. Moreover, this assignment will also help you warm up on C programming, and is an important building block for future labs. You may form different groups for subsequent assignments.

Once you join the group, you will be asked to create a team, or join a team. Once a team is created, you can add members to your team.

Important: In your code, you should have a file called authors.txt with the onyens of each member of the team on a separate line. For example:

onyen1
onyen2

Getting the starter code

You will need to click on this link to create a private repository for your code.

Once you have a repository, you will need to clone your private repository (see the URL under the green "Clone or Download" button, after selecting "Use SSH". For instance, if your private repo is called lab0h-team-don:

walter% git clone git@github.com:comp530-f18/lab0h-team-don.git

System for this course

For COMP 530H, we will be using walter.cs.unc.edu, which you can access via ssh. This system has all needed software for these labs. We will grade your assignments on classroom, so ensure that they work on classroom. We will not give partial credit for assignments that work on a different system, but do not work on walter.

You are also welcome to install the needed software on your own laptop. The course staff is not available to help you debug your personal laptop configuration. If you do this, you should still check that your code works on classroom. The tools page has directions on how to set up qemu, gdb and gcc for use with JOS.

Background

The basic unit of work in the Linux operating systems is called a task. (In class, because we will be considering all sorts of operating systems, we'll use the more generic word "process" instead of task.) Inside the kernel, a task is defined by a data structure, a C struct, that contains all the information the kernel needs to know about the task. In practice, this structure is quite large and is a set of linked structures linked by pointers within the individual structures.

For this assignment, we will use the following highly simplified version of a task structure:

typedef struct {
  long pid;
  VM *vm_ptr;
  FS *fs_ptr;
} task;

In this simplified example, a task is defined by its identifier, a number called pid ("process identifier"), and pointers to two additional data structures that hold information about the task's virtual memory state (vm_ptr) and files in use by the task (fs_ptr). Importantly, you don't have to know anything about files or virtual memory at this point. We will cover these concepts later in the course. For now assume these are just pointers to other structures.

The definition of the virtual memory structure VM is:

typedef struct {
  paged  *paged_ptr;
  pinned *pinned_ptr;
} VM;

This structure relies on two additional structures for "paged" virtual memory and "pinned" virtual memory. Again, you don't have to know anything about these virtual memory concepts at this point. For now just assume these are just pointers to other structures that are defined as follows:

typedef struct {
  void *paged_start;
  void *paged_end;
} paged;

typedef struct {
  void *pinned_start;
  void *pinned_end;
} pinned;

The definition of the file system structure FS is:

typedef struct {
  long inode_start;
  long inode_end;
} FS;

This structure stores information about file system objects known as "i-nodes" (or just "inodes'). Once again, you don't have to know anything about files or inodes to complete this assignment!

Note that not all task structures will contain VM (or paged or pinned) or FS structures. One or more of the pointers to these subordinate structures may have the value NULL.

Pointer Refresher

One of the hardest parts of moving from programming in a higher-level language, like Java, to C is dealing with pointers. We will start with a few ungraded, but highly recommended, exercise to refresh your understanding of pointers.

Exercise 1. Read about programming with pointers in C. The best reference for the C language is The C Programming Language by Brian Kernighan and Dennis Ritchie (known as 'K&R'). We recommend that students purchase this book (here is an Amazon Link). There is a copy on reserve in the library.

Read 5.1 (Pointers and Addresses) through 5.5 (Character Pointers and Functions) in K&R. Then download the code for pointers.c, run it, and make sure you understand where all of the printed values come from. In particular, make sure you understand where the pointer addresses in lines 1 and 6 come from, how all the values in lines 2 through 4 get there, and why the values printed in line 5 are seemingly corrupted.

There are other references on pointers in C, though not as strongly recommended. A tutorial by Ted Jensen that cites K&R heavily is available in the course readings.

We also recommend reading the Ksplice pointer challenge as a way to test that you understand how pointer arithmetic and arrays work in C.

Warning: Unless you are already thoroughly versed in C, do not skip or even skim this reading exercise. If you do not really understand pointers in C, you will suffer untold pain and misery in subsequent labs, and then eventually come to understand them the hard way. Trust us; you don't want to find out what "the hard way" is.

The Function task_store

For this assignment you are to write a function task_store:

void *task_store(enum operation op, char *parm, task *ptr);

that will perform one of three operations on a task structure: store a copy of a task structure, return a pointer to an element of a previously stored copy of a task structure, or initialize data structures associated with the task_store function. The operation that is to be performed is specified by the op parameter. This parameter will be an instance of the enumerated type operation. This type is defined as follows:

typedef enum {
  INIT, 
  STORE, 
  LOCATE
} operation;

Thus, depending on the value of the op parameter, the call to task_store will either initialize local data structures (INIT), store a local copy (STORE), or return a pointer to an element of a previously stored structure (LOCATE). In all cases the function task_store returns an un-typed pointer (void *) with a non-NULL value in the case of a successful call and a NULL value in the case that any error occurs during the call.

More specifically, if the value of the op parameter is STORE, then the task_store function is to store a persistent copy of all structures used to represent the task pointed to by the parameter ptr. To do this your function will need to allocate sufficient memory to store the task structure as well any of the subordinate structures pointed to by elements of the task structure. Note that because you are making copies of the structures, you will have to create the appropriate linkages between the pointers in the structures and the copies of the structures to which they point.

The string parameter parm will contain the character representation of an integer value that specifies an identifier for the task that the calling program will use in subsequent LOCATE operations to access a value from the stored copy (Note: this identifier is not the same as the pid field of the task struct). For example the call:

ret = task_store(STORE, "100", &my_task) 

will create a persistent copy of the task representation found at pointer &my_task and associate the copy with the numeric identifier 100.

Your implementation of task_store should set a limit on the number of task structures that can be stored (say 50). If this number of task structures is ever exceeded then your function should return an error indication.

If the value of the op parameter is LOCATE, then you are to return a pointer to a field in the stored local copy of a task structure. The calling program should be able to dereference the returned pointer (with appropriate cast) to access the value in the field of the local copy. When the value of op is LOCATE, the string parameter parm specifies the character representation of a numeric task identifier and the string name of a field in a struct (pid, paged_start, paged_end, pinned_start, pinned_end, inode_start, inode_end). struct fields that provide pointer linkages among the structures (vm_ptr, fs_ptr, paged_ptr, and pinned_ptr) should not be "located".

For example the call
ret = task_store(LOCATE, "100 inode_start", NULL)

should return a pointer to the inode_start field in the FS struct for the task structure previously stored with the numeric identifier 100. If any error is encountered during the call, the NULL value should be returned. You can assume that parameters encoded in parm are separated by a space (by the character ' ') and that parm is a C null-terminated string.

If the value of the op parameter is INIT, then any persistent data structures and variables used by task_store should be initialized. This operation can return (void *) 1 on success, and NULL on failure. You can assume that the first call to task_store in any program will be an INIT operation. INIT should only be called once, and any calls to INIT after the first one should return an error.

Program Structure

You are to write a function task_store as a separately compiled function in a file you must name task_store.c. We will provide you with a header file task.h that contains the type definitions listed above that you can include in your task_store.c file. You should test your implementation by writing a separate main program (in a separate file) that calls your task_store implementation. (This is how we will test your implementation.) To test your implementation, note that your main program will have to create a bunch of dummy task structures. To get you started we will provide you with a template for your main program called store_test.c that you can (and should!) expand and customize.

To compile and test your program use the provided Makefile:
make

In addition to the C elements mentioned above, you should review the man pages for C library functions that you will likely want to use including malloc, memcpy, atoi, sscanf, and string functions such as strcmp, strstr, etc., and the C language sizeof operation.

Your code should be carefully commented, especially with file and function/block level comments and descriptions for all internal data structures.

Exercise 2. (10 points) Implement the task_store function, as described above. You will also need to implement in store_test.c a main routine that creates a bunch of dummy inputs and tests the correctness of your task_store function. You should write at least 3 solid test cases, ideally more, and your code should be clean and well-commented.

Hand-In Procedure

Type make handin in the lab directory. You may submit more than once; if you do this, the most recent submission will be graded. The way submission works is basically that you create a tag in git, which is then pushed to github. If you look at your lab page on github, you will see a pulldown list with "Branch: master". If you drop this list down, you will see an option to view tags. If you choose the tag handin, you will be able to view your submitted code and confirm that it is correct.

All programs will be tested on walter.cs.unc.edu. All programs, unless otherwise specified, should be written to execute in the current working directory. Your correctness grade will be based solely on your program's performance on walter.cs.unc.edu. Make sure your programs work on walter!

Generally, unless the homework assignment specifies otherwise, you should compile your program using the provided Makefile (e.g., by just typing make on the console). Do not add any special command line arguments ("flags") or compiler options to the Makefile.

The program should be neatly formatted (i.e., easy to read) and well-documented. In general, 75% of your grade for a program will be for correctness, 25% for "programming style" (appropriate use of language features [constants, loops, conditionals, etc.], including variable/procedure/class names), and documentation (descriptions of functions, general comments [problem description, solution approach], use of invariants, pre-and post conditions where appropriate). For this first assignment, correctness and programming style/documentation will each count for 50% of your grade.

Make sure you put your name(s) in a header comment in every file you submit.

This completes the lab.


Last updated: 2018-09-06 09:24:06 -0400 [validate xhtml]