Due 11:59 PM, Tuesday, Nov 17, 2020
In this lab you will develop a multi-threaded simulation of a Least Recently Used (LRU) cache of keys.
The course staff have provided you with a simple, sequential implementation of the code. The sequential implementation will need synchronization to work properly with multiple threads. Your job will be to create several parallel verions of the code, with increasing sophistication.
As before, this assignment does not involve writing that many lines of code (you will probably copy/paste the base code several times). The changes you make will probably be in the tens of lines. The hard part is figuring out the few lines of delicate code to write, and being sure your synchronization code is correct. We strongly recommend starting early and writing many test cases.
In this assignment, we will assume that our cache needs to track
the reference count of a set of keys (here, integers from 0..MAX_KEY
).
When a key is referenced, by calling reference(key)
,
the key should be found in the cache, and either added and the reference count set to 1, or the
reference count incremented by one (if already present).
This cache is organized as a simple, singly-linked list.
Periodically, a thread will need to clean
the
cache (similar to the clock algorithm we saw in class).
The cleaner thread iterates through the list and decrements
the reference count by 1. For any node that reaches a reference
count of zero, the node should be removed from the list and freed.
You may do the lab alone, or in a group. You can complete this lab with a different team than the previous lab, if you prefer. If you work in a group, please submit one assignment to Gradescope, and list all group members in the code comments.
You will need to click on this link to create a private repository for your code.
Once you have a repository, you will need to clone your private repository (see the URL under the green "Clone or Download" button, after selecting "Use SSH". For instance, if your private repo is called lab3-team-don:
git clone git@github.com:comp530-f20/lab3-team-don.git
We provided you with a single-threaded, sequential version of the code in sequential-lru.c
.
This C file implements the functions defined in lru.h
:
int init(int numthreads) |
Initialize any global variables or synchronization primitives (Hint: See pthread_mutex_init() and pthread_cond_init() . May not need to do anything. Returns 0 on success, -errno on failure. |
int reference(int key) |
Search the list, and, if found, increment the key's reference count by one. If not found, add to the list. Returns 1 on success, 0 on failure. |
void clean (int check_water_mark) |
Iterate through the list, and decrement the reference count. If the reference count drops to zero, delete the element. If check_water_mark is set, block until there are more than LOW_WATER_MARK elements in the list. |
void shutdown_threads(void) |
Wake up any blocked threads. May not need to do anything in each variant. |
void print(void) |
Print the contents of the list, primarily for debugging and testing. |
We provide you with a test harness in main.c
. You will complete LRU implementations in mutex-lru.c
and fine-lru.c
. You may copy any code from sequential-lru.c
into these files that you find useful, but it will require modification.
Note that typing make generates three different executables. Currently, only sequential-lru will work, and only with one thread. There will probably be compiler warnings for the other variants.
The code comes with the ability to turn debugging prints on and off at compile time. You can edit lru.h
to uncomment the line below, and you will get debug output printing after
the initial selftests, and at the end of a period of time (specified on the command line):
//#define DEBUG 1
Note that heavy use of printf
is not a good idea beyond initial testing
because printf includes synchronization, which can inadvertently hide bugs.
That said, this style of macro-enabled debug messages can help you understand
what is happening, and easily compile out the messages once you believe
the code is working.
To build the code, and a very, very simple test case, type make
.
You can test the starter code as follows:
porter@comp530fa20:~/lab3$ ./lru-sequential
=== Starting list print ===
=== Total count is 64 ===
Key 0, Ref Count 1
Key 1, Ref Count 1
Key 2, Ref Count 1
Key 3, Ref Count 1
Key 4, Ref Count 1
Key 5, Ref Count 1
Key 6, Ref Count 1
Key 7, Ref Count 1
Key 8, Ref Count 1
Key 9, Ref Count 1
Key 10, Ref Count 1
Key 11, Ref Count 1
Key 12, Ref Count 1
Key 13, Ref Count 1
Key 14, Ref Count 1
Key 15, Ref Count 1
Key 16, Ref Count 1
Key 17, Ref Count 1
Key 18, Ref Count 1
Key 19, Ref Count 1
Key 20, Ref Count 1
Key 21, Ref Count 1
Key 22, Ref Count 1
Key 23, Ref Count 1
Key 24, Ref Count 1
Key 25, Ref Count 1
Key 26, Ref Count 1
Key 27, Ref Count 1
Key 28, Ref Count 1
Key 29, Ref Count 1
Key 30, Ref Count 1
Key 31, Ref Count 1
Key 32, Ref Count 1
Key 33, Ref Count 1
Key 34, Ref Count 1
Key 35, Ref Count 1
Key 36, Ref Count 1
Key 37, Ref Count 1
Key 38, Ref Count 1
Key 39, Ref Count 1
Key 40, Ref Count 1
Key 41, Ref Count 1
Key 42, Ref Count 1
Key 43, Ref Count 1
Key 44, Ref Count 1
Key 45, Ref Count 1
Key 46, Ref Count 1
Key 47, Ref Count 1
Key 48, Ref Count 1
Key 49, Ref Count 1
Key 50, Ref Count 1
Key 51, Ref Count 1
Key 52, Ref Count 1
Key 53, Ref Count 1
Key 54, Ref Count 1
Key 55, Ref Count 1
Key 56, Ref Count 1
Key 57, Ref Count 1
Key 58, Ref Count 1
Key 59, Ref Count 1
Key 60, Ref Count 1
Key 61, Ref Count 1
Key 62, Ref Count 1
Key 63, Ref Count 1
=== Ending list print ===
=== Starting list print ===
=== Total count is 64 ===
Key 0, Ref Count 2
Key 1, Ref Count 1
Key 2, Ref Count 2
Key 3, Ref Count 1
Key 4, Ref Count 2
Key 5, Ref Count 1
Key 6, Ref Count 2
Key 7, Ref Count 1
Key 8, Ref Count 2
Key 9, Ref Count 1
Key 10, Ref Count 2
Key 11, Ref Count 1
Key 12, Ref Count 2
Key 13, Ref Count 1
Key 14, Ref Count 2
Key 15, Ref Count 1
Key 16, Ref Count 2
Key 17, Ref Count 1
Key 18, Ref Count 2
Key 19, Ref Count 1
Key 20, Ref Count 2
Key 21, Ref Count 1
Key 22, Ref Count 2
Key 23, Ref Count 1
Key 24, Ref Count 2
Key 25, Ref Count 1
Key 26, Ref Count 2
Key 27, Ref Count 1
Key 28, Ref Count 2
Key 29, Ref Count 1
Key 30, Ref Count 2
Key 31, Ref Count 1
Key 32, Ref Count 2
Key 33, Ref Count 1
Key 34, Ref Count 2
Key 35, Ref Count 1
Key 36, Ref Count 2
Key 37, Ref Count 1
Key 38, Ref Count 2
Key 39, Ref Count 1
Key 40, Ref Count 2
Key 41, Ref Count 1
Key 42, Ref Count 2
Key 43, Ref Count 1
Key 44, Ref Count 2
Key 45, Ref Count 1
Key 46, Ref Count 2
Key 47, Ref Count 1
Key 48, Ref Count 2
Key 49, Ref Count 1
Key 50, Ref Count 2
Key 51, Ref Count 1
Key 52, Ref Count 2
Key 53, Ref Count 1
Key 54, Ref Count 2
Key 55, Ref Count 1
Key 56, Ref Count 2
Key 57, Ref Count 1
Key 58, Ref Count 2
Key 59, Ref Count 1
Key 60, Ref Count 2
Key 61, Ref Count 1
Key 62, Ref Count 2
Key 63, Ref Count 1
=== Ending list print ===
=== Starting list print ===
=== Total count is 32 ===
Key 0, Ref Count 1
Key 2, Ref Count 1
Key 4, Ref Count 1
Key 6, Ref Count 1
Key 8, Ref Count 1
Key 10, Ref Count 1
Key 12, Ref Count 1
Key 14, Ref Count 1
Key 16, Ref Count 1
Key 18, Ref Count 1
Key 20, Ref Count 1
Key 22, Ref Count 1
Key 24, Ref Count 1
Key 26, Ref Count 1
Key 28, Ref Count 1
Key 30, Ref Count 1
Key 32, Ref Count 1
Key 34, Ref Count 1
Key 36, Ref Count 1
Key 38, Ref Count 1
Key 40, Ref Count 1
Key 42, Ref Count 1
Key 44, Ref Count 1
Key 46, Ref Count 1
Key 48, Ref Count 1
Key 50, Ref Count 1
Key 52, Ref Count 1
Key 54, Ref Count 1
Key 56, Ref Count 1
Key 58, Ref Count 1
Key 60, Ref Count 1
Key 62, Ref Count 1
=== Ending list print ===
=== Starting list print ===
=== Total count is 0 ===
=== Ending list print ===
Salt is 1542750928
=== Starting list print ===
=== Total count is 55 ===
Key 1, Ref Count 2
Key 3, Ref Count 1
Key 6, Ref Count 3
Key 9, Ref Count 1
Key 10, Ref Count 1
Key 11, Ref Count 1
Key 13, Ref Count 1
Key 14, Ref Count 1
Key 15, Ref Count 1
Key 16, Ref Count 1
Key 18, Ref Count 1
Key 22, Ref Count 1
Key 25, Ref Count 1
Key 26, Ref Count 1
Key 28, Ref Count 2
Key 31, Ref Count 1
Key 34, Ref Count 1
Key 35, Ref Count 1
Key 38, Ref Count 1
Key 39, Ref Count 1
Key 41, Ref Count 2
Key 44, Ref Count 1
Key 47, Ref Count 1
Key 50, Ref Count 1
Key 51, Ref Count 1
Key 52, Ref Count 1
Key 54, Ref Count 1
Key 58, Ref Count 1
Key 61, Ref Count 2
Key 65, Ref Count 2
Key 68, Ref Count 1
Key 69, Ref Count 1
Key 70, Ref Count 2
Key 71, Ref Count 2
Key 73, Ref Count 2
Key 79, Ref Count 7
Key 81, Ref Count 2
Key 82, Ref Count 2
Key 88, Ref Count 2
Key 90, Ref Count 1
Key 92, Ref Count 1
Key 93, Ref Count 2
Key 94, Ref Count 1
Key 99, Ref Count 1
Key 100, Ref Count 1
Key 101, Ref Count 3
Key 106, Ref Count 2
Key 109, Ref Count 1
Key 111, Ref Count 1
Key 118, Ref Count 1
Key 120, Ref Count 2
Key 122, Ref Count 1
Key 124, Ref Count 1
Key 125, Ref Count 1
Key 126, Ref Count 1
=== Ending list print ===
All but the last list print should be deterministic, as this is controlled by the code in self_tests()
(you may add additional tests if you like). The last print will vary from run to run, and is the output of running the simulation for 30 seconds.
This test harness takes several options to control the length of time it runs, and the number of threads. To get this information, try:
porter@comp530fa20:~/lab3$ ./sequential-lru -h
LRU Simulator. Usage: ./lru-[variant] [options]
Options:
-c numclients - Use numclients threads.
-h - Print this help.
-l length - Run clients for length seconds.
Do not use this implementation with more than one thread---it will break because it does not use any synchronization!
Before you begin the assignment, (re-)read Coding Standards for Programming with Threads. You are required to follow these standards for this project. Because it is impossible to determine the correctness of a multithreaded programming via testing, grading on this project will primarily be based on reading your code not by running tests. Your code must be clear and concise. If your code is not easy to understand, then your grade will be poor, even if the program seems to work. In the real world, unclear multi-threaded code is extremely dangerous -- even if it "works" when you write it, how will the programmer who comes after you debug it, maintain it, or add new features? Feel free to sit down with the TA or instructor during office hours for code inspections before you turn in your project.
Programming multithreaded programs requires extra care and more discipline than programming conventional programs. The reason is that debugging multithreaded programs remains an art rather than a science, despite more than 30 years of research. Generally, avoiding errors is likely to be more effective than debugging them. Over the years a culture has developed with the following guidelines in programming with threads. Adhering to these guidelines will ease the process of producing correct programs:
char a[1000]
void Modify(int m, int n)
{
a[m + n] = a[m] + a[n]; // ignore bound checks
}
If two thread will be executing this function, there is no guarantee that both will perceive consistent values of the members of the array a. Therefore, such a statment has to be protected by a mutex that ensures the proper execution as follows:
char a[1000]
void Modify(int m, int n)
{
Lock(p);
a[m + n] = a[m] + a[n]; // ignore bound checks
Unlock(p);
}
where p is a synchronization variable. -pthread
flag (included in your Makefile) is sufficient for libc using gcc.
Other compilers may require the -D_POSIX_PTHREAD_SEMANTICS
flag to select thread-safe versions.
One very helpful reference for implementing concurrency on a linked list is the book The Art of Multiprocessor Programming by Herlihy and Shavit. In particular, look at Chapter 9.4 and 9.5 for examples of how to do this sort of synchronization.
You are now ready to start implementing multi-threaded versions of the code.
Exercise 1. (7 points)
Implement a thread-safe list in mutex-lru.c
, using coarse-grained locking.
In other words, for this exercise, it is sufficient to have one lock for the entire list.
To complete this exercise, we recommend using pthread mutex functions,
including pthread_mutex_init
, pthread_mutex_lock
,
and pthread_mutex_unlock
.
Be sure to test your code by running with lru-mutex -c XXX
(where XXX is an integer greater than 1), in order to check that the code works properly.
Because multi-threaded code interleaves memory operations non-deterministically, it may work
once and fail a second time---so test very thoroughly and add a lot of assertions to your code.
Don't be alarmed if you end up with a relatively big or relatively empty list at this point, as long as the list is correct otherwise.
Comp530fa20 is a 16 core machine. Be sure to test your code regularly on the class system, as some bugs may only manifest with multiple CPUs.
Ideally, we want to maintain that the cache has a certain number of elements --- i.e., that
the count
variable stays between LOW_WATER_MARK
and HIGH_WATER_MARK
.
In the second exercise, you will need to use condition variables to ensure two things:
reference()
blocks until count is below HIGH_WATER_MARK
,
as reference may add an element to the list.clean()
blocks until count is above LOW_WATER_MARK
,
ensuring the list doesn't get too small.shutdown_threads()
is
called.Although you could simply have a thread run in an infinite while
loop, we can do better. In particular, we would like you to use condition variables
(e.g., pthread_cond_wait()
and pthread_cond_signal
),
as well as the existing mutex synchronization, to allow the delete thread to sleep
until there is work to do, and then to wake up once there is.
You will need to ensure that the resulting mutex LRU list implementations supports this behavior correctly. It is not necessary for this to work with the sequential version.
Exercise 2. (8 points)
Add condition variable support to the mutex lru list, using a condition variable
to block reference and clean, as described above.
You need to also write at least two additional unit tests (e.g., under self_tests()
or another test function),
that ensure that the node count is being properly maintained. Be sure not to break sequential trie,
and be sure this continues to work as you complete the next exercises.
You can improve concurrency by supporting a single lock per node, allowing threads to execute concurrently in different parts of the trie.
Exercise 3. (10 points)
Implement a list that uses fine-grained locking in fine-lru.c
.
In other words, every node in the list should have its own lock (note the change in the node definition in this file).
What makes fine-grained locking tricky is ensuring that you cannot deadlock while acquiring locks, and that a thread doesn't hold unnecessary locks. Be sure to document your locking protocol in your README.
Be sure to include support for condition variables checking the high and low water mark. Here, you can use one global lock to protect the count and to manage
the condition variables, but do NOT hold this lock for the entire list traversal, or you will not get credit for this exercise. Further note
that it is ok for count to be a bit stale --- e.g., you can drop the global lock, do an insert, and then reacquire the global lock to increment count and possibly signal
other threads; this window where count is out of sync with the list is ok, as long as the end result works out.
Challenge! (up to 5 points, depending on how elegant and correct your solution is)
Replace the list with a concurrent skiplist.
Read this article (or search the web)
to learn more about skip lists.
To complete this challenge, implement a skiplist to replace the current next
pointer in each list node.
Be careful to use the thread-safe pseudo-random number generator.
Challenge! (up to 40 points, depending on how elegant and correct your solution is)
Important. This challenge is extremely hard, so do not start this before your lab is otherwise complete.
Getting all 40 points will require substantial documentation demonstrating the correctness of your implementation.
Read-copy update is an alternative to reader/writer locking
that can be more efficient for read-mostly data structures.
The key idea is that readers follow a carefully designed path through the data structure without holding any locks,
and writers are careful about how they update the data structure. Writers still use a lock to mutually exclude each other.
Read enough from the link above, or other sources, to learn how RCU works.
Task: Create a list variant (rcu-lru.c
) which uses RCU instead of a reader-writer lock. You may use
a 3rd party RCU implementation for the helper functions (e.g., rcu_read_lock
, or write your own. You should write the rcu-lru yourself.
We highly recommend keeping this in a separate source file from your main assignment.
Note: RCU requires memory barriers in some cases, even for readers, so be sure you understand where these need to be placed.
To be sure your code is very clean, it must compile with make without any errors or warnings!
If the various sources you use require common definitions, then do not duplicate the definitions. Make use of C's code-sharing facilities.
You must include a README file with this and any assignment. The README file should describe what you did, what approach you took, results of any measurements you made, which files are included in your submission and what they are for, etc. Feel free to include any other information you think is helpful to us in this README; it can only help your grade.
You will be handing in the code via Gradescope. You should have been added to the class; if not, please contact the instructors as soon as possible. If you work in a team, you should only submit one copy of your code in Gradescope; you can add your teammates to the handin. We recommend handing in directly from your github repository to the assignment.
Note: No/limited autograding. Grading for this assignment will be primarily manual. We may add an autograder to the assignment for part of the points, as time allows. In general, because all interleavings must be correct for a multi-threaded program to be correct, this requires manual grading by more seasoned programmers (i.e., course staff).
You may hand in more than once and we will take either the most recent or the one you designate as your submission, applying lateness penalties as appropriate (out-of-band).
In the event of any discrepancies with the autograder environment, programs will be tested on
comp530fa20.cs.unc.edu
. All programs, unless otherwise specified,
should be written to execute in the current working directory. Your correctness grade will be based
solely on your program's performance on
comp530fa20.cs.unc.edu
. Make sure your programs work on
comp530fa20
!
Generally, unless the homework assignment specifies otherwise, you should compile your program using the provided Makefile (e.g., by just typing make on the console). Do not add any special command line arguments ("flags") or compiler options to the Makefile.
Note: We do not have an automated way to calculate late penalties. These will be applied manually at the end of the semester.
The program should be neatly formatted (i.e., easy to read) and well-documented. The Style Guide gives additional guidance on lab code style.
Make sure you put your name(s) in a header comment in every file you submit.
If you complete any challenge problems, please describe the solution and how to demonstrate it in challenge.txt
.
Note that we have a separate submission option in Gradescope for submitting challenge problems, in order to accommodate later submissions without charging late hours.
Even if you are submitting on time, please submit your challenge problems a second time through the appropriate challenge assignment --- we will only manually
grade assignments handed in through the challenge option.
This completes the lab.
Portions of the thread programming guidelines are adopted undergraduate OS course guidelines created by Mike Dahlin and Lorenzo Alvisi.
Last updated: 2025-04-21 12:06:41 -0400 [validate xhtml]