COMP 530: Lab 4: Disk Performance Analysis

Due on Friday, December 7, 2016, 11:59 PM
Note: You may use all of your remaining late hours on this lab, including after the deadline.

Introduction

In this lab, you will measure the performance charactersitics of a hard disk drive (HDD) and a solid state drive (SSD). You will write a few simple micro-benchmarks, run them on a real test system, plot the results, and write a brief analysis of what you learned from the results. You will hand-in a write-up as a pdf---a few pages should be sufficient---as well as the code to run your tests. You are welcome to use existing unix tools, or write your own; if you write your own, they should only require a few tens or hundreds of lines of code. In either case, please hand in a script that runs all of your tests, so the TAs can understand precisely how you obtained the numbers you did.

Writing Microbenchmarks

The first task you will need to do is write several microbenchmarks. A microbenchmark is a simple measurement utility for different types of operations. Here, you will write one or more simple utilities (we recommend in C), that issue different I/O patterns to a disk. Note that Linux exposes disks to users as if they were simply a file (e.g., /dev/sda), so you can write this utility to use familiar file system interfaces (open, read, write, and friends). We do recommend you parameterize this utility to take different parameters and arguments (see Lab 3 for an example of how one can use getopt to simplify command-line parameter processing).

The primary purpose is to understand the performance sensitivity of both types of devices to different I/O patterns. Thus, you will also need to be able to create a range of different I/O types.

Exercise 1. (20 points) Your first task is to write one or more microbenchmark utilities that can measure the time to issue different I/O patterns, as follows:

I/O Size: Sequentially write a range of logical blocks up to 1 GB, in different granularities. Measure the throughput (i.e., 1 GB divided by the execution time of the writes). The variable to change here is the size of each I/O. The smallest sensible size is 4KB (one logical block). So, you might write all 1GB in a series of 4KB writes, a series of 8KB writes, etc., up to a maximum of 100 MB.
I/O Stride: Extend the I/O Size experiment to also leave a configurable amount of space between each I/O request in the sequential write. The stride should range between 4KB and 100 MB, and be independent of the I/O size.
Random I/Os: Your microbenchmark framework should support a mode that can issue random writes within a range of LBAs. This experiment should be able to set the range of acceptable offsets on the device, as well as a granularity, again ranging from 4KB to 100 MB.
Read vs. Write: For all of the tests above, you should also be able to issue the same I/O patterns as reads or writes.

Hint: You can test your framework for functional correctness by running these tests on a file on any system. You only need to use the disk to collect performance measurements. For easier testing, I would recommend that you pass the device name as a command-line parameter. I would also write some unit tests that ensure you are really writing the patterns you think you are writing, say by re-reading the file after a test case.

Hint: To reduce the effects of OS-level caching, open the device with O_DIRECT, and be sure to issue an fsync at the end of each test that involves a write (so you are actually measuring the time to write everything to disk, not just to cache).

Data collection logistics

You will use gwion.cs.unc.edu to run these tests. You will use the /dev/sdb1 partition for the HDD experiments, and /dev/sda2 for the SSD experiments. Both should be world writable and filled with random data.

In order to avoid interference with one another, we request that you use google calendar to reserve a slot for exclusive use of these disks. You can access the calendar using your UNC CS account, at this url (or just create an event in google calendar using your UNC CS account, and invite gwion). Before you run any experiments on gwion, please check the calendar to avoid interfering with anyone else's tests. If no one has a reservation, you may run tests in "non-exclusive" mode (i.e., with the understanding someone else may be doing things, but you can make sure everything works properly on gwion or get some preliminary data this way).

The fact that we are sharing a single machine means you will need to plan ahead a bit, and that you will need to reserve a slot enough in advance of the deadline that you can have time to analyze the results, possibly run more experiments, collect more data if needed, and have time to write up the results before the deadline.

Getting enough samples

One important piece of good scientific methodology is collecting enough samples that you have some confidence your measurements are representative of the real average. Even in computer systems, there is some degree of natural variation (often moreso in the real world). If you run only one experiment, you may have come across a rare outlier.

To keep this assignment tractible, we are going to require that each reported measurement is the mean of at least 5 runs. Be sure to keep data for each run, not just a mean, as you will also need to plot a confidence interval (more on this below).

In deciding which points to test for each of the above experiments, for each variable to test, we suggest you start with the minimum, midpoint, and maximum. From here, you can add midpoints (think binary search) until the "shape" of the graph becomes clear. You may use some judgment as to when you can skip points, but you should expect several of the lines to have a "knee" in the curve; points around the "knee" will be of particular interest. At a minimum, there should be at least 16 data points collected, evenly distributed over the variable space.

For your own edification, the correct way to determine how many samples you need is to use a statistical method like the Student's T-Test (NB: Check out the history of how this was developed), which uses an assumption about the expected distribution of samples and the measured variance to determine whether additional samples are needed.

Data collection and plots

This section will describe the specific experiments you should run, as well as how you should plot these data points. You may use any graphing software you like. Excel is probably simplest, although feel free to use other tools. Prof. Porter is a fan of Ploticus, but there is a significant learning curve that is only worthwhile if you plan to use these tools again in the future. Some of his students have found that the R language has some useful graphing tools.

Exercise 2. (30 points) Collect data and graph the following experiments. Draw line graphs, with each point indicating a mean, and including 95% confidence intervals as error bars for each point. Unless otherwise indicated, all graphs report throughput on the y-axis. You can calculate throughput as: total bytes read or written, divided by the completion time of the benchmark (excluding any setup costs, but definitely including the time to do an fsync on a write test).

For each of these tests, include a plot for reads and a plot for writes. There should be a complete set of all plots for the HDD and for the SSD.

I/O Size: The x-axis (variable under test) should be the I/O granularity, ranging from 4KB to 100MB.
I/O Stride: The x-axis should be the size of the stride (or "gap" between I/Os). You should plot lines for a few different I/O sizes (at least 5, ranging from 4KB to 10MB), to determine whether the trend is consistent or not.
Random I/Os: The x-axis should be the I/O granularity, again ranging from 4KB to 100 MB. You should plot this for a range of 1 GB worth of logical block addresses.

Please also create a script that executes your framework with all of these parameters, such that a single command (like ./myscript.sh will generate all of the expected data). This is useful for reproducing your results (in our case, for grading, but, if you continue doing any science in your future career, so that someone else can understand and replicate your experiment).

Analysis

The final step of this project is to do a short write up (1-2 pp of text, plus graphs) should be sufficient. The write-up is open-ended, but what we are looking for here is some analysis and interpretation of these graphs.

Exercise 3. (15 points) Write 1--2 pp of text analyzing each graph and the trends, as discussed above.

Some questions to consider in your write-up: What can you learn from these experiments? Is there an "optimal" I/O size or pattern? If you were designing a file system (or an application that does a lot of file I/Os), what lessons can you draw from these graphs. How do these results vary for reads vs. writes, or for HDDs vs. SSDs?

Hand-In Procedure

For all programming assignments you will "turn in" your program for grading by placing it in a special directory on a Department of Computer Science Linux machine and sending mail to the TA alias (comp530ta-f16 at cs dot unc dot edu). To ensure that we can grade your assignments in an efficient and timely fashion, please follow the following guidelines precisely. Failure to do so will potentially result in your assignment receiving a failing score for this assignment.

As before, create a directory named lab4 (inside your ~/comp530/submissions directory). Note that Linux file names are case sensitive and hence case matters!

When you have completed your assignment you should put your program and any other necessary parts (header files, etc.) in the specified subdirectory and send mail to the TA (comp530ta-f16 at cs dot unc dot edu), indicating that the program is ready for grading. Be sure to use the following subject line "COMP530 Lab 4 your_cs_userid partner_cs_userid" In this email, indicate if you worked alone or with a partner (and both team members' names and CS login names, as many of you have emails different from your CS login); be sure to cc your partner on this email. If you used any late hours, please also indicate both how many late hours you used on this assignment, and in total, so that we can agree.

After you send this email, do not change any of your files for this assignment after sending this mail (unless you are re-turning in late, as prescribed by the lateness policy).! If the timestamps on the files change you wi ll be penalized for turning in a late assignment. If your program has a timestamp after the due date it will be considered late. If you wish to keep fiddling with your program after you submit it, you should make a copy of your program and work on the copy and should not modify the original.

All programs will be tested on gwion.cs.unc.edu. All programs, unless otherwise specified, should be written to execute in the current working directory. Your correctness grade will be based solely on your program's behavior on gwion.cs.unc.edu. Make sure your programs work and have performance consistent with your write-up on gwion!

Functional correctness will be based on your implementation's ability to correctly create the expected I/O patterns.

The program should be neatly formatted (i.e., easy to read) and well-documented. In general, 75% of your grade for a program will be for correctness, 25% for "programming style" (appropriate use of language features [constants, loops, conditionals, etc.], including variable/procedure/class names), and documentation (descriptions of functions, general comments [problem description, solution approach], use of invariants, pre-and post conditions where appropriate).

Make sure you put your name(s) in a header comment in every file you submit. Make sure you also put an Honor pledge in every file you submit.