Some General Concepts

Next: Unix File Management Up: Physical Representation Previous: Disks

Some General Concepts

File Sizes

An important property of files, used in the physical representation of files, is that most files are small while some are very large. A study done at Los Almos shows that half the files were within 40Kbytes, whereas many reached 10M bytes or so. Another study at UW-Madison showed that 42 percent of all files fit within one disk block, and 73 percent fit within 5 disk blocks.

File System Blocks and Fragments

We shall distinguish between file system blocks and disk blocks discussed earlier. A disk block is the data in one sector while a file block is data in one or more consecutive sectors. Thus a file system block size is a multiple of a disk block size.

Making a file system block different from a disk block gives the operating system some control on the unit of data transfer. A large (file system) block size allows greater utilization of the disk bandwidth, while a small block size allows lesser internal fragmentation. In view of the fact that there are both some very large files that may be read sequentially, and some very small files of 5 or more block sizes, it is important to optimize both.

To achieve the above purpose, some systems support `large' and small `blocks'. In particular, 4.2BSD divides a file system block into 2, 4, or 8 fragments, each of which is addressable. Each fragment in turn is made up of one or more disk blocks. The last part of a file that does not fit into a file block is stored in one or more fragments. This scheme reduces the internal fragmentation while allowing large throughput.

One problem with the above scheme is that it may involve excessive copying of data. As a file grows, data in fragments may need to be copied to blocks to ensure that only the last parts of the file are in fragments. This problem may be particularly severe when a process sequentially writes large amounts of data into a file. To reduce this problem, an operating system may make the file system block size available to applications, so that they can make units of data transfer multiples of these sizes.

In the rest of the discussion we shall use the term block to refer to a file system block. Moreover, unless necessary, we shall not distinguish between a block and a fragment.

Reducing Latency

It is important to allocate blocks in such a way that both rotational and seek latency are reduced. Rotational latency is reduced by keeping consecutive blocks of a file in either adjacent in a cylinder or a few sectors apart. The former alternative is chosen if a large number of blocks are transferred without processor intervention. The latter alternative is chosen if each transfer generates an interrupt. It allows the R/W head to be positioned on a block when that block is needed. The number of sectors between adjacent file blocks is based on the expected time between transfer requests and is a function of processor speed.

Seek latency is reduced through clustering, which keeps the data in a file confined to nearby cylinders. We shall study below techniques to do such allocation.

Logical Disks

A physical disk may be divided into several logical disks. These define independent file systems in the sense that a file cannot span different logical disks. Logical disks can be structured to promote clustering. They also allow administrators to segregate files pertaining to different projects and to limit the amount of disk space each project uses. Finally, they support different block sizes, thus allowing logical disks to be associated with applications with similar characteristics.

Cylinder Groups

A logical disk may be further divided into cylinder groups, which are adjacent cylinders. Unlike logical disks, cylinder groups are not visible to a user. Thus the files of a directory can span different cylinder groups.

The operating system tries to keep the blocks of a file in the same cylinder group. However, it is dangerous to do so consistently, However, files that grow may have their parts scattered in different cylinder groups. (why?) Thus in Unix 4.2BSD a file is assigned a new cylinder group when a file exceeds 48 kilobytes, and every 1 megabyte thereafter. The newly chosen cylinder group is selected from those cylinder groups that have a greater than average number of free blocks left. Also, the system keeps a certain percentage of a logical disk free to reduce the scattering of data.

Managing Free Space

The free space may be managed by keeping either a linked list of free blocks or a bit map containing a bit for each block which indicates if the bit is allocated or not. A bit map wastes space, but allows clustering.

Static Allocation vs Dynamic Allocation

So far we have assumed that files can grow dynamically. Some systems force a user to specify the size of the file when it is specified. Thus a file may be allocated a contiguous chunk. The disadvantages of static allocation is that it forces the user to provide an estimate and may cause external fragmentation/compaction. The advantage is that it is easy to implement.

Caching

Some of the information on disk is kept in memory through caching. Thus, when a process writes data to a file, the operating system may transfer the data to a buffer without writing to the disk. This reduces disk transfers if a process attempts to read or write the buffer before it is flushed to the disk. Caching is also useful when applied to directories that are looked-up often.

Unfortunately, caching causes data on disk to be out of step with data in memory. Thus it leads to loss of data if a system crash occurs.

Disk-Head Scheduling

The CPU can generate disk I/O requests much faster than the disk hardware can service them. Therefore, at any time there are several requests pending. We can impose an order on the outstanding disk requests so that they can be serviced efficiently. We consider below different disk-head scheduling policies.

The simplest policy is to order disk requests first come, first served (FCFS). This policy is fair; however, every request is likely to result in a seek. This method works well for light loads, but saturates quickly.

An alternative is to apply the shortest seek-latency first (SSF) policy, which next serves the request whose track is closest to the current one. This is the best policy for minimizing the total seek latency. However, it is less fair than the previous policy in that it makes some requests wait much longer than others. Thus the variance of wait time is large.

A compromise is to adopt the ELEVATOR policy, which uses the strategy adopted by an elevator in servicing requests. Under this policy, the disk head is either in an inward-seeking phase or an outward-seeking phase at any time. While moving inwards or outwards, the head picks up any requests that it passes. It changes direction when there are no more requests ahead.

Under ELEVATOR, a request may have to wait as long as two sweeps until it is serviced. Therefore a circular variant of it is used under high loads to improve the variation in wait times. Under this scheme the head always moves inwards. When the last seek has been performed in that direction, a single seek is performed to the outermost track with a pending request, without picking any requests on the way out. Thus a request never waits for more than one sweep.

Next: Unix File Management Up: Physical Representation Previous: Disks

Prasun Dewan
Wed Apr 21 11:44:11 EDT 1999