Naming

Next: Access Methods Up: File Management Previous: File Management

Naming

There are several issues in naming files. We discuss some of these below.

Flat vs Hierarchical

An operating system may provide a flat or a hierarchical name space. A flat name space is similar to a global scope in FORTRAN. Thus, just as two variables with the same name `i' cannot be created in FORTRAN, two files with the same name `foo' cannot be created in a flat name space. The Exec-8 system for the Univac provides such a name space. (It alleviates some of the problems of a flat name space by dividing a name into three components, and defaults the first component to the user's id.)

A hierarchical name space is similar to nested scopes in Pascal. A file system calls these nested scopes directories, which may themselves be named. (Can programming language scopes be named?) Now each file (or directory) has a local name, and a full name, which is the full name of the directory in which it is contained followed by the local name of the file. No two files in the same directory may have the same local name. However, files in different directories may have the same local name.

A process may name a file by its full name or a name relative to a special directory called the working directory of the process. For instance in Unix, if the full name of the working directory is `/u2/joe', then the names `/u2/joe/foo' and `foo' refer to the same file. The symbol `/' distinguishes full names from relative names. A special operation is provided to change the current working directory. In Unix the working directory has the special relative name "." and its ancestors have the names "..", "../.." and so on.

A directory is more than a naming tool. An operating system defines special operations on a directory that give the names and other attributes of the files contained in the directory. These attributes include information such as creation date, owner, size, and so on.

The decision to use directories as parts of hierarchical names and as collections of attributed files is a convenient one. However, the two functions of providing hierarchical names and collections of files are not inseparable. Assume that files of remote computers can be accessed. Then the name of the form "jeeves.cs.unc.edu:/foo" is hierarchical even though "jeeves" is a machine name and not a directory name. Similarly, systems that provide flat name spaces also provide a mechanism to examine the name and other attributes of the files in the system.

Devices in Name Space

Should device names be parts of the name space of files? If files and devices share the same operations, then yes. In this situation certain file names are reserved, such as (in Unix):

/dev/console
/dev/floppy
/dev/ttyia

Aliases

Operating systems such as Unix allow a file to be referred by several aliases. Thus the names "/paper1/references" and "/paper2/references" may refer to the same file. Aliasing allows use of shorter relative names for files that do not belong to one single directory. For instance, while in "paper2" a user may use the shorter name "references", rather than the longer name "/paper1/references" To make a new alias of a file, a special call of the form

{\bf alias (old name, new name)}

is provided. It is important to note that the call does not create a new copy of the file but only provides a new name for it.

Aliasing raises several two interesting issues related to deletion and accounting:

Deletion: if the file is deleted under one alias should it disappear under all aliases? If so, a search is required to find all aliases of a file. To make this search efficient, all aliases of the file could be stored in a central place, perhaps with the contents of the file. If not, then a reference count is stored with each file, and deleting a file decrements the count and removes the file only if the count goes to zero.

Accounting: if two different users have access to the same file, who should be charged for the file space occupied by the file? We could charge the owner of the file, that is, the user who first created it. However, this situation is not fair, specially if the owner deletes it. A better alternative is to divide the space used among all directories in the system.

Indirect Files

Berkeley 4.2 Unix provides another way to have multiple (full) names for the same file through indirect files. An indirect file is a file that contains nothing but the name (full name or relative) of another file. A special call of the form

Indirect(F, I)

may be used to create a new indirect file for F. Any attempt to open the indirect file will open F instead.

Indirect files also raise several issues including deletion and accounting:

Deletion: Deleting an indirect file I does not delete F. Deleting F makes I useless. It is then illegal to open I, though it remains in the directory. If a file named F is created again, I is again usable. (Why is this feature useful?)

Accounting: If an indirect file I refers to F, then the owner of I pays for the minuscule amount of storage store the name F. Thus the owner of F pays for all of it.

Multiple Indirection: Can Indirect files name other indirect files? If so, then there is a danger of recursion, as shown below:

I1 -> I2 -> I1

(Does such a danger exist with aliasing?) To detect this recursion, an indirect file chain may be kept to a small number such as 5.

Interpretation of Relative Names: With respect to what directory should relative names in indirect files be understood? Two alternatives are: 1) the working directory of the process, and 2) the directory in which the indirect file is stored. (Going back to the analogy between variables and files, the alternatives essentially signify `dynamic' binding and `static' binding) Unix chooses the second alternative.

Naming by Users vs Processes

The naming scheme discussed here is the one used by processes when they make service calls that specify file names. A user names a file via some process such as a command interpreter, which may embellish or replace the naming scheme provided to it by the operating system. For instance, a command interpreter might allow a `*' in the file name to match any string.

A danger with allowing embellishment of the naming scheme by different processes is that a user may not see a single naming scheme. It would be useful if an operating system provides a naming scheme that does not need any embellishment for users and can be provided unaltered by all processes. However, this situation is not always possible. Some systems provide a special program that allows a user to point at the name of a file in a directory listing instead of typing it. These processes may allow two different files in a directory to have the same name, as shown below:

bar
foo
foo

There is no ambiguity since a user points at the desired file name. Such a naming scheme cannot be supported by the operating system since a service call cannot `point' at a file entry.

Next: Access Methods Up: File Management Previous: File Management

Prasun Dewan
Wed Apr 21 11:44:11 EDT 1999