Index file format

First, an example.

GISTIDX-01
head {
  pagesize 1000
}
00001AF0
00001CD1
00002008
foot {
}
The index file consists of a magic number, a header, a series of keys, and a footer. These four segments are delimited by 1 or more white space characters.

Magic Number

The first ten bytes of the file are "the magic number": the string GISTIDX-01. The numeral 01 will be incremented when the file format is changed substantially.

Header and Footer

The header and footer have the same structure as in the log file format. The only variable in the index file is in the head: the pagesize is the number of event records in a page. (In the real world, the pagesize is likely to be a power of 2.)

Keys

Each line consists of a single key, a hexadecimal timestamp as in the third field of an event record in the log file.

The keys are just the timestamps of the first and every nth event record in the log file, where n is the pagesize.

The index file is used in conjunction with the log file to form an indexed-sequential database. The operation of the database is simple. First, the entire index file is loaded into memory. Then, to retrieve a record from the database, one begins by consulting the index. The example above is the index for a log containing at least 2001 records but no more than 3000, implicitly divided into three "pages" of up to 1000 records each.

The required pages are then searched in memory linearly or by bisection.


Timothy Culver
Last modified: Thu Mar 18 15:00:22 EST 1999