Porting Libgist

1. Introduction

One of the main goals of Gist is portability. Parallel architectures change, and as new ones come along, using Gist should require a minimum of work.

The GistView component is written entirely in Java, and is as portable as any other Java application.

Libgist, however, is not portable--at least not in the strict sense of compiling and running as-is on many different architectures. This is an inherent limitation of Libgist as a parallel program. Yet there is good news--Libgist consists of only about 1000 lines of code, so porting it should be a manageable project for a skilled parallel programmer.

To review the basic operation of Gist:

Porting Libgist to any parallel machine involves addressing all of these issues: (The last issue may be avoided if one is willing to make the buffers big enough so that they will not fill during a run.) Performing these operations at optimal speed requires a fairly thorough knowledge of the parallel programming model on your computer. Expect to rewrite a good portion of Libgist to address these issues.

It is not clear to us at this point whether it is even possible to compile, for any given platform, a single C library which will provide Libgist's functionality at good efficiency for all parallel programs. For example, Libgist-1.0 is written using the #pragma interface to parallelism. Linking Libgist against a program that uses the pthreads interface may require, essentially, a Libgist port.

On the other hand, a significant portion of Libgist is platform-dependent in straightforward ways. For example, Libgist needs access to a system clock, and the means to read various performance counters. These parts of the porting job, while not trivial, affect only clearly-marked, local sections of the Gist code.

2. Parallel aspects

In Libgist-1.0, the buffers are described by the three variables
struct Gist_event**           Gist_event_buffers    =  NULL;
Gist_counter_t**              Gist_counter_buffers   =  NULL;
int*                          Gist_num_buffered_events = NULL; 
There are actually two buffers per processor: one for the event structures, and one for the counter values. The reason for this separation is that the event structure is a constant size, while the number of counters is not determined until runtime. So each event buffer contains Gist_events_per_buffer events; each counter buffer contains Gist_events_per_buffer*Gist_num_counters counter values.

The first thing you need to do is design a good replacement for this scheme for your machine. This task will probably be the hardest part of porting Libgist.

Once you've designed the buffer data structure, replace the lines in Gist.h and globals.c, and all of the other source files. Make sure every time these variables are used, your new buffer access method is used instead.

Remember that Gist_init() will be called in a "sequential" section, that is, outside any #pragma block or before any threads are forked. By contrast, Gist_log() may be called by different processors, even at the same time. Avoid all synchronization inside Gist_log().

Notice also that Gist_dump() is unique in that it is called both from an individual processor (when one buffer fills) and from the synchronous routine Gist_finish().

3. Performance measures

The recording of performance measures is necessarily system-dependent. However, the code that calls getrusage() should run with little modification on any UNIX platform. You should only need to: Brief note: the trick with rucode.c, which generates rucode.inc, is simply to avoid having some statement of the form enum { RU_MAXRSS, ... } that must match Gist_ru_strings[] = { "ru_maxrss", ... } exactly.

On a non-UNIX platform, simply remove all the code that relates to the getrusage call; these blocks are clearly marked.

You have to figure out how to obtain hardware performance measures for your machine. But it should be fairly clear where to put this code. Note that the getrusage() code is an example of a single system call returning the values of all of a family of counters, with a boolean vector describing which ones are stored. By contrast, the SGI hardware-counter code is an example of a system call returning only the values of a few (two, in this case) pre-chosen counters. Your performance measures will probably be like one or the other of these, so start with that code.

A current limitation of Libgist is that there be a single datatype, Gist_counter_t, that is large enough to handle all types of performance counters. If you change this type, you need to find all of the printf() statements in dump.c where counters are being printed.

4. Timers

In an event record, Libgist stores the current time in whatever format the system prefers. Currently, this is
typedef struct timespec Gist_timestamp_t;
But Libgist needs to write the time as an integer (this is what the file format requires). Also Libgist sorts records by timestamp. So another datatype is used, which is an integer type, currently
typedef long long Gist_timestamp_integer_t;
The type long long is special to the SGI compiler. If the timestamp can fit in 32 bits, or you're willing to squeeze it, then you can change the line to unsigned int. There is a macro in Gist.h which converts Gist_timestamp_t into Gist_timestamp_integer_t. Also note that you need an appropriate hexadecimal printf tag for Gist_timestamp_integer_t. This may be defined in Gist.h by the time you read this, but in case we didn't get around to it, you can simply grep llX *.c.
Tim Culver
Last modified: Thu Apr 29 00:29:24 EDT 1999