GPUSync is a synchronization-based method for incorporating GPUs into real-time operating systems. Real-time operating systems properly prioritize pending work to improve responsiveness, reduce jitter, minimize required hardware computing capacity, and ensure computing resources are given to the most critical applications while avoiding the starvation of others. Many important GPGPU applications can take advantage of these benefits, including automated vehicles, computer vision (such as augmented reality), broadcasting, and financial trading, to name a few. Indeed, realtime guarantees are often necessary for applications that require governmental certification, such as automated vehicles.

Access to GPUs under GPUSync is arbitrated through advanced k-exclusion and nested locking protocols. I perform real-world evaluations of GPUSync on computer vision algorithms. This work is in progress.

Major features of GPUSync include:

  1. Support for multi-GPU platforms.

  2. Real-time deterministic peer-to-peer GPU data transmissions.

  3. Real-time deterministic overlapping of GPU compute kernel execution and data transfers, including support for high-end multi-copy engine GPUs.

  4. Architecture-aware heuristics that maintain task GPU affinity.

  5. Global management of multi-GPUs, allowing the migration of computations between GPUs (on compute kernel boundaries).

  6. Real-time theoretical analysis, enabling timing constraints to be guaranteed.

I have two papers on this work in progress:

  1. 1.GPUSync: A Framework for Real-Time GPU Management

  2. 2.GPUSync: Architecture-Aware Management of GPUs for Predictable Multi-GPU Real-Time Systems

The first paper focuses primarily on the theoretical side of GPUSync that allows guarantees on timing constraints to be made. The paper includes measurements quantifying the cost of I/O and system memory bus contention on real-time schedulability. The second paper represents earlier work on GPUSync. This paper differs from the first in that architecture-aware GPU assignment and migration heuristics based upon feedback controllers are discussed and various trade-offs evaluated (Note: I have since devised superior heuristics to those discussed in this paper). I am in the process of blending these two papers and updating its material for a journal paper.

UPDATE: Some of this work will be presented at the upcoming 35th Real-Time Systems Symposium, in Vancouver this December (2013).