Overload Tolerance in Safety-Critical Systems

The Challenge. A primary objective in scheduling safety-critical real-time systems is that all deadlines be met. To achieve this goal, system architects typically attempt to anticipate every eventuality and design the system to handle all of these situations. Such a system would, under ideal circumstances, never miss deadlines and behave as expected by the system designers. In reality, however, unanticipated emergency conditions may occur wherein the processing required to handle the emergency exceeds the system capacity, thereby resulting in missed deadlines. The system is then said to be in overload. If this happens, it is important that the performance of the system degrade gracefully (if at all). A system that panics and suffers a drastic fall in performance in an emergency is likely to contribute to the emergency, rather than help solve it. This research project is investigating approaches to deal with the performance degradation resulting from transient overloads in time-critical applications.

The Approach. Our approach towards understanding the behavior of resource allocation algorithms under overload conditions has focused upon addressing the following issues:

Last updated on 2001/11/28 by SkB.