The Computer Science department of the University of North Carolina at Chapel Hill is involved in a federally funded initiative to study the use of computer technology to support collaboration among scientists and professionals. Our project is interdisciplinary; involving computer scientists, cognitive psychologists, and an anthropologist.
One component of this project is the construction of a distributed, real-time, video conferencing system on a network of high- performance workstations. This note concerns the technological issues involved in realizing a workstation-based conferencing system. The emphasis of our research is on the development of operating system and network support for real-time communication of digital audio and video streams.
From a technological standpoint, the goal is to support multiple, concurrent streams of digital audio and video in a distributed network of computer workstations. These streams may either form disjoint conferences within the network or involve one node (workstation) in multiple simultaneous conferences. Our approach is to view the problem as one of real-time resource allocation and control. With respect to real-time performance, our system should be indistinguishable from the more traditional analog (i.e., non- computer based) realization of a conferencing system. This will be assessed not only by feedback from users but also through analytical modeling. In the latter case we plan to demonstrate by rigorous techniques that the system adheres, in all cases, to a set of well defined performance parameters. For example, one parameter will be a specification of worst case response time for a packet of compressed audio/video data delivered end-to-end.
To construct and analyze such a system we have adopted a real-time system design discipline based on a paradigm of process interaction called the real-time producer/consumer (RTP/C) paradigm. The RTP/C paradigm defines a semantics of inter-process communication that provides a framework for reasoning about the real-time behavior of programs. This semantics is realized through an application of some recent results in the theory of deterministic scheduling and resource allocation.
2.1. Collaboration and conferencing
The driving problem that sustains our research is computer and communication support for collaboration within groups, one part of the broader field of computer-supported cooperative work (CSCW). Whereas collaborators may often work independently, many activities require direct person-to-person interaction –– exploring, questioning, proposing, reviewing, negotiating. These conferences will be most productive if all participants simultaneously have full access to their shared computer-stored materials. Furthermore, it should be convenient to begin and sustain conferences without elaborate scheduling and, better yet, without leaving one’s office. Consequently, we believe that “shared windows” implemented on networked, graphics-based workstations and augmented by voice and video communications are promising technologies for collaboration support. Our facilities for conferencing (including shared windows, sight, and sound) are designed for integration with the X Window System. While we don’t expect workstation-based conferencing to replace all face-to-face meetings, better tools should make distributed collaboration an effective alternative.
2.2. Why digital media?
All requirements for media in collaboration systems could be met using a combination of digital and analog technology. For example, one could equip a workstation with a LAN adapter and adapters for analog video and voice; in buildings wired for data, voice, and CATV (separate technologies), conferencing could then be extended to office workstations. One important motivation for all-digital media is the cost advantage of a single technology that can support communications for both people and computers –– an “information receptacle” in the office into which one can plug a workstation suitably configured for multi-media computing and conferencing.
Further, if audio and video are represented in digital form, the data can be handled in new ways. Software can replace specialized hardware used in teleconferencing systems (e.g., voice-activated displays, or “quad-split” images to show four participants). With all-digital media, a software conferencing system can allow each user to customize communications (e.g., images displayed, size and arrangement of windows, update rates). Other interesting possibilities include the ability to “cut” a segment of video or audio from a conference and “paste” it into a hypermedia database.
2.3. System design for digital media
Digital media have two properties that present challenging system design problems –– bandwidth consumption and continuous, real- time demands. Recent advances in compression technology have brought bandwidth demands for digital video within the capabilities of workstations. Intel’s Digital Video Interactive (DVI) system can compress color, full-motion video in real time with VCR-like quality. The bandwidth required for transmitting the compressed stream over a LAN is approximately 1.1 Mbps (4.5 Kbytes per video image). In a workstation used for collaboration, many competing demands for bandwidth (e.g., from the window system or file system) must be satisfied concurrently with demands from multiple streams of video and/or audio.
Digital audio and video are called “continuous media” because the information is continuously changed, usually at regular intervals. A new video image must be captured, transmitted, and displayed every 33 milliseconds to achieve the maximum fidelity of motion possible with DVI technology. Some relaxation of this requirement is possible, however, because people have a degree of tolerance –– a rare discontinuity will usually not be noticed. Because a person is present at both ends of a continuous-media stream in conferences, we must approximate the “look and feel” of human conversations. Further, because the participants will be discussing actions in their shared windows, both the graphics windows and the continuous- media windows must be updated concurrently. These requirements naturally lead to real-time scheduling in workstation operating systems.
We limit our system to supporting a single local-area network (includes multiple LANs interconnected by MAC-layer bridges). LAN systems well suited to the conferencing application include the 16-Mbps token ring and the 100-Mbps FDDI ring. In addition to bandwidth, two particularly desirable functions of these networks are priority service and group addressing. Our research also anticipates wide-area networks with Gigabit speeds that will make continuous media widely distributable. The problems of managing continuous media in these networks are being addressed as part of the DASH project at the University of California at Berkeley.
3. Programming Model
3.1. The real-time producer/consumer paradigm
A continuous media system is viewed as a directed graph where vertices represent processes and edges represent unidirectional communication channels. Processes exchange messages along communication channels. Processes may be either sequential programs that execute on a single processor or physical processes in the external environment that communicate with internal processes via interrupts.
Each channel in a graph defines a producer/consumer relationship between two processes. Each channel is characterized by a rate indicating the worst case minimum inter-arrival time that can be observed between any two messages sent on the channel. The RTP/C paradigm stipulates that over a well-defined interval, a consumer of messages on a channel must consume messages at precisely the rate at which they are produced. Logically, a message sent (produced) on a channel must be received (consumed) before the next message is sent. Conceptually, a producer defines a discrete time domain for a channel. The emission of messages on this channel corresponds to the “ticks” of a discrete time clock. If a pair of interconnected processes adheres to the RTP/C paradigm, then, relative to the channel’s discrete time clock, a producer cannot tell the difference between a consumer that is infinitely fast and one that simply obeys the RTP/C paradigm. In this manner the RTP/C paradigm allows one to ignore implementation details such as processor speeds.
Given a specification of the processing resources available in the system, the key challenge is to implement a process graph such that all interconnected pairs of processes are guaranteed to adhere to the RTP/C paradigm. We have studied the theoretical foundations of this problem extensively. A process graph is modeled as a set of tasks that make repetitive requests for execution and have a deadline for the completion of each execution request. Moreover, tasks may have constraints on their execution due to the presence of critical sections. With respect to the formal model, we have developed an optimal algorithm for sequencing such tasks on a single processor. This work has resulted in a decision procedure for determining whether a given set of processes can be executed such that all interconnected pairs of processes are guaranteed to adhere to the RTP/C paradigm. The parameters to this decision procedure are the maximum amounts of computation time required to consume a message on each channel and the worst case inter-arrival times of messages on channels. These results are applied to the problem of designing a distributed real-time system by mapping appropriate subgraphs of a process graph onto physical processors or the network.
The RTP/C paradigm provides a framework both for expressing processor-time-dependent computations and for reasoning about the real-time behavior of programs. For example, for an arbitrary path through a graph one can compute bounds on the time required for a message, or sequence of messages, to propagate between the source and sink processes. Other traditional performance metrics such as throughput are implicit in the model. The adoption of the RTP/C paradigm as a programming abstraction for real-time systems also allows one to assess design tradeoffs such as the introduction of a faster processor, or the effect of restructuring a computation.
3.2. Supporting digital audio and video with the RTP/C paradigm
The programming model we have described can be viewed as a type of data-flow model. As such it is well suited to applications whose real-time constraints arise from the need to process continuous streams of data that enter the system. For a workstation-based conferencing application, the entire system, including software processes, I/O devices, and the interconnection network, is represented as a directed process graph. Within this framework we can, for example, determine bounds on the latency for a compressed video frame (or sequence of frames) to traverse a path from a compression source to a decompression sink. Moreover, we will be able to derive limits on the number of real-time streams of continuous media that our system can handle simultaneously.
Our prototype implementation uses IBM PS/2 machines (Intel 80386 processors) with Intel ActionMedia 750 capture and delivery adapters (DVI technology). Each workstation is also equipped with a video camera and microphone for input to the DVI capture adapter, along with a VGA graphics display and speakers for output from the DVI delivery adapter. The PS/2 machines are interconnected by a 16-Mbit token ring. This configuration allows us to conduct experiments involving multiple concurrent streams of digital continuous media together with other types of data (e.g., remote file access). The multiple concurrent streams of continuous media are required to support multiple simultaneous conferences. (Current generation DVI cards may not be able to handle multiple conferences on a single workstation; future versions may provide more capability.) Initially, the base operating system for these machines will be IBM PC-DOS. Our longer-term plans call for implementations on other operating systems for which an X11 server is available (e.g., Unix).
At the lower levels of the system we will be constructing operating systems support for the RTP/C paradigm as extensions to the base PC-DOS. This involves the implementation of a tasking model, scheduling algorithms, and resource allocation policies. The specific kernel functions we are constructing have two novel features:
• the kernel uses detailed knowledge about the application executed, such as process graph topology, to optimize response times for the propagation of messages, and
• the kernel exploits properties of the scheduling policies we employ to implement tasks in an extremely “lightweight” manner (e.g., without the usual overhead of maintaining separate run-time stacks for each task).
A prototype of this kernel has been constructed on top of Unix.
We are currently (August 1990) installing and testing the hardware for the prototype configuration. A model and analysis for continuous media support based on the real-time producer/consumer model are being developed concurrently with an implementation of a run-time system (a real-time scheduler extension to PC-DOS) that uses topological information to support the analysis used in the model.