Research Summary

I have worked in three closely related areas of software systems: system support for distributed collaboration, user-interface generation, and object-oriented database systems. This document summarizes my work in these areas and contains links to some of my papers that are online. I created this writeup as part of my tenure package at UNC and am publishing it here since it gives potential students an idea of the research topics in which I am interested.

System Support for Distributed Collaboration

Computer supported collaboration allows people to work together without leaving their offices and labs. The experience of researchers in this area has shown that distributed collaborative software can give the benefits of face-to-face collaboration. However, it has also shown that design and implementation of this software are extremely difficult because of the large number of complex functions the software must perform flexibly and efficiently such as synchronization, undo/redo, concurrency control, access control, and merging. The key to exploring the tremendous potential of collaborative software is to devise an infrastructure that addresses the fundamental issues in collaboration. I have initiated a line of research to explore and determine the concepts required to realize such an infrastructure.

I have designed a framework for supporting collaborative software based on a new software abstraction, called a ``shared active variable,'' which allows multiple users to modify application data structures and view results. It integrates and extends properties of program variables, database entities, and screen images. The basis for this abstraction is an ``active variable,'' which is an abstraction I designed as part of my research in user-interface generation summarized later. The framework has been successfully implemented by my research group in a software system called multiuser Suite. (Suite is a user-interface generator that resulted from my research in user-interfaces.) Experience with it at Purdue, UNC, University of Minnesota, and elsewhere shows that the system makes it easy to efficiently implement a large class of collaborative applications. A detailed case study of my experience with using the framework to implement a complex collaborative application is given in this paper. A related paper surveys the general area of multiuser frameworks and another identifies principles that are fundamental to these systems.

The potential of collaborative applications cannot be fully exploited if they enforce a tight and rigid coupling among the collaborators. I have designed a new flexible coupling model, which defines a wide range of sharing and synchronization levels among the collaborators and allows them to dynamically tailor these levels to their task. In particular, it blurs the traditional distinction between synchronous and asynchronous collaboration, which are just two extremes in a spectrum of coupling levels defined by the model. I have also been working with Ph.D. students at Purdue and UNC on other aspects of collaboration including access control, undo, merging, concurrency control, and migration. I am also working with John Riedl at the University of Minnesota on a collaborative software engineering environment called FLECSE (FLExible Collaborative Software Enginering). We have identified several scenarios that show the system can increase concurrency in several phases of the software lifecycle. Moreover, John Riedl's group has done pilot studies that show that computer supported software inspection can lead to distributed and semi-synchronous software inspections that are as effective as face-to-face synchronous software inspections. I am currently involved in a joint project between UNC and the Armed Forces Institute of Pathology (AFIP) to develop a system allowing distributed pathologists to collaboratively browse, manipulate and discuss forensic data. This work is still preliminary, but it has brought together many of the related rearch efforts at UNC and given us a complex real-world application to work together on. As part of our work on interoperability, Suite has been integrated with Hussein Abdel-Wahab's XTV and Dave Stotts' Trellis. We hope this work will lead to a general software infrastructure, called a collaboration bus, for integrating collaboration systems. I have recently started looking at the relationship between CSCW and software process research. Finally, Kevin Jeffay and I are studying how shared window systems can be made more flexible.

User-Interface Generation

The user interface of interactive programs is increasingly being considered by users of the programs as the primary determinant of their satisfaction with the programs. It is also considered by the implementors of these programs as the primary determinant of the cost of implementing the programs. I noticed this problem in 1982 as a graduate student while developing the user interface of an experimental operating system called Charlotte. Other researchers and I independently observed that many interactive programs offer a common set of editing commands and differ mainly in the data structures they display, how they present the data structures, and how they respond to invocation of editing commands. The key to user-interface generation then was to automatically support the display and editing of data structures in the language, thereby relieving application programmers from the difficult task of implementing these functions.

In my dissertation, I extended the Mesa programming language with an abstraction for automating the construction of user interfaces. The abstraction, called an active variable, is a program variable associated with a set of ``program operations'' used by a program to manipulate it, a set of ``user operations'' used by users to edit it, and a set of ``attributes'' representing its customizable properties. Programs that create active variables are responsible only for defining attribute values and not for implementing the user interface to edit them, which is automatically generated by the language based on the types and attributes of the variables.

The difficulty of implementing user interfaces has led to a variety of works. While these works address the same general problem, the relationships among them are not well understood. In my dissertation work, I made one of the few contributions to this problem by analyzing a set of traditional and recent works that had not been compared before, identifying the factors they optimize, and comparing the approaches they offer. An important conclusion of this analysis was that in comparison to other user-interface abstractions, the abstraction of an active variable is a carefully balanced tradeoff between automation and flexibility. Other abstractions either put far more constraints on the kind of user interfaces that can be implemented or require far more effort to implement the user interfaces. I did a ``proof of concept'' implementation, called Dost, of the abstraction and used it to develop a small set of experimental applications. My preliminary experimental results validated my analysis by demonstrating that indeed the abstraction offered substantially more flexibility and/or automation than related works.

I have extended my thesis work on user-interface generation in several important ways. I have designed and implemented a significant variation of the active variable for the Smalltalk programming language. A problem with the idea of associating customizable attributes with each active variable is that the user must tediously specify a large number of attributes. I addressed this problem by devising a new inheritance model that allowed users to specify an attribute value once for a group of variables that share the value. At least two further problems had to be overcome before the concept of an active variable could become truly practical. First, its definition had been language-specific and thus did not apply to several of the standard programming languages in use. Second, the interactive and computational operations on an active variable were tightly bound in that they had to be performed as part of a single process and on a single computer. I addressed these problems by devising a method for partitioning application programs into separate computational and interactive processes and defining the latter in a language-independent manner. I developed a distributed, multilanguage object model to facilitate this separation and a generic interactive component that supports multiple procedural programming languages.

These extensions to my thesis work are closely related to each other and were implemented in an experimental system called (single-user) Suite. Several students at Purdue and other sites have experimented with these concepts. Their experience has shown that few (ten to a hundred) lines of code are required to implement the user interface using Suite instead of the thousands of lines of code that are required to implement the user interface using other tools.

Object-oriented Database Systems

Working with the O-RAID group at Purdue, I have developed a new database model that integrates the relational and object models. The model has been implemented in a system called O-RAID, which is an extension of an existing distributed relational system called RAID. Previous research efforts that have explored the hybrid approach have been biased towards either the object model or the relational model. The object-relation model of O-RAID supports relations, objects, and relations among objects, thereby allowing the users to choose either of the two models or a powerful combination of the two. It uses a simple and novel implementation technique which allows the object-relation model to be implemented using implementations of existing relational systems and object-oriented programming languages. One of the key ideas in this technique is to store objects as tuples, thereby allowing distributed transactions on objects to be performed using the existing implementation of distributed transactions on tuples. This technique made it possible for the O-RAID group to successfully implement a distributed object-oriented database system in a relatively short time with a small number of students. O-RAID is currently being used for experiments comparing its performance with RAID.