Rover: Disconnected Replicated Objects

Next: Argus: Distributed Transactions Up: Distributed Communication Previous: Sumatra: Mobile Objects

Rover: Disconnected Replicated Objects

While we did see how a repository can offer disconnected or weakly connected operation, we have yet seen the same for distributed communication. Rover [], illustrates a solution for systems supporting communication of objects. Rover can be considered a cross between Obliq and Coda: like Obliq it supports user-defined objects and allows them to migrate or ''rove'' among computers; and like Coda, it supports strongly-connected, weakly-connected, and disconnected operation.

Rover supports the notion of a relocatable dynamic object - an object that is managed by some server, called the home server of the object, from where it can be dynamically relocated to the address space of a client of the server that has established a session with it. The relocation occurs when the client explicitly imports a copy of the object. The client can now invoke one or more operations on the local copy, and at some later time, explicitly export the changes back to the server copy. Thus, this model is based on the explicit check-in/check-out concept of version control systems rather than the automatic caching scheme of Coda. As in Coda, the client can explicitly prefetch objects it expects to access (import).

The prefetch, import, and export are executed as remote procedure calls (RPCs) invoked by the client in the server. Unlike traditional systems, Rover does not invoke an RPC synchronously, blocking the invoker until the procedure is executed at the remote site and returned results. Instead, it supports Queued RPC, that is, unblocks the client, queues the RPC in a stable log at the client site, asynchronously invokes the operation in the server, and communicates the results back to the client by invoking a callback.

As in Coda, the scheduled used to process the client operations depends on the connectivity to the server. In the strongly-connected case, each QRPC can be sent immediately to the server, in the disconnected case, the RPCs are sent when connection is next established, and in the weakly-connected case, the cost of communication and the nature of the queued operation influences the schedule. Queued RPCs are not necessarily executed in the order in which they are queued. They can be reordered to let the more urgent operations be executed earlier. A client associates a QRPC with an urgency or QOS (Quality of Service) parameter which is used to determine how soon that call is invoked. The authors of QRPC compare this parameter to the different degrees of urgency we associate with postal mail - ordinary, two-day express, one-day express, and so on. Thus, an import or fetch RPC can be given higher priority than an export RPC to simulate the Coda weak-connection policy.

Despite its name, an RDO, does not truly relocate or migrate to the client when it is imported. In fact, it is shared with the client - the import operation creates a copy of the object, leaving the original copy to be concurrently modified by the server or imported in other address spaces.

Rover provides several policies to support consistency among the copies of the object. It allows, copies of the object to be modified concurrently, using a type-specific optimistic concurrency control mechanism to check for conflicts. We will look at type-specific CC in more detail later; to give an example -- a Queue-specific CC can allow concurrent queuing operations. In case of conflicts, a Coda-like conflict resolution procedure can be used to resolve the conflicts - instead of working on files, it would work on objects. To eliminate such conflicts, a shared object can be locked by a client, thereby preventing the server and other clients from modifying it. However, as we saw earlier, this policy is suitable only if the the lock holder is strongly connected. A disconnected or weakly-connected client can reduce the chance of conflicts by specifying, for a local object, the following consistency options:

Uncacheable: do not allow the object to be imported. We might associate this option with an exported object that has not reached the server.
Immutable: the object is guaranteed to be immutable, so write methods must not be allowed on the object. Rover, thus, must distinguish between read and write methods on an object.
Verify-before-use: Before invoking a method on a object, check if the object is still upto date. (What is the difference between locking an object and verifying before use?)
Verify-if-service-is-accessible: Same as before except that the verification is not done if the cost of communicating with the server is high (more than some specified threshold).
Expires-after-date: No operations can be performed on the object after a certain date.
Service callback: Notify the client if the server object changes. Not sure what happens if the client is disconnected - doubt that server keeps track of callbacks it must invoke.

Some of these options are orthogonal, and users must be able to specify multiple options simultaneously.

As mentioned before, callbacks are also

used to inform applications about the completion import and export operations issued by them. When an application invokes an import operation, it specifies the session id, the object id, QOS specification, and a callback with the operation. Rover queues an RPC for the operation and sends it to the server when an appropriate level of connectivity is established with the server. The server fetches the object and sends it to the client, which deletes the queued RPC and invokes the application callback.

Similarly, when a client exports an object, it specifies the session and object id, the QOS priority, and a callback. All operations invoked on the object since it was imported are sent as QRPCs. The server executes them if there are no conflicts or if conflicts can be resolved automatically (based on an application-provided procedure), sends back to the client an indication of whether the object was updated or an unresolvable conflict was detected. The client removes the QRPCs from the log, and invokes the application callback with the values returned by the server. When a method is invoked on a cached object, it is marked as tentatively committed, and once it is successfully changed by the server, it is marked as committed. Applications can inform users about the commitment status of objects so that they know which values they can depend on. The RPC request and response are split in that they can be sent on communication links with different properties.

As in Obliq, the importing of an object can trigger the creation of a new, ''agent'', thread for executing operations on the object. As in Coda, logged QRPCs can be compressed, but Rover excepts application programmers to provide the compression procedures. Rover supports both the interactive TCP/IP and batch SMTP protocols for communication among clients and servers, the choice between them is based on the QOS desired. Moreover, instead of using TCP/IP directly, a client can instead use the higher level HTTP protocol. An object is identified by a URN (Uniform Resource Name), which is built on top of a URL.

Rover has been used to built several applications, including versions of a mail reader, calendar scheduler, and a Web browser that do not require strong connection to the data they access. (The current Web browsers allow users to access cached data when they are disconnected, what additional facilities can a Rover-based Web browser provide?).

How well does Rover perform compared to the traditional communication model, wherein no data are cached in the local client, and it manipulates objects by invoking operations in the server. Rover can be compared with the traditional model only when there is some connectivity between the client and server, in case of disconnection, it clearly offers access where the traditional model would deny it. In the connected case, the cost of a QRPC is much more than cost of traditional communication because of the cost of logging the RPC in stable log. A TCP/IP-based ''null'' RPC (290-byte request with a 5-byte reply) takes 47 milliseconds, over 10 Mbit/sec Ethernet, whereas using TCP/IP directly takes 8 milliseconds to communicate the same data. Writing a log entry takes 37 milliseconds, which explains the difference.

However, every RPC in the traditional model does not have to be correspond to a QRPC in the Rover model. If an object has been imported in an application, then it corresponds to a local operation in the application, which takes 3.2 msec for a null RPC. We need to also consider the time to import/export the object. The time to export the object from a client to a server machine is 59 msec. Once an object has been exported, it can remain in the client machine, in case the application imports it again. The cost of retrieving a null object into the address space of the client from the address space of the system is 7.5 msec, which is the cost of a local null RPC. LRPCs are not implemented efficiently in Rover - hence the high cost. (All numbers above are for the ethernet case. ) The cost of exporting and importing objects is, of course, amortized over multiple object invocations.

So how effective is the amortization in real applications. The authors of Rover did several additional experiments with the applications they build to see overall interaction times for some user-level actions. For instance, they considered the cost of reading eight messages in a folder whose total size was 65.4 Kbytes. They considered several cases, including:

Strongly-connected remote exmh: remote exmh connected to local X server through 10 MB/sec Ethernet. Time taken: 0:55 (minutes:seconds).
Weakly-connected (14.4 Kbit/sec ) remote exmh: remote exmh connected to local X server through 14.4 Kbit/sec connection. Time taken: 9:08.
Weakly-connected, local exmh: local exmh connected to remote file system through NFS running over 14.4 Kbit/sec SLIP connection. Time taken: 2:36.
Weakly-connected, local Rover exmh: Rover exmh, assuming that folder is not cached, connected to server through 14.4 Kbit/sec connection: Time taken: 1:34.
Weakly-connected, local Rover exmh: Rover exmh, assuming that folder is cached, connected to server through 14.4 Kbit/sec connection: Time taken: 1:06.
Disconnect, local Rover exmh: Rover exmh assuming that folder is cached. Time taken: 1:02.

As can be seen, the cost of caching data is indeed amortized in this case but the strongly-connected exmh gives the best performance.

Example-Replicated: An initialization program stores the greeting in some server and associates it with a URN to be used as the session name. Each user runs an application that uses the URN to import the object, registers a service callback to receive updates, allows the user to modify the string, exports it after each change or when the user saves it - depending on the connectivity, and immediately imports it after exporting it. The service callback is used to update the string with the new value of the object. If the application is guaranteed to be strongly connected to the server, it can lock/unlock the string when the user starts/finishes editing it. Otherwise it can provide a procedure to resolve the conflicts or let the user do so interactively.

Unlike the other distributed communication schemes we saw above, Rover provides support for consistency management in the form of concurrency control and merging. This is because it combines features of shared repositories and distributed communication, like the former it has a notion of shared data (RDOs), and like the latter, it provides a mechanism for communicating information among different processes (QRPC). Currently, it does not offer several repository functions such as access lists and version trees, but these can be implemented in an object-based repository, as we shall see when we consider object-oriented databases.

Rover's check-in/check-out model also provides support for isolation: The changes to imported objects are not propagated to other users until the object is exported. Moreover, its QRPCs provides some fault tolerance. An RPCs is queued in a stable log, so if the operation to send it to the server fails (because the client was disconnected, for instance) the client will try again until it gets an ack from the server that the RPC was committed.

However, the support for concurrency control, isolation, and fault tolerance are somewhat limited. An application must explicitly import, export, and lock objects, it cannot atomically lock a set of distributed objects, it cannot make certain changes to an object visible before others, and there is no general mechanism for recovering from failures in arbitrary operations.

Next: Argus: Distributed Transactions Up: Distributed Communication Previous: Sumatra: Mobile Objects

Prasun Dewan
Sun Mar 16 14:09:55 EST 1997