next up previous
Next: Coda: Multiple Computer Up: Shared Distributed Repositories Previous: Traditional DBMS: Fine-Grained

Lotus Notes: Rarely-Connected Document Replicas

Lotus Notes [] provides a storage structure, called a document, that is a mix between database records and files. Like a database record, it can contain fine-grained, fixed-length fields of predefined types, and like a file, it can store variable-length data of predefined and programmer-defined types. In particular, it can contain variable-length strings for storing document text. Thus, Notes documents, like mail messages in Information Lens, are semi-structured. In fact, one of the most popular applications of Notes is Notes Mail, which stores mail messages as semi-structured documents and provides various schemes for sorting them. Notes provides a unified framework for processing mail messages and other documents. For instance, the same program can be used to browse through messages and other documents, and any document can have a ``response'' linked to it.

Notes is an example of a replicated storage system, that is, a system that creates multiple replicas of a data structure on different sites and ensures consistency of these replicas. Creating a replica at a site promotes fault tolerance, and more importantly, allows read operations to be resolved locally without requiring any distributed communication.

The write operations are more complicated since they need to propagated to all replicas to ensure consistency among them. Traditional replicated file and database systems ensure a strong immediate consistency model, that is, update all (or at least a quorum) of the replicas when a write operation commits at the local site. Notes, like the News application we saw earlier, supports a weaker eventual consistency model, that is, ensures that if all update activities stop the replicas will converge after a finite amount of time.

The main motivation for supporting eventual consistency in News was the number of sites involved - news postings would become unbearably slow if we had to commit the posting to all replicas. This is also a motivation in document databases, but perhaps a stronger reason is a rarely-connected computer, that is, a computer such as a laptop that is more often than not disconnected from other computers.

To allow the user of such a computer to access data in the disconnected state, we would like to create local replicas of shared document databases but cannot ensure immediate consistency of a write operation.

Instead, we would like to allow the user to proceed with other operations, and propagate the results of these later when the computer connects to others.

Like News, Notes supports a decentralized consistency scheme wherein pairs of computers exchange changes to their documents. Each computer knows about one or more other computers or peers that store replicas of the local databases. At some point, it can ''replicate'' with a peer, when it does a 1-way pull of changes from the peer. (The peer can simultaneously pull changes from the local computer.) A user can manually ask for immediate replication with a peer. Moreover, a replication schedule (specified by the system administrator) can trigger automatic replications.

The source and destination computers have autonomy over what parts of their replicas are sent and received, respectively.

The source computer sends only those documents parts to which the destination computer has write access. Moreover, it determines if deletions and old changes are shared with the destination. Conversely, the destination receives only those document parts to which the peer has write access. Moreover, it can specify the types of data it should receive from the source. It can also specify also queries dynamically selecting the the database parts it wants replicated. Finally, it determines if changes to access-control lists and replication parameters associated with a database should also be received from the source.

Notes provides an automatic scheme for merging the destination replica with the source replica. It associates objects with timestamps, and replaces an older version of an object with a newer object in the source. Like News, it merges simultaneous additions to the database by performing both operations. Concurrent updates to the same document is hard to resolve and Notes simply creates both alternatives as parallel versions, leaving the user to choose one of them. One of these versions is made the primary document and the other is made a secondary ``response'' to it.

News does not have to address the concurrent update problem since users can simultaneously update a News message, though they can simultaneously add two messages to a bulletin board. As we shall see soon, the merge problem is addressed in several other systems, and we shall address it in-depth later.

Example: Same as the database case except that: We create a document with a variable length text field instead of a database record. Also we can create a local replica of the document and operate on it in the disconnected mode. In particular, we can modify it in this mode . If we want automatic notification of changes to a greeting, then a system administrator must define a replication schedule - otherwise we explicitly replicate to exchange changes.

If both computers have simultaneously changed the greeting, then one of the greetings will be made a response to the other. If we want to receive but not transmit our changes to the greeting, then we give others write access but not read access to our replica.

Notes is useful not only for merging replicas modified by different users but also those owned by the same user as he moves among workstations (e.g office and home computers). In fact, the merge problem is much easier in this case, since concurrent changes are never made to the replica.



next up previous
Next: Coda: Multiple Computer Up: Shared Distributed Repositories Previous: Traditional DBMS: Fine-Grained



Prasun Dewan
Sun Mar 16 14:09:55 EST 1997