Semantics

Next: Implementation Up: Transparent RPC Previous: Transparent RPC

Semantics

Reference Parameters

An important difference between a local and remote procedure call has to do with address parameters. Since addresses in one process have no meaning in another (unless we have shared memory), some systems disallow address parameters while others create an isomorphic copy of the data structure pointed to by the actual parameter at the receiving process and point the formal parameter at this copy. The latter systems support transparency only for languages that do not allow address arithmetic.

Binding

How does a client machine identify the server? We can associate each interface definition with a global type name, T. Each server that implements the interface creates an instance, I of T. An instance is implemented by some process P on some host H. There is a spectrum of binding strategies possible, based on the binding awareness in the client program:

One extreme approach is for the client program to simply identify which interfaces types it is using. The server RPC system publishes the interfaces implemented by each server, giving the location (host, process, and interface id) of the implementation. The client RPC system chooses one of these published implementations for the interfaces used by the client. This is the minimum amount of binding awareness possible since the interface type is necessary to link the client to appropriate generated routines.

The other extreme is for the client to indicate not only the type of the interface but also the complete location of the implementation in which it is interested (H, P, I). This is essentially the approach adopted in Java RMI.

Intermediate choices are to specify some but not all the details of the location, letting the system figure out the unspecified details. In particular, a client program can specify H or (H, P).

Less binding awareness makes the program more portable and easier to write. On the other hand, it gives the client programmer less control and also makes the binding less efficient since the system must maintain and search (centralised or replicated) information about published interfaces.

Usually, there is no well known way to name processes - a port number or string would normally be used to which processes bind. Java RMI uses a host-wide unique string.

No of Invocations

How many times has the remote procedure call executed when it returns to the invoker? Ideally, we would want to maintain the semantics of local procedure call, which is guaranteed to have executed execute exactly once when it returns. However, these semantics are difficult to maintain in a distributed environment since messages may be lost and remote machines may crash. Different semantics have been proposed for number of remote invocations based on how much work the RPC system is willing to do:

At-least-once: The call executes at least once as long as the server machine does not fail. These semantics require very little overhead and are easy to implement. The client machine continues to send call requests to the server machine until it gets an acknowledgement. If one or more acknowledgements are lost, the server may execute the call multiple times. This approach works only if the requested operation is idempotent, that is, multiple invocations of it return the same result. Servers that implement only idempotent operations must be stateless, that is, must not change global state in response to client requests. Thus, RPC systems that support these semantics rely on the design of stateless servers.

At-most-once: The call executes at most once - either it does not execute at all or it executes exactly once depending on whether the server machine goes down. Unlike the previous semantics, these semantics require the detection of duplicate packets, but work for non-idempotent operations.

Exactly once: The system guarantees the local semantics assuming that a server machine that crashes will eventually restart. It keeps track of orphan calls, that is, calls on server machines that have crashed, and allows them to be later adopted by a new server. These semantics and their implementation were proposed in Bruce Nelson's thesis, but because of the complexity of the implementation, were never implemented as far as I know. Nelson joined Xerox where he implemented the weaker at-most-once semantics in the Cedar environment.

How should the caller be told of RPC failures in the case of at-least once or at-most-once semantics? We cannot return a special status in case of transparent RPC since local procedure calls do not return such values. One approach, used in Cedar, is to raise a host failure exception, which makes the client program network aware even though the call syntax is transparent.

Next: Implementation Up: Transparent RPC Previous: Transparent RPC

Prasun Dewan 2006-02-02