OS Interface: Sockets

Next: Transparent RPC Up: Communication across a Previous: Process to Process

OS Interface: Sockets

We have seen earlier abstract ideas behind OS-supported message passing, which can be implemented on the network layers mentioned above. To show how an OS layer sits on top of these network layers, let us consider the concrete example of Unix sockets. Unix sockets provide an interface to UDP and TCP/IP and other protocols at this level. Unlike IP ports, sockets support a combination of free, input, and bound ports. Unlike the Xinu IPC mechanism you have seen so far, they allow processes in different address spaces and on different hosts to communicate with each other. Socket declarations are given in the file <sys/socket.h>. The following code fragments execcuted by the server and client illustrate the use of sockets.

service\_sock = socket (af, type, protocol)
bind (service\_socket, local\_addr, local\_addr\_len) - stream
listen (service\_socket, qlength) - stream
server\_end = accept (service\_socket, remote\_addr, remote\_addr\_len)  - stream only

write (server\_end, msg, len)
or
send (server\_end, msg, length, flags)
or
sendto (server\_end, msg, length, flags, dest\_addr, addr\_len) - datagram only

read (server\_end, \&msg, len)
or
recv (server\_end, \&msg, len, flags)
or
recvfrom (server\_end, msg, length, flags, \&from\_addr, \&from\_addr\_len) - datagram only

and the receiver can similarly execute

client\_connection\_sock = socket (af, type, protocol)

connect (client\_connection\_sock, dest\_addr, dest\_addr\_len)
read, write, send, receive, sendto, or recvfrom

A server providing a service first uses the socket call to create a socket that serves as an input port for creating new sockets. The af parameter indicates address family or name space (AF_INET - (host, port), AF_Unix - Unix file system, AF_APPLETALK), type indicates type of connection (SOCK_DGRAM, SOCK_STREAM) and protocol indicates the kind of protocol to be used (IPPROTO_IP - System Picks, IPPROTO_TCP, ...)

After this call the server does the following:

The bind call binds the socket to a local address. An separate external name space is needed for the local address since communicating processes do not share memory. It is the structure of this address that the name space argument specified in the previous call. The structure of local IP addresses is given in the file <netinet/in.h>.

The address indicates an internet address and port number. The port number is a 16 bit number the server chooses. Port numbers 0..1023 are reserved by system servers such as ftp and telnet. Look at the file /etc/services for port number of system servers. A user-defined server must choose an unbound port number greater than 1023. You can explicitly pick one of these values and hope no other process is using the port. A port number of 0 in the port field indicates that the system should choose the next unbound port number, which can then be read back by the server using getsockname.

When communicating internet addresses, port numbers, and other data among machines with different byte orders, you should use routines (such as htons, htonl, and ntohs ) that convert between network and host byte orders for shorts/longs. A bind call can use a special constant (htonl (INADDR_ANY)) for an internet address to indicate that it will listen on any of the internet addresses of the local host.

To talk to a server, a client needs to create a connection endpoint through its own socket

call. Then it can use connect to link its socket with the server socket. If it does not have the internet address of the host, it can determine this address from the host name using gethostbyname.

A socket bound by the server serves as an ``input port'' to which multiple clients can connect. The connect call creates a new ``bound port'' for the server-client connection. The client end of the bound port is the socket connected by the client. The other end is returned to the server by the accept call, when a successful connection is established by the client. The server determines the number of connections to a bound socket using the listen call. For UDP, no connections need to be established through accept and connect - the connect call can be invoked but it is a local operation simply storing the intended remote address with the socket. In the stream case, the client usually does not bind its end of the socket to a local port, the system automatically binds the socket to an anonymous port. Sockets are not strictly input or bound ports since they can be inherited and accessed by children of the process that created them (through the mechanism of file descriptor inheritance we shall see later).

Data can be send/received using either the regular read and write calls, or the special send and recv calls, which take additional message-specific arguments such as send ``out of band'' data. The sendto and recvfrom calls are applicable only for UDP datagram communication and are necessary if the client has not ``connected'' the socket to a remote address.

If a process is willing to receive data on more than one I/O descriptor (socket, standard input), then it should use the select call:

int select (width, readfds, writefds, exceptfds, timeout)
     int width;
     fd_set *readfds, *writefds, *exceptfds;
     struct timeval *timeout;
\end {verbaitm}

The call blocks till activities occur in the the file descriptors specified in the arguments.
The activity could be completion of a read or write or the occurence of an
exceptional condition.
The file descriptors are specified by setting bits in the fd\_set bitmasks.
Only 0..width-1 bits are examined.
The bitmasks return the descripors on which the activities occured,
and the program can then use them in subsequent read/write calls.


\subsection {Sun RPC}

Both UDP and TCP require a client to encode its request and parameters 
in the data area
of the message,
and then wait for a reply on the reply port of the message.
The server in turn needs to decode the request,
perform the service,
and then encode the reply.
It would be more useful if the client could directly call a remote procedure 
which the server executes to service the request.

Sun RPC demonstrates how a remote procedure call paradigm can be supported
on top of UDP or TCP via library routines rather than compiler support. 
Each server registers a
procedure
pair via the
{\tt registerrpc}
call,
\begin{verbatim}
registerrpc (prognum, versnum, procnum, procaddr, inproc, outproc)

A client can then invoke callrpc to call the remote procedure:

callrpc (host, prognum, versum, procnum, arg, inproc, outproc)

The procedure callrpc takes as arguments the machine number of the server, the procedure identifier, and the input and output parameters. It sends an appropriate UDP or TCP message to the destination machine, waits for a reply, extracts the results, and returns them in the output parameters.

The XDR (eXternal Data Routines) are responsible for marshalling/unmarshalling procedure parameters onto/from XDR streams. Example of XDR (input & output) procedure :

typedef struct {
    int f1;
    float f2 } S;

xdr\_s (xdrsp, arg)
    XDRS *xdrsp;
    S *arg;
{
    xdr\_int (xdrsp, \&arg->f1);
    xdr\_float (xdrsp, \&arg->f2);
}

It is called in the marshalling/unmarshalling modes by the client and server RPC system. In the marshalling mode, it writes the argument onto the stream and in the unmarshalling mode it extracts data from the stream into the argument.

Next: Transparent RPC Up: Communication across a Previous: Process to Process

Prasun Dewan
Thu Sep 18 10:55:57 EDT 1997