next up previous
Next: Binding local names Up: No Title Previous: No Title

Introduction

Pipes are a simple solution for communicating between a parent and child or between child processes. What if we wanted to have processes that have no common ancestor with whom to set up communication? Unix provides an IPC mechnism called sockets for this purpose. Two processes can separately create sockets and then have messages sent between them. These two processes may reside on the same machine or on separate machines interconnected using a computer network. In Berkeley UNIX 4.3BSD, one can create individual sockets, give them names, and send messages between them.

Sockets created by different programs use names to refer to one another; names generally must be translated into addresses for use. The space from which an address is drawn is referred to as a domain. There are several domains for sockets. We will use a domain called the Internet domain (AF_INET) which is the UNIX implementation of the DARPA Internet standard protocols IP/TCP/UDP. DARPA Internet is a worldwide collection of several networks.

Communication follows some particular style. Currently, communication is either through a stream or by datagram. Stream communication implies several things. Communication takes place across a connection between two sockets. Before beginning the communication, two processes execute a series of steps to ensure that both are ready to exchange messages. The result of this negotiation is a connection. The communication over a stream is reliable, error-free, and, as in pipes, no message boundaries are kept. Reading from a stream may result in reading the data sent from one or several calls to write() on the sender's side or only part of the data from a single call, if there is not enough room for the entire message, or if not all the data from a large message has been transferred.

Datagram communication does not use connections. Each message is addressed individually. If the address is correct, it will generally be received, although this is not guaranteed. Often datagrams are used for requests that require a response from the recipient. If no response arrives in a reasonable amount of time, the request is repeated. The individual datagrams will be kept separate when they are read, that is, message boundaries are preserved.

Actual transfer of data takes place using a protocol. A protocol is a set of rules, data formats and conventions that regulate the transfer of data between participants in the communication. In general, there is one protocol for each socket type (stream, datagram, etc.) within each domain. The code that implements a protocol keeps track of the names of the sockets, sets up connections and transfers data between sockets, perhaps sending the data across a network.

One specifies the domain, style, and protocol of a socket when it is created. For example, the following Unix system call to sockect causes the creation of a stream socket with the protocol in the Internet domain.

socket(AF_INET, SOCK_STREAM, 0);

The constants AF_INET and and SOCK_STREAM are defined in . After a socket is created, it must be given an appropriate name in its domain. In the Internet domain, sockets are identified using a 8-byte transport address consisting of two parts: address of the host or machine on which the socket exists and a port number, or a delivery slot, on that machine. The following structure defined in file ``netinet.h'' describes the socket address.

struct	sockaddr_in { /* Internet family socket address */
        short    sin_family;     /* adresss family */
        short    sin_port;       /* 2 byte port number */
        long     sin_addr;       /* 4 byte machine address */
        char     sin_data[8];    /* unused */
};
When a message must be sent between machines it is sent to the protocol routine on the destination machine, which interprets the address to determine to which socket the message should be delivered.

For a given protocol, a socket is thus identified using the tuple

.
An association is a temporary or permanent specification of a pair of communicating sockets. An association is thus identified by the tuple
.

The protocol for a socket is chosen when the socket is created. The local machine address for a socket can be any valid network address of the machine, if it has more than one, or it can be the wildcard value INADDR_ANY.

Figure 1 shows an example program for sending datagram to a well-known server called the UDP echo server residing on machine s.

To determine a network address to which it can send the message, it looks up the host address by the call to gethostbyname(). The returned structure includes the host's network address, which is copied into the structure specifying the destination of the message.

The port number can be thought of as the number of a mailbox, into which the protocol places one's messages. Certain daemons, offering certain advertised services, have reserved or ``well-known'' port numbers. These fall in the range from 1 to 1023. Higher numbers are available to general users. Only servers need to ask for a particular number.

Port number for a well-known service may be obtained by using the system call getservbyname(family, service).





next up previous
Next: Binding local names Up: No Title Previous: No Title



Prasun Dewan
Tue Mar 9 11:04:08 EST 2004