Most queries currently made to WWW servers fetch static data stored in a portion of the file system associated with the server. The CGI interface provides a means for a client to request that an arbitrary program be executed by the server. The reason for running that program can be to produce side effects, such as updating a data base or sending e-mail to someone, but more often the program is run in order to return data directly to the client/user in the form of an HTML document generated by the program. The architecture of the CGI with respect to overall Web architecture is shown in the following figure:
![]()
CGI Architecture
To use the CGI interface, one must understand four things:
Each of these topics will be discussed, below. The version of CGI that will be discussed is CGI/1.1. More information regarding CGI can be found through the NCSA CGI overview page. Through it you can go to a basic introduction, a primer for the novice writer of CGI programs, and a more detailed interface specification.
- where to place programs, and with what permissions, within the server's file space so that they can be executed as CGI programs
- how to send input from the user to a CGI program
- how to process user input within the CGI program
- how to generate and return information/documents to the user
In this and subsequent discussion of Web programming, Perl will be the languge of choice. For a tutorial on CGI/Perl programming, see the Perl/CGI Tutorial. There are several archives of Perl programs. One that emphasizes CGI applications is provided by S. E. Brenner in England. A source for v5 Perl programs is the Whitehead Institute at MIT. You can also get all of the Perl programs included in O'Reilly's Learning Perl text.
1. Mechanics
For the server to locate and execute a program through the CGI, it is normally placed in a special directory known by the server to contain executable programs. In most environments, this directory is namedcgi-bin
. In most environment, most users do not have write access to this directory because of potential security exposure, and they will have to have a system administrator put their programs there for them.To support experimentation and non-monitored use of CGI programs, some facilities run additional servers. At the author's department, that server is designated
wwwx.cs.unc.edu
. Unlike conventional configurations, wwws allows users to place their cgi programs in any directory so long as the filename for a cgi program includes a.cgi
suffix -- for example,program1.cgi
.For purposes of this course, you should create two directories under your
members
subdirectory: aperl
directory for your experimentation with perl and acgi-bin
directory for your actual cgi programs. When you create these directories, be sure that anyuser has read, list rights and that the file is executable.
2. Input
There are two basic ways in which data are passed from a server (which, presumably, gets it from the client/user) to the cgi program. The first is throughenvironment variables
; the second is through the standard input file,STDIN
.Environment Variables
Environment variables are global variables set by the server that are then inherited by the cgi program. The names of these variables are fixed, and the cgi program must access them through those assigned names. The list of environment variables is the following:Values are assigned to environment variables by the server before the cgi program begins execution and, thus, are available to it when it begins. Example values are the following:
- HTTP_USER_AGENT
- SERVER_NAME
- QUERY_STRING
- SERVER_PORT
- HTTP_ACCEPT
- SERVER_PROTOCOL
- PATH_INFO
- REMOTE_ADDR
- DOCUMENT_ROOT
- PATH
- PATH_TRANSLATED
- GATEWAY_INTERFACE
- REQUEST_METHOD
- SCRIPT_NAME
- SERVER_SOFTWARE
- REMOTE_HOST
Example values
- HTTP_USER_AGENT = Mozilla/1.1N (Macintosh; I; 68K)
- SERVER_NAME = wwwx.cs.unc.edu
- QUERY_STRING = querry-string-added-to-end-of-URL
- SERVER_PORT = 80
- HTTP_ACCEPT = */*, image/gif, image/x-xbitmap, image/jpeg
- SERVER_PROTOCOL = HTTP/1.0
- PATH_INFO = /additional/info/added/to/path
- REMOTE_ADDR = 152.2.132.132
- DOCUMENT_ROOT = /afs/unc/proj/wwwc-f95
- PATH = /usr/ucb:/bin:/usr/bin:/usr/afsws/bin:/usr/bin/X11:/usr/local/bin/X11R5
:/usr/local/bin/X11:/usr/etc:/usr/local/bin:/usr/5bin:/usr/local/contrib
/mod/bin://bin- PATH_TRANSLATED = /afs/unc/proj/wwwc-f95/additional/info/added/to/path
- GATEWAY_INTERFACE = CGI/1.1
- REQUEST_METHOD = GET
- SCRIPT_NAME = /wwwc-bin/smith/cgi_env_vars
- SERVER_SOFTWARE = NCSA/1.4.2
- REMOTE_HOST = mac-ara-port2
Let me call your attention to two values in particular. The variable,
QUERY_STRING
, is set to whatever string follows a question mark (?) occurring at the end of the URL. Such values are typically sent as a result of a FORM that uses METHOD=GET; it often represent a query string, such as a query to a database, depending on the function of the FORM. You can, of course, manually enter such a string directly in the URL, which is what I did to generate the sample values shown above.The second variable to note is
PATH_INFO
. Like QUERY_STRING, it is additional information passed to the cgi program through the URL. In this case, it is additional information added to the URL immediately following the path to the cgi program (and before theQUERY_STRING
, should one be present). You have seen this convention used with respect to imagemaps, where the first part of the URL is the path to the imagemap program and what follows is the path to a map file.Both of these conventions are pretty klugey. A more flexible and cleaner approach is to use
STDIN
, described below. In the meantime, you can experiment with the different options through this example form.STDIN
If the method used by a form isPOST
, instead ofGET
STDIN. From within your program, you can then read the data and process it accordingly.The data you read from STDIN must be parsed into attribute/value pairs. This will require that you:
This decoding and parsing process is discussed in detail in the Perl/CGI Tutorial
- split the data by ampersand (&) into attribute=value pairs
- if working in Perl, you may wish to place the attribute=value pairs into an associative array by splitting on
=
- translate pluses (+) to spaces
- translate special characters in hex to their regular character form
You can experiment with input sent with METHOD=POST using this example form.
3. Processing the data
Once you have parsed any input from STDIN, you are ready to process it and any data received through an environment variable. At this point, you are in a conventional programming context and your program can do virtually anything a conventional program written in the language you are using can do.
4. Sending data back to the user/client
When you are ready to send data back to the user, the cgi program can do so by writing the data toSTDOUT
. Thus, you just generateThe server header lines are the following:
After the header lines, the cgi program generates its data, but with HTML tags interspersed, just as if you were writing conventional HTML static files. These lines of generated HTML data will then be returned by the server to the client/user where they will be displayed just as if the data were a static page.
- Status (just the numeric return code followed by the brief text explanation, since the server will add the server version)
- Content-type, in the usual MIME form
- Location, (optional) a URL to be followed and returned to the client
- blank line (CRLF)