Perl

 Perl is relevant to the World Wide Web as the programming language most frequently used for CGI programs. That role may change in the next few years if Java servlets replace CGI as the preferred mechanism for adding dynamic processing. But for now, CGI and Perl remain the interface and language of choice.
In the sections that follow, a brief history of Perl is given. After that, some of the basic characteristics of Perl are described. The emphasis, however, is on what Perl is versus how to use it. How to use Perl for CGI programming is discussed in detail in several on-line lessons.



History

Perl is a couple of years older than the Web. The original version was written by Larry Wall while he was working at Unisys in Santa Monica, CA. It appeared on the network for distribution in December, 1987. During the past decade, Perl has gone through five versions. Early versions ran only on Unix systems, but it has now been ported to most major platforms.

From the beginning Perl has been more than a language to Perl enthusiasts. Some would call it a cult. Those at the center of the Perl phenomenon prefer the term, culture: " I wanted the Perl community to function like a little bit of Heaven, where people are naturally helping each other. People want to help each other. They encourage each other, they give each other cool things..." [Larry Wall, from an on-line review published in Web Review, 2/28/97, at http://webreview.com/97/02/28/feature/perl.html].

Recently, O'Reilly & Associates has become a major sponsor of Perl. They publishing the definitive books on Perl: Schwartz, L.; & Christiansen, T., (1997), Learning Perl. Sebastopol, CA: O'Reilly & Assocs. and Wall, L.; Christiansen, T.; & Schwartz, L., (1996), Programming Perl: 2nd Edition. Sebastopol, CA: O'Reilly & Assocs. They have employed Wall to continue work on the language and related products. They now distribute a commercial version of the language as a development kit. And they sponsor a Web site dedicated to facilitating Perl development (http://www.perl.com/).


The Language

Perl is an intermediate language that is most useful for tasks that are too large or would be too difficult for a shell program but which don't require a full-blown program in a conventional language, such as C. Perl has three primary strengths. First, it provides convenient access to system commands, such as file and directory operations. Second, it provides a comprehensive set of functions for working with regular expressions. Thus, it provides very powerful tools for searching and replacing patterns within strings. This property make sit particularly useful for generating or modifying lines of HTML. Third, it has convenient mechanisms for quickly producing formatted reports.
Perl programs are stored as executable files of source statements. When a Perl program is executed, it is first compiled by the Perl interpreter and then run. Thus, no part of a program will execute if it includes compilation errors. However, there is no way to save a program in its compiled form so that is could be executed directly. Thus, each time a program is executed, it is recompiled.
Perl is an iterative language. It includes common control structures, such as if-then-else constructs and while, until, and for loops. It also includes user-defined functions, providing basic support for modular design. Perl ignores whitespace, so programs can be written in free format, although an indented style is conventional.
An unusual characteristic of Perl is that it includes only three types of variables. These are scalars, arrays, and associative arrays or hashes. Perl does not distinguish among numeric types and strings. Any variable that is not some form of an array is a scalar, indicated by a dollar sign ($) appended to the front of the name. The language leaves it up to the Perl interpreter to figure out the data type for a particular variable and handle it accordingly. Arrays, indicated by an "at" sign (@) appended to the name, are handled conventionally except that a given element in an array is considered a scalar. Associative arrays or hashes are 2xn arrays that include key and value pairs. Elements in the array are indexed not by position in the array but by the key. One particularly important use of associative arrays is the delivery of HTTP header information, as attribute/value pairs, by a Web server to a Perl CGI program.
While Perl has a devoted following and a great many features to recommend it, it also has its drawbacks. The two most important ones, in my opinion, are its extreme terseness and its numerous hidden side effects. Perl is somewhat reminiscent of APL in the number of operations that can be squeezed into a single statement. For example,

$key =~ s/%(..)/pack("c",hex($1))/ge;

converts all URL-encoded 3-character hex values for special characters found in a string back into their original single character forms. But it takes a trained eye to decipher as well as create such compact expressions.

Perl is replete with "invisible" variables that are created by the system as side effects. The rationale is that these variables are often the very ones needed in a nearby statement, but to use them, the programmer must know they exist and the precise semantics of their creation and extent. For example, in the expression shown above, $1 is a hidden variable created as a side effect of the regular expression interpreter searching for occurrences of a percent sign (%) followed by two characters. Its value is set to the two characters that follow the percent sign, and it is those characters, treated as a 2-digit hex value, that are translated into the single corresponding character.

A number of Perl libraries have been developed to provide additional support for particular tasks. For example, there are several Perl libraries that perform basic operations required of all CGI applications. Typically, these libraries are distributed freely, just as Perl is.

 

When all is said and done, Perl is a hacker's language. This is particularly evident in its terseness and side effects. It does not encourage good programming practice. For experienced programmers who want to create simple programs quickly, it is a true time-saver. But for developing large applications, there are other languages that are better suited. It remains the language of choice for CGI applications, but I expect that Java servlets will soon replace CGI as the primary mechanism for accessing dynamic server function through the Web. When that happens, Java will replace Perl, as a side effect.


Additional Reading

If you would like to learn more about Perl, there are a number of sources you may wish to consult. The two standard hardcopy publications are Schwartz, L.; & Christiansen, T., (1997), Learning Perl. Sebastopol, CA: O'Reilly & Assocs. and Wall, L.; Christiansen, T.; & Schwartz, L., (1996), Programming Perl: 2nd Edition. Sebastopol, CA: O'Reilly & Assocs.
A primary source for on-line information is http://www.perl.com/. This site, sponsored by O'Reilly, provides references to a diversity of information on Perl, ranging from its history to sources for Perl libraries to newsgroups. You can use it for a jumping off point to other on-line resources.