Lambda Notation

Lambda Notation

At a level of conceptualization, a function doesn't necessarily need a name. Executing a functional program is a process of function applications. An application of a function is about what a function does. What a function does is specified in its definition, the code in the body. Naming provides us a convenient mechanism to refer to that code. It is however not the only mechanism.

There are only two things we do in functional programming:

1. function definitions

2. function applications

Of course, the purpose of a definition is to apply it.

Though related, it's important to see that they refer to different things. You will get confused later on if you don't separate them conceptually.

In Lisp, one can write a function without giving it a name by using

           (lambda (x1 ... xn) BODY)
where (x1 ... xn) is the parameter list (usually called formal parameters), and BODY the function definition, i.e. what it does. We call this a lambda function (note that it is a function definition, not a function application.)

Important: A lambda function is not an application. An application using a lambda function is of the form

           ((lambda (x1 ... xn) BODY) a1 ... an)
where a1 ... an are the actual parameters for x1 ... xn.

From now on, we will generally call expressions involving lambda lambda expressions .

Examples in gcl:

>((lambda (x y) (+ (* x x) y)) 4 6)

>(mapcar '(lambda (x) (+ x 1)) '(1 2 3 4 5))
(2 3 4 5 6)
Or, you can use a special function called function instead of quote:
>(mapcar (function (lambda (x) (+ x 1))) '(1 2 3 4 5))
(2 3 4 5 6)

The builtin Lisp function function returns the functional definition (in a form used by the interpreter -- we will have more to say about this) associated with the lambda expression for runtime purposes.

Although function and quote make no difference for the above example, they are not exactly the same.

Try this in gcl:

>(function  (lambda (x) (+ x 1)))
(LAMBDA-CLOSURE () () () (X) (+ X 1))

What does the returned s-expression have anything to do with the definition of the function (lambda (x) (+ x 1)))? Well, it is the definition of the function but in a form used by the interpreter. We will have more to say later about this when we discuss how the interpreter works.

However, try the following in gcl:

>'(lambda (x) (+ x 1))
(LAMBDA (X) (+ X 1))

quote just returns its argument.

You may always use function in front of a lambda function.

       (function (lambda (x y) (+ (* x x) y))) 
       4 6

       '(lambda (x y) (+ (* x x) y)) 
       4 6

>(apply '(lambda (x) (+ x 1)) '(3))

>(apply (function (lambda (x) (+ x 1))) '(3))

Now we see that the Lisp functions apply and funcall apply to a function, not only by referring to its name, but also by referring to a lambda function.

Note how the function function differs from the Lisp functions funcall and apply; it is not about applying a function; instead, it is a function that takes a function and returns its definition in a form used by the interpreter.

Lambda Calculus

1. Introduction

Lambda calculus is a formal, abstract language where all functions are defined without giving a name. We can understand the foundations of functional programming by studying the properties of this formal language.

It should be noted that lambda notation is not frequently used in practical Lisp programming. The reason is simple: the use of a name is more convenient torefer to a function. Referencing to a function without name is difficult for practical programming if you want the function to be called many times. However, as we have said, our interest lies in the role that lambda calculus plays as a foundation of functional programming. Once we understand an abstract model of computation, we can always develop a language based on it.

2. Language

First, let us narrow down to the smallest language for talking about the model of functional programming.

Curried Functions

Once we have the notion of higher order functions, any n-ary function can be defined equivalently by a unary function.

Theoretical implication: To understand the model of computation for functional programming, all we need is to understand a strict subset where only unary functions are dealt with.

Example. Choosing the smaller one of two numbers is a function that takes two arguments. You would normally do it in Lisp as:

(defun smaller (x y) 
      (if (< x y) x y) 

Now look at the following definition using lambda notation:

(defun smaller (x y) 
   (lambda (x)
               (lambda (y) (if (< x y) x y))

Note that you don't need quote or function in front of a lambda function here, since the interpreter knows the definition of a function (headed by the word lambda ) is a function. Or, you can always use function to make sure nothing goes wrong.

(defun smaller (x y) 
   (function (lambda (x)
               (function (lambda (y) (if (< x y) x y)))

Now, look at the definition of smaller (in the discussion below, let's ignore function):

   (lambda (x)
               (lambda (y) (if (< x y) x y))

This is a nested lambda function -- the body of the function (the underlined below) with x as the parameter is another lambda function.

   (lambda (x)
               (lambda (y) (if (< x y) x y))

Here is what this expression says: the lambda function (of course, this refers to the outside function) takes an argument x and returns the function

  (lambda (y) (if (< x y) x y)) 
that given y yields the smaller number between x and y.

The outside lambda function is a unary function.

If you read this a couple of times you may realize that this lambda function is taking parameters in serial, one at a time. That is how it can be a unary function that does the same.

You can write down this unary lambda function in Lisp syntax and apply it:

   (((lambda (x) (lambda (y) (if (< x y) x y))) 4) 9)

Unfortunately, most lisp systems have difficulties understanding the intention of nested lambda functions. In gcl, you have to explicitly tell it how nesting is intended:

         (lambda (x) (function (lambda (y) (if (< x y) x y)))) 

This looks awful. In practical Lisp programming, one bearly writes code like this. Its interest really lies in lambda calculus serving as a foundation of functional programming.

Exercise: Write a lambda function that takes x as an argument and yields a function that takes y as an argument and appends x and y.

Syntax of lambda calculus

With the above in mind, the language of lambda calculus is simply defined as:

[function]      := (lambda (x) [expression])
[application]   := ([expression] [expression])
[expression]    := [identifiers] | [application] | [function]
[identifiers]   := a | b | ...

Though the language definition is simple and straightforward, expressions in this language could be quite complex and sometimes ugly because they can be deeply nested.

Example: the same expression we saw earlier:

   (lambda (x)
               (lambda (y) (if (< x y) x y))

This is a nested (lambda) function.

Another example: the following is a function application where the argument happens to be a lambda function

      ((lambda (x) (x 2))  (lambda (y) (+ y 1)))
       ^^^^^^^^^^^ ^^^^^   ^^^^^^^^^^^^^^^^^^^          
                    Body   arg (which itself is a function)

3. Reduction Rules

We study the operational semantics of lambda calculus, which is based on reducing expressions to their simplest possible forms. That is, a computation is simply a process of reduction from one expression to some other expressions.

It has been shown that the kind of primitive functions that we normally need elsewhere can all be defined in this language. That is, we don't need to have primitive functions builtin in order to define other functions. This is a deep result. You just have to believe it. Perhaps this is not so surprising. We know that a Turing Machine has no builtin functions either, but we can program all computable functions to run on it.


What types of reductions are allowed; whether a given expression possesses a simplest form. Is a "simplest form" unique, and how can it be found?

This is like making an abstract "machine" work in different ways and find out which ways provide meanful mechanisms for computation. Eventually, scientists figured out a characterization of the "machine" that gives us something equivalent to what a Turing machine can do.

Beta-Reduction: Among several that were studied, the most intuitive and important one is precisely the one that we have informally called function application. It is called beta-reduction in the literature: if you see an expression (could be a sub-expression in some complex expression)

      ((lambda (x) BODY) a)
then you can reduce this sub-expression to the expression
where all the occurrences of x are replaced by a (however, some complications may arise, which are to be discussed sortly).

For example,

      ((lambda (x) (x 2))  (lambda (y) (+ y 1)))
       ^^^^^^^^^^^ ^^^^^   ^^^^^^^^^^^^^^^^^^^          
                    Body   arg (which itself is a function)

reduces to

         ((lambda (y) (+ y 1)) 2)
          resulting from replacing the occurrence of x in (x 2)

which can further reduce to

         (+ 2 1)

Suppose + is defined as usual. This then reduces to 3.

Alpha-Reduction: Bound variables can be given different names without affecting the meaning of an expression. In C, when you declare a procedure, it does not really matter what names you use for the formal parameters, i.e., a procedure proc(x,y) where you could have used other variable names such as w and u for x and y. This is true in Lisp too. But one thing you cannot say is that proc(x,y) is the same as proc(u,u). The latter is more restrictive. In logic, when you say, for all x such that p(x), it is logically equivalent to say, for all y such that p(y).

Thus, it should not be a surprise that in lambda calculus, a bound variable can be replaced by another if the latter doesn't cause any conflict (particularly if it is a new name).


     (lambda (x) (+ x y))

Note that x is bound in the scope of (lambda (x) ...). But y is free.

This expression is equivalent to

     (lambda (z) (+ z y))

This is considered a type of reduction in lambda calculus, called alpha-reduction, and we write

     (lambda (x) (+ x y))   -->alpha   (lambda (z) (+ z y))

In what cases such a renaming could cause a name conflict?


     (lambda (x) (+ x y))   -->alpha   (lambda (y) (+ y y))

y was unbound in the left hand side (it's like a "global" variable) and became bound after the renaming. The meaning of y is changed. This is called a name conflict . The name y was used as a name whose scope is outside the given lambda expression, and after the reduction its scope is changed to a local one.

Maybe you are troubled by y being free. Well. y is free in that scope. If that expression is a sub-expression of a larger one, say

      (lambda (y) (lambda (x) (+ x y)))
then y is still free in that scope but bound in the scope of (lambda (y) ...).

Name conflicts can be avoided in alpha-reduction if you always use new variable names (so there is no possibility of conflicting).

Direct substitutions may not always work correctly!

The following example shows that direct substitution could lead to a wrong result.


     ((lambda (x) (lambda (y) (x y))) y)
                  ^^^^^^^^^^^^^^^^^^  ^
                      Body            arg

This expression is quivalent to

     ((lambda (x) (lambda (z) (x z))) y)
                  ^^^^^^^^^^^^^^^^^^  ^
                      Body            arg

This is obtained by a step of alpha-reduction; bound variable y is renamed by a new variable z.

However, by the direct substitution in beta-reduction, the former reduces to

     (lambda (y) (y y))

while the latter to

     (lambda (z) (y z))
They are not equivalent, since when they are applied to a common object, the results could differ
     ((lambda (y) (y y)) a) -->beta  (a a)
     ((lambda (z) (y z)) a) -->beta  (y a)  

What is wrong? The scope of y was changed accidentally in the direct substitution: it was free earlier and became bound after the substitution.

Thus, in beta-reduction, the scoping of a variable should be preserved by variable renaming to ensure that reduction is correct. In this case, we get

     ((lambda (x) (lambda (y) (x y))) y) -->beta  (lambda (z) (y z))

Note that correct beta reductions can always be achieved by renaming (alpha-reduction) followed by a beta-reduction using direct substitution.

     ((lambda (x) (lambda (y) (x y))) y) 
    -->alpha    ((lambda (x) (lambda (z) (x z))) y)
    -->beta     (lambda (z) (y z))

Normal Form:

A lambda expression that cannot be reduced further (by beta-reduction) is called a normal form.

If a lambda expression E can be reduced to a normal form, we then say that E has a normal form.

In general, a lambda expression may not have a normal form.

     ((lambda (x) (x x)) (lambda (y) (y y)))
A step of reduction of this expression yields the same expression (up to bound variable renaming).

Such lambda expressions are useful (actually indespensible) in encoding recursive functions. We will not discuss the details in this course. But we should note that no language is sufficiently powerful if it cannot express recursive functions.

Normal Order Reduction vs. Applicative Order Reduction

These are general concepts associated with functional evaluations.

A function application may be evaluated in different ways. E.g. suppose we have two functions which are defined as

      f(x) = x + x 
      g(x) = x + 1
Now consider the function application f(g(2))

Normal Order Reduction:

         f(g(2)) --> g(2) + g(2) --> 3 + g(2) --> 3 + 3 --> 6

That is, we apply the "outer most" function whenever it is possible.

Applicative Order Reduction:

         f(g(2)) --> f(3) --> 3 + 3 --> 6

That is, we apply the "inner most" function whenever it is possible.

If there is more than one outermost or innermost functions that are applicable, we should choose the leftmost one (for no other reason than just fixing the choice). E.g.

    f(g(2)) + f(g(4))
where suppose + requires its argument to be integers before the addition can be carried out. In this case, we should choose f(g(2)) for normal order reduction, and g(2) for applicative order reduction.

One notices that for the evaluation of expression f(g(2)) above, g(2) is evaluated twice in normal order reduction and only once in applicative order reduction. This seems to suggest that applicative order is in general more efficient. This is indeed the case. However, normal order has better termination property.

An Example where NOR terminates and AOR does not

   g(x) = cons(x, g(x+1))
   f(x) = a                 a constant function

   Consider f(g(0))

   NOR: f(g(0)) -> a
   AOR: f(g(0)) -> f(cons(0, g(1))) -> f(cons(0, cons(1, g(2)))) -> ...

Examples of Orders of Reductions in Lambda Calculus

Example A.

     ((lambda (x) (+ 1 x)) ((lambda (z) (+ 1 z)) 3)) 

Normal order reduction:

     ->  (+ 1 ((lambda (z) (+ 1 z)) 3)) 
     ->  (+ 1 (+ 1 3))
     ->  (+ 1 4)
     ->  5

Applicative order reduction:

     ->  ((lambda (x) (+ 1 x)) (+ 1 3))
     ->  ((lambda (x) (+ 1 x)) 4)
     ->  (+ 1 4)
     ->  5

Example B. 

      ((lambda (x) (+ x x))  ((lambda (y) (+ 3 y)) 2))

Normal order reduction:

   ((lambda (x) (+ x x))  ((lambda (y) (+ 3 y)) 2))
-> (+ ((lambda (y) (+ 3 y)) 2) ((lambda (y) (+ 3 y)) 2))
-> (+ (+ 3 2) ((lambda (y) (+ 3 y)) 2))
-> (+ 5 ((lambda (y) (+ 3 y)) 2))
-> (+ 5 (+ 3 2))
-> (+ 5 5)
-> 10

Applicative order reduction:

   ((lambda (x) (+ x x))  ((lambda (y) (+ 3 y)) 2))
-> ((lambda (x) (+ x x))  (+ 3 2))
-> ((lambda (x) (+ x x))  5)
-> (+ 5 5)
-> 10

Church Rosser Theorem

(In this theorem we use -> to denote zero or more reduction steps.)

1. If A -> B and A -> C then there exists an expression D such that B -> D and C -> D

2. If A has a normal form E, then there is a normal order reduction A -> E.

That is, no matter what reduction strategy is used initially, there is always a way to converge into the same expression. In addition, normal order reduction guarantees termination if the given expression has a normal form.

The abstract model of computation of lambda calculus lays down the foundations for a style of programming which we have called functional programming. One can develop a language based on this model, implement an interpreter for the language, and maybe design a special machine for it. These are the topics of the forthcoming lectures.