Today is a theory day. We will take a look at the mathematical precursor to Scheme, the lambda calculus.
The lambda calculus was invented by Alonzo Church in the 1930s to study the interaction of functional abstraction and function application from an abstract, purely mathematical point of view.
It was one of many related systems that were proposed in the late 1920s and 1930s. This was an exciting time for theoretical computer science, even though computers had not yet been invented. There were many great mathematicians such as Kurt Gödel, Alan Turing, and Emil Post who were trying to understand the notion of computation. They certainly understood intuitively what an algorithm was, but they were lacking a formal mathematical definition. They came up with a number of different systems for capturing the general idea of computation:
These systems all looked very different, but what was amazing was that they all turned out to be computationally equivalent in the sense that each could encode and simulate the others. Church thought that this was too striking to be mere coincidence, and declared that they had found the correct definition of the notion of computability: it was the common spirit embodied in a different way by each one of these formalisms. This declaration became known as Church's thesis. Nowadays we have a better understanding of computability, and one could add any modern general-purpose programming language such as Java, C, or Scheme (suitably abstracted) to this list, but Church and his colleagues did not have that luxury.
The lambda calculus is like Scheme except much simpler.
Lambda terms E, F, G, ... are defined inductively:
Notes:
(E
F)
in Scheme. Think of this as an expression with
an invisible operator "apply" between the E and F, like + or
*. Parentheses are used for readability only, so they do not have the
same significance as in Scheme. The apply operator is not associative,
so (EF)G is not the same as E(FG) in general. The expression EFG
written without parentheses
should be read as (EF)G.(lambda
(x) E)
in Scheme. The application operator binds tighter than abstraction
operator, so lx.lx.xy
should be read as lx.(lx.(xy)).Computation in the lambda calculus takes place by substitution. There are two main rules:
Examples in Scheme:
(lambda (x)
(+ x 3))
is equivalent to (lambda (y) (+ y 3))
.
((lambda
(x) (+ x 3)) 7) => (
+ 7 3)
.
An a- or b-reduction step can be performed at any time to any subterm of a lambda term. Write E => F if E goes to F in some finite number of a- or b-reduction steps.
In the lambda calculus, a reduction rule can be performed at any time to any subterm of a lambda term at which the rule is applicable. Scheme specifies an order of evaluation, but the lambda calculus does not. It turns out that it does not matter because of the Church-Rosser theorem: If E => F1 and E => F2, then there always exists a G such that F1 => G and F2 => G.
E
/ \
/ \
/ \
/ \
F1 F2
\ /
\ /
\ /
\ /
G
So if you diverge by applying rules to different parts of the term, you can always get back together again by applying more rules. Example:
((lambda (x) (+ x x)) (* 2 3))
/ \
lazy/ \eager
/ \
/ \
(+ (* 2 3) (* 2 3)) ((lambda (x) (+ x x)) 6)
\ /
\ /
\ /
\ /
(+ 6 6)
Terms are equivalent if they have a common reduct; that is, E1 == E2 if there exists an F such that E1 => F and E2 => F.
E1 E2
\ /
\ /
\ /
\ /
F
The relation == is an equivalence relation on terms. To show transitivity, we use the Church-Rosser property: if E1 == E2 and E2 == E3, then there exist F1 and F2 such that
E1 E2 E3
\ / \ /
\ / \ /
\ / \ /
\ / \ /
F1 F2
By the Church-Rosser property, there exists a G such that F1 => G and F2 => G:
E1 E2 E3
\ / \ /
\ / \ /
\ / \ /
\ / \ /
F1 F2
\ /
\ /
\ /
\ /
G
which proves that E1 == E3.
All functions are unary in the lambda calculus. Functions of higher arity are simulated by currying. A function of two arguments is simulated by a function of one argument that returns a function of one argument that returns the result. That is, we would simulate
(lambda (x y) E)
in curried form by
(lambda (x) (lambda (y) E)) = lx.ly.E
Instead of applying the function to two arguments, e.g.
(f A B)
we apply it to the first argument, then apply the result to the second argument:
((f A) B)
The lambda calculus is powerful enough to encode other data types and operations.
Define
#t =def=
lx.ly.x
#f =def= lx.ly.y
In other words, #t and #f are two-argument functions that just return the first and second argument, respectively. Then we can take "if" to be the identity function!
if =def= lx.x
(if #t
B C) = ((if #t) B C) =>
(#t B C) = ((lx.ly.x
B) C)
=> (ly.B C) =>
B
(if #f
B C) = ((if #f) B C) =>
(#f B C) = ((lx.ly.y
B) C)
=> (ly.y C) =>
C.
We can define other Boolean functions:
and =def=
lx.ly.xyx
or =def=
lx.ly.xxy
Then for example
(and #t #f) => (#t #f #t) => (lx.ly.x #f #t) = #f.
In Scheme, all cons does is package up two things in such a way that they can be extracted again by car and cdr. To code this in the lambda calculus, we can take two terms A and B and produce a function that will apply another function to them.
cons =def= lx.ly.lc.cxy
Thus if we apply this to A and B we get
(cons A B) => lc.(c A B)
Now we can extract A and B again by applying this function to #t and #f, respectively:
(lc.(c
A B) #t) => (#t A B) = (lx.ly.x
A B) => A
(lc.(c A B) #f) => (#f A B) = (lx.ly.y
A B) => B
so we should define
car =def=
ld.(d #t)
cdr =def=
ld.(d #f).
If we define the null list object appropriately, we can get a null? test:
null =def=
lx.#t
null? =def= ld.(d
lx.ly.#f).
Then
(null? null) = (ld.(d lx.ly.#f) lx.#t) => (lx.#t lx.ly.#f) => #t
(null? (cons A
B)) = (ld.(d lx.ly.#f)
lc.(c A B))
=> (lc.(c A B) lx.ly.#f)
=> (lx.ly.#f A B)
=>
#f.
In Scheme, numbers are a built-in primitive type. In the lambda calculus, we have to encode them as lambda terms. Church came up with a neat way to do this and his encoding is called the Church numerals. The idea is to encode the number n as a lambda term n representing a function that takes another function f as input and produces the n-fold composition of f with itself.
0
=def= lf.lx.x
1 =def= lf.lx.fx
2 =def= lf.lx.f(fx)
3 =def= lf.lx.f(f(fx))
...
n =def=
lf.lx.f(...(f(f(fx)))...)
Note that 0 is just #f after a-conversion. Let's write fnx as an abbreviation for the term f(...(f(f(fx)))...). Then
n =def= lf.lx.fnx.
This is nice, but in order to do anything with it, we have to encode arithmetic. Let's start with the simplest operation: successor. We would like to define a lambda term add1 that when applied to any n produces n+1. Note that
nfx => fnx,
so
f (nfx) => (f fnx) = fn+1x.
Lambda abstracting, we get
lf.lx.f(nfx) => lf.lx.fn+1x = n+1.
Therefore we want our successor function to produce lf.lx.f(nfx) from n, so we define
add1 =def= ln.lf.lx.f(nfx).
How do we test for 0?
zero? =def= ln.(n lz.#f #t).
Then
(zero? 0)
= (ln.(n lz.#f
#t) lf.lx.x)
=> (lf.lx.x
lz.#f #t) => (lx.x #t)
=> #t
(zero? n+1)
= (ln.(n lz.#f
#t) lf.lx.fn+1x)
=> (lf.lx.fn+1x
lz.#f #t) => (lx.(lz.#f)n+1x
#t)
=>
((lz.#f)n+1
#t) => #f.
Addition is just iterated successor. Note that
(n add1) => lx.(add1n x)
thus
(n add1 m) => (lx.(add1n x) m) => (add1n m) => m+n.
This suggests we define
+ =def= ln.lm.(n add1 m).
Multiplication can be defined by iterated addition and exponentiation by iterated multiplication in a similar fashion.
Subtraction is a little more difficult, but it can be done. Define inductively
[]
=def= null
[n
n-1 n-2
... 2 1
0] =def=
(cons n
[n-1 n-2
... 2 1
0])
So [...] represents a proper list in the usual Scheme sense under our encoding. Our approach is to come up with a term F such that
nF[0] => [n n-1 n-2 ... 2 1 0].
Then we will be able to take
sub1 =def= ln.(if (zero? n) 0 (car (cdr (nF[0])))).
Note
nF[0] => (Fn [0])
so this will work provided for all n we have
(F [n n-1 n-2 ... 2 1 0]) => [n+1 n n-1 n-2 ... 2 1 0].
This can be achieved by defining
F =def= lx.(cons (add1 (car x)) x).
A recursive definition in Scheme such as
(define fact
(lambda (x) (if (zero? x)
1
(* x (fact (sub1 x))))))
defines the
function fact
in terms of itself. This self-referential
capability that we can achieve in Scheme at first seems impossible to simulate
in the lambda calculus because of the lack of names. However it can be
done.
Note that we are really looking for a term F that is a solution of the equation
F == lx.(if (zero? x) 1 (* x (F (sub1 x)))).
Put another way, we are looking for a fixpoint of the function
T =def= lf.lx.(if (zero? x) 1 (* x (f (sub1 x)))),
that is, a term F such that TF == F.
Fixpoint theorems abound in mathematics. A whorl on your head where your hair grows straight up is a fixpoint. At any instant of time, there must be at least one spot on the globe where the wind is not blowing. For any continuous map f from the closed real unit interval [0,1] to itself, there is always a point x such that f(x) = x.
The lambda calculus is no exception. Any term W has a fixpoint
(lx.W(xx) lx.W(xx))
as can be seen by performing one b step:
(lx.W(xx) lx.W(xx)) => W(lx.W(xx) lx.W(xx)).
Moreover, there is a lambda term Y, called the fixpoint combinator, which applied to any W gives its fixpoint:
Y =def= lt.(lx.t(xx) lx.t(xx)).
Let's argue that the fixpoint of T above gives the factorial function. Define
fact =def= (lx.T(xx) lx.T(xx)).
We wish to show that for any n,
(fact n) => n!
We argue this by induction on n. Basis:
(fact 0)
=> ((lx.T(xx) lx.T(xx))
0) => (T(lx.T(xx)
lx.T(xx)) 0)
=> (lx.(if (zero? x) 1
(* x ((lx.T(xx) lx.T(xx))
(sub1 x)))) 0)
=> (if (zero? 0) 1
(* 0 ((lx.T(xx)
lx.T(xx)) (sub1 0))))
=> 1.
Induction step:
(fact n+1)
=> ((lx.T(xx) lx.T(xx))
n+1) => (T(lx.T(xx)
lx.T(xx)) n+1)
=> (lx.(if (zero? x) 1
(* x ((lx.T(xx) lx.T(xx))
(sub1 x)))) n+1)
=> (if (zero? n+1) 1
(* n+1 ((lx.T(xx)
lx.T(xx)) (sub1 n+1))))
=> (* n+1 ((lx.T(xx)
lx.T(xx)) (sub1 n+1)))
=> (* n+1 ((lx.T(xx)
lx.T(xx)) n))
= (* n+1 (fact n))
=> (* n+1 n!)
(by the induction hypothesis)
=> (n+1)!