# A Seriously Slow Fibonacci Function

I recently wrote an article which was ostensibly about the Fibonacci series but was really about optimization techniques. I wanted to follow up on its (extremely moderate) success by going in the exact opposite direction: by writing a Fibonacci function which is as *slow* as possible.

This is not as easy as it sounds: any program can *trivially* be made slower, but this is boring. How can we make it slow in a fair and interesting way? The answer is to use a model of computation which is not *deliberately* designed to be slow but which in practice is quite slow, usually because the designer had something quite different than performance in mind.

While there are several to choose from, I selected the \(\lambda\)-calculus (read “lambda calculus”) as being particularly easy to write an spectacularly inefficient implementation in.

Some of you no doubt will be having flashbacks at the mention of the name, while others have already started slowly edging their mouse towards the close tab icon. But don’t worry - if you done any programming before you’ve already seen all the “hard” ideas associated with it and what’s left is a “toy” language that can be learned in a few minutes. By the end of this article you will see clearly how you could write your own non-trivial programs directly in the \(\lambda\)-calculus.

In fact, it’s main problem is that it’s *too* simple: it is difficult at first to see how anyone could *do* anything with it. Luckily for us, there exists a set of macros which turn the \(\lambda\)-calculus into a much higher level language. While Alonzo Church was inventing the \(\lambda\)-calculus itself he also developed this set of macros in parallel so that he could convince himself and others that it really could compute. These macros provide a simple and concrete way of encoding numbers, mathematical operators, boolean logic, and even data structures like lists, maps, and trees. Today, this technique is called Church encoding. If \(\lambda\)-calculus is machine code, then the Church encoding is C.

## Goal

David Allen says to begin with the goal in mind, so let’s take a look at what we’re shooting for. As a blueprint, let’s first look at a performance-naive implementation of a function which finds the \(n\)-th Fibonacci number, implemented in vanilla Python:

```
def fibonacci(n):
if n <= 1:
return n
else:
return naive_fib(n-1) + naive_fib(n-2)
```

Let’s put together a shopping list of features we need to implement this function:

- natural numbers
- addition
- subtraction
- less than comparison
- if/else branch
- recursion

Well, we certainly have our work cut out for us, don’t we? The first four require a model of the Peano axioms, the if/else branch requires a model of Boolean algebra, and recursion is usually regarded as a non-trivial language feature.

For the purposes of this article, we’re going to take a non-standard approach and not use the original notation. Instead, we’ll use a subset of Python which has a one-to-one correspondence with the \(\lambda\)-calculus but which is hopefully both more familiar and more accessible to readers: all of the code in this article will run in any Python 3 interpreter so that readers may follow along and try their own experiments if they like. I shall call this subset the “Pythonic” \(\lambda\)-calculus when I want to specifically refer to the lambda calculus implemented as a subset of the Python grammar.

In the next section I will describe this subset more formally. In this section, I’ll just do a quick high-level overview so you have some idea of where we are heading.

Basically, we’ll be writing Python code, but restricting ourselves to *only* using the anonymous function syntax (e.g., `lambda x: ...`

) and function calls (e.g., `f(x)`

).

In addition, we will expand our language with *definitions*, which are basically macros with a human readable name that expand out to \(\lambda\)-calculus expressions. Here is an example of a definition, the contents of which we will return to later:

`plus = lambda m: lambda n: lambda f: lambda x: m(f)(n(f)(x))`

Definitions get substituted into other expressions very much like running “Find and Replace All”:

`s/plus/(lambda m: lambda n: lambda f: lambda x: m(f)(n(f)(x)))/g`

By introducing definitions, we will gradually build up a surprisingly high-level and expressive language on top of the \(\lambda\)-calculus. We can then revert to proper \(\lambda\)-calculus at any time by simply expanding out all the definitions like macros.

With those preliminaries out of the way, let me show you are goal. Although we will the rest of this article to understand in detail how it works, we will end up with a Fibonacci function that looks something like this:

```
fibonacci = Y(lambda f: PIR(
# define function f(n) as:
lambda n: (
# if n < 2
less_than(n)(two))
# then n
(n)
# else f(n-1) + f(n-2)
(plus
(f(minus(n)(one)))
(f(minus(n)(two))))))
```

As always in Python, `#`

indicates an non-executable comment.

It shouldn’t be to hard to see the basic shape of the familiar, friendly Fibonacci function in there. If you have any LISP, it will even look vaguely familiar, although the parentheses are in slightly different places. (In LISP, function application looks like `(minus n 2)`

instead of `minus(n)(two)`

.)

## The Pythonic Lambda Calculus

In the \(\lambda\)-calculus there are only two operations: abstraction and application. These two can be composed to write any program (computable function) that has or ever will be written.

What do we mean by the term “abstraction?” Let’s say we have a concrete calculation, such as the sums of squares of a vector like `(3, 4)`

. We could type the following into a Python interpreter:

`3*3 + 4*4`

In an interactive programming session, this might suffice: you type something in and hit ENTER to see the answer. But in most cases values `3`

and `4`

are too hard-coded to be useful. So we can *abstract* this by replacing `3`

with a placeholder variable `x`

and introducing a function of `x`

:

```
def half_ss(x):
return x*x + 4*4
```

Now that we have an abstraction, a.k.a. a *function*, how do we use it? Before we can do any concrete computation, we need to know what `x`

is supposed to be. This is called *function application* or simply *application*, and we use the familiar `f(x)`

syntax common to math and almost all programming languages.

`half_ss(3)`

But this “half” function isn’t very satisfactory. The `4`

is still hard coded.

If we want to do the same thing again and abstract `4`

into `y`

, we have a couple of choices. We *could* greatly complicate our language and add another syntactic primitive `,`

:

```
def half_ss(x, y):
return x*x + y*y
ss(3, 4)
```

But if we want to keep things simple, why don’t we simply perform abstraction twice?

```
def ss(x):
def half_ss(y):
return x*x + y*y
return half_ss
```

Note that when we call `ss(3)`

, what is returned is again a function, so we can use a second application to pass the second argument:

`ss(3)(4)`

The two applications “cancel out” the two abstractions, and we are left with the concrete value `3*3 + 4*4`

, more commonly known as 25.

This works because we call `ss`

with the first argument and instead of resolving to a value, it instead returns a function that we can call with the second argument to get the result. We can repeat this operation as many times as necessary, but it get’s inconvenient to make up a silly name like `half_ss`

each time; instead we’ll use an *anonymous* lambda function:

```
ss = lambda x: lambda y: x*x + y*y
ss(3)(4)
```

So to recap, we implement the abstraction operation of \(\lambda\)-calculus in Python by using a `lambda`

expressions with a single argument each, and if we want to define a function that takes more than one argument we stack up lambda expressions like `lambda x: lambda y: lambda z :...`

.

Note that we can write our entire program without recourse to any names if we treat `ss`

as a definition and replace the string `ss`

with its right-hand side where ever it appears. The final program is:

`(lambda x: lambda y: x*x + y*y)(3)(4)`

Which you can run and verify that it gives the answer 25.

In the \(\lambda\)-calculus, application is the *only* way we can do *anything*, so we should re-write the binary expressions as more primitive functions:

`(lambda x: lambda y: plus(mult(x)(x))(mult(y)(y)))(3)(4)`

One specific trick deserves attention. We could define a constant function like so:

`lambda x: c`

When called, this function ignores the argument `x`

and always returns the constant `c`

. This expression however, has `c`

as a free variable: there is no `lambda c:`

above it. We can remedy that with another abstraction:

`lambda c: lambda x: c`

This gives us a way to construct constant functions out of any value. This kind of thing is called a closure. Closures have the property that they can “remember” values assigned to them at runtime. In fact, the function `ss`

defined above also returned a closure, which “remembered” that `x`

was supposed to equal `3`

until the time came when a concrete calculation could be carried out. Closures are extremely important in the \(\lambda\)-calculus because they are our only way of defining data structures and passing state around.

Setting aside closures, let’s next discuss how computation is actually carried out. You may well have noticed that the operation I’ve called “application” shows the *syntax* that says a function *should* be called with a certain argument, but I haven’t said anything about how the calling of a function should actually be carried out!

It turns out this is exactly the reverse of abstraction - we replace abstract variables with concrete values. This is called the \(\beta\)-reduction rule (read “beta reduction”.) For example, we may call the function `lambda x: x(x)(x)`

with concrete value `t`

by writing `(lambda x: x(x)(x))(t)`

. The \(\beta\)-reduction rules allows us to remove the `lambda x`

and go through and replace every `x`

in the body of the function with `t`

, which leaves us with `t(t)(t)`

. Since `t`

is a free variable, we cannot reduce this any further, so we stop. Let’s do the same thing to our sums-of-squares example:

```
(lambda x: lambda y: plus(mult(x)(x))(mult(y)(y)))(3)(4)
(lambda y: plus(mult(3)(3))(mult(y)(y)))(4)
plus(mult(3)(3))(mult(4)(4))
```

Since `plus`

, `mult`

, and even `3`

and `4`

are just definitions, we could expand those definitions and continue the beta-reduction process until only a single concrete value remains. Below, we will study the Church encoding for natural numbers and learn exactly how to do this.

So abstraction introduces lambda expressions, application tells us when we should call a function and which argument to pass in, and \(\beta\)-reduction tells us how to carry out that function call.

There is one other rule, called \(\alpha\)-replacement (read “alpha replacement”), which tells us that we can change the variable names of functions whenever we want, as long as we are consistent and change it everywhere inside the body of the function too while also avoiding conflicts with other variable names. For example, these two lambda expressions are the same, and we can use the \(\alpha\)-replacement rule to transform one into the other and vice versa:

```
lambda x: x === lambda y: y
lambda x: x !== lambda x: y
```

The lambda expression `lambda x: y`

on the other hand, would not be the same because we cannot replace `x`

with `y`

without a conflict. We could change `lambda x: y`

into `lambda z: y`

, but that would be a different function. This rule should be intuitively obvious. It’s also not particularly important because we equally well could have used an infinite sequence of variable names to avoid conflicts; this would eliminate the need for the \(\alpha\)-replacement rule. The \(\beta\)-reduction rule captures the very essence of what it means to carry out a computation; the \(\alpha\)-replacement rule is book-keeping.

The above gives the *flavor* of the \(\lambda\)-calculus: abstraction, application, \(\alpha\)-replacement and \(\beta\)-reduction. Since the \(\lambda\)-calculus is Turing complete, this can be interpreted as implying that all programming can be reduced to abstraction and application, and all computation can be reduced to the \(\beta\)-reduction rule; all else is vanity and grasping at wind.

But the examples I’ve used have share the weakness that if you go far enough down, you are relying on other operations. For example, `mult = lambda x: lambda y: x * y`

is ultimately defined in terms of the built-in multiplication operation `*`

, and `x`

and `y`

are are of type `int`

. This won’t do; indeed, this is a serious defect, because the whole point of the lambda calculus is to prove that these two operations suffice to define *any* computable function. Copping out halfway through and relying on native operations proves nothing.

To correct this defect, we need to start from scratch and scrupulously avoid using any operation *except* for abstraction and application. Happily, Church encoding provides a roadmap… it will however lead us to types and data structures very different than the native python `int`

and `bool`

!

## Formal Grammar

The subset of Python which constitutes the Pythonic \(\lambda\)-calculus can fully described by this BNF specification:

```
<expression> ::= <variable>
| "(" lambda <var> ":" <expression> ")"
| <expression> "(" <expression> ")"
```

In plain english, we build up expressions recursively out of variables, lambda functions, and function applications.

In some cases, where it causes no ambiguity, we will omit the parentheses around lambda functions, or add parentheses around function application e.g. `(f(x))`

. Note that the *placement* of parentheses is rather different than in the original \(\lambda\)-calculus syntax! Where we might have written \((f x y)\) in the convential notation we will write `f(x)(y)`

, and where we would have written \((f (x y))\) we will write `f(x(y))`

. Hopefully, the Pythonic notation for function application will actually be *more* familiar to those of you who have studied modern high level programming languages or pure mathematics.

As a convenience, we will also allow ourselves *definitions*, although we will later show that these definitions are merely a convenience and can be done away with whenever we choose through the simple process of string substitution. (The development of the Church encoding proceeds mainly by means of such definitions.) A definition looks like this:

`<definition> ::= <definition-name> "=" <expression>`

Where the `<expression>`

on the right is restrictions to have no free variables. That is to say, every variable used is in fact inside the body of a lambda function with that variable as a parameter. For example, `lambda x: x(x)(x(x))`

has no free variables, but `lambda x: y(x)`

has `y`

as a free variable. Furthermore, while definitions can contain other definitions, they cannot ever contain themselves, not even implicitly hidden away in some other definition. Otherwise, the simple string substitution macro expansion of a definition would never terminate! (Later, when we need recursion to implement the Fibonacci function, we will need to find a way to do this!)

These restrictions exist so that we can think of our extended grammar as nothing more than a set of lightweight macros on top of the \(\lambda\)-calculus. In principle, for any given program written in the extended grammar, we can simply substitute in the body of the definition wherever we find the name of the definition. This takes only a finite number of steps and is guaranteed to terminate. At the end of this process, we are left with an equivalent program with no definitions or definition names. Furthermore, we can complete this “macro preprocessing” step entirely before beginning to run the program. In this sense, it is ancillary to the real calculation.

BTW, these rules for what constitutes a valid *definition* (as opposed to an axiom) can be traced back to Frege, who needed it because he was in the process of adding quantifiers over bound variables to logic. It turned out be a very fruitful principle; modern mathematics is 99% definitions, with only a handful of axioms holding up the foundation. It’s also very broad - even though the \(\lambda\)-calculus is not a logic, the concept of “definition” remains much the same.

To wrap up, the full extended grammar looks like this:

```
<definition> ::= <definition-name> "=" <expression>
<expression> ::= <variable>
| "(" lambda <var> ":" <expression> ")"
| <expression> "(" <expression> ")"
| "(" <definition-name> ")"
```

Both the original grammar and the extended grammar are strict subsets of Python, and as such is runnable on a Python interpreter. This is, perhaps, the most extreme example of programming “into” a language instead of programming “in” a language, following McConnell’s distinction.

Note that if we run an expression in the extended grammar directly in Python, the interpreter does *not* do the macro expansions as described above… but the results of the calculations will always be identical if we’ve carefully followed the rules for introducing definitions! Later, we will show examples of running programs both ways.

## Church Booleans

We’ll start with the simplest item on our shopping list: Boolean logic. There are only two Boolean values:

```
# Church Booleans
true = lambda x: lambda y: x
false = lambda x: lambda y: y
```

Note that these are **not** the same as Python’s built-in `True`

and `False`

constants; our lowercase `true`

and `false`

appear to Python as callable objects of type `function`

, not objects of type `bool`

.

Some of you may object that these are not *values*, these are *functions*. Of course they are; the \(\lambda\)-calculus is made out of nothing *but* functions! But that doesn’t prevent us from thinking of some functions as values when it is convenient for us. Consider the “objects” of object-oriented programming - an object is nothing *but* a collection of functions, but we usually think of objects as values.

More generally, it is often convenient to talk about the “type” of different lambda expressions. However, because we are technically working in the “untyped” \(\lambda\)-calculus, we will have to keep the concept of “type” high-level and informal for now. (There are also “typed” versions but we don’t need it to actually compute stuff.)

Since we are keeping things informal, we can use an intuitive definition: the “type” of an expression is roughly the number and types of the parameter it would normally be called with. In high level languages this is often called the function signature. The two Church Booleans we defined both have the same type - they expect to be called with two arguments and will then return one or the other of those two arguments unchanged.

Consider the following function, which takes a Boolean argument:

`lambda p: p(a)(b)`

This piece of code will work equally well if `p`

is `true`

or `false`

- only the behavior will be different. If `p`

is `true`

, it will return `a`

and if `p`

if `false`

it will return `b`

. In other words, it has the same semantics as an if/else statement or the `?:`

ternary operator.

In general, if we have a predicate (an expression which evaluates to a Boolean), we can always write an if/else statement like:

```
((<predicate>)
(<if-expression>)
(<else-expression>))
```

While we will have to wait until after we’ve defined the Church numbers to get really useful predicates like `less_than`

, we can go ahead and define the usual Boolean operators purely in terms of our above definitions for `true`

and `false`

:

```
# Boolean logic
AND = lambda x: lambda y: x(y)(false)
OR = lambda x: lambda y: x(true)(y)
NOT = lambda x: x(false)(true)
XOR = lambda x: lambda y: x(NOT(y))(y)
```

Each of these expects to be passed Boolean values and then returns a single Boolean value. We can unpack these using the above equivalence to `if/else`

; for example, `AND`

reads as “if x is true, then return y, else return false”. This will return true only if `x`

and `y`

are both `true`

, so this function acts like “and”. You can read the other definitions in the same way.

To more easily interface with these functions, let’s write some bridge code:

```
from typing import Callable, Any
def church_to_bool(b: Callable) -> bool:
return b(True)(False)
def bool_to_church(b: bool) -> Callable:
return true if b else false
```

Now that we have these bridge functions, it’s fairly easy to write a test demonstrating that we’ve correctly implemented the truth tables for each of Boolean operation:

```
for x in (True, False):
for y in (True, False):
x_c = bool_to_church(x)
y_c = bool_to_church(y)
z_c = AND(x_c)(y_c)
z = church_to_bool(z_c)
print(x, "AND", y, "=", z)
```

True AND True = True True AND False = False False AND True = False False AND False = False

At this point I encourage you to try to test some of the other Boolean operators, and to write your own, such as `NAND`

or the `SUM`

and `CARRY`

of a full adder for more of a challenge.

This has been our first taste of computing the \(\lambda\)-calculus. The pattern (which we’ll soon see more of) is simple: use Church encoding to somehow translate your input values into lambda expressions. Pass those into a lambda expression which represent your program. Finally, reverse the Church encoding to recover meaningful values.

## Church Numerals

The Church encoding of the natural numbers, called Church numerals, defines the number \(n\) to be a binary function (here, “binary” means taking two arguments) which takes a function `f`

and an arbitrary value `x`

and applies `f`

to `x`

exactly \(n\) times:

```
zero = lambda f: lambda x: x
one = lambda f: lambda x: f(x)
two = lambda f: lambda x: f(f(x))
three = lambda f: lambda x: f(f(f(x)))
# ... and so on
```

How do you apply a function zero times? Well, you don’t; you just return the value `x`

right away. To call it once is `f(x)`

, twice is `f(f(x))`

, and so on. This means that Church numbers are **not** an arbitrary sequence of symbols that only gain semantics because of the relations defined on them (as they are in other models) but actually have behavior which is directly related to their meaning: the \(n\)-th Church number has the behavior of repeating a computation \(n\) times.

For example, suppose we have a function called `print`

and a message `"Hello, Lambda Calculus!"`

and we want to print this message 3 times. How would we implement the equivalent of a `for`

or `while`

loop in the Pythonic lambda calculus? Just so:

```
> three(print)("Hello, Lambda Calculus!")
Hello, Lambda Calculus!
Hello, Lambda Calculus!
Hello, Lambda Calculus!
```

Of all the high-level programming languages I am aware of, I think only Ruby comes close to the idea that numbers should literally be their own `for`

loops. Even LISP and Haskell require recursion, a separate map function, or a loop macro. (For code readability alone this is probably a good thing, though. In writing this article I’ve found the lack of traditional signpost statements more confusing than elegant, and have had to use comments to indicate where such control flow statements were being used.)

This one-to-one correspondence between Church numerals and behaviors makes it relatively easy to define mathematical operations on Church numerals. Following Peano, the first thing we need is a successor function which can increment a number by one:

`succ = lambda n: lambda f: lambda x: f(n(f)(x))`

Why does this work? Well, given an original number `n`

, it first *uses* `n`

to apply `f`

to `x`

\(n\) times. It then applies `f`

once more itself. Thus, the final value will be `f`

applied to `x`

\(n+1\) times, which is \(n+1\) by definition.

This successor function allows us to easily construct new numbers *ad infinitum*:

```
# 0-10 for convenience
zero = lambda f: lambda x: x
one = lambda f: lambda x: f(x)
two = succ(one)
three = succ(two)
four = succ(three)
five = succ(four)
six = succ(five)
seven = succ(six)
eight = succ(seven)
nine = succ(eight)
ten = succ(nine)
church_digits = [zero, one, two, three, four, five, six, seven, eight, nine]
```

We can now define other mathematical operators:

```
# church numerals
plus = lambda m: lambda n: lambda f: lambda x: m(f)(n(f)(x))
mult = lambda m: lambda n: lambda f: lambda x: m(n(f))(x)
exp = lambda m: lambda n: n(m)
pred = (lambda n:
lambda f: lambda x:
n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u))
minus = lambda m: lambda n: n(pred)(m)
```

Of these, `plus`

and `mult`

are the easiest to understand. `plus`

first applies `f`

to `x`

\(m\) times, then applies `f`

to the result \(n\) times, for a total of \(n+m\) applications. The multiplication operator `mult`

is similar, but does things in a slightly different order: it *first* defines a new function `g = n(f)`

which applies `f`

to some value n times, and *then* applies `g`

to `x`

m times. Since each call to `g`

ends up calling `f`

\(n\) times, the result is that `f`

is applied to `x`

\(m \times n\) times.

Try to figure out `exp`

for yourself. It’s equivalent to \(m^n\). It’s not on the main line of functionality we need for the Fibonacci function, and it’s very clever!

`pred`

is the predecessor relation. It subtracts one from a number if possible. (Zero is just mapped to zero, as negative numbers are not defined.) It’s more complex than `succ`

but studying it is extremely rewarding, because it leads to understanding how data structures can be represented in the \(\lambda\)-calculus. The basic idea is that we are going to but the value `x`

in a box and replace `f`

with a different function, which I’ll call `skip_first`

. The first time `skip_first`

is called, it sees that the box has not been opened, so it opens that. After that, it sees that the box is already open, so it takes the value out of the box, applies `f`

to it once, and puts it back in the box. It does this \(n\) times. At the end, it takes the value of the box. The ultimate result is that `f`

is applied to `x`

\(n-1\) times, because nothing happened the first time. In this analogy, the initial closed “box” is `lambda u: x`

, the new box that is created after each step is `lambda h: h(g(f))`

, and the `lambda u: u`

at the end is the final act of taking the value out of the box.

`pred`

is tricky understand, especially in this elementary form. The Wikipedia article also has a pretty good explanation too. A good exercise to manually work out the \(\beta\)-reduction of `pred(two)`

to get a feel for it. If it still gives you trouble, I suggest you leave it aside and study the rest of theory until you learn how to encode the “pair” data structure. Then `pred`

much may be defined in a much more natural way. just to implement the Fibonacci function so decided to take the straight path through the mud. Nevertheless, there is a switchback trail with a very gentle slope right over there.

Defining `pred`

was the hard part. The definition of `minus`

in terms of `pred`

is much easier: `n`

applies the function `pred`

to `m`

\(n\) times, so we subtract one \(n\) times, which is the same as subtracting `n`

from `m`

. Easy, yes?

Now that we have a reasonable set of mathematical operations, let’s do some practical examples. We can do basic operations like \(2+2\) or \(6 \times 7\):

```
plus(two)(two)
mult(six)(seven)
```

The problem with these is that what they return is a Church numeral, which is a Python Callable. All I see on my screen when I run the above snippets is opaque and ambiguous output like this:

`<function __main__.<lambda>.<locals>.<lambda>.<locals>.<lambda>(x)>`

To translate this back into something we can understand, we need to write another bridge function. But how can we “unpack” a Church number? Recall that the Church encoding of \(n\) is a function taking two arguments, a function \(f\) and a value \(x\), and it applies \(f\) to \(x\) \(n\) times: \(f^n(x)\). In our Python environment, this works even if \(x\) and \(f\) are not written in the lambda calculus. Therefore we can follow the spy pattern often used in unit testing and pass in an function which will report how often it was called.

```
def church_to_int(n: Callable) -> int:
return n(lambda x: x+1)(0)
```

Going the other way requires no such tricks. If I want to encode the number 42 in the \(\lambda\)-calculus, I can use the first ten digits (defined above) and our mathematical operations to build up the Church number:

`plus(mult(four)(ten))(two)`

We can use the same strategy for any number. All we really need to do is parse the base 10 representation of a number and build up the Church number in stages:

```
def int_to_church(n: int) -> Callable:
church_number = church_digits[ int(str(n)[0]) ]
for digit in str(n)[1:]:
church_digit = church_digits[int(digit)]
church_number = plus(mult(church_number)(ten))(church_digit)
return church_number
```

We can now perform non-trivial calculations entirely in the \(\lambda\)-calculus:

```
> print("2 + 2 =", church_to_int(plus(two)(two)))
4
> print("6 * 7 =", church_to_int(mult(six)(seven)))
42
```

Here is a much larger (and slower) example:

```
> a = int_to_church(1001)
> b = int_to_church(999)
> ab = mult(a)(b)
> print("1001 * 999 =", church_to_int(ab))
999999
```

Now, finally, we can implement our sums-of-squares method:

```
> a = int_to_church(3)
> b = int_to_church(4)
> ss = plus(exp(a)(two))(exp(b)(two))
> print("3**2 + 4**2 =", church_to_int(ss))
25
```

This isn’t all of number theory of course, but its enough to implement our little Fibonacci function!

## Predicates Involving Numbers

The first and most basic predict test we need is a check for zero. This will form the foundation of all the other predicates:

`is_zero = lambda n: n(lambda x: false)(true)`

This works because if `lambda x: false`

is called even once, the result will be `false`

, and this can only be avoided if the function is never called, in which case the original value `true`

will be returned. But the only Church numeral which *never* calls its function argument is `zero`

, so the above function returns `true`

only for `zero`

, and `false`

for every other number.

By the way, a function which returns a value of type Church Boolean is the *definition* of a “predicate” in this context. The word carries no logical or semantic content here.

The fact that `pred`

stops at zero (i.e., `pred(zero) == zero`

) implies that `minus(x)(y) == zero`

if and only if `y`

is bigger than or equal to `x`

. We can use this fact to define various comparison tests:

```
leq = lambda m: lambda n: is_zero(minus(m)(n))
less_than = lambda m: lambda n: leq(succ(m))(n)
eq = lambda m: lambda n: AND(leq(m)(n))(leq(n)(m))
```

These functions are interesting because while the expect their arguments `n`

and `m`

to be Church numbers, their return value is a Church Boolean. The main thing we wanted was `less_than`

, which we will need for our Fibonacci function.

## Recursion

Rather than jumping straight into implementing recursion in the \(\lambda\)-calculus, let’s take it slow and develop the idea in stages. Let’s start with vanilla Python recursion:

```
def factorial(n):
if n <= 1:
return 1
else:
return n * factorial(n-1)
```

This only works because by the time the interpreter reaches the statement `factorial(n-1)`

the global symbol table already contains an entry for `factorial`

, so Python happily resolves `factorial`

to that function and calls it. In other words, it works because Python is doing all the heavy lifting for us!

In the \(\lambda\)-calculus, there is no global symbol table. Even if there were, lambda functions are all anonymous: they don’t have names, so what would you even query the symbol table for? The workaround is to pass the function into itself as an argument. This is totally legal; `x(x)`

is a perfectly cromulent expression in Pythonic \(\lambda\)-calculus. Continuing with our factorial example a little further, we have:

```
def _factorial(f, n):
if n <= 1:
return 1
else:
return n * f(f, n-1)
def factorial(n):
return _factorial(_factorial, n)
```

Neat. But it’s a little ugly that we have to make the recursive call as `f(f, n-1)`

and explicitly pass `f`

back into itself. Why not make `f`

a closure which Currys that first argument for us?

```
def _factorial(f, n):
if n <= 1:
return 1
else:
return n * f(n-1)
def factorial(n):
f = lambda m: _factorial(f, m)
return f(n)
```

We can make this more generic and reduce reliance on the global namespace by passing `_factorial`

in as an argument:

```
def call_recursively(f, n):
g = lambda m: f(g, m)
return g(n)
def _factorial(f, n):
if n <= 1:
return 1
else:
return n * f(n-1)
def factorial(n):
return call_recursively(_factorial, n)
```

Finally, the `call_recursively`

function can be abstracted entirely as a Python decorator. (If you’re not familiar with the decorator syntax, `@decorator/f`

is simply syntactic sugar for `f = decorator(f)`

and is a convenient way to apply a functor to a function.)

```
def recursive(f):
def recursive(n):
g = lambda m: f(g, m)
return g(n)
return recursive
@recursive
def factorial(f, n):
if n <= 1:
return 1
else:
return n * f(n-1)
```

So, in vanilla Python, we’ve implemented a reusable utility which enables us to conveniently do recursion while avoiding any reference to the global symbol table. As we make the jump to the final form purely in Pythonic \(\lambda\)-calculus, we will rename `recursive`

to `Y`

- it is indeed the famous Y-combinator, the higher-order function which makes recursion (and therefore also iteration) possible in the \(\lambda\)-calculus. As for *why* it is called `Y`

, I have no idea - it’s just the standard symbol.

`Y = lambda f: (lambda x: x(x))(lambda y: f(lambda z: y(y)(z)))`

We can also apply exactly one application of the \(\beta\)-reduction rule to bring this into an equivalent symmetrical form:

`Y = lambda f: (lambda y: f(lambda z: y(y)(z)))(lambda y: f(lambda z: y(y)(z)))`

While the symmetrical form is more often seen, I prefer the first version because I think it more clearly expresses the idea of currying a function with itself. However, both do exactly the same thing. Furthermore, neither have any free variables and so meet our requirements for a proper definition.

In an ideal world we could now define the recursive factorial function entirely in Pythonic \(\lambda\)-calculus like so:

```
factorial = Y(lambda f:
lambda n:
((leq)(n)(one)
(one)
(mult(n)(f(pred(n))))))
```

However, there is a problem: when we run the above function we get a stack overflow error. Why this so? Is the algorithm wrong? No: if you executed this lambda expression with true \(\beta\)-reduction, it would work fine. The problem is that our Church Boolean pseudo-`if-else`

statement is not quite a proper `if-else`

! In a language like C or Python, the code inside the selected branch will be executed, but code in the other branch will *not even be run.* However, in the Pythonic \(\lambda\)-calculus, if we write:

```
((some-predicate)
(if-branch)
(else-branch))
```

Then *both* the `if-brach`

and the `else-brach`

will need to be evaluated completely *before* `some-predicate`

can be called, regardless of the value of `some-predicate`

!

This is called “eager” evaluation and Python *always* eagerly evaluates *all* arguments of a function call *before* performing the function call. Therefore in Python, we will always compute *both* branches, and only at the very end will we discard one of the values. Normally, this wouldn’t cause serious problems because the answer would always be the same (it would just be a little slower as it takes time to the work which gets thrown away.) It becomes a serious problem in the case of recursion because the `else`

branch is *always* evaluated, which calls the function again, which calls the function again, and so for forever.

One response would be to give up on Python and go abuse some other language which has lazy evaluation. (It’s not a coincidence that functional languages like Haskell generally support lazy evaluation!) Alternatively, we could write an interpreter for the \(\lambda\)-calculus which implements \(\beta\)-reduction rule, side-stepping Python entirely.

These are good options, and for someone writing a very complete implementation of the \(\lambda\)-calculus they would be necessities. But today we’re only concerned to write one function, the Fibonacci function, so we can use a much simpler hack to prevent infinite recursion, which I will abbreviate `PIR`

:

```
# cheap hack because true and false don't short circuit.
def PIR(f):
def wrapper_function(n):
if church_to_bool(is_zero(n)):
return zero
else:
return f(n)
return wrapper_function
```

With our little hack in place, we can now implement a *working* version of the factorial function which does *not* cause a stack overflow:

```
factorial = Y(lambda f: PIR(
lambda n:
((leq)(n)(one)
(one)
(mult(n)(f(pred(n)))))))
```

Long stacks of parentheses appear to be an operational hazard when working with \(\lambda\)-calculus inspired languages.

In any case, the above function is a little inconvenient to call because it requires a Church numeral as input and returns an opaque Church numeral. Let’s wrap it in bridge code:

```
def slow_factorial(n):
n = int_to_church(n)
fac = factorial(n)
return church_to_int(fac)
```

This little function correctly computes factorials:

```
for n in range(1, 10):
print(n, slow_factorial(n))
1 1
2 2
3 6
4 24
5 120
6 720
7 5040
8 40320
9 362880
10 3628800
```

What about our main design objective? I am pleased to report that the above function is indeed*extremely* slow: it takes over a minute to calculate \(10!\) on a fairly new laptop. That works out to *36 million times slower* than the vanilla Python implementation.

We can profile the code to figure out why. Here, I’ve edited the profiler output from `%prun slow_factorial(9)`

to give names to the most common calls; in the original the function name was always just `(<lambda>)`

distinguished only by line number.

```
ncalls tottime percall cumtime percall function
1564014 0.849 0.000 0.849 0.000 succ
362889 0.354 0.000 0.542 0.000 plus
362880 0.188 0.000 0.188 0.000 church_to_int
260669 0.152 0.000 0.152 0.000 zero
79211 0.052 0.000 0.052 0.000 pred
9/1 0.000 0.000 0.003 0.003 factorial
```

So, we actually spend almost all of our time simply incrementing numbers by one. Church numbers are easy to define and work with, but they are gloriously inefficient, especially as the numbers grow. A more efficient encoding could be defined by using the Boolean algebra we developed earlier to define operations on binary strings, but that is not what we are about today.

## Final Fibonacci

Our shopping list is complete: we now have all the necessary tools to proceed to the endgame. All that remains is to implement the Fibonacci algorithm:

```
fibonacci = Y(lambda f: PIR(
lambda n:
less_than(n)(two)
(n)
(plus
(f(minus(n)(one)))
(f(minus(n)(two))))))
```

The `less_then(n)(two)`

is a predicate that resolves to a Church Boolean. This Boolean is then used as an `if/else`

statement returning `n`

for the `if`

branch or the recurance relation for the Fibonacci for the `else`

branch. The `else`

branch is simply the sum of \(F_{n-1}\) and \(F_{n-2}\). The `Y`

-combinator ensures that `f`

is in fact the same `fibonacci`

function so we can call `f(n-1)`

and `f(n-2)`

to calculuate \(F_{n-1}\) and \(F_{n-2}\).

Because we make two separate recursive calls and don’t do any caching, the number of calls to `fibonacci()`

will grow roughly as \(\mathcal{O}(2^n)\). This is of course a terrible algorithm; it’s the same one I called `naive_fib()`

in my earlier article on optimizing the Fibonacci function. However, this is entirely in keeping with our goal of writing the slowest possible version!

As before, we’ll wrap this in bridge code to handle the translation between native Python integers and Church numerals:

```
def slow_fibonacci(n: int) -> int:
n = int_to_church(n)
fib = fibonacci(n)
return church_to_int(fib)
```

We can test that it is correct by exhibiting the first 20 Fibonacci numbers:

```
for n in range(21):
print(n, slow_fibonacci(n))
0 0
1 1
2 1
3 2
4 3
5 5
6 8
7 13
8 21
9 34
10 55
11 89
12 144
13 233
14 377
15 610
16 987
17 1597
18 2584
19 4181
20 6765
```

## How Slow is Slow?

As expected, this is rather slow, over 6 seconds to calculate \(F_20\):

```
%time slow_fibonacci(20)
CPU times: user 6.59 s, sys: 20 ms, total: 6.61 s
Wall time: 6.6 s
```

This is one thousand times as slow as the same naive algorithm in vanilla Python, and about a million times slower than a good algorithm. Note however that this is still faster than a human could work it out on paper! As recently as 80 years ago this would have been state-of-the-art.

A straight line on a log-scale plot shows that the algorithm scales as \(\mathcal{O}(2^n)\).

The profiler shows where we were spending our time:

```
1922492 function calls (1833945 primitive calls) in 1.858 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall function
1133885 0.513 0.000 0.513 0.000 pred
17710/1 0.199 0.000 31.615 31.615 fibonacci
95316 0.173 0.000 3.507 0.000 two
59896 0.159 0.000 3.636 0.000 mult
53131 0.118 0.000 30.497 0.001 is_zero
53130 0.110 0.000 0.280 0.000 minus
35421/1 0.092 0.000 31.615 31.615 PIR-wrapper
35420/2 0.083 0.000 31.615 15.807 Y
119792 0.074 0.000 0.074 0.000 three
35421 0.072 0.000 0.114 0.000 church_to_bool
17710 0.057 0.000 10.598 0.001 less_than
35420 0.052 0.000 0.076 0.000 fibonacci-wrapper
64077 0.041 0.000 0.041 0.000 zero
35420 0.024 0.000 0.024 0.000 PIR
17710 0.022 0.000 0.033 0.000 one
28655 0.014 0.000 0.014 0.000 false
24476 0.012 0.000 0.012 0.000 true
17710 0.012 0.000 0.012 0.000 leq
17711 0.012 0.000 0.012 0.000 plus
17710 0.012 0.000 0.012 0.000 succ
6765 0.006 0.000 0.006 0.000 church_to_int
```

Unlike `slow_factorial()`

which deals with numbers that blow up very quickly, `slow_factorial()`

deals with relatively smaller numbers which means that we spent less time simply iterating through `succ`

and more time doing interesting things. Nevertheless, it spends a *lot* of time doing simple subtractions - this is one of the weak points of the Church numerals.

## Slower Than Slow

How could we make this even *slower*? Again, in a fair way, not just sprinkling no-ops and sleep statements throughout.

One interesting approach would be to implement what is sometimes called a meta-circular evaluator: an interpreter for the \(\lambda\)-calculus written entirely within the \(\lambda\)-calculus itself. We could then stack interpreters indefinitely, with each layer costing us another factor of a thousand. It would be reminiscent of this art project where a long gear chain is used to build a machine which will take 13.7 billion years for the final gear to complete one rotation:

The machine has a electric motor happily whirring away at one end (click the image for a video showing it action) and a solid block of concrete at the other. Normally that would be a recipe for disaster but because each gear steps the revolution speed down by a factor of 10, and because so many gears are chained together, the motion is infinitesimal by the end.

We’re not actually going to do that, of course. This article is already way too long. But we *could.*

## Macro Expansion

Perhaps you don’t believe that this is really a \(\lambda\)-calculus program; after all, it has all those “definitions” which look suspiciously like named functions and in fact are being treated as named functions by the Python interpreter! Very suspicious.

We can see this is not the case by doing a search for each definition’s name and replacing it with its body, and repeating until no named definitions are left. Below is the step-by-step expansion process, with one version per line; the last line contains no definitions (expect our `PIR`

hack) and instead is composed entirely of `lambda`

functions, function calls, and bound variable names.

```
fibonacci = Y(lambda f: PIR(lambda n: (less_than(m)(two))(n)(two)(n)(plus(f(minus(n)(one)))(f(minus(n)(two))))))
fibonacci = Y(lambda f: PIR(lambda n: (lambda m: lambda n: leq(succ(m))(n))(n)(two)(n)(plus(f(minus(n)(one)))(f(minus(n)(two))))))
fibonacci = Y(lambda f: PIR(lambda n: (lambda m: lambda n: (lambda m: lambda n: is_zero(minus(m)(n)))(succ(m))(n))(n)(two)(n)(plus(f(minus(n)(one)))(f(minus(n)(two))))))
fibonacci = Y(lambda f: PIR(lambda n: (lambda m: lambda n: (lambda m: lambda n: (lambda n: n(lambda x: false)(true))(minus(m)(n)))(succ(m))(n))(n)(two)(n)(plus(f(minus(n)(one)))(f(minus(n)(two))))))
fibonacci = Y(lambda f: PIR(lambda n: (lambda m: lambda n: (lambda m: lambda n: (lambda n: n(lambda x: (lambda x: lambda y: y))(lambda x: lambda y: x))(minus(m)(n)))(succ(m))(n))(n)(two)(n)(plus(f(minus(n)(one)))(f(minus(n)(two))))))
fibonacci = Y(lambda f: PIR(lambda n: (lambda m: lambda n: (lambda m: lambda n: (lambda n: n(lambda x: (lambda x: lambda y: y))(lambda x: lambda y: x))((lambda m: lambda n: n(pred)(m))(m)(n)))(succ(m))(n))(n)(two)(n)(plus(f((lambda m: lambda n: n(pred)(m))(n)(one)))(f((lambda m: lambda n: n(pred)(m))(n)(two))))))
fibonacci = Y(lambda f: PIR(lambda n: (lambda m: lambda n: (lambda m: lambda n: (lambda n: n(lambda x: (lambda x: lambda y: y))(lambda x: lambda y: x))((lambda m: lambda n: n(pred)(m))(m)(n)))(succ(m))(n))(n)(two)(n)(plus(f((lambda m: lambda n: n(pred)(m))(n)(one)))(f((lambda m: lambda n: n(pred)(m))(n)(two))))))
fibonacci = Y(lambda f: PIR(lambda n: (lambda m: lambda n: (lambda m: lambda n: (lambda n: n(lambda x: (lambda x: lambda y: y))(lambda x: lambda y: x))((lambda m: lambda n: n((lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)))(m))(m)(n)))(succ(m))(n))(n)(two)(n)(plus(f((lambda m: lambda n: n((lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)))(m))(n)(one)))(f((lambda m: lambda n: n((lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)))(m))(n)(two))))))
fibonacci = Y(lambda f: PIR(lambda n: (lambda m: lambda n: (lambda m: lambda n: (lambda n: n(lambda x: (lambda x: lambda y: y))(lambda x: lambda y: x))((lambda m: lambda n: n((lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)))(m))(m)(n)))(succ(m))(n))(n)(two)(n)((lambda m: lambda n: lambda f: lambda x: m(f)(n(f)(x)))(f((lambda m: lambda n: n((lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)))(m))(n)(one)))(f((lambda m: lambda n: n((lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)))(m))(n)(two))))))
fibonacci = Y(lambda f: PIR(lambda n: (lambda m: lambda n: (lambda m: lambda n: (lambda n: n(lambda x: (lambda x: lambda y: y))(lambda x: lambda y: x))((lambda m: lambda n: n((lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)))(m))(m)(n)))(succ(m))(n))(n)(succ(one))(n)((lambda m: lambda n: lambda f: lambda x: m(f)(n(f)(x)))(f((lambda m: lambda n: n((lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)))(m))(n)(lambda f: lambda x: f(x))))(f((lambda m: lambda n: n((lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)))(m))(n)(succ(one)))))))
fibonacci = Y(lambda f: PIR(lambda n: (lambda m: lambda n: (lambda m: lambda n: (lambda n: n(lambda x: (lambda x: lambda y: y))(lambda x: lambda y: x))((lambda m: lambda n: n((lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)))(m))(m)(n)))((lambda n: lambda f: lambda x: f(n(f)(x)))(m))(n))(n)((lambda n: lambda f: lambda x: f(n(f)(x)))(lambda f: lambda x: f(x)))(n)((lambda m: lambda n: lambda f: lambda x: m(f)(n(f)(x)))(f((lambda m: lambda n: n((lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)))(m))(n)(lambda f: lambda x: f(x))))(f((lambda m: lambda n: n((lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)))(m))(n)((lambda n: lambda f: lambda x: f(n(f)(x)))(lambda f: lambda x: f(x))))))))
fibonacci = (lambda f: (lambda x: x(x))(lambda y: f(lambda z: y(y)(z))))(lambda f: PIR(lambda n:(lambda m: lambda n: (lambda m: lambda n: (lambda n: n(lambda x: (lambda x: lambda y: y))(lambda x: lambda y: x))((lambda m: lambda n: n((lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)))(m))(m)(n)))((lambda n: lambda f: lambda x: f(n(f)(x)))(m))(n))(n)((lambda n: lambda f: lambda x: f(n(f)(x)))(lambda f: lambda x: f(x)))(n)((lambda m: lambda n: lambda f: lambda x: m(f)(n(f)(x)))(f((lambda m: lambda n: n((lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)))(m))(n)(lambda f: lambda x: f(x))))(f((lambda m: lambda n: n((lambda n: lambda f: lambda x: n(lambda g: lambda h: h(g(f)))(lambda u: x)(lambda u: u)))(m))(n)((lambda n: lambda f: lambda x: f(n(f)(x)))(lambda f: lambda x: f(x))))))))
```

The ultimate expanded version of this function works just as well and runs just as fast. Although it does nothing but create closures and call functions, somehow in the end it carries out the computation of a mathematical function. True, it is a little hard to read in this form, but so is machine code.

Here is a visualization of the full Fibonacci program (click for a larger image.) This shows every node (either a variable, function call, or lambda abstraction) as a binary tree.

It reminds me a lot of this XKCD comic strip about LISP where `cons`

is taken as atomic. Of course, the pair data structure can easily be defined in the \(\lambda\)-calculus as well, making lambda abstractions even more than fundamental `cons`

. The essential insight remains the same: all programs and data structures can be reduced to binary trees and all information about the program is somehow contained in the very structure of the tree itself.

## Conclusion

All models of computation are equal, but some are more equal than others. In theory, the \(\lambda\)-calculus is only a constant factor away from any other model of computation, but the same is not true for the Church encoding: Church numerals are useful because they are very easy to “bootstrap;” that is to say, to implement in terms of lower level primitives, while on the other hand it is very difficult to implement more efficient numbers without data structures and a ready supply of distinct symbols, which the Church numerals provide.

It took us only a few dozen definitions to go from something so spartan that it seemed to be missing every convenience of modern programming, to a useful language with recursion, for loops, if/else statements, and recursive function calls. Things like closures, Currying, functors/decorators, which are considered advanced features in other languages, we somehow got for free.

If we had carried on defining signed numbers, pairs, lists, associative maps, and so on, this parsimony would continue and after a few hundred definitions - rather smaller than the standard library of most languages - we would have a perfectly functional language. (Pun *absolutely* intended.)

Some languages like Forth and io in fact take this exact approach: the language is absolutely minimal and almost everything is moved to the standard library, including the definition of standard constructs like `for`

and `if/else`

.

If computation is so simple, why are modern languages so complicated? It mostly boils down to the impedance mismatch between the \(\lambda\)-calculus and the kinds of electronic computers we can actually build. We can’t actually make Church numerals fast, but we *can* make a machine which can perform mathematical operations on two 64 bit numbers in a single cycle. We can’t write a performant \(\beta\)-reducer, but we *can* write a block of machine code which calls functions by using a calling convention: pops a certain number arguments off the stack when called and returns a value on the stack or in a particular register. And so on.

For reasons that aren’t entirely clear to me, the \(\lambda\)-calculus has been a good model for thinking about high-level, expressive ways of computing, while the Turing machine has been a good model for designing actual computing machines. While in theory we can always switch from one model of computation to another, in practice this can involve a three or four order of magnitude reduction is performance. Practical programming languages have to walk the line between exposing the real capabilities of the machine while also providing useful high level abstractions.

Which isn’t to say there aren’t practical languages very much inspired by and very close to the \(\lambda\)-calculus. LISP and Haskell come to mind. After watching some of the SICP lectures on Youtube, I’m convinced the only reason Harold Abelson didn’t just teach the whole course in pure \(\lambda\)-calculus is because computers back then were still slow enough that it would have been just a little too painful, and LISP was chosen as a compromise. (Unfortunately, as we’ve seen today, this is still true. But maybe someday…) Many JavaScript programs too, particularly those that create lots of closures to handle asynchronous callbacks, seem to me to be much closer in spirit to the \(\lambda\)-calculus (as opposed to something like C, which uses a pointer machine model which is only slightly higher level than a Turing machine.) It’s never been more important to expose CS students to the idea and modes of thinking inspired by the \(\lambda\)-calculus and I think it’s importance will only grow as we can better afford the cost of abstractions.

As always, all the source code for this article is free and open source.

## Postscript

Reading the “macro expanded version” of the function out loud, the similarity between “lambda” and “llama” made me think of the children’s book “Llama llama red pajama, a poem about a baby llama and his mama. I translated the above function declaration into a made up agglutinative language to produce this nonsense bedtime rhyme… perhaps the only one in the world which is also a \(\lambda\)-calculus program.

Hey llama baby hey llama mama hey

Ma sleep sleep hey llama ba baby

Hey llama ta ba hey ba sleep hey

Ta sleep sleep sleep sleep hey

Llama baby llama na hey llama ga

Llama na hey llama ga llama na

Hey llama na na hey llama ma hey

Llama ma llama ba ba sleep sleep

Hey llama ma llama ba ma sleep

Sleep hey hey llama ga llama na

Na hey hey llama na llama baby

Llama ma na hey llama red pajama

Llama haha hey red pajama hey

Baby sleep sleep sleep hey llama

Drama sleep hey llama dra dra

Sleep sleep sleep hey ga sleep

Sleep hey ga sleep hey na sleep

Sleep sleep hey hey llama na

Llama baby llama ma baby hey na

Hey baby sleep hey ma sleep sleep

Sleep hey ga sleep sleep hey na

Sleep sleep hey na sleep hey hey

Llama na llama baby llama ma baby

Hey na hey baby sleep hey ma

Sleep sleep sleep hey llama baby

Llama ma baby hey ma sleep sleep

Sleep hey na sleep hey hey llama

Ga llama na llama baby llama ma

Ga hey baby sleep hey na hey baby

Sleep hey ma sleep sleep sleep

Hey baby hey hey llama ga llama

Na na hey hey llama na llama baby

Llama ma na hey llama red pajama

Llama haha hey red pajama hey

Baby sleep sleep sleep hey llama

Drama sleep hey llama dra dra

Sleep sleep sleep hey ga sleep

Sleep hey na sleep hey llama baby

Llama ma baby hey ma sleep sleep

Sleep sleep hey baby hey hey

Llama ga llama na na hey hey

Llama na llama baby llama ma na

Hey llama red pajama llama haha

Hey red pajama hey baby sleep

Sleep sleep hey llama drama sleep

Hey llama dra dra sleep sleep

Sleep hey ga sleep sleep hey na

Sleep hey hey llama na llama baby

Llama ma baby hey na hey baby

Sleep hey ma sleep sleep sleep

Hey llama baby llama ma baby hey ma

Sleep sleep sleep sleep sleep sleep sleep