(INPROGRESS) Comments on "A Geneaology of Functional Programming Languages"
Last-Edit: 2026-01-21 07:32:41
This page provides some written comments in addition to my slides used
at the Dutch Functional Programming Day 2026
where I presented a brief overview of the
history of functional programming.
On this page I want to extend my slides with a few more written
comments and links that might assist in reading the slides.
The high-level goal of my presentation is to give a critical
perspective on the understanding of functional programming by going
through the history of concepts associated with functional programming
languages.
The narrative presented here is based on my own research of the
available historical records, as well as private exchanges. If you
have any doubts w.r.t. anything presented here,
please complain!
Conventional Summary of the History of Functional Programming
-
In summaries of and introductions to
functional programming
it
it common to be presented a brief summary of the history. Usually, it
proceeds in a few steps: λ-Calculus, Lisp, ML, Haskell.
-
Optionally it is also common to see that the list may be extended
by the mainstream arrival of features known from functional languages
(anonymous functions, higher-order functions, immutable data
structures, pattern matching, etc.) in object oriented and imperative
programming languages. Prominent examples include Java
8's Stream
library, C++
11's λ syntax
or pattern matching
in Rust
or Python.
-
As is to be expected, this presentation is an oversimplification:
It doesn't mention a number of important contributions but also
glosses over a the development of concepts which in retrospect appear
to be obvious but that actually constitute a real improvement in the
understanding of programming as a problem.
Background in Formal Systems
-
The history usually starts here, where the λ-Calculus contrast
Turing Machines. The implication is the that dichotomy between
imperative and functional languages precedes even the first automatic
computers, and as such represent a fundamental divide in the approach
to programming (exaggerated).
-
Yet at the same time we should remind ourselves that formal systems
(TM, λ-Calculus, Post-Systems, etc.) are not
programming languages and do not deal with the specific issues that
arose in the field of programming language implementations in the
1950s. These include: Questions of type-setting, character sets,
finite memory, bugs in interpreters or compilers, issues of
portability, dealing with dependencies and issues stemming from the
historical development of a language that has to deal with legacy code
and attempts to improve a language despite the existence of real-world
code that would have to be adapted to these changes. In essence,
there is a significant difference in the scale of application between
formal systems and programming systems, that imply a different set of
issues that each has to deal with.
-
Nevertheless, the formal systems mentioned here ((un)typed
λ-Calculus and the system of combinators) have had
an influence on the design of programming languages.
Specifically the question of whether or not to
bind
variables
is already posed by 1936.
A Sketch of the History of Programming Languages
-
The history of functional programming languages is situated within the
history of programming languages in general. I elide the details of
this history here, and only wish to highlight the guiding
questions of programming language design and research since the
1950s.
-
My framing starts with the 1950s where the means to implement a
language but also the understanding of what a a (good) programming
language requires were still lacking. These developments were
documented
by Knuth
et. al.. Examples include FORTRAN, LISP 1 and ALGOL 58/60.
-
The 1960s stand in the shadow of ALGOL 60 (
[…] improvement on
its predecessors, but also on on nearly all its successors.
) and
much creativity goes into thinking about iterations on the premises
and limitations that ALGOL sets up. Examples include ALGOL 68, ALGOL
W then Pascal, PL/1, CPL (which begets BCPL (which begets B (which
eventually begets C (…))). At the same time these languages still have
to deal with the diversity of computing systems and approaches
to HCI, and are
therefore frequently non-portable or limited to specific sites.
-
The 1970s exhibit two trends: An interest in actual
portability (recall that ALGOL 60 did not have a
prescribed
physical appearance
, significantly meaning that character sets or
how to designate keywords were not fixed). As a consequence,
an interest in languages with small cores and portable, self-hosted
standard libraries (if any) mostly written in the languages themselves
become interesting. Examples here include C, Forth, Smalltalk,
Scheme.
-
Towards the 1980s more thought was being placed on the tools that a
language can provide to make itself useful: Sophisticated debugging
tools, performance tracing, comprehensive standard libraries intend to
improve the productivity of programmers by providing them with more
power and expressivity. Think of C++, Common Lisp and Ada. Parallel
trends include iterations on BASIC-like languages that were popular
for microcomputers-users and the research on functional languages
mentioned later on (decedents of ML and KRC).
-
From the 1990s onward, where an increasing number of professional
— but not academically trained — computer programmers
coincides with an increasing demand for computer systems (including
but certainly not limited to the rise of the WWW), two trends stand in
contention: The
time to market
-driven mentality has an interest
in interpreted languages that allow for fast prototyping of systems
(e.g. Perl, Ruby, Python), while at the same time the problem of
organizing large teams and ensuring collaboration creates interest in
languages that can prevent mistakes using tools previously intended to
increase the programmers power (this is reflected in the general
interest in statically typed and analyzable languages that ensure
safely properties).
On Lisp
-
Lisp, initially conceived as a another formalism by McCarthy, and then
later implemented on the IBM 704
by Russell
after noticing that the universal function
eval could
serve as an interpreter.
-
While it is true that Lisp was inspired by a mathematical theory of
computation
(Kleene's
recursive theory of recursive functions), as opposed to the
conventional approach of the time of building on the instruction
language of the computer architecture, the approach taken by McCarthy
is distinctive from later functional programming languages and the
λ-Calculus prior to it. In
fact, McCarthy
reflects:
To use functions as arguments, one needs a notation for functions, and
it seemed natural to use the λ-notation of Church (1941). I
didn't understand the rest of his book, so I wasn't tempted
to try to implement his more general mechanism for defining
functions. Church used higher order functionals instead of using
conditional expressions. Conditional expressions are much more readily
implemented on computers.
-
While the essence of LISP is based on symbols and cons-cells (compared
to the λ-Calculus where functional abstractions and bound
variables serve as the
ontological
foundation), the reality of
LISP as a programming language was that it was intended to be
much closer to the conventions at the time, for instance FORTRAN and
later ALGOL: Despite support for recursion, which was used in the
demonstration of the eval-apply
meta-interpreter (Alan Kay refers to this as the Maxwell equations
of computer science
, where the analogy is that the interaction
between the two functions rhymes with how electric and magnetic fields
are intertwined), in practice imperative programming
using progn, prog
and go
(GOTO) were heavily used.
-
Furthermore, the variable binding policy in LISP is unlike that
familiar to us from the λ-Calculus. The latter,
lexical
binding
ensures that a variable is bound analogously to the
program structure, while dynamic
or special binding
only
tracks a stack of variable bindings as they occur during evaluation.
In the following example
(defun bar (x) y)
(defun foo (y) (foo nil))
when invoking the function foo that in turn
invokes bar, it turns out that returning the value of the
variable y turns out not to be an issue, since the
variable was bound in foo as a parameter. So LISP
did not
implement closures.
-
It is not until 1975 where Sussman and Steele propose Scheme,
a dialect of Lisp that was more explicitly modeled on Church'
λ-Calculus, meaning that
lambda evaluates to a
closure and variables are lexically scoped. The context was that
whilst working on Hewitt's recently
proposed Actor
model for concurrent computation, they realized that the
constructors for actors (α) and functions (λ) were
equivalent. In later revisions of the Scheme report the concurrency
discussions were entirely dropped.
-
Emphasis on symbolic computation allowed for LISP to excel in the
meta-programming space, with
the introduction
of macros in 1963. This is related to the
Language ≅
Encoding
point: The encoding of a program for the
eval-apply meta-interpreter (universal
function) is structurally equivalent to the function itself. Note
that this is not the case anymore when Scheme and subsequent
implementations introduce Closures, as this breaks intensional
equality of functions. Nevertheless, regardless of the default
variable binding approach, the ease of dealing with and manipulating
the syntax of the language within/using the language
turned out to be of great convenience when extending the language with
constructs that it previously did not support. LISP therefore was
able to adapt to new ideas, from structured programming
(the loop
macro is a prominent example of a macro that takes inspiration from
ALGOL 68-like DO construct) and object-oriented
programming
(CLOS).
If You See What I Mean
-
In a series of papers, among others dealing with the relation
of ALGOL 60
to the λ-Calculus, Peter Landin approached and finally
published his
proposal for a family of programming languages
named ISWIM.
-
Syntactically, ISWIM is derived from ALGOL but drops most keywords
such as
BEGIN and END and using indentation
for scoping (Landin calls this the offset-rule
). It should be
noted that ISWIM is strictly a family of languages, that do not make
strong commitments to the actual presentation, similar to ALGOL 60.
The resulting languages appear strikingly similar to most the coming
programming languages in the functional tradition.
-
A particularly influential syntactic innovation is
the
where notation, and the derived let
notation derived as syntax sugar. Without going into the
λ-Calculus, Landin gives a definition of where in
terms of substitution, interestingly extended with a weak form of
pattern matching that can destruct a list of bindings at once.
Pattern Matching
-
Structural pattern matching (SPM) as a control-flow in the form known to us
goes back to the research of Rodney Burstall. This constitutes a
significant departure from ALGOL's control structure: Instead of
controlling forking execution using
IF and iterating
execution using WHILE (or equivalent), the programmer
generalities the former using CASE
expressions on variables of algebraic data-types — in fact, ALGOL's
IF simply becomes syntax sugar for a CASE expression
if we require that the condition be of a boolean type — and makes iteration
explicit using recursive invocations. Sequencing is not necessarily
affected by this perspective-shift. Imperative languages usually have
this as a primitive construct, while functional languages can
replicate sequencing using λ-expressions depending on the
evaluation strategy.
-
The first language where SPM was used in this manner was by the POP
family of, ALGOL-like languages with functional aspects. The
motivation in introducing SPM was to aid reasoning about the
correctness
of programs. These developments later led to the development of the
language NPL and later HOPE, named after a Scottish agricultural reformer.
-
Burstall would later participate in the effort to standardize Standard
ML, that would extend Milner's ML with the
case
expression we now associate with the language family.
-
The more general notion of pattern matching had already existed prior
to Burstall's work. The closes example appears to have been
Valentin
Turchin's
Refal
language, developed in the Soviet Union in 1968. Details on the
pattern matching behavior can be found in
the manual. Earlier
still at Bell Laboratories the
the SNOBOL
language (see the
so-called Green
Book) uses a form of pattern matching on input data that also
influences the control-flow of the language, yet in a much more
imperative way than the previously discussed languages. Later on,
Colmerauer's Prolog
(1972), which descended from
Hewitt's (Micro-)Planner
system, would also use pattern matching by means of unification to
control evaluation, yet contrary to the approach taken
Burstall's, Prolog pattern matching has an open world
approach and does not try to promise exhaustiveness of the case
analysis.
The Meta Language
-
Robin Milner, while in Stanford, worked on an implementation of Dana
Scott's Logic
of Computable Functions (LCF)
named Logic for
Computable Functions (also, LCF), an early automated theorem
prover based on domain theory. An early form of ML was used as the
language to express proofs in this system. A summary of how this
proto-ML (also referred to as
LCF/ML
) differed from the
contemporary conception of a ML-ish language is that ML was a typed
Lisp, with the core contribution being that Milner implemented type
inference, which had previously already been independently described
by Hindley in 1969. Notably, ML at this
staged lacked pattern matching. Sum types had
projections, as though they were products, and discriminators
that would be used to determine which side of the product was
inhibited. Further details on the history of LCF, which laid the
groundwork for later systems such as HOL, can be
found here.
-
LCF was moved between institutions, first from Stanford to Edinburgh,
then to Cambridge
where Gérard
Huet was visiting. Along
with Lawrence
Paulson they implemented a compiler for ML, that would replace the
previous implementation that been written in a dialect of Lisp (on a
PDP-10 system), resulting in a significant speedup of the performance.
Huet would later on go to develop CAML. CAML would go on to have two
implementations, the first by Ascánder Suárez written in Lisp, and
later re-implemented by Xavier
Leroy
and Damien
Doligez to develop
Caml Light
, that would later on
become OCaml (1996). OCaml in turn
would be influence further languages such
as F# and be used to implement
the Rocq proof system.
-
Parallel to the french developments, Milner led an effort to harmonize
parallel developments of ML from different sites, but also integrate
ideas from HOPE (inductive data types and SPM as control flow).
Starting in 1982, multiple meetings and proposals would be made for
the language that would later be known as
Standard ML
.
For an authoritative history of ML and later on Standard ML, please
consult The
History of Standard ML, or Xavier Leroy's presentation
on 25 years of
OCaml.
The Parallel Lineage of FP
-
John Backus, known primarily for his early work on FORTRAN and for
contributing the
Backus-Naur Form
notation for context free
languages in the context of the ALGOL standardization effort, was
awarded the ACM Turing Award in 1977. In this acceptance lecture,
Can
Programming Be Liberated from the von Neumann Style? A Functional
Style and Its Algebra of Programs he introduced a language FP,
that represented the research of his group at IBM.
-
Contrary to the previously discussed examples of languages descending
from Lisp and ISWIM, and in turn having a more or less vague relation
to the λ-Calculus, FP was inspired
by Ken
Iverson's APL.
This influence expressed itself in an emphasis on point-free
composition of functions, along with
functional forms
. These
include Insert (denoted by /
, performing
a fold
operation), ApplyToAll (denoted by α
,
i.e. mapping) and also Compose (denoted by ○
).
-
A few comments on APL: Iverson
-
Backus would later publish FL
(1989), integrating influences from ML (types, exceptions, IO), but
both the languages and the paradigm failed to gain a hold, despite
playing an influence in nurturing an interest in functional
programming in a more general sense in the wider world of computer
science and programming.
Contributions of David Turner
Towards Haskell
Further Languages Worthy of Mention
Closing Comments
Postscript: History vs. Genealogy
The title of this presentation intentionally uses the
term Genealogy
in two senses:
-
The literal interpretation of studying the development of genes, so to
speak, of the attributes that are commonly associated with functional
programming.
-
In reference to
the philosophical
method commonly associated with Nietzsche and Foucault. The
intention here is to expose the conditions and coincidences that led
to the historical development of ideas that ear conventionally
perceived to be timeless.
Philip Kaludercic