(INPROGRESS) Comments on "A Geneaology of Functional Programming Languages"

Last-Edit: 2026-02-17 16:41:43

This page provides some written comments in addition to my slides used at the Dutch Functional Programming Day 2026 where I presented a brief overview of the history of functional programming.

On this page I want to extend my slides with a few more written comments and links that might assist in reading the slides.

The high-level goal of my presentation is to give a critical perspective on the understanding of functional programming by going through the history of concepts associated with functional programming languages.

The narrative presented here is based on my own research of the available historical records, as well as private exchanges. If you have any doubts w.r.t. anything presented here, please complain!

Conventional Summary of the History of Functional Programming

In summaries of and introductions to functional programming it it common to be presented a brief summary of the history. Usually, it proceeds in a few steps: λ-Calculus, Lisp, ML, Haskell.
Optionally it is also common to see that the list may be extended by the mainstream arrival of features known from functional languages (anonymous functions, higher-order functions, immutable data structures, pattern matching, etc.) in object oriented and imperative programming languages. Prominent examples include Java 8's Stream library, C++ 11's λ syntax or pattern matching in Rust or Python.
As is to be expected, this presentation is an oversimplification: It doesn't mention a number of important contributions but also glosses over a the development of concepts which in retrospect appear to be obvious but that actually constitute a real improvement in the understanding of programming as a problem.

Background in Formal Systems

The history usually starts here, where the λ-Calculus contrast Turing Machines. The implication is the that dichotomy between imperative and functional languages precedes even the first automatic computers, and as such represent a fundamental divide in the approach to programming (exaggerated).
Yet at the same time we should remind ourselves that formal systems (TM, λ-Calculus, Post-Systems, etc.) are not programming languages and do not deal with the specific issues that arose in the field of programming language implementations in the 1950s. These include: Questions of type-setting, character sets, finite memory, bugs in interpreters or compilers, issues of portability, dealing with dependencies and issues stemming from the historical development of a language that has to deal with legacy code and attempts to improve a language despite the existence of real-world code that would have to be adapted to these changes. In essence, there is a significant difference in the scale of application between formal systems and programming systems, that imply a different set of issues that each has to deal with.
Nevertheless, the formal systems mentioned here ((un)typed λ-Calculus and the system of combinators) have had an influence on the design of programming languages. Specifically the question of whether or not to bind variables is already posed by 1936.

A Sketch of the History of Programming Languages

The history of functional programming languages is situated within the history of programming languages in general. I elide the details of this history here, and only wish to highlight the guiding questions of programming language design and research since the 1950s.
My framing starts with the 1950s where the means to implement a language but also the understanding of what a a (good) programming language requires were still lacking. These developments were documented by Knuth et. al.. Examples include FORTRAN, LISP 1 and ALGOL 58/60.
The 1960s stand in the shadow of ALGOL 60 ([…] improvement on its predecessors, but also on on nearly all its successors.) and much creativity goes into thinking about iterations on the premises and limitations that ALGOL sets up. Examples include ALGOL 68, ALGOL W then Pascal, PL/1, CPL (which begets BCPL (which begets B (which eventually begets C (…))). At the same time these languages still have to deal with the diversity of computing systems and approaches to HCI, and are therefore frequently non-portable or limited to specific sites.
The 1970s exhibit two trends: An interest in actual portability (recall that ALGOL 60 did not have a prescribed physical appearance, significantly meaning that character sets or how to designate keywords were not fixed). As a consequence, an interest in languages with small cores and portable, self-hosted standard libraries (if any) mostly written in the languages themselves become interesting. Examples here include C, Forth, Smalltalk, Scheme.
Towards the 1980s more thought was being placed on the tools that a language can provide to make itself useful: Sophisticated debugging tools, performance tracing, comprehensive standard libraries intend to improve the productivity of programmers by providing them with more power and expressivity. Think of C++, Common Lisp and Ada. Parallel trends include iterations on BASIC-like languages that were popular for microcomputers-users and the research on functional languages mentioned later on (decedents of ML and KRC).
From the 1990s onward, where an increasing number of professional — but not academically trained — computer programmers coincides with an increasing demand for computer systems (including but certainly not limited to the rise of the WWW), two trends stand in contention: The time to market-driven mentality has an interest in interpreted languages that allow for fast prototyping of systems (e.g. Perl, Ruby, Python), while at the same time the problem of organizing large teams and ensuring collaboration creates interest in languages that can prevent mistakes using tools previously intended to increase the programmers power (this is reflected in the general interest in statically typed and analyzable languages that ensure safely properties).

On Lisp

Lisp, initially conceived as a another formalism by McCarthy, and then later implemented on the IBM 704 by Russell after noticing that the universal function eval could serve as an interpreter.
While it is true that Lisp was inspired by a mathematical theory of computation (Kleene's recursive theory of recursive functions), as opposed to the conventional approach of the time of building on the instruction language of the computer architecture, the approach taken by McCarthy is distinctive from later functional programming languages and the λ-Calculus prior to it. In fact, McCarthy reflects:
To use functions as arguments, one needs a notation for functions, and it seemed natural to use the λ-notation of Church (1941). I didn't understand the rest of his book, so I wasn't tempted to try to implement his more general mechanism for defining functions. Church used higher order functionals instead of using conditional expressions. Conditional expressions are much more readily implemented on computers.
While the essence of LISP is based on symbols and cons-cells (compared to the λ-Calculus where functional abstractions and bound variables serve as the ontological foundation), the reality of LISP as a programming language was that it was intended to be much closer to the conventions at the time, for instance FORTRAN and later ALGOL: Despite support for recursion, which was used in the demonstration of the eval-apply meta-interpreter (Alan Kay refers to this as the Maxwell equations of computer science, where the analogy is that the interaction between the two functions rhymes with how electric and magnetic fields are intertwined), in practice imperative programming using progn, prog and go (GOTO) were heavily used.
Furthermore, the variable binding policy in LISP is unlike that familiar to us from the λ-Calculus. The latter, lexical binding ensures that a variable is bound analogously to the program structure, while dynamic or special binding only tracks a stack of variable bindings as they occur during evaluation. In the following example (defun bar (x) y) (defun foo (y) (foo nil)) when invoking the function foo that in turn invokes bar, it turns out that returning the value of the variable y turns out not to be an issue, since the variable was bound in foo as a parameter. So LISP did not implement closures.
It is not until 1975 where Sussman and Steele propose Scheme, a dialect of Lisp that was more explicitly modeled on Church' λ-Calculus, meaning that lambda evaluates to a closure and variables are lexically scoped. The context was that whilst working on Hewitt's recently proposed Actor model for concurrent computation, they realized that the constructors for actors (α) and functions (λ) were equivalent. In later revisions of the Scheme report the concurrency discussions were entirely dropped.
Emphasis on symbolic computation allowed for LISP to excel in the meta-programming space, with the introduction of macros in 1963. This is related to the Language ≅ Encoding point: The encoding of a program for the eval-apply meta-interpreter (universal function) is structurally equivalent to the function itself. Note that this is not the case anymore when Scheme and subsequent implementations introduce Closures, as this breaks intensional equality of functions. Nevertheless, regardless of the default variable binding approach, the ease of dealing with and manipulating the syntax of the language within/using the language turned out to be of great convenience when extending the language with constructs that it previously did not support. LISP therefore was able to adapt to new ideas, from structured programming (the loop macro is a prominent example of a macro that takes inspiration from ALGOL 68-like DO construct) and object-oriented programming (CLOS).

If You See What I Mean

In a series of papers, among others dealing with the relation of ALGOL 60 to the λ-Calculus, Peter Landin approached and finally published his proposal for a family of programming languages named ISWIM.
Syntactically, ISWIM is derived from ALGOL but drops most keywords such as BEGIN and END and using indentation for scoping (Landin calls this the offset-rule). It should be noted that ISWIM is strictly a family of languages, that do not make strong commitments to the actual presentation, similar to ALGOL 60. The resulting languages appear strikingly similar to most the coming programming languages in the functional tradition.
A particularly influential syntactic innovation is the where notation, and the derived let notation derived as syntax sugar. Without going into the λ-Calculus, Landin gives a definition of where in terms of substitution, interestingly extended with a weak form of pattern matching that can destruct a list of bindings at once.

Pattern Matching

Structural pattern matching (SPM) as a control-flow in the form known to us goes back to the research of Rodney Burstall. This constitutes a significant departure from ALGOL's control structure: Instead of controlling forking execution using IF and iterating execution using WHILE (or equivalent), the programmer generalities the former using CASE expressions on variables of algebraic data-types — in fact, ALGOL's IF simply becomes syntax sugar for a CASE expression if we require that the condition be of a boolean type — and makes iteration explicit using recursive invocations. Sequencing is not necessarily affected by this perspective-shift. Imperative languages usually have this as a primitive construct, while functional languages can replicate sequencing using λ-expressions depending on the evaluation strategy.
The first language where SPM was used in this manner was by the POP family of, ALGOL-like languages with functional aspects. The motivation in introducing SPM was to aid reasoning about the correctness of programs. These developments later led to the development of the language NPL and later HOPE, named after a Scottish agricultural reformer.
Burstall would later participate in the effort to standardize Standard ML, that would extend Milner's ML with the case expression we now associate with the language family.
The more general notion of pattern matching had already existed prior to Burstall's work. The closes example appears to have been Valentin Turchin's Refal language, developed in the Soviet Union in 1968. Details on the pattern matching behavior can be found in the manual. Earlier still at Bell Laboratories the the SNOBOL language (see the so-called Green Book) uses a form of pattern matching on input data that also influences the control-flow of the language, yet in a much more imperative way than the previously discussed languages. Later on, Colmerauer's Prolog (1972), which descended from Hewitt's (Micro-)Planner system, would also use pattern matching by means of unification to control evaluation, yet contrary to the approach taken Burstall's, Prolog pattern matching has an open world approach and does not try to promise exhaustiveness of the case analysis.

The Meta Language

Robin Milner, while in Stanford, worked on an implementation of Dana Scott's Logic of Computable Functions (LCF) named Logic for Computable Functions (also, LCF), an early automated theorem prover based on domain theory. An early form of ML was used as the language to express proofs in this system. A summary of how this proto-ML (also referred to as LCF/ML) differed from the contemporary conception of a ML-ish language is that ML was a typed Lisp, with the core contribution being that Milner implemented type inference, which had previously already been independently described by Hindley in 1969. Notably, ML at this staged lacked pattern matching. Sum types had projections, as though they were products, and discriminators that would be used to determine which side of the product was inhibited. Further details on the history of LCF, which laid the groundwork for later systems such as HOL, can be found here.
LCF was moved between institutions, first from Stanford to Edinburgh, then to Cambridge where Gérard Huet was visiting. Along with Lawrence Paulson they implemented a compiler for ML, that would replace the previous implementation that been written in a dialect of Lisp (on a PDP-10 system), resulting in a significant speedup of the performance. Huet would later on go to develop CAML. CAML would go on to have two implementations, the first by Ascánder Suárez written in Lisp, and later re-implemented by Xavier Leroy and Damien Doligez to develop Caml Light, that would later on become OCaml (1996). OCaml in turn would be influence further languages such as F# and be used to implement the Rocq proof system.
Parallel to the french developments, Milner led an effort to harmonize parallel developments of ML from different sites, but also integrate ideas from HOPE (inductive data types and SPM as control flow). Starting in 1982, multiple meetings and proposals would be made for the language that would later be known as Standard ML.

For an authoritative history of ML and later on Standard ML, please consult The History of Standard ML, or Xavier Leroy's presentation on 25 years of OCaml.

The Parallel Lineage of FP

John Backus, known primarily for his early work on FORTRAN and for contributing the Backus-Naur Form notation for context free languages in the context of the ALGOL standardization effort, was awarded the ACM Turing Award in 1977. In this acceptance lecture, Can Programming Be Liberated from the von Neumann Style? A Functional Style and Its Algebra of Programs he introduced a language FP, that represented the research of his group at IBM.
Contrary to the previously discussed examples of languages descending from Lisp and ISWIM, and in turn having a more or less vague relation to the λ-Calculus, FP was inspired by Ken Iverson's APL. This influence expressed itself in an emphasis on point-free composition of functions, along with functional forms. These include Insert (denoted by /, performing a fold operation), ApplyToAll (denoted by α, i.e. mapping) and also Compose (denoted by ○).
A few comments on APL: Iverson
Backus would later publish FL (1989), integrating influences from ML (types, exceptions, IO), but both the languages and the paradigm failed to gain a hold, despite playing an influence in nurturing an interest in functional programming in a more general sense in the wider world of computer science and programming.

Contributions of David Turner

Towards Haskell

Further Languages Worthy of Mention

Closing Comments

Postscript: History vs. Genealogy

The title of this presentation intentionally uses the term Genealogy in two senses:

The literal interpretation of studying the development of genes, so to speak, of the attributes that are commonly associated with functional programming.
In reference to the philosophical method commonly associated with Nietzsche and Foucault. The intention here is to expose the conditions and coincidences that led to the historical development of ideas that ear conventionally perceived to be timeless.

Philip Kaludercic