\documentclass{llncs}
\usepackage{alltt}
\usepackage{amssymb}
\usepackage{graphicx}
\raggedbottom
\def\HOL{{\small HOL}}
\def\smalltt#1{{\small\texttt{#1}}}
\def\toolong{$\hspace{-100cm}$}
\def\legendre{\overwithdelims()}
\def\mod#1{\mathrel{(\mathop{\rm mod} {#1})}}
\begin{document}
\title{Formal proof -- getting started}
\author{Freek Wiedijk}
\institute{
Institute for Computing and Information Sciences \\
Radboud University Nijmegen \\
Toernooiveld 1, 6525 ED Nijmegen, The Netherlands}
\maketitle
\section{A list of 100 theorems}
Today highly non-trivial mathematics is routinely being
encoded in the computer, ensuring a reliability that is
orders of a magnitude larger than if one had just used human minds.
Such an encoding is called a \emph{formalization},
and a program that checks such a formalization for correctness
is called a \emph{proof assistant}.
Suppose you have proved a theorem and you want to make {certain}
that there are no mistakes in the proof.
Maybe already a couple of
times a mistake has been found and you want to make sure that
that will not happen again.
Maybe you fear that your intuition is misleading you and want
to make sure that this is not the case.
Or maybe you just want to bring
your proof into the most pure and complete form possible.
We will explain in this article how to go about this.
Although formalization has become a routine activity,
it still is labor intensive.
Using current technology, a formalization
will be roughly four times the size of a corresponding
informal {\LaTeX} proof (this ratio is called the \emph{de Bruijn factor}), and it will take almost a full week to
formalize a single page from an undergraduate mathematics
textbook.
The first step towards a formalization of a proof consists
of deciding which proof assistant to use.
For this it is useful to know which proof assistants have shown
to be practical for formalization.
On the web page \cite{web:100}
there is a list that keeps track of the formalization status of a hundred well-known theorems.
Here are the first few entries on that list:
%together with
%the number of systems in which the theorem has been formalized:
\begin{center}
\smallskip
\begin{tabular}{rlr}
\hbox to 0pt{\hss\rlap{\emph{theorem}}\phantom{0.}} && \llap{\emph{number of systems in which the theorem has been formalized}} \\
\noalign{\smallskip}
\hline
\noalign{\smallskip}
1. & The Irrationality of $\sqrt{2}$ & $\ge 17$ \\
2. & Fundamental Theorem of Algebra & 4 \\
3. & The Denumerability of the Rational Numbers & 6 \\
4. & Pythagorean Theorem & 6 \\
5. & Prime Number Theorem & 2 \\
6. & G\"odel's Incompleteness Theorem & 3 \\
7. & Law of Quadratic Reciprocity & 7 \\
8. & The Impossibility of Trisecting the Angle and Doubling the Cube$\enskip$ & 1 \\
9. & The Area of a Circle & 1 \\
10. & Euler's Generalization of Fermat's Little Theorem & 4 \\
11. & The Infinitude of Primes & 6 \\
12. & The Independence of the Parallel Postulate & 0 \\
13. & Polyhedron Formula & 1 \\
%14. & Euler's Summation of $1 + {1\over 2}^2 + {1\over 3}^2 + \dots$ & 1 \\
%15. & Fundamental Theorem of Integral Calculus & 7 \\
\hbox to 0pt{\hss\rlap{\emph{\dots}}\phantom{0.}} && \dots
%\noalign{\smallskip}
%\hline
\end{tabular}
\smallskip
\end{center}
\noindent
On the web page \cite{web:100} only eight entries are listed
for the first theorem, but in \cite{17}
seventeen formalizations of the irrationality of $\sqrt{2}$ have been collected,
each with a short description of the proof assistant.
%Most are elementary like
%the Pythagorean theorem, but there also are some highly
%non-trivial ones mixed in like Fermat's Last Theorem.
When we analyze this list of theorems to see what systems occur most,
it turns out that there are five
proof assistants that have been significantly used for
formalization of mathematics.
These are:
\begin{center}
\smallskip
\begin{tabular}{lc}
\emph{proof assistant}$\quad$ & \emph{number of theorems formalized} \\
\noalign{\smallskip}
\hline
\noalign{\smallskip}
{\HOL} Light & 69 \\
Mizar & 45 \\
ProofPower & 42 \\
Isabelle & 40 \\
Coq & 39 \\
\noalign{\smallskip}
\hline
\noalign{\smallskip}
\emph{all together} & 80
\end{tabular}
\smallskip
\end{center}
\noindent
Currently in all systems together 80 theorems from this list have been formalized.
We expect to get to 99 formalized theorems in the next few years, but Fermat's Last Theorem is
the 33rd entry of the list and therefore it will be some
time until we get to 100.
%All these systems have strong merits, and most have thriving
%user communities. There is no system among these five
%which is significantly `better' than the others, although
%they are quite different. To give a rough impression of
%the differences between these systems: {\HOL} Light and ProofPower are
%the most elegant and the easiest to extend by the user,
%while Mizar and Coq are the most expressive and the most
%`mathematical'. Isabelle, is the least idiosyncratic
%of this list, combining strengths from the other systems.
%The most impressive theorems that have been formalized
%in these systems are: the Four Color Theorem in Coq, the Prime
%Number Theorem in Isabelle (an elementary proof) and in
%{\HOL} Light (an analytic proof), the Jordan Curve Theorem
%in {\HOL} Light and in Mizar, G\"odel's First Incompleteness
%Theorem in Coq and in {\HOL} Light, and the Fundamental Theorem
%of Algebra in all but ProofPower.
If we do not look for quantity but for quality, the most impressive formalizations up to now are:
\begin{description}
\item
[G\"odel's First Incompleteness Theorem]
by Natarajan Shan\-kar using the proof assistant nqthm in 1986, by Russell O'Connor using Coq in 2003, and by John Harrison using {\HOL} Light in 2005.
\item
[Jordan Curve Theorem]
by Tom Hales using {\HOL} Light in 2005, and by Artur Korni{\l}owicz using Mizar in 2005.
\item
[Prime Number Theorem]
by Jeremy Avigad using Isabelle in 2004 (an elementary proof by Atle Selberg and Paul Erd\"os),
and by John Harrison using {\HOL} Light in 2008 (a proof using the Riemann zeta function).
\item
[Four Color Theorem]
by Georges Gonthier using Coq in 2004.
\end{description}
\noindent
All but one of the systems used for these four theorems
are among the five systems that we listed.
This again shows that currently these are the most interesting for
formalization of mathematics.
Here are the \emph{proof styles} that one finds in these systems:
\begin{center}
\smallskip
\begin{tabular}{lc}
\emph{proof assistant}$\quad$ & \emph{proof style of the system} \\
\noalign{\smallskip}
\hline
\noalign{\smallskip}
{\HOL} Light & {procedural} \\
Mizar & {declarative} \\
ProofPower & {procedural} \\
Isabelle & {both declarative and procedural possible} \\
Coq & {procedural} \\
\end{tabular}
\smallskip
\end{center}
\noindent
A \emph{declarative} system is one in which one writes a proof
in the normal way, although in a highly stylized language
and with very small steps. For this reason a declarative
formalization resembles program source code more than
ordinary mathematics. In a \emph{procedural} system one does
not write proofs at all. Instead one holds a dialogue with
the computer. In that dialogue the computer presents the
user with \emph{proof obligations} or \emph{goals}, and the user then
executes \emph{tactics}, which reduce a goal to zero or more new,
and hopefully simpler, \emph{subgoals}. Proof in a procedural
system is an interactive game.
In this paper we will show {\HOL} Light as the example of a procedural
system, and Mizar as the example of a declarative system.
%{\HOL} Light and ProofPower are similar, as they both
%are reimplementations of the {\HOL} system.
%Isabelle also
%started as a reimplementation of {\HOL}, but by now has grown different in
%many respects.
%Coq is related to {\HOL} as well (although less closely) through the {\small LCF} system.
The main
advantage of {\HOL} Light is its elegant architecture, which
makes it a very powerful and reliable system. A proof of
the correctness of the 394 line {\HOL} Light `logical core'
even has been formalized. On the other hand {\HOL} has the
disadvantage that it sometimes cannot express abstract
mathematics -- mostly when it involves algebraic structures -- in
an attractive way.
It \emph{can} essentially express all abstract mathematics though.
Another disadvantage of {\HOL} is that the proof parts of
the {\HOL} scripts are unreadable.
They can only be understood by executing them on the computer.
Mizar on the other hand allows one to write abstract
mathematics very elegantly, and its scripts are almost
readable like ordinary mathematics. Also Mizar has \emph{by
far} the largest library of already formalized mathematics
(currently it is over 2 million lines). However, Mizar
has the disadvantage that it is not possible for a user to
automate recurring proof patterns, and the proof automation
provided by the system itself is rather basic. Also, in
Mizar it is difficult to express the formulas of calculus in
a recognizable style.
It is not possible to `bind' variables,
which causes expressions for constructions like sums, limits,
derivatives and integrals to look unnatural.
\section{The example: Quadratic Reciprocity}
In this article we will look at two formalizations of a specific theorem.
For this we will take the \emph{Law of Quadratic
Reciprocity}, the seventh theorem from the list of hundred theorems.
This theorem has thus far been formalized in four systems:
by David Russinoff using nqthm
in 1990, by Jeremy Avigad using Isabelle in 2004,
by John Harrison using {\HOL} Light in 2006,
and by Li Yan, Xiquan Liang and Junjie Zhao using Mizar in 2007.
When I was a student, my algebra professor Hendrik Lenstra
always used to say that the Law of Quadratic Reciprocity is the first
non-trivial theorem that a student encounters in the mathematics curriculum.
Before this theorem,
most proofs can be found without too much trouble by expanding the definitions
and thinking hard.
In contrast the Law of Quadratic Reciprocity is the first theorem that is totally
unexpected.
%
It was already conjectured by Euler and Legendre, but was only proved by
the `Prince of Mathematicians' Gauss,
who called it the \emph{Golden Theorem} and during his lifetime gave eight different proofs of it.
The Law of Quadratic Reciprocity relates whether an odd prime $p$ is a square modulo an odd prime
$q$, to whether $q$ is a square modulo $p$.
The theorem says that these are equivalent
unless both $p$ and $q$ are 3 modulo 4, in which case they have opposite truth
values.
There also are two \emph{supplements} to the Law of Quadratic Reciprocity, which
say that
$-1$ is a square modulo an odd prime $p$ if and only if $p \equiv 1 \mod{4}$,
and that $2$ is a square modulo an odd prime $p$ if and only if $p \equiv \pm 1 \mod{8}$.
The Law of Quadratic Reciprocity is usually phrased using the \emph{Legendre
symbol}.
A number $a$ is called a \emph{quadratic residue} modulo $p$ if there exists an $x$ such
that $x^2 \equiv a \mod{p}$.
The Legendre symbol ${a\legendre p}$ for $a$ coprime to $p$ then is defined by
$${a\legendre p} = \cases{\phantom{-}1\quad\mbox{if $a$ is a quadratic residue modulo $p$}\cr
-1\quad\mbox{otherwise}}$$
Using the Legendre symbol, the Law of Quadratic Reciprocity can be written as:
$${p\legendre q}{q\legendre p} = (-1)^{{p - 1\over 2}{q - 1\over 2}}$$
The right hand side will be $-1$ if and only if both $p$
and $q$ are $3 \mod{4}$.
There are many proofs of the Law of Quadratic Reciprocity \cite{reciprocity:overview}.
The formalizations shown here both formalize elementary
counting proofs that go back to
Gauss' third proof.
We will not give details of these proofs here, but will just outline the
main steps, following \cite{reciprocity:proof}.
First by using that only half of the residues modulo $p$ can
be a square, one proves \emph{Euler's criterion}:
$${a\legendre p} \equiv a^{{1\over 2}(p - 1)} \mod{p}$$
Using this criterion, by calculating the product of $a$, $2a$, \dots, ${1\over 2}(p - 1)a$ in two different ways, one then proves \emph{Gauss' lemma} which says that
$${a\legendre p} = (-1)^l$$
in which $l$ is the number of $1 \le j \le {1\over 2}(p - 1)$
for which there is an $-{1\over 2}p < a' < 0$ such
that $a j \equiv a' \mod{p}$.
Finally one uses Gauss' lemma to derive the Law of Quadratic Reciprocity by
counting lattice points in the four regions of the following figure:
\begin{center}
\includegraphics[width=7.5cm]{qr.eps}
\end{center}
\section{{\normalsize\textbf{HOL}} Light}
Suppose that we select {\HOL} Light as our proof assistant.
The second step will be to download and install the system.
This does not take long. First download the ocaml compiler
from \cite{web:ocaml} and install it. Next download
the \smalltt{tar.gz} file with the current version of the {\HOL} Light sources from
\cite{web:hol_light} and unpack it.
Then follow the installation instructions in
the \smalltt{README} file.
If you use Linux or Mac {\small OS} X, all you will need to do is type `\smalltt{make}'.
Under Windows installation is a bit more involved: you will have to
copy the `\smalltt{pa\char`\_j\char`\_}\dots\smalltt{.ml}' file that corresponds to your version of ocaml as given by `\smalltt{ocamlc -version}' to a file called `\smalltt{pa\char`\_j.ml}',
and then compile that copy using one of the two `\smalltt{ocamlc -c}' commands
that are in the \smalltt{Makefile}.
When you
have installed the system, start the ocaml
interpreter by typing `\smalltt{ocaml}', and then enter the command
`\smalltt{\char`\#use "hol.ml";;}'.
This checks and loads the basic library of {\HOL}
Light, which takes a few minutes. After that you can load {\HOL}
files by typing for example `\smalltt{loadt "100/reciprocity.ml";;}'.
The third step will be to write a formalization of
the proof. For this you will have to learn the {\HOL}
proof language. To do this it is best to study two documents:
the {\HOL} Light manual \cite{hol_light:manual} and the {\HOL} Light tutorial \cite{hol_light:tutorial}.
\begin{figure}\label{hol:figure}
\begin{center}
\includegraphics[width=7.5cm]{example-hol.eps}
\end{center}
\caption{Fragment of the {\HOL} Light formalization of the Law of Quadratic Reciprocity.}
\end{figure}
Instead of taking this third step and describing how one writes a formalization,
here we will just look
at formalization that already exists.
It can be found in the file
`\smalltt{100/reci\-procity.ml}' (see Fig.~\ref{hol:figure})
and formalizes the proof from \cite{reciprocity:proof}.
This file can also be found on the web by itself as \cite{hol_light:formalization}.
It consists of 753 lines of {\HOL} Light code and proves
41 lemmas on top of the already existing {\HOL} Light mathematical
libraries.
The statement of the final lemma is:
\begin{quote}
\def\\{\char`\\}
\begin{alltt}\small
!p q. prime p /\\ prime q /\\ ODD p /\\ ODD q /\\ ~(p = q)
==> legendre(p,q) * legendre(q,p) =
--(&1) pow ((p - 1) DIV 2 * (q - 1) DIV 2)
\end{alltt}
\end{quote}
\noindent
This is the Law of Quadratic Reciprocity in {\HOL} syntax.
In this expression the exclamation mark is {\small ASCII} syntax
for the universal quantifier, the combination of slash and backslash is
supposed to resemble the $\land$ sign and represents conjunction,
the tilde is logical negation,
and the ampersand operator maps the natural numbers into
the real numbers.
The functions \smalltt{pow} and \smalltt{DIV} represent exponentiation
and division, and the term \smalltt{legendre(p,q)}
represents the Legendre symbol ${p\legendre q}$. This last function is defined in the
formalization by the {\HOL} syntax:
\begin{quote}
\begin{alltt}\small
let legendre = new_definition
`(legendre:num#num->int)(a,p) =
if ~(coprime(a,p)) then &0
else if a is_quadratic_residue (mod p) then &1
else --(&1)`;;
\end{alltt}
\end{quote}
\noindent
In this the expression \smalltt{num\char`\#num->int}
corresponds to functions of type ${\mathbb N}\times{\mathbb N}\to{\mathbb Z}$.
That is, \smalltt{num} and \smalltt{int} are the
natural numbers and the integers, the hash symbol represents the Cartesian
product, and the combination of a minus and a greater than sign
is supposed to look like an arrow.
A formalization, in any proof assistant, mainly consists
of a long chain of lemmas, where each lemma consists of a
label, a statement, and a proof.
In between these
lemmas there occasionally are a few definitions.
In \smalltt{reciprocity.ml} there are three: a definition
of the notion of quadratic residue, a definition of
the Legendre symbol, and a definition of the notion of
iterated product.
The formalization of the Law of Quadratic Reciprocity is too
large to explain in full here.
Therefore we will now zoom in on one of its smallest lemmas,
the third lemma in the file:
\begin{quote}
\begin{alltt}\small
let CONG_MINUS1_SQUARE = prove
(`2 <= p ==> ((p - 1) * (p - 1) == 1) (mod p)`,
SIMP_TAC[LE_EXISTS; LEFT_IMP_EXISTS_THM] THEN REPEAT STRIP_TAC THEN
REWRITE_TAC[cong; nat_mod; ARITH_RULE `(2 + x) - 1 = x + 1`] THEN
MAP_EVERY EXISTS_TAC [`0`; `d:num`] THEN ARITH_TAC);;
\end{alltt}
\end{quote}
\noindent
This {\small ASCII} text consists of the three parts mentioned above:
the first line gives the label under which the result
will be referred to later,
the second line states the statement of the lemma,
and the last three lines encode the proof.
In this proof, there are references to
four earlier lemmas from the {\HOL} Light library:
\begin{quote}
\begin{alltt}\small
LE_EXISTS !m n. m <= n <=> (?d. n = m + d)
LEFT_IMP_EXISTS_THM !P Q. (?x. P x) ==> Q <=> (!x. P x ==> Q)
cong !rel x y. (x == y) rel <=> rel x y
nat_mod !x y n. mod n x y
<=> (?q1 q2. x + n * q1 = y + n * q2)
\end{alltt}
\end{quote}
\noindent
The proof of this lemma encodes a dialog with
the system.
We can execute the proof all at once (this happens when we load the
file as a whole), but we can also
process the proof step by step, in an interactive fashion.
This is the way in which a {\HOL} Light proof is developed.
To do this, we enter the following command (where the \smalltt{\char`\#} character is the prompt of the system):
\begin{quote}
\begin{alltt}\small
# g `2 <= p ==> ((p - 1) * (p - 1) == 1) (mod p)`;;
\end{alltt}
\end{quote}
\noindent
The \smalltt{g} command asks the system to set the `goal' to the statement
between the backquotes. The system then replies with:
\begin{quote}
\begin{alltt}\small
Warning: Free variables in goal: p
val it : goalstack = 1 subgoal (1 total)\medskip
`2 <= p ==> ((p - 1) * (p - 1) == 1) (mod p)`
\end{alltt}
\end{quote}
\noindent
indicating that it understood us and that this statement
now is the current goal. Next we execute the first `tactic'
(a command to the system to reduce the goal) by using the \smalltt{e}
command:
\begin{quote}
\begin{alltt}\small
# e (SIMP_TAC[LE_EXISTS; LEFT_IMP_EXISTS_THM]);;
\end{alltt}
\end{quote}
\noindent
Note that this corresponds to the initial part of the third
line of the lemma in the way that it is written in the file.
The tactic \smalltt{SIMP\char`\_TAC} uses the theorems given in its
argument to simplify the goal.
It transforms the goal to:
\begin{quote}
\begin{alltt}\small
`!d. p = 2 + d ==> (((2 + d) - 1) * ((2 + d) - 1) == 1) (mod (2 + d))`\toolong
\end{alltt}
\end{quote}
\noindent
As already noted, the `\smalltt{!}' symbol is the universal
quantifier, which means that the statement that now is the goal is universally quantified
over all natural numbers \smalltt{d}. The existential quantifier,
which will occur below, is written as `\smalltt{?}'.
We now display the rest of the interactive session without further
explanations between the commands.
This is a dialogue between
the human user executing tactics and the computer presenting
the resulting proof obligations (`goals').
In square brackets are the \emph{assumptions} that
may be used when proving the goal.
\begin{quote}
\begin{alltt}\small
# e (REPEAT STRIP_TAC);;
val it : goalstack = 1 subgoal (1 total)\medskip
0 [`p = 2 + d`]\medskip
`(((2 + d) - 1) * ((2 + d) - 1) == 1) (mod (2 + d))`\medskip
# e (REWRITE_TAC[cong; nat_mod; ARITH_RULE `(2 + x) - 1 = x + 1`]);;
val it : goalstack = 1 subgoal (1 total)\medskip
0 [`p = 2 + d`]\medskip
`?q1 q2. (d + 1) * (d + 1) + (2 + d) * q1 = 1 + (2 + d) * q2`\medskip
# e (MAP_EVERY EXISTS_TAC [`0`; `d:num`]);;
val it : goalstack = 1 subgoal (1 total)\medskip
0 [`p = 2 + d`]\medskip
`(d + 1) * (d + 1) + (2 + d) * 0 = 1 + (2 + d) * d`\medskip
# e ARITH_TAC;;
val it : goalstack = No subgoals
\end{alltt}
\end{quote}
\noindent
At this point the proof is finished, as there are no unproved subgoals left.
To effectively use {\HOL} Light you will need to learn the
dozens of tactics available in the system.
This example uses the following tactics:
\begin{quote}
\def\r#1{\textrm{\normalsize #1}}
\def\m#1{$\mathbb #1$}
\begin{alltt}\small
SIMP_TAC \r{Simplify the goal by theorems}
REPEAT \r{Apply a tactic repeatedly until it fails}
STRIP_TAC \r{Break down the goal}
REWRITE_TAC \r{Rewrite conclusion of goal with equational theorems}
ARITH_RULE \r{Linear arithmetic prover over \m{N}}
MAP_EVERY \r{Map tactic over a list of arguments}
EXISTS_TAC \r{Provide a witness to an existential statement}
ARITH_TAC \r{Tactic to solve linear arithmetic over \m{N}}
\end{alltt}
\end{quote}
\noindent
A final note about this example.
The lemma talks about natural numbers,
which means that for $p = 0$ the difference $p - 1$ is defined to be 0.
This is called `truncated' subtraction.
This complicates the proof, and also explains the need for the condition
\smalltt{2 <= p} in the statement of the lemma.
If the lemma had been stated using integers, it would have been
provable without human input by the automated prover \smalltt{INTEGER\char`\_RULE}:
\begin{quote}
\begin{alltt}\small
# INTEGER_RULE `!p:int. ((p - &1) * (p - &1) == &1) (mod p)`;;
1 basis elements and 0 critical pairs
val it : thm = |- !p. ((p - &1) * (p - &1) == &1) (mod p)
\end{alltt}
\end{quote}
\noindent
In that case no explicit proof script would have been necessary.
\section{Mizar}
If instead of {\HOL} Light we choose Mizar as our proof assistant, again
the second step consists of downloading and installing
the system. Download the system from the Mizar web site
\cite{web:mizar}, unpack the \smalltt{tar} or \smalltt{exe} file, and follow the instructions
in the \smalltt{README}.
Mizar is distributed as compiled binaries,
which means that we do not need to install anything else first.
The third step then again is to write a formalization of
the proof.
The best way to learn the Mizar language is
to work through the Mizar tutorial \cite{mizar:tutorial}.
\begin{figure}\label{mizar:figure}
\begin{center}
\includegraphics[width=7.5cm]{example-mizar.eps}
\end{center}
\caption{Fragment of the Mizar formalization of the Law of Quadratic Reciprocity.}
\end{figure}
The Law of Quadratic Reciprocity is formalized in the file
`\smalltt{mml/int\char`\_5.miz}'
(part of this file is shown in Fig.~\ref{mizar:figure}; it also is on the web by
itself as \cite{mizar:formalization}).
This file again primarily consists of
a long chain of lemmas.
It consists of 4701 lines proving 51 lemmas.
It also has 3 definitions:
a definition of integer polynomials, a definition of
quadratic residues, and a definition of the
Legendre symbol.
To have Mizar check this file for correctness,
copy the file `\smalltt{mml/int\char`\_5.miz}' inside a fresh directory
called `\smalltt{text}', and \emph{outside} this directory type
\begin{quote}
\smalltt{mizf text/int\char`\_5.miz}
\end{quote}
\noindent
This will print something like:
\begin{quote}
\begin{alltt}\small
Processing: text/int_5.miz\medskip
Parser [4701] 0:02
Analyzer [4700] 0:13
Checker [4700] 1:14
Time of mizaring: 1:29
\end{alltt}
\end{quote}
\noindent
which means that the file was checked without errors.
Try modifying \smalltt{int\char`\_5.miz} and see whether the
checker will notice that the file now no longer is correct.
The Law of Quadratic Reciprocity is the 49th lemma from
the file. It reads:
\begin{quote}
\begin{alltt}\small
p>2 & q>2 & p<>q
implies Lege(p,q)*Lege(q,p)=(-1)|^(((p-'1) div 2)*((q-'1) div 2))
\end{alltt}
\end{quote}
\noindent
The proof of this statement takes 1268 lines of Mizar code!
Here is a smaller example of a Mizar proof.
This is the 11th lemma from the file:
\begin{quote}
\begin{alltt}\small
theorem Th11:
i gcd m = 1 & i is_quadratic_residue_mod m & i,j are_congruent_mod m\toolong
implies j is_quadratic_residue_mod m
proof
assume
A1: i gcd m = 1 & i is_quadratic_residue_mod m & i,j are_congruent_mod m;\toolong
then consider x being Integer such that
A2: (x^2 - i) mod m = 0 by Def2;
m divides (i - j) by A1,INT_2:19;
then
A3: (i - j) mod m = 0 by Lm1;
(x^2 - j) mod m = ((x^2 - i) + (i - j)) mod m
.= (((x^2 - i) mod m) + ((i - j) mod m)) mod m by INT_3:14
.= 0 by A2,A3,INT_3:13;
hence thesis by Def2;
end;
\end{alltt}
\end{quote}
\noindent
The lemmas from the Mizar library to which this proof refers are:
\begin{quote}
\begin{alltt}\small
INT_2:19 a,b are_congruent_mod c iff c divides (a-b)
INT_3:13 (a mod n) mod n = a mod n
INT_3:14 (a + b) mod n = ((a mod n) + (b mod n)) mod n
Lm1 (x divides y implies y mod x = 0) &
(x<>0 & y mod x = 0 implies x divides y)
Def2 a is_quadratic_residue_mod m iff ex x st (x^2 - a) mod m = 0\toolong
\end{alltt}
\end{quote}
\noindent
If you think that the condition `\smalltt{i gcd m = 1}' is not
used in this proof, you can try removing it, both from
the statement and from the `\smalltt{assume}' step, and see what
happens when you check the file again.
\section{The future of formal mathematics}
In mathematics there have been three main revolutions:
\begin{itemize}
\item
The introduction of \emph{proof} by the Greeks in the fourth century BC,
culminating in Euclid's \emph{Elements}.
\item
The introduction of \emph{rigor} in mathematics in the nineteenth
century.
During this time the non-rigorous calculus was made rigorous by Cauchy
and others.
This time also saw the development of mathematical logic by Frege and
the development of set theory by Cantor.
\item
The introduction of \emph{formal mathematics} in the late twentieth
and early twenty-first century.
\end{itemize}
\noindent
Most mathematicians are not aware that this third revolution already
has happened, and many probably will disagree that this revolution even
is needed.
However, in a few centuries mathematicians will look back at our time
as the time of this revolution.
In that future most mathematicians will not consider mathematics
to be definitive unless it has been fully formalized.
Although the revolution of formal mathematics already has happened and formalization
of mathematics has become a routine activity,
it is not yet ready for widespread use by all mathematicians.
For this it will have to be improved in two ways:
\begin{itemize}
\item
First of all, formalization is \emph{not close enough to existing mathematical
practice} yet to be attractive to most mathematicians.
For instance, both {\HOL} Light and Mizar define
$${1\over 0} = 0$$
because they do not have the possibility to have functions be undefined for
some values of the arguments.
This is just a trivial example, but in many other places the statements
of formalized mathematics are not close to their counterpart in everyday mathematics.
%To be able to use some proof assistants one even needs to be
%aware of how to reason intuitionistically, which certainly
%is an insurmountable barrier against the wide adoption of these systems.
Here
there exists room for significant progress.
However, it is \emph{not} important to have proof assistants be able to
process existing mathematical texts.
Writing text in a stylized formal language is easy.
The fact that proof assistant are not able to understand natural
language will not
be a barrier to having formalization be adopted by
the working mathematician.
\item
The second improvement that will be needed is on the side of \emph{automation}.
With this I do not mean that the computer should take steps
that a mathematician would need to think about.
Formalization of mathematics is about checking, and not about discovery.
However, currently steps in a proof that even a high school student
can easily take without much thought often take many minutes to formalize.
This lack of automation of `high school mathematics'
is the most important reason why formalization currently still
is a subject for a small group of computer scientists,
instead of it having
been discovered by all mathematicians.
\end{itemize}
\noindent
Still, there are no fundamental problems that block these improvements
from happening.
It is just a matter of good engineering.
In a few decades it will no longer take one week to formalize a page
from an undergraduate textbook.
Then that time will have dropped to a
few hours.
Also then the formalization will be quite close to what
one finds in such a textbook.
When this happens we will see a quantum leap, and suddenly all mathematicians
will start using formalization for their proofs.
When the part of refereeing a mathematical article that consists of checking its correctness
takes more time than formalizing the contents of the paper would
take, referees will insist on getting a formalized version before they want
to look at a paper.
However, having mathematics become utterly reliable might not be the primary reason
that eventually formal mathematics will be used by most mathematicians.
Formalization of mathematics can
be a very rewarding activity in its own right.
It combines the pleasure of
computer programming (craftsmanship, and the computer doing
things for you), with that of mathematics (pure mind,
and absolute certainty.) People who do not like programming
or who do not like mathematics probably will not like formalization.
However, for people who like both
formalization is the best thing there is.
%
%In his paper about the Compcert C compiler that was proved
%completely correct
%\cite{compcert}, Xavier Leroy writes about the activity of building Coq proof scripts:
%\begin{center}
%\emph{Building such scripts is surprisingly addictive, in a videogame kind of way.}
%\end{center}
%And this is exactly how it is.
\begin{thebibliography}{10}
\bibitem{web:100}
\smalltt{http://www.cs.ru.nl/\char`\~freek/100/}
\bibitem{17}
Freek Wiedijk (ed.), \emph{The Seventeen Provers of the World},
with a forword by Dana S.~Scott,
Lecture Notes in Artificial Intelligence 3600, Springer, 2006.
\bibitem{reciprocity:overview}
G.H. Hardy and E.M. Wright, \emph{An Introduction to the Theory
of Numbers}, 1938.
Fifth edition: Oxford University Press, 1980.
\bibitem{reciprocity:proof}
Alan Baker, \emph{A Concise Introduction to the Theory of Numbers},
Cambridge University Press, 1984.
\bibitem{web:ocaml}
\smalltt{http://caml.inria.fr/ocaml/}
\bibitem{web:hol_light}
\smalltt{http://www.cl.cam.ac.uk/\char`\~jrh13/hol-light/}
\bibitem{hol_light:manual}
\smalltt{http://www.cl.cam.ac.uk/\char`\~jrh13/hol-light/hol-light-manual-1.1.pdf} \\
John Harrison, \emph{The HOL Light manual} (1.1), 2000.
\bibitem{hol_light:tutorial}
\smalltt{http://www.cl.cam.ac.uk/\char`\~jrh13/hol-light/tutorial\char`\_220.pdf} \\
John Harrison, \emph{HOL Light Tutorial} (for version 2.20), 2006.
\bibitem{hol_light:formalization}
\smalltt{http://www.cs.ru.nl/\char`\~freek/notices/reciprocity.ml}
\bibitem{web:mizar}
\smalltt{http://www.mizar.org/}
\bibitem{mizar:tutorial}
\smalltt{http://www.cs.ru.nl/\char`\~freek/mizar/mizman.pdf} \\
Freek Wiedijk, \emph{Writing a Mizar article in nine easy steps}, 2007.
\bibitem{mizar:formalization}
\smalltt{http://www.cs.ru.nl/\char`\~freek/notices/int\char`\_5.miz}
%\bibitem{compcert}
%Xavier Leroy, `Formal Certification of a Compiler Back-end, or: Programming
%a Compiler with a Proof Assistant', POPL'06, Charleston, South Carolina, 2006.
\end{thebibliography}
\end{document}