Short explanation of
why
I'm coding a proof of the
Jordan curve theorem
The tower stood flat against the blackboard, a dead, crystalline, cutting blade of red and blue and orange.
William F. Orr, Euclid Alone
A fine red line encircled two forking arrows and included six square feet of deductions along each path. Between the two were a series of equations in red.
Isaac Asimov, Foundation
I want to put a proof of the Jordan curve theorem in the computer. The full proof, everything it takes, from as simple and clear and universal an axiom system as possible. Including every seemingly trivial detail. I want to code it inside the computer in such a way that that computer can check the correctness.
Let me explain why.
I think there's a revolution in mathematics around the corner.
I think that in later times people will look back on the fin the siecle of the twentieth century and say "then is when it happened" (just like we look back at the Greeks for inventing the concept of proof and at the nineteenth century for making analysis rigorous.) I really believe that. And it amazes me that no one seems to notice.
Until now mathematics has always been something inside a human head. Or in a mathematical talk, which consists of people making sounds. Or in a mathematical publication, which consists of ink-marks on paper (or the same marks in a computer, but it stays just marks, just talk.) More or less confused thoughts and babble, that's what mathematics always has been. Until now.
Somehow, we imagine a platonic mathematical realm, something we claim to talk about... but it's all just imagination. All there is, what exists in the material world, are the thoughts in our minds and the sounds/marks we make about it. And that is all rather vague, really.
Lakatos has written very nicely about this. (Lakatos thought this to be inevitable. I don't. But then Lakatos died before computers became common. And I didn't :-))
Suppose I have in my mind some mathematical construction, a proof say. Suppose I have proven some theorem. And suppose even further that I realize that my proof can be done inside some formal system. Then I "know" that there is some abstract, mathematical object in that formal system (a "proof") that corresponds in some way to the proof I'm thinking about, the proof I have found, the spirit of my proof.
Now the nice thing about formal systems is that they're very simple, discrete, finitary, systems. It's quite easy to code something from such a system as a finite string over some finite alphabet. Including the "formal proof" I was just talking about. So, there is some finite string that corresponds in some natural way to the proof in the formal system, a string that therefore "is" my proof. (Of course it will be a quite long and unclear string, not at all resembling the crystal clear image that the proof is in my mind... but in some sense it will be the "same" proof.)
Now it gets interesting. Because we all know that all there is to a computer, all that exists inside a computer is just strings of bits and bytes. Which is remarkably similar to the thing I just talked about: a finite string over some finite alphabet.
But of course it's not the same at all! The string of symbols coding the "proof" in the finitary formal system is still a platonic abstract object in mathematical never-never land. While the string of bits in a physical computer consists of physical things, electrons flowing through electronic circuits.
Still these two kinds of strings are so remarkably similar!
Never before have the platonic mathematical world and the physical world been this similar, this close. Is it strange that I expect leakage between these two worlds? That I think the proof strings will find their way to the computer memories?
So much for the philosophy :-) So what do I think will happen? What do I expect will change in the way that people do mathematics?
What I expect is that some kind of computer system will be created, a proof checker, that all mathematicians will start using to check their work, their proofs, their mathematics. I have no idea what shape such a system will be going to take. But I expect some system to come into being that is past some treshold so that it is practical enough for real work, and then quite suddenly some kind of "phase transition" will occur and everyone will be using that system.
One of the objections to this prediction that might be made is that mathematicians will never be going to agree about what such a system should look like and so that no system will be adopted on a large scale. I think this is nonsense: the same could be said for the LaTeX typesetting system, and everyone in the mathematical community is using that to write their texts. And really, mathematicians do agree about a lot of mathematical conventions. I mean: if I say "vector space" everyone knows what I'm refering to, right? Lots of axioms there and they're completely standardized.
Another objection might be that, well, maybe someday something like what I predict might happen, but it will take a long time for mathematicians to get used to the idea and to start using such a system. I don't believe this. I think one often underestimates how fast a new practice can be adopted by a large group of people. I think "proof checking" will go the way of "hypertext". Some years ago people were just talking about how nice the idea of hypertext was, without much happening. The group of people doing research in the hypertext field was rather small and their ideas seemed idealistic and maybe a bit irrelevant. And then suddenly there was Netscape (actually it was called Mosaic then.) And suddenly hypertext was everywhere. Only nobody called it hypertext anymore: they called it "Internet". So, I expect that someone will create a practical system to verify mathematics by computer, and suddenly all mathematicians will be using it: only they won't call it proof checking, they will call it after the name of the company selling the thing. (I want to be the Ted Nelson of proof checking. I hope Automath will be the Xanadu of proof checking.)
I think this analogy suggests that the quantum leap in proof checking will occur before the technology is really mature: that it isn't necessary to have the perfect system before it will be widely used. If one looks at the WWW then it's a bad implementation of hypertext (as Ted Nelson has pointed out.) Still it's good enough. And that's all it takes.
Another objection: it's too hard. Although in principle it is possible to code all details of a realistic mathematical proof, it will be very unpractical because of the enormous amount of detail that is required. This is the claim that Davies and Hirsch make in their The Mathematical Experience. I think three arguments can be given to counter this objection.
First, nowadays we have computers and computers are very good at helping people manage lots of boring detail: so I think that the possibility of "automation" takes quite a bit of the bite out of this objection. If computers can calculate fifty billion digits of pi (with the help of Yasumasa Kanada and Daisuke Takahashi), surely the amount of detail required is irrelevant. Second, exactly the same kind of complexity as is involved in the coding of a mathematical proof occurs in "ordinary" computer programs and there it doesn't seem to harm too much. For instance, when programming something like a word processor or web browser, every tiny detail of the behavior of the program has to be spelt out to the computer: and such a program has a behavior that's easily on the same order of complexity as all detail of a mathematical proof. (For instance, Donald Knuth's source code of TeX has about the same size as the source code of Natarajan Shankar's Boyer-Moore proof of Gödel's incompleteness theorem (1005K versus 1031K.) One reason for this might be that mathematicians like to make their mathematics as simple as possible, while computer programmer's don't care too much for simplicity... or so it seems.) Third, there are quite significant pieces of mathematics that have been completely formalized already (look for instance at the work done with Mizar, Otter and the Boyer-Moore system.) Which means that experience already has shown that the problems are not insurmountable. So maybe it's only a matter of making things just a bit easier to bring it into the reach of "ordinary" mathematicians.
Yet another possible objection: it won't happen because it's not worth the effort. When a mathematician has a proof, that's all that's needed: all the extra effort of "doing the computer thing" will be superfluous, because one has the proof already.
To me this is the most silly objection of all: it's like saying that proving something isn't necessary once one is convinced that it's true. I mean, mathematics is about certainty, right? This is a chance to get much closer to certainty than before: of course it's worth it!
Furthermore, I don't think the objection holds on a more practical level either. People make mistakes: and it's not nice to have a referee discover errors in your paper. So being sure everything fits together like it should, that does add something. Also, when working on a mathematical proof, one constantly keeps checking and rechecking partial results: there always is the nagging doubt that the earlier work isn't sound, that one is building on quicksand. To get rid of that doubt will be very nice. (Actually, when working with a proof checker it sometimes works the other way: one has finished the proof and then one is left with the urge to check the thing... but that is not necessary because the computer already did it: this gives a decidedly unpleasant and "lost" feeling, like being interrupted in the middle of a sentence.) Finally, when being stuck in a proof, anything to keep the mind busy is useful to get fresh ideas: doing the dishes is the common example. I think that putting the earlier work in the computer will do just as well...
So much for the futurism :-) So what do I want to do about it personally. What are my plans?
Basically, I want to be there. I want to help. I want to know about it. I want to try it, to find out where the bottlenecks are. I want to find out what the state of the art in this field currently is.
So. What I think is needed is to try to do the "mathematical" thing. What I think needs to be shown is that it is possible to check the work of the "working mathematician" by computer. (And I mean real work, twentieth (or -first :-)) century mathematics, not some toy problem or ancient mathematics.)
So what I really should do is go look for a mathematical friend, find out what her latest paper was, and then translate that into the computer.
But there are two problems with this plan: there is a lot of mathematics that a research paper makes use of, that it builds on, mathematics that needs to be put in the computer first, making this a very big effort (although inventing a proof is much harder than understanding it, understanding everything about it is quite hard as well.)
Also, I don't have a mathematical friend.
So I made a first order approximation. I selected a theorem that is non-trivial and from the twentieth century (well, maybe it is late nineteenth century, but at least it's fairly recent.) The Jordan curve theorem. And I'm working on that.
Now I'd like to explain what I don't want to do. Because the field of digital mathematics has lots of activity that's a bit besides the point as far as I'm concerned. It's worthwile research of course, but it isn't what I'm looking for. First: I don't want to do software correctness. I want to do mathematics. The big difference between the two things is that in software the theorems are horribly complicated (try imagining what the statement "TeX is correct" exactly means) but the proofs are, although lengthy, rather trivial. On the other hand in mathematics, the theorems are very simple and the proofs are where the real work is going on.
This is related to three sizes that can be associated with a digital piece of mathematics. The first size, A say (from "axioms"), is the size of the part of the development that's the "foundation" of the mathematical system that is being used: the axioms, so to speak. This "A" is a bit fuzzy, because axioms can be put in the "engine" doing the proof check, making A smaller. Still, I hope it will be clear what I mean: the part of the thing that if that is correct then you know that all the rest has some meaning, is free of contradictions, etc. The second size, call it D (from "definitions") is what's needed to state the proposition. It's the infrastructure needed to tell what the proof is about. Now because many definitions are theory laden, this D will generally be quite a bit larger than one expects, but still. Finally there's the third size, P (from "proof"). That's the size of the whole thing.
What I think is real mathematics, deep mathematics, interesting mathematics, and what I think should be put in the computer, is the mathematics where A is much smaller than D and where D is much smaller than P, so where:
A << D << P
(Maybe it's interesting to know that for the last theorem in Jutting's Grundlagen - part of "Satz 301" which says that each complex x is of the form z1+z1.i for exactly one pair of real z1 and z2 - the values are A = 89, D = 2999 and P = 3186. So much for D being much smaller than P :-), but then "Satz 301" is rather trivial, and I said I was looking for non trivial theorems. Incidentally, the whole Jutting development has size 11207.)
What else isn't what I'm looking for? Constructive mathematics. I'm definitely not interested in constructive mathematics. What I want to do is digitize the work of a "working mathematician" and the number of "working mathematicians" in this world that do their math constructively can be counted on the fingers of one hand (logicians do study constructive math, but even then they generally reason classically about their Kripke models, Heyting algebras and what have you.)
Another side to constructivism is that the mathematical objects that constructive mathematics talks about are, well, too "constructive." I think that the "real" mathematical world is Cantor's world of ZFC. Which means that I think one can't avoid basing one's systems on ZFC, like for example the Mizar system does. (Yes, I've been indoctrinated in my youth... :-) But most mathematicians believe that ZFC is the proper foundation for all of mathematics and I think that a consensus like that shouldn't be discarded too lightly.) My favorite mathematical objects are the free ultrafilters over the natural numbers: I can't very well imagine how to talk about those things without a set theoretical foundation. This means that for me just taking some logic with "inductively" defined objects built in (the common thing to do in type theoretical systems), that it won't do. It also means that I don't agree with De Bruijn that ZFC should be avoided because it misses the "intuition" behind the mathematics, because it adds an extra layer of coding that gets in the way. Of course it would be nice to be able to avoid that coding, but I just don't see how that is possible and still be sure of the meaning of one's mathematics. Yes, one can do everything axiomatically, but then how can one be sure that there are no mistakes in the axioms? That's the big advantage of building everything on top of ZFC: because in that case all that one writes down at least has some meaning (although it might not be the intended one.) In other words: if one first builds the ZFC world and then all the rest of mathematics inside that world, the persistence of the ZFC structure protects one against inconsistency. And that's a protection that one doesn't get if one introduces everything axiomatically.
Then there's theorem proving. Now this is quite interesting: I used to think I was against theorem proving. In the sense that I thought that theorem proving missed the point: it's not the computer that should invent the proof, but the user. I felt fairly strongly about this: the system I am envisioning is something like a word processor, and one doesn't want a word processor to write the text!
In the use of current proof checkers there often is the feeling of getting "lost": there are lots of things on the screen, the proof checker is in some complicated state, and one just doesn't understand it anymore. I think theorem proving tends to make this problem worse: when the computer produced something, the chance that the human doesn't understand it is bigger than when the human produced it. Also, making the computer too smart leads one to a fairly dangerous style of proving: one just "tries" to see whether the computer can do it on its own. That's not a way to get good proofs.
Still I've changed my mind about this: theorem proving is essential to proof checking. The reason for this is that when one is proving something many of the steps are below the treshold of consciousness. And exactly those steps, the steps one doesn't write down in a paper, that one even doesn't consciously register when doing the proof, those things should be produced by the computer and the human entering the proof shouldn't be bothered by them. I think that a formal proof is like an ice berg: most of it is below the surface of the ocean of consciousness. I think that the computer should do exactly such an amount of theorem proving that the "conscious" part of the proof comes from the user, and the "unconscious" part is produced by the computer, automatically.
(This means that I still think that a project in which a theorem prover "discovers interesting mathematics by itself" misses the point. It's nice demoware, it's impressive and it's not what I'm looking for.)
And finally there's computer algebra. Computer algebra is by far the most "mathematical" variant of computer mathematics, in the sense that real mathematicians are using it and that they're using it to produce real, new mathematics. It's used on the frontier of mathematics and not in the hinterland of toy problems. However computer algebra has one big problem: it doesn't mean anything.
It's hard to explain exactly what my problem with the semantics of computer algebra systems is, because the expressions of a computer algebra system do correspond very nicely to the kind of theorem one can find in a mathematical textbook (theorems that do have a very rigorous meaning.) For instance, suppose one asks a computer algebra system to calculate the integral of x^n and the system answers x^(n+1)/(n+1): what's wrong with that?
There are two problems, though. First, the logical component of computer algebra systems tends to be very weak: probably because the user is supposed to take care of that part of the game. So the fact that there is a "hidden condition" that n=-1 shouldn't hold in the example above is considered to be irrelevant (still, it does make it wrong.) Second, there's no clean foundation for the meaning of the "expressions" in a computer algebra system: they don't have a uniform meaning in some clean mathematical system. (The "integral" example shows this too: I really can't imagine a clean definition of an "indefinite integration" functional that "fixes" the constant of integration to zero... but that's what's happening in this example. Hence the "meaning" of the indefinite integration is not clean, semantically.)
Still, just like with theorem proving, computer algebra is essential for doing computer mathematics. There are two sides to the mathematical activity: calculation and proof. Those are really quite different activities, and both need to be present in a realistic system. (The difference is that with calculation one isn't so much interested in the truth, in the correctness of what one is doing, as well in the answer. Also, when calculating one often applies all kinds of algorithms without any understanding: it is important that the digital variant of mathematics is able to reflect that.)
What I would like to see is a system that's kept rather simple because it has "factored out" the theorem proving and computer algebra parts: moreover it would be nice to have this in such a modular way that there could be multiple variants of "plug-ins" (theorem proving plug-ins, computer algebra plug-ins) to choose from. The main problem with this vision is how the plug-ins should establish the "correctness" of what they do to the proof checker. (The approach in which the plug-ins are proven correct makes it rather hard to create a plug-in. But the approach where the plug-in is required to supply a proof object to the checker makes the "interface" between plug-in and proof checker less straightforward.)
So much for what I don't like to work on :-) And now: my current job.
Since a few months I'm working in a proof checking project: the project of Henk Barendregt. Henk has almost the same plan as I had (de Bruijn's original Automath plan really), only he selected a different theorem: the fundamental theorem of algebra ("every non-constant polynomial over the complex numbers has a zero.") It's wonderful to have found such a like-minded person and being able to work for him. But...
There are a few things that I have my doubts about in Henk's project:
- First and foremost: Henk (well, Herman really, because Herman is the one working on the proof, but it's Henk's vision) is working on a constructive version of the fundamental theorem. I think it's not practical. It will make the whole thing much more difficult (much more difficult) and also it will make it less recognizable to the average mathematician. It's a pity.
- They decided to use the Coq system. Since I met Christine Paulin and her Coq group, I rather like the Coq system. But still, for various reasons (primarily because of the rather low readability of the script files) I wouldn't have chosen it.
- Herman decided to go the "general" way and not just create a very specific development for this one theorem: for instance instead of just talking about the complex numbers, he gives a general framework for abstract algebra: groups, rings, fields, etc. This gives quite a bit of overhead which is besides the point as far as I'm concerned. And this overhead is felt too: he's still in the "Jutting" part of the development, talking about rather straightforward arithmetical identities. So there's not yet much visible in his Coq files of the fundamental theorem.
- The fundamental theorem of algebra isn't sexy enough for me: the (classical) proof is rather straightforward.
(Quickly then: when z->infinity then also |P(z)|->infinity, from which it follows that there is some z0 for which |P(z0)| is minimal. Suppose that that's unequal to zero, then because for some c and n we have that P(z0+epsilon) = P(z0) (1 + c.epsilon^n + O(epsilon^(n+1)) ), for a suitable epsilon we get |P(z0+epsilon)| less than |P(z0)| and a contradicition. Yes I'm cheating here :-)) So the fundamental theorem of algebra doesn't satisfy my "depth" requirement enough to suit my tastes.
Still, I very much like being in Henk's project. For one thing, proof checking is great fun to me, no matter what it's about. Fitting the pieces of a proof together in a computer is one of the best things I can think of. Even if it's a constructive proof coded in the Coq system.
And. Also. One other thing I really would like to do is to digitize Euclid I from Tarski's axioms. Doesn't that sound like a nice project? (I think the most difficult part will be to show that "if some circle C' has a point inside some other circle C, and if C' also has a point outside of C, then C' and C intersect." Currently I have no idea how to prove that from the E2'' axioms.)
So what exactly is this "Jordan curve theorem" that I'm talking about all the time? Here is one version:
Jordan Curve Theorem Light. One can't get from one side of a line to the other side without crossing the line.
Now this may sound like a silly theorem (and maybe it is) but the weird thing about it is that it is also a difficult theorem (and of course that's exactly what I'm looking for because I'm looking for a case study where D << P.) So it's hard to prove that you can't crawl through a line without hitting any of the points on it. But what exactly is it that one should prove? One will have to expain what a "line" is, what "crossing it" means, what "sides" are, etc. So here is a more explicit version:
Jordan Curve Theorem. A simple closed curve in the plane divides the plane in exactly two parts, one bounded (the inside) and one unbounded (the outside). Furthermore the curve is the complete frontier of both parts.
Still this is not very precise, because the concepts of "simple closed curve" and "divides" have not been explained. These can be avoided by using the more basic concepts of "homeomorphism" and "component":
Jordan Curve Theorem Pro (only the first part of the full theorem, but this time in a fairly precise version:) For each subset of the plane that is homeomorphic to the circle (which comes to the same thing as being the image of the circle under a continuous injection: a so-called "Jordan curve") it holds that the complement has exactly two components (and both of these components are arc-wise connected.)
This last version really appeals to me, mainly because of the references to "the" plane and "the" circle. Actually "the" plane is not a well-defined mathematical object at all, but instead has many variants: the plane as a vector space, the plane as a metric space, the plane as a topological space, etc. And when one looks closely, each of these variants has many representations too. So clearly "the plane" is a fairly "fuzzy" concept.
A similar thing holds for "the circle" (maybe even more, because there are three quite different, but each fairly "natural", ways of making a topological circle: on the one hand there's the set of points (x,y) in the plane satisfying x*x+y*y=1; on the other hand there's the set of real numbers modulo 1; and on the third hand one can create a "circle" from one or more "segments" by identification of their "end points".
I would prefer to be able to use the same "loose" style of talking about "the plane" and "the circle" in the digital version of the thing, but I don't know whether that is practical.
So how does one prove the Jordan curve theorem? Basically there seem to be two proofs: either one can study the homology of the complement of the Jordan curve, or one can prove the Jordan curve theorem for polygons first and then approximate the Jordan curve with polygons. Paradoxically, the first way, which appears to be more the indirect one, is the more efficient approach.
So what do I need to do to prove the Jordan curve theorem in the computer? I need to:
- select a proof of the Jordan curve theorem for digitizing
- write out the proof informally but in great detail
- select/create a system for proof development
- develop (in this system) the axioms I'll use as the foundation for my proof
- develop the "well known" math that I'll be needing
- develop the statement of the Jordan curve theorem
- develop the proof
Currently I'm somewhere between stages one and two. I have bought a small book about the topology of the plane that presents a fairly explicit homological proof. I guess I'll try to base my development on that proof. Whether it's a good idea to try to get the proof in a more abstract form than what is in the book is what I'm currently thinking about. Soon, soon I'll start working on stage two. And I fear I'll drown in stage five, or maybe already in stage three. Well, we'll see what happens when I get there :-)
Note added July 1999
I just found out that Andrzej Trybulec has had the same idea of making a project out of digitizing the Jordan curve theorem. In Mizar. Together with his Japanese friend Yatsuka Nakamura. And that they're almost finished. I have to think about this. I really have to think about this.
(last modification 1999-10-26)
|