Wednesday, December 7, 2016

Holy Grail

Today's colloquium was about using artificial intelligence to prove mathematical theorems. Automated Theorem Proving (ATP) has been one of those "Holy Grail" type quests for the AI community since the dawn of the genre. I don't use that term in the complementary form; more in the "fools errand" connotation (though that's certainly too strong). It seems like the perfect combination of noble, difficult, and achievable (well, I don't know that looking for the Holy Grail in England qualified on that last one, but a legend as durable as Arthur is allowed a little leeway). However, under closer inspection, it starts to look pointless, infeasible, and still really, really hard.

Pointless because theorem proving isn't really what drives mathematics. The applied folks (like me) really don't care that much. If something works empirically, we go with it. The theoretical folks do care, but, as all of theoretical mathematics is predicated on which axioms you accept, what is and isn't provable rests more on your point of view than any skill with logic.

Infeasible because mechanical rigor simply doesn't work in math (or anywhere else, for that matter). This comes as a surprise to many but serious mathematicians are well aware of the fact that the emperor has no clothes. Regardless of which axioms you come up with, there will be unprovable assertions (that is, statements that can be neither proven true or false). Worse, there will also be contradictions (statements that, while technically provable, are at direct odds with other statements that are provable).

Now, it's not like ATP has no value at all. The famous "four color" problem (you can color any map using only four colors without having contiguous regions use the same color) was solved via ATP and no human-readable version of the proof has followed. OK, that's great. Still, Rand McNally wasn't exactly stressing over having to reset their plates. Nobody had come up with a counterexample in 6,000 years of mapped history. Chances were pretty good that their 4-color press was adequate.

And, it's this last bit that (almost) makes me want to take a swing at the problem. What if, for just a second we relaxed the correctness criteria. That is, we didn't worry about whether the automatic proofs were correct, just whether they were "consistently correct". Maybe we even give a certainty rating. X=>Y. Y=>Z. Therefore, X=>Z. Yeah, you can take that one to the bank. No integer solutions to an + bn = cn where n > 2? Well, maybe we're not so sure about that, though it sure appears to be right.

My point is that math isn't about logic. Sure that's the tool you use just as vibrations in the air are the tool that a jazz musician uses. But, it's not what makes the field great. What makes it great is the intuitive leap; the grab at something that doesn't really exist. No ATP algorithm will ever get that. To get that, you have to leave the world of algorithms and move to heuristics. Things that are probably true. Or even things that may well not be true but are worth a shot anyway. There's no reason a computer can't process things like this. But it means accepting that they might be wrong. The ATP True Believers don't want to accept that. They want the clarity of math in the 1700's (which was actually a big jumbled mess and had to be completely rewritten in the ensuing 200 years; people just hadn't realized the limits of logic yet).

Anyway, my BIG IDEA for all this (which I am NOT going to pursue, so anybody who wants to run with it is more than welcome to do so without crediting me in the least) is to try a 2-stage process. The first step is the creator, which is an empirically-based logic engine that knows the general patterns of successful proof, but is willing to hop across gaps in the logic if that's what it takes. The second is the peer reviewer, that does what all good peer reviewers do: pokes holes in the proof and sends them back for further analysis. The creator then tries to bridge these (presumably smaller) gaps in the logic. If it succeeds, at least by its loose standards, it re-submits to the peer reviewer. If it fails, it tries the whole thing over again. This continues until both the creator and peer reviewer agree. Sound familiar? It should. It's been working in real life for around 600 freakin' years! Seriously, why is nobody doing this?


No comments:

Post a Comment