Tuesday, February 13, 2007

Understanding Simplicity

"Understanding" is a word that is often used but hard to pin down in a technical sense. Dictionary.com offers no less than 13 definitions, the first of which reads "to perceive the meaning of; grasp the idea of; comprehend." Hmm. Interesting. Let's go to "comprehend" and see what comes up: "to understand the nature or meaning of; grasp with the mind; perceive." I see, so to understand is to comprehend, and to comprehend is to understand the nature or meaning of. You see the problem. If we instead choose to follow up on "perceive," it leads us down some more paths, none of which give us anything approaching a technical definition of understanding, one that would let us look at something that doesn't already understand what it means to understand and decide that it understands something else (if you'll forgive the bloated nature of that sentence).

Understanding is something we just do, that we know when we experience it. But this isn't English class, where everything is right, nor is it philosophy class, where everything is up for debate! There is an essence to the colloquial use of "understand" that should be possible to capture in a somewhat rigorous way - by which I merely mean by defining it in terms of words of less complexity than itself.

Which brings me to a big point. We tend to think of an explanation of some fact or observation as a string of words that somehow reduce that fact or observation to a (perhaps longer) string of words that is in some way simpler, or more primitive, where each piece takes less explanation than the original fact. If I try to explain to you what a real number is and I say that it's any number on a continuous line, including numbers that can't written by fractions, that makes sense. But if I look at you like an asshole for asking and tell you that it's a Dedekind cut on the rational numbers, things just get more confusing for you. Human comprehension almost always progresses towards the atomic nuggets of understanding that we all seem to magically possess and feel comfortable manipulating rather than towards bigger ideas that are used to explain the smaller ones.

This is a point not taken lightly in science. When two different theories make the same predictions about the results of an experiment, whichever is deemed to be simpler is preferred. Now, this is not always an easy chore, deciding what "simple" means in the context of a complicated scientific theory; in fact, it's probably no easier than defining "understand." But in both cases, we know it when we see it, and if it looks simple, it's easier to understand. So the two are intimately connected. Many theories have been, in fact, overthrown on the grounds of relative simplicity, perhaps the most public case being the epicycle theories of pre-Newtonian astronomy, which for all intents and purposes made up Fourier series approximations of true elliptical orbits, but with a whole lot more mathematical baggage.

There are quite valid mathematical reasons to prefer simple models to complex ones. For an example, suppose we have 20 points on a graph that lie more or less in a straight line (but with a bit of noise). A 20th degree polynomial can fit these points perfectly. But it will rocket off to who knows where in between them. On the other hand, a first degree polynomial, which is much simpler by any definition, can "mostly" fit the data, and appears to capture the true nature of the distribution even if it does not give an exact fit. To extrapolate this observation to all of science is a leap of faith; however, it is a leap of faith that has proven to be quite useful over the centuries.

So where am I going with this? Simple: we need to be careful about defining the first sub-problems of strong AI. I think our failures run much deeper than just technical difficulties, we're literally studying the wrong problems at the moment, overfitting our methods to our goals. We're not going to achieve a Turing test machine instantly, as such a thing will necessarily be a conglomeration of many different breakthroughs in many different areas. Instead of focusing on pattern classifiers and stuff like that we really need to figure out what the tough core of this problem is made up of, and put it as simply as we possibly can. If we can define the problem in a clear enough manner, we may find that it's not that tough, after all. After that, we can worry about preprocessing, which is what most current "AI" research really amounts to (the fact that something takes place in the brain does not make that something a vital piece of intelligence! As fun and important to us as pattern recognition is, it's not even close to the essence of what qualifies us as sentient).

I know that the fundamental problem has something to do with understanding the process of understanding - what is going on information-wise when I "grok" a topic? But I have no idea how to understand the process, as it is not available to introspection. All I know is that when something is understood, it becomes simple, whether that's a function of its storage method, contents, or interpretation. So in my opinion, we need to take a real hard (and perhaps not so simple!) look at simplicity, and what it means to reduce a set of observations to maximum reasonable simplicity to see where to go next. This may be a simpler goal, to start with.