Artificial Intelligence is not a science;
Artificial Intelligence is a rigorous part of an inter-theoretical approach to every scientific theory, what I call "the philosophy of theories."
By contrast, defining this subject matter is the very task I am undertaking here. It is not unusual for a scientist to postpone the definition of his/her scientific area. This attitude is frequently assumed by chemists, biologists, and physicists. But the main reason, in their case, belongs to the didactic order: one cannot explain one's work and one's interests to a beginner without first having introduced a sound and robust framework of basic concepts. In the case of AI, in my opinion, an additional structural reason makes a definition of the area as a science impossible.
As is well known, AI lies at the intersection of several fields, such as philosophy, psychology, linguistics, and computer science; I will argue that this is unavoidable.
Roughly speaking, my idea can be summarized as follows:
a) The basic molecular constituent of scientific knowledge is a mathematical structure of a universe, a language and a logic, which I call a "theory" in a technical sense;
b) A science may be thus defined as a series of "homogeneous" theories; where again "homogeneous" is a technical term.
Thus the single molecular constituent ensures the stability of science, whereas the series of constituents provides the internal dynamics, following a well-known philosophical paradigm. Think, for instance, of the "pluralists" who try to avoid the contraposition: Heraclitus/Parmenides. Take atomists: the atoms stay unchanged, to explain stability, while their permutations are always in progress, to perform change. Our understanding needs both to fix time and to let it run. Such an approach will be used to test whether or not AI can be thought of as a science.
Thus, I will now focus on a new definition of "theory" sound enough to bear the notion of science.
The notion of "theory." The term "theory" comes from the Greek word , and is present with unimportant variations in the main western languages. Thus we have "théorie" (French), "Theorie" (German), "teoria" (Spanish), "teoria" (Italian). The Greek word comes from the verb whose meaning is "to look all around," in the sense of looking from the vantage of a "higher point of view." The term "theoria" in the sense of a whole, comprehensive doctrine is currently used by Plato and Aristotle. As I will show, there are two interesting features in the etymological explanation that will be useful for our goals. On the one hand, a theory is involved with a hypothetical form of reasoning; on the other, it offers a comprehensive solution to a given problem.
In order to avoid misunderstandings, I will list some other meanings of "theory" that are in some sense different from mine, and with which I cannot, therefore, completely agree. In other words, I shall begin by saying what a theory is not.
In common usage, the hypothetical feature of the term is prevalent: when you say that someone has a theory about something, you mean that she has a general, reasonable explanation--that is, a hypothesis. What I have in mind here, on the contrary, is the scientific use of the word as exemplified by "set theory," or "game theory ." Every theory has, in a sense, a series of basic presuppositions, and so we can say that it has a hypothetical nature; but the character of hypothesis is external, not internal to the theory itself. It regards the theory as a whole. Inside the theory, its theorems are certain; outside, they have the same uncertainty as the theory itself.
Another use which differs from the one I propose here is the one of mathematicians. I want to refer, here, not to what kind of theory a mathematical theory is itself, but to a misleading common use of the word "theory" by mathematicians. In this use, a theory is simply a set of propositions together with the set of their logical consequences. Against this approach I put forward the following two points: i) a theory must be a systematic and organic explanation. So propositions involved in a theory must be in a sense "homogeneous"; one could hardly recognize as well-founded a theory of propositions beginning with a determinate letter, such as "b" for instance. And ii) such an explanation must be concerned with a specific domain. What I mean is that some ontic or at least ontological commitment is needed in order to have a theory. Both these problems are solved by introducing a third element besides logic and language, that is, something like the "universe of discourse"; I will call it the "universe" of the theory. The mathematician, in his use, takes care of the linguistic component of a theory, as he speaks of "propositions" and of the logical one, as he speaks of the set of "consequences"; but the ontic component, the universe of the discourse, is missing.
I assume here, as a theory, a mathematical structure composed of a Universe, a Language, and a Logic.
More formally, a theory T is an ordered triple <U, L, l>, where
U is a (possible structured) set called "universe of the theory T";
L is a (possible formalized) language, whose semantics is in U. L is called "language of the theory T";
l is at least (in the simpler case) a first order predicate calculus whose semantics is in L; l is called "logic of the theory T."
The above definition takes into account the simplest case. Of course the Universe, U, can be in turn something more structured than a simple set; for instance, a mathematical structure, that is to say a set together with operations defined on it; or, in a more sophisticated case, U can be in turn a full theory. In the same way, the language L and the logic, l, can be more complex. For instance, l can be a pair of a full predicative logical calculus together with a set of meaning postulates, or postulates already linked to a specific universe, e. g. Peano's postulates for arithmetic or Euclid's postulates for geometry.
Kinds of theories. Following Carnap's suggestion, we can distinguish between formal and real or empirical theories.
Both formal and empirical theories assume a Logic, in the sense defined above. But in the former case, the objects of the Universe are fully determined by it, while in the latter this is not so. In the case of an empirical theory, an object of the Universe is under-determined by the assumptions. Thus, experimental tools must be devised in order to discover some attributes of the theoretical objects. Note that this does not mean that the theory comes back from the theoretical Universe to the real world. In a formal theory, means to find attributes of the objects other than by axiomatic formal deduction do not exist. So, if an object is under-determined by the axioms with regard to some attributes, the theory itself becomes undecidable. This difference enables us to understand the different level of ontic assumptions of the two kinds of theories: in the case of a formal theory, the goal is to pre-determine experience--not to learn from it.
A very interesting problem arises as a direct consequence of this distinction. In fact, two opposite theses can be imagined:
a) There are sciences which are per se formal and others which are per se empirical
b) All the sciences begin as empirical, and then tend to become formal over time.
The first thesis is very clear. From this point of view mathematical sciences, such as geometry or topology, are formal, while physics, chemistry, biology are empirical; it was so from the very beginning, and will always be so. On the contrary, the second point of view claims that even geometry originated as an empirical science and that only at a further stage was it axiomatized. In other words, only when a significant corpus of knowledge has been acquired does it become possible to state a set of principles on the basis of which experience becomes predictable. The most important (and first) example was Euclid's systematization of geometry, but no doubt the same will happen for all the sciences. This implies that a science can be seen as a series that has an initial segment of empirical theories, and starting from the point of "Euclidization," a segment of formal ones.
Mixed theories. I will now introduce one of the most interesting notions of this approach--that of mixed theories.
Knowledge progresses by building up theories. The first step passes from empirical to theoretical objects, which entails abstraction. Knowledge effects its strong need for internal soundness through abstraction, that is, by breaking down its objects into their attributes and taking into consideration only a few of them at a time. Platonism consists of claiming for such theoretical objects an existence ab aeterno. On the other hand, theoretical objects that try to approach real objects again become mixed objects, or objects of the Universe of a mixed theory.
A mixed object, which is thus not produced by simple abstraction, can be constructed in two different ways: on the one hand, it can be the outcome of a process of knowledge; for example, when the features that abstract theories have already certified through abstraction are applied to a real object. On the other hand, an object can be "mixed" as the output of a special kind of abstraction, which rearranges, in a purely theoretical way, objects belonging to different Universes and then puts them under a one-to-one mapping until they can be reduced to one unit. In the first case, we are dealing with a real theory, which takes advantage of formal theories that, from a logical point of view, precede it; in the second, we have a new theory, which in a sense stays at a level of abstraction that is even higher than the theories which have been mixed together. It is important to notice that the first case is not simply the "application" of some theories to concrete data. Finding some chemical as well as some geometric attributes in an empirical object is not enough to build a mixed theory based on chemistry and geometry. Qualities that pure theories have postulated for the objects of their Universes must not simply be all present together in a certain concrete object: they must be linked. A certain social theory and a certain economic theory can build economic politics, if and only if, social values are not independent of economic values and vice versa. The human "fact" under consideration is then no longer an empirical fact for which we consider attributes coming from pure theories, but rather it becomes an object that belongs to the Universe of a mixed theory.
Mixed theories can also be built by following an opposite process: rather than going back to the initial data perform a further step of abstraction. However, and paradoxically enough, a higher level of concreteness may be reached by following this path. Consider, for example, the theory of real numbers and Euclidean geometry. A Universe of mixed objects can be built by establishing a one-to-one mapping between geometric points and pairs of real numbers, straight lines, and first degree equations, etc. The straight line r, determined by the two points P(1,4) and Q(2,7), and the equation y = 3x + 1, then become two different names for the same object. Under these conditions, expressions such as "the straight line y = 3x + 1" become meaningful. In this case, the objects of our Universe are more concrete: without losing any previous attribute, they now have new attributes, which come from the other theory. As we go on adding attributes, we are passing from a more abstract to a more concrete situation.
To complete this brief survey, I will mention in passing another important distinction between theories, though it will not be directly used in what follows. A theory can assume as an object of its Universe another theory. Thus the distinction between object-theory and meta-theory arises. It is very interesting, in my opinion, to study the possible relations between object-universe and meta-universe, object-language and meta-language and object-logic and meta-logic inside this new conceptual frame, that is as particular instance of the comparison between a theory and its meta-theory. But all this lies outside our present topic.
Through a growth process, a theory can (but not necessarily must) become complete or, as I prefer, "saturated." A theory is saturated when its logical apparatus does not introduce any further noun or object. In other words, we could say that its consequences can be fully predicted or determined, even if they are a denumerable infinite set. For instance, in the propositional calculus it is possible to obtain an infinite number of tautologies, i.e., of logically true sentences. However, it makes no sense for a scientist to work at adding a new theorem to the list, because for every given formula it is possible to determine, in a finite number of steps, whether the formula is a theorem or not. Thus, we can say that propositional calculus is a saturated theory. In a sense, we can say that a saturated theory loops inside itself. Let Cn(T) be the set of consequences of the theory T; then we can say that, if T is saturated, Cn(Cn(T)) is equal to Cn(T).
A theory can take the place of another theory. Faced with a fact we are not able to explain in a theory, we must modify our fundamental assumptions. We can perform this task in two ways: first, by enlarging the initial assumptions and putting them into a more general system; second, by arguing that they are wrong, and assuming quite new starting points. The relationship between the theory of Einstein and the physics of Newton can be considered an instance of the former case; as an instance of the latter, think of the relationship between the theory of Copernicus and the geocentric system of Ptolemy. In the first case, the scope of the original toery (Newtonian physics, in the preceding example) is dramatically enlarged, even if its axioms continue to be true. In the second case, on the contrary, the starting theory is invalidated as it is replaced. Let us note that also in the first case we do not have simply an enlargement in the sense that all the sentences of the preceding theory go on being true: enlarging the general assumptions introduce many new theorems, while some other statements stop being a theorem or even being true in the theory.
What kind of objects would make up the Universe of AI? Lisp and Prolog computer programs, or the metaphor computer/brain, or the way humans think? All these are involved, but any definition based on such concepts would be reductive. The main problem is that we do not have a well-defined Universe for the following two reasons.
A) Intelligence can deal with whatever process, with whatever "program" for whatever kind of human or non-human processor. I claim that we can find intelligence everywhere. As a consequence, making abstraction from the concrete processes yields a unique output.
B) AI is concerned not only with studying such processes, but also with realizing them.
A possible objection to A could be that the situation does not seem to be so different in the case of other sciences. Consider the case of geometry, for example. Even if any existing thing has a geometric shape, this does not preclude the Universe of geometric entities from being a whole, separate and self-consistent corpus. The problem is that the same argument does not hold for intelligence because we consider it a property, not an individual. To put it differently, intelligence does not tolerate reification because reification, starting from whatever thing, gives a unique output. If you are studying a triangular garden, you can have, as an abstraction, a triangle (leaving the aside the question, whether the triangle is generated by us, or it is Platonically already existent). If your garden has a different shape, you can deal with another figure, such as a square, by abstraction. Starting with different intelligent processes, you reach the same abstraction, "intelligence," which, like Schopenhauer's Wille, makes sense only as a whole.
The second point, B, is of course connected. Intelligence, in ontological terms, even though not in grammatical terms, is a property and not a thing. Having intelligence is only a stylistic variation of being intelligent. In my opinion, only living beings and processes can be intelligent, not things. Artificial Intelligence, to be "artificial," must discard living beings. We remain with nothing else but actions and processes. We have lost things. We remain without a Universe in the sense explained above. It is for this ontic faint that, as students of AI, we are committed not only to studying, but to realizing and duplicating processes.
The ordered series of physical theories constitutes the science of physics. The one of chemical theories constitutes chemical science. And so on. A chemical theory has its own objects, which are the chemical ones. Can we say that AI is a theory, in the sense used here, or an ordered series of theories? Clearly, this is not the case. The objects with which AI deals are either i) objects of the real world, studied from a particular point of view; or ii) artificial objects built by AI itself. Like philosophy, AI is concerned with all the instances of knowledge and of "intelligence": chemical objects, the behavior of a man playing chess, computer algorithms, the brain structure, etc. As a general methodology, AI approaches every theory horizontally and as a consequence every science, assuming the objects with their attributes, that is with the attributes stated by their own universes. AI has no ontic commitment, and, as a consequence, has no strong descriptive power in a specific universe. Furthermore, it cannot be thought of as a mixed theory; for it would be the mixed theory of everything. Even though it looks at different Universes, AI does not build any ontological link between them. In order to have a mixed theory, two objects already defined by their theories must be recognized as being the same thing; in Cartesian geometry, for instance, a point on a plane and a pair of real numbers are fully determined by their theories (Euclidean geometry and the theory of real numbers, respectively) before being considered as different name for the same thing. However, Artificial Intelligence makes no claim that two "intelligent" processes, using the same strategy, are different names for the the same object. Suppose an AI student realizes that a man playing chess, and a program concerning the classic problem of job scheduling with penalties, use the same strategy of alternate depth-first and breadth-first search in order to avoid the exhaustive combinatorial search on a tree. She would probably not see any unique new object, unless she is so deeply Platonist to think of the common search strategy as an eternal, ideal, robust object. It is in this sense that I argue that AI functions as a general methodology rather than a science.
There is another reason why AI cannot be reduced to the previously explained frame of "theory" and consequently of "science." The essential feature of a theory is its descriptive power. Its final output is a (at least denumerable) coherent set of (well formed, that is belonging to a specific language) sentences interpreted in (meaningful with respect to) a determinate Universe. Some practical acts can depend on a theory, but they are external to it. Even the most applied science (theory) has this characteristic: it sees its own task that of describing, but not performing, some real processes. For AI this is not the case. AI is both theoretical and practical; in the sense that its main goal is not only describing, but really duplicating intelligence (which otherwise would not be "artificial"). As a consequence, AI has inside itself both theoretical and practical goals, and in such a situation a positive loop arises: some theoretical hypotheses can generate (and be tested by) applications, which in turn can suggest new hypotheses and explanations.
Not being a psychologist, I am not so concerned with defending the brain/computer analogy as being something stronger than a simple metaphor. But, as a philosopher, I hold the following points of view:
I) I have no problem in agreeing with the three points mentioned above by the critics of AI, that is to say:
I.a) common sense is, or can be under certain condition, a feature of intelligence;
I.b) not every process can be reduced to a set of rules;
I.c) two processes, A and B, can have the same inputs and the same outputs, being nevertheless quite different from each other.
II) All these points are a direct consequence of, or are strictly connected with, what I have already explained above, that AI is not a science, as it lacks a well-defined Universe.
II.a) Polling the Universes of different theories, but outside them, AI, at every contact point, needs a background, that is, the context of the other theories. A science can avoid common sense as it assumes a partial, theoretical Universe, built on not-existing objects. Biology deals with the abstract, not existing, universal "dog," and not with my dog. So many features of my dog are lost; but biology has no need of common sense.
II.b) My agreement with thesis I.b, that not every process can be reduced to a set of rules, needs to be specified. I accept the sentence in a gnoseological sense, not in an ontological one. What I mean is that we are not able, now, on the basis of our present knowledge, to explain some processes by rules; but I do not believe that there is a process intrinsically without rules. This would be a miracle, and, in any case, not a "process." Having more space, we would need perhaps, some terminological clarification on this point. However, the central argument can be saved in an independent way. For instance, a free (without rules) process cannot be "intelligent." So such a situation simply states that AI does not claim to explain everything in the real world. I myself could list many non-intelligent actions, processes, people...
Some authors insist on the important role played by meaning when human beings apply rules and underline, rightfully, that meaning has no role in the case of a computer executing rules. It seems to me that these authors forget a very interesting type of thought and intelligence. With regard to this, Leibniz speaks of "symbolic or blind thought," as when we say that a "kyliagon" is a polygon of thousand angles. Can we represent to ourselves a "kyliagon" in the same sense in which we can think of a hexagon? It is important to underline that the impossibility of such a representation does not indicate a defective feature of our intelligence. Rather, it constitutes one of the most powerful means our reason has for dealing with the complexity of the world. We are able to put meaning into brackets, and to use a symbolic apparatus such as an "algorithm," or a machine for thinking. If a mathematician, performing his calculus, were forced to give a meaning to every step of his work, he could not reach the most trivial goals.
Suppose you have 1,785 sheep, and you want to assign them to 15 shepherds. You will probably do something like the following:
Of course, 1785 refer to the sheep, 15 refers to the shepherds, and 119 refers, again, to sheep. But what is the meaning of 28 and 135?
To understand this better, think of the opposite situation, where you have no algorithm. Take for instance the Roman notation for numbers, in use before the Indo-Arabic reform of numeric symbolism. You have MDCCLXXXV sheep, you must assign them to XV shepherds. Try to perform the needed calculation. Go ahead and write down the steps.... As you can see, in some cases the solution process is ruled by signs, not by meanings! Searle may be right about the fact that the processes performed by humans are ruled on meaning, that is, are semantically well founded, and on the contrary the ones performed by computers are only based on signs, without a real meaning; but this is not always an advantage.
II.c) Two processes can obviously lead to the same result, though be very different from each other. They can be different at every step. Leaving out Searle's example on the style of parking,1 think of the following instance. You have to put a lot of books on a bookcase. The two rules "take the lowest and put it on the left" and "take the taller and put it on the right" generate two processes which are at every stage quite different but have the same input and the same final result. This simply proves that a real brain and a computer could follow different processes and give the same output; as a consequence, realizing a process on a computer is not conclusive proof that we have the right and unique explanation of a certain mental process (note, by the way, that this is not proof of the contrary).
But, as AI does not have a specific Universe of its own, this fact is not a criticism of AI. I am interested in a good Prolog algorithm, even though no evidence exists that this is exactly the same way the human brain performs the same task. AI is not a theory of the brain from a functional point of view, and this could eventually be a problem for neurophysiologists or neurologists. But I suppose it leaves AI students completely indifferent.
Let us summarize the point, which is probably the crucial one of the present paper. A real application of an "artificial intelligence" needs references to the real world, or a "background," or a "common sense." But this simply depends on AI's lack of abstract objects of its own, or on its weak ontological assumptions, AI being a general methodology rather than a science. As a methodology, it can only apply its conceptions to universes already defined by theories, rather than build on a universe of its own.
The upper limit of Computer Science is Church's thesis--only recursive functions are computable. This thesis defines the Universe under consideration which is the one of all the algorithms. AI's challenge is to go beyond Church's thesis. A problem can be interesting for AI and then belong to its field of studies, because:
I) it is a non-computable problem,
II) it is a problem which is not practically computable, and
III) it is not formalized enough to be tackled by an algorithm.
The difference between I and II is, in short, that for a we refer to non-recursive functions; for II we refer to functions that are recursive, but can be solved only by algorithms that have a complexity higher than polynomial, that is to say exponential; consequently, after a small number of steps, they go out of space and time (in a sense which can be made precise in mathematical-physical terms).
There is also the fact that AI builds computer algorithms. Is this not a contradiction? The paradox depends on a change of perspective, AI obviously cannot battle against Church's thesis, which is a mathematical truth. Consequently, it cannot solve any thing unsolvable. But it can realize through a program the same strategies and the same behavior used by an intelligent being who is dealing with such a problem. So, strictly speaking, what AI obtains as a solution is not, in a mathematical sense, really an algorithm that solves the case: rather, it is a good strategy, that could fail.
As Gestalt psychologists know well, a change of perspective can lead us to a quite different interpretation. As in the case of the famous picture--where an old and a young woman can alternatively be seen--it is possible to recognize an algorithm or a non-deterministic procedure in an AI program, depending on the initial point of view. From the perspective of assembly code, we are dealing with an algorithm; from the point of view of the goals of the program, and its behavior regarding them (that is to say, at a higher level), we are dealing with an "intelligent," and non deterministic non algorithmic strategy. The situation is similar in the case of human freedom. If the attention is focused on the physiological aspects of a choice, it is impossible to see but deterministic processes (the analogous of assembly code); however, a free will is conceivable if considered from the point of view of the abstract reasons governing the choice.
From a robust concrete materialistic point of view, you could conclude that AI and perhaps human freedom itself, are psychotropic drugs. Maybe. But is this enough to renounce our intellectual games?
1 See John Searle, "Cognitive science and the computer metaphor," Understanding the Artificial on the Future Shape of Artificial Intelligence, ed. Massimo Negrotti (London: Springer, 1991).