The study of science and technology by social scientists has led some of us to develop a theory of the growth of socio-technical imbroglios in terms of associations.1 The word "social" in the expression "social science" would no longer refer to "society" but to the "associations" established between humans and non-humans. The problem encountered by such a theory is to decide whether or not one should qualify the associations beforehand. In studying a fragment of science, should we be able to sort the associations by types (for instance "student of," "instrument for," "stronger than," "interested in," "implies that," and so on) or simply stick to the mere occurences of the associated elements. The empirical consequences of such a decision are important and seem to lead us into a quandary. If we follow the first line of enquiry we will have a rich narrative, but will be able to deal only with a very small amount of data. If we follow the second line, we might be able to handle a large amount of data, but we would lose the richness of the information and will have only a cloud of elements with no other relations than the fact that they occur together. We seem to be limited by the very same weakness that suspended the associationist research program in the eighteenth century. The use of computers and large data banks might help us out of this quandary by allowing "mere associations" to give us enough information to qualify also the types of association. This is, at least, the attempt that we want to present in this article.
Despite contemporary progress in statistics, the social sciences are still too divided between quantitative and qualitative methods. To take a few examples, economics, electoral sociology, demography and "cliometry" (the French word for quantitative history) have at their disposition a number of mathematical tools and appropriate databases. The same cannot be said for anthropology, for numerous branches of history, for field studies in sociology and for symbolic interactionism. This difference of methods and tools is at once the cause and the consequence of several other divisions between macroscopic analyses and analyses of individual interactions, between explanations in terms of structure and explanations in terms of circumstances. Despite the presence of tools developed in part for sociologists-like factorial analysis-the progressive passage from global to local analyses never seems to get any easier.
This division is particularly deleterious for those who use the notion of network in order to account for multiplicity, heterogeneity, and the variability of associations responsible for the solidity of a fact, a technical object, a cultural feature, or an economic strategy. Such studies are not at ease either with quantitative methods, which do not follow the network faithfully enough, or with simple ethnographic descriptions (case studies), which do not enable one to tie a given case study to any other. But at the same time, there is no way of charting a network by choosing a median solution and projecting it-by correspondence analysis, for example-onto a common statistical space. In so doing, one would lose the advance that constitutes the idea of networks: the possibility for the actors themselves to define their own reference frames as well as the metalanguages used within them.
This step forward, conjointly made by ethnomethodology, the new sociology of science, and semiotics, has not yet been operationalized by specially designed methods of data analysis.2 In the absence of methods adapted to it, those who are developing the network ideas are forced to hesitate between statistical groups that are too large-scale and detailed analyses that are too fine-grained-or to despair of ever finding suitable quantitative methods.3 It then becomes easy to accuse those using the idea of networks of making a slogan of it (the network is "a seamless web"4) which does not enable one to differentiate as effectively as traditional notions using groups of acceptable size, and which does not enable one to carry out a relativist program.
Thus we need to give qualitative workers a Computer Aided Sociology (CAS) tool that has the same degree of finesse as traditional qualitative studies but also has the same mobility, the same capacities of aggregation and synthesis as the quantitative methods employed by other social sciences. The limits to this tool are the same as those of any instrument from any scientific discipline. Firstly, it works from written documents or inscriptions and thus does not resolve the problem of how these documents are obtained. Secondly, it necessarily sheds some information in the act of representing the data. In the case of the social sciences, the tool or computer tool that we give the specifications for below thus presupposes the prior transformation of the terrain into texts and the accumulation of enormous masses of documents, in which the researcher risks getting drowned. It does not claim to replace detailed analysis of a terrain or text. It only endeavors to provide a means for dealing with large numbers of documents.5
The tool that we seek will begin by looking at texts, if possible full texts-whether these be archives, reports, open or closed interviews, or, finally, field notes. In any case, instead of then wondering how to treat this enormous mass of data by applying methods of automatic reasoning or of artificial intelligence (AI) to the documentation obtained, we intend to follow the inverse strategy, and use techniques for treating documents in order to help researchers to artificially produce intelligence about the terrain they are analyzing.
The advantage of this approach is that it constitutes a challenge at once for sociological theories themselves, for computer science, and for cognitive science. In effect, the question can be posed in three ways:
Does the idea of networks enable us to deconstruct the set of forms and vocabulary that the social sciences have used, every which way, up to the present?
Does it allow us to follow a class of problems with fluid definitions, something that computer science has not yet been able to deal with?
Is it possible to use the idea of networks to successfully reconstruct the logics that the concepts of forms and structures only give us very partial access to?
These three questions can, we think, be tackled by taking them all on at the same time in the form of a program written for a microcomputer. We have given our project the code name of "the Hume-Condillac machine"6 in honor of the Scottish (1711-1776) and French (1714-1760) philosophers whose research programs we are partially reviving using computers, to which they obviously did not have access. This project for Sociology Assisted by Computer is also a form of CAS (Computer Assisted by Sociology). We agree with Hewitt that models for developing cognitive science rooted in the mind or brain are less useful for constructing computer tools than those borrowed from organizations, society, and networks.7
In this paper, we draw certain logical, cognitive, and information-science conclusions from work that has accumulated over the past ten years in the sociology of science and technology.8 What we have been able to show from studies of laboratories, theories, machines, and technology is that their robustness, their solidity, their truth, their efficiency, and their usefulness depend less on formal rules or on their own characteristics than on their local and historical context-independently of the various ways that there are of defining that context.
The principle we started from in constructing the Hume machine is a principle of calculability different from that of Turing machines, but one which occupies the same strategic position for our project as his did for his project.9 The reasoning is as follows:10
any form is a relationship of force;
any relationship of force is defined in a trial;
any trial may be expressed as a list of modifications of a network;
any network is resolvable into a list of associations of specific and contingent actants;11 and
this list is calculable.
Thus there is no formal concept richer in information than that of a simple list of specific and contingent actants. There is a tendency to believe that we are better off with formal categories than with circumstantial facts, but forms are merely a summary of a network: that is to say, of the number and distribution of associations.
The principle of calculability can, then, be summarized as follows. Any microtheory,12 or sequence of formal concepts, can be deployed in a network of associations that is not itself a microtheory. To put it another way: any closed system is a local and circumstantial part of an open system. Following Hewitt, we take an open system to be a system that cannot in principle be completed or closed, and that therefore has to negotiate between conflicting decisions made by parts of the system that are independent of one another.
This postulate seems to be paradoxical. Logical forms, mathematical rules, sociological laws, structural stabilities, and syntactical constructions do indeed seem much richer in content than mere association. P "implies" Q, P "is the cause of" Q, P "possesses" Q, P "is the father of" Q, P "is complementary to" Q, and P "transforms" Q seem to be more robust determinations than the simple statement P "is associated with" Q.
We postulate that these "rich" terms have no other content than that of summarizing, representing, gathering, condensing a network of "poor" terms whose sole link is that of association. Structured forms are synopses, clusters, digests, skeins of associations. Their size, force, robustness, necessity and solidity cannot be deduced from their formal qualities, but from the substance or matter of the network that they are capable of mobilizing. In other words, there is no formal notion that does not gain its substance from a (more or less pre-ordered) set of contingent circumstances. In conclusion: in following the network of contingent circumstances, we also gain access to the ultimate cause of the solidity of all structured forms.
Our postulate is only apparently reductionist. It is not a question of reducing the whole to its parts, as if we were saying that the human body is at root "only" hydrogen plus carbon plus water. On the contrary, we want to show that the whole-the network of contingent circumstances-is superior to its parts-the skeins or structures that summarize its associations.13 The postulate is thus literally irreductionist. It deploys all forms in its network, and all power in its relationships of force.
It is clearly pointless to make such a statement about the origin of robustness if one uses it to replace procedures for the calculation of a microtheory within the very interior of the field of application of a microtheory. Why take the trouble of using the networks of associations so nicely summarized by truth tables or by partial differential equations-even if this use is theoretically possible-when there are numerous tools in logic or mathematics capable of treating these microtheories without the slightest difficulty? The postulate only becomes valuable if it is applied to a class of problems that microtheories cannot incorporate: that is to say, to everything that is between microtheories-to open systems. The Hume machine will always be weaker than microtheories taken on their own terms. It only comes into its own when compared to the performances of microtheories14 not on their own terrain, but on its complement.
Microtheories form a more or less dense archipelago. The sea that links them is for the moment one that it is difficult and dangerous to navigate. Cognitive and computer scientists dream of covering over this sea by linking the set of microtheories. This, we now know, is an impossible dream15-or rather a nightmare. Faced with this situation, there are, it seems, three possible solutions. The first consists of closing ranks, ignoring basic problems, lowering sights and marching on, limiting computer prostheses to the simple cases of sets that are already well-defined and indeed predefined by metrology and by standardization procedures. The second involves criticizing the weakness of computers by explaining why they will never be capable of dealing with ambiguous, polysemic, reflexive, hermeneutic problems which necessitate a diffuse formalism and fluid sets.16 The third is to postpone solving material problems for the present, and to write programmatic texts while waiting for computers or people to improve. Each of these approaches has the effect of working from microtheories, and of postponing as long as possible a consideration of their margins and limits. We will see that, using a form/field reversal (Gestalt switch), it is possible to follow an inverse strategy and to work from open systems by treating microtheories as a special case-as a condensation.
There is in effect a fourth path, one that does not rely on an extension of formalism, on the injection of Heidegger or Garfinkel into the chips, or on programmatic dreams featuring an Achilles who in fact never catches up with the tortoise. This is the path that we will take with the Hume-Condillac machine. It involves adapting our philosophy, ontology, and sociology as far as possible to what computers can do: statistics about the counting of labelled occurrences. Instead of taking the royal road, which consists of making computers intelligent so that they are as skillful as the finest sociologists and most intricate hermeneuticians-a road which very soon becomes impossibly steep-we will take the service escalator. We accept the elementary stupidity of computers and we fashion a sociology, a logic and an ontology that work at their level of stupidity. Instead of being strong, we take the solution of weakness, hoping to turn this weakness into strength, since today's computers-and not those postulated for the year 2050-can come to our aid right away. Whoever tries least goes furthest. We adopt this strategy of weakness, which was attempted without success by Hume and Condillac for explaining the human mind, for dealing with computers whose non-human mind is sufficiently moronic to really resemble Condillac's statue or Hume's tabula rasa.
Why should this new approach succeed when so many dreams of automata have failed? Precisely because the Hume machine does not dream, but takes the computer for what it is, without imposing anthropomorphic projections and epistemological beliefs on it. The objection often made by hermeneuticians against the "thoughts" of computers is that since it has no body nor project nor worries, the computer is not thrown-into-the-world, as Heidegger said humans were. But this is the point we are making-we are not talking about imitating humans. Amorphous silicon and electrons have their own way of being in the world. We have to work from them instead of vesting them with human properties so as to immediately deny that they have any.
What, then, is the minimal property that we should start by giving the Hume machine? Received opinion says that computers need a set of rules in order to calculate. Formalists claim that the computer is above all a generator of rules that are all reducible to an inference engine of the form IFÉTHENÉ. This point is also accepted by hermeneutic critics, who go on to say that it is impossible to completely regulate language games and consciousness. But this first supposition-viz. that computers obey rules-is already an anthropomorphic projection.17 It involves attributing a particular view of formal human thought as elaborated by epistemologists to computers, with all the paradoxes that that entails.
Now, the computer does not have any IFs or THENs-these are already functions within a predefined language. All it has is occurrences of addresses linked between themselves by an elementary association: address 1 "is the same as" or "is different from" address 2. For its own part, in its own world, all the computer does is blindly deal with associations between contingent and specific addresses. In other words, it is already in itself an association network, in the sense defined above by our principle of calculability.
We can now understand that the only necessary point of departure is offered us by the computer itself. This is a network of associations between contingencies, having no other a priori characteristics than that they are different from one another, and that they can be addressed. Is the computer blind? Then so are we. Does the computer have no formal rules to start with? Neither do we. Does the computer not deal in abstractions? Neither do we. Does the computer just feel its way from trial to trial, from circumstance to circumstance? So do we, and we don't ask any more of it.
The paradox of discussions about the possibilities of computers is that they are lent qualities that they do not have-formalism, the epistemological dream of humans-whereas they are denied hermeneutic capacities that they already have. Indeed, in its own terms the computer is already an open system. It respects contingencies and specificities much more than humans who try to program it believe. We then give up on ever being able to get out of it what it is already capable of, providing only that we abandon our epistemological fantasies.
This conception of what computers do is clearly the result of applying our argument about the origin of structured forms back to them, since it is this argument that enables us to dispute anthropomorphic projections onto computers. But this result is in turn what will allow us to use the computer right away to prove our argument about the network origin of said forms. The Hume machine is an associationist machine and is only that. It does not come with any logical category, any syntactic form, any structure. It is-dare we say-blindly empiricist. What we hope to gain from our strategy is to find, instead of the chaos that might be expected, all the emergent properties that are worthy of our attention, and which will enable us to circulate between microtheories.18
In order for the machine to remain an open system, we will suppose that it has at the outset no information at all about the nature of the data it is dealing with. The only information that can be fed in is of a general nature, relating to the way in which its perceptions are to be memorized, associated and aggregated.19 The different objects present in its memory are always represented by labels. This is an indispensable condition for the treatment of the data. The only logic needed to govern these labels is that of identity: two perceptions are either the same or they are different.
Working from a flux of data present in the form of labels without properties (at least before any learning is done), how can the internal state of the machine structure itself to be able to offer interpretations that sufficiently resemble those of the sociologist, logician or historian who want to use it to help them deal with their research data?
The fact of being able to perceive labels without any other treatment is clearly not enough to reconstruct microtheories. The least that we can then ask is to be able to count the occurrences of these labels, then to record statistical associations between labels by counting their co-occurrences-as Hume or Condillac's statue did. No structure, no microtheory, should come into the machine that it has not obtained by its analysis of associations.
But won't it be said that it is the absurdity of just such an associationist project that, waking Kant from his "dogmatic sleep," proved the necessity of synthetic a priori judgements? Why should we be able to succeed where Hume and Condillac failed? The fact is that they tried to pass directly from the recording of associations to formal structures. They forgot to deploy an essential mediator: the network. In addition, they did not have the benefit of computers: that alone can accumulate enough contingent circumstances to substitute the force of number for the formal force of rules. It is the network that will allow the machine to serve its apprenticeship by transforming any set of contingent circumstances into a point that will then serve for the reconstitution of the network. A network of co-occurrences composed of associations of actants is very poor compared to microtheories, but it has an advantage over all of them that largely overcomes this imbalance: it is a formidable means of travel and of displacement.
In order to prove that it is possible to obtain from networks of co-occurrences what it was believed could only be achieved with formal rules, we have to provide a scale-model of the Hume-Condillac machine. Indeed, since we refuse programmatic discourse, we should already be able to realize in the model certain of the capacities of the machine, since only its size and not its principle will distinguish the current machine from any future one.20
In order to construct this model, we are going to take the least favorable conditions. That is to say, we will take a micro-computer treating full texts reduced to keywords, and we are going to establish a network using co-word analysis. The kind of calculus that seems the most appropriate for our project is that done by the Leximappeª or Candideª programs. If, under these extreme conditions, we are able to prove that this simple network of associations already enables us to bring out even a limited number of structures believed until then to be defined by formal elements, then we will have proved that any real Hume machine treating a greater number of texts will be able to realize our goal.21
The diagram above (see opposite page for an approximate English translation) summarizes the modus operandi of Candide: each sentence is replaced by the network of words it contains. Those networks are then added together to constitute the network of the analyzed text. The diagram above does not include the values of the associations provided by the coefficient E (see below).
The model that we have made works as follows. It first of all records occurrences of keywords in the machine's addresses. These keywords have no characteristics other than that of having an address. It draws up a list of all the occurrences of a word-for the machine, this list is made up of a string of 0s and 1s. Next it performs its calculation: that is, a comparison of co-occurrences. It classifies the associations it finds in order of degree of co-occurrence (maximum 1, minimum 0).
From the range of possible measures, we chose the following coefficient of equivalence E:
Cij2where C is the occurence of keywords i and j.
E = -----
This coefficient has the advantage of not rendering links dependent on the total number of occurrences (a high degree of co-occurrence between infrequent words are classified with high degrees of co-occurrence between common words, and are not lost sight of).22 The result is then projected in the form of relationships between words. These relationships have no other content that that of being "indicators" that register the relative degree of co-occurrence. Any extension of the corpus (whatever type of corpus it may be) will produce a (possibly null) modification of the value of the "tensors." It is this modification, which results from a trial of strength, that is the sole and unique point of departure in this rough prototype of the Hume-Condillac machine for any interpretation of the nature, essence, and form of actors, and of the nature, essence, and robustness of structures linking those actors. The recording of the variation of associations as a function of these trials is its only reality principle.
Our model of the Hume-Condillac machine starts from this level of self-imposed poverty and empirical blindness. Nothing in its constitution-except for the indexing of keywords and the choice of the coefficient of equivalence E-is in disaccord with the functioning of Boolean logic nor with the material functioning of electrons in transistors. On the contrary, the "higher-level" language of co-occurrences, the machine language, and the material all constitute how the machine operates-or thinks, or speaks-in exactly the same way.
The diagram above (see the following one for an approximate English translation) summarizes the modus operandi of Candide: each sentence is replaced by the network of words it contains. Those networks are added together to constitute the network of the analyzed text The diagram above does not include the values of the coefficient E (see above).
What can we learn from such a primitive network of co-occurrences and such a contingent treatment of associated keywords? Nothing, say the formalists, and their fraternal enemies the hermeneuticians or the sociologists who defend humans' intrinsic difference. Everything, we say. Or, at least, everything of interest to us in looking at large bodies of qualitative documents which have remained opaque to costlier and more sophisticated treatments.
Those who are not used to looking at heterogeneous networks and who prefer the solid shelter of microtheories always imagine that putting things into network terms signifies going from order to chaos. However, a network is not undifferentiated; it is not "the night wherein all cows are equally grey," to use Hegel's expression. Chaos would mean that all associations were equally probable. That is to say that in the model any one keyword would have exactly the same chance of being associated with any one term as with any other. Now, a record of keywords is, in contrast to this hypothetical chaos, highly differentiated. It is not the case that any given keyword is associated with any other. There are preferences, asymmetries, power relationships. In brief, there is order. It is simply that since these differences do not appear in terms of structure or category but as a trajectory of associations, they do not appear right away. However, it is sufficient to get used to the idea of their being there to discern the minimum order with which the Hume machine will learn to organize its world, and thus help human researchers organize their own.
We will show that even at the current state of the machine, the use of network analysis already enables us to obtain effects of meaning that are much richer than those that others strive at great cost to impose on machines.
a) "No machine can ever recognize hermeneutical finesses, such as synonymy." On the contrary. Nothing is easier for a network of co-occurrences-even the prototype of the Hume machine can already do it.
How will we make the machine understand that two distinct terms admit a single referent? The first solution that comes to mind is to enter a dictionary of synonyms into the machine. This would enable it automatically to substitute one term for the other. However, this solution poses many more problems than it resolves, since linguists have shown that there is never any pure synonymy, and that it is necessary to take the words' use context into account in order to decide the substitution of one term for another.
Now the very interest of a network of co-occurrences resides in the fact that there is no other definition of an actant than a contextual one: that is to say, in terms of the set of actants (or semantic field) it is associated with. Thus by working from a network of associations, we can in principle recover synonyms-at whatever degree of purity or impurity-without having to enter a dictionary into the computer. This quite clearly leads us to modify the definition of synonymy along the way, as always with the Hume machine, from a substantialist to an "existentialist" one. Two words are synonymous in the context formed by a given body of texts if they give rise to the same association profile. However, it is clear that with rare exceptions two words never have exactly the same profile. There are no pure synonyms. The small differences that have to be eliminated at great cost in the dictionary approach and by categories are all maintained in the approach by networks of co-occurrences. The richness of language is in its use context. Thus, paradoxically, a management in terms of association networks retains more richness than a classification by definition.
In the current model, we analyze the first Leximappeª network using a second program called Vectorª, that compares not the keywords but their association profiles.23 Thus we can recognize synonyms by the simple fact that two terms having neighboring association profiles are put side by side on the Vectorª map. If one is superposed on the other, then they are pure synonyms.
This approach even enables us to treat the implicit and the hidden or absent referent. Suppose that the set under consideration is made up of interviews with people who are talking about a thing that for some hidden reason they never actually mention explicitly. In our center, for example, everyone interrogated uses the word "Mac"; the researcher may not know what a "Mac" is. Looking at its association profile, the researcher will be able to reconstitute the semantic field of the hidden word "microcomputer," even if the word itself does not figure in any of the interviews. If the association profile is markedly different from the hidden word that the researcher believed it proper to substitute, then the burden of proof is on the researcher. Does s/he have the right to impute this hidden referent to interviewees, even though co-occurrence analysis does not justify the inference? Is a "Mac" the same thing as a "microcomputer" or is it really something quite different? Here again, the retention of use contexts in the machine enables us to retain the "existentialist" richness that "essentialist" questions always impoverish-for us locally a Mac is not a computer but the computer and there is no way an IBM computer could be the token of the general type "computer"! A lot would be lost if this nuance were ignored!.
From the example of synonymy, we can see the strategy of the Hume machine. Instead of invoking a weighty formalism that tries to store up thousands of particular dictionary rules in an effort to reduce ambiguities, we offer a disorderly accumulation of a body of whatever size and the rediscovery of fine nuances through a simple mapping of semantic fields. On the one side there are hundreds of rules that in the long run do not enable us to take the different uses of words into account, and on the other there are no rules, but there is the contextual richness of use.
b) "The definition of categories necessarily depends on human intervention. In themselves raw data are scattered all over the place." On the contrary. The Hume machine finds it particularly easy to generate categories automatically. Even our prototype can already do this.
It is said that when we look at form compared to context, we find ourselves faced with "empirical" data void of all significance, dispersed. The role of the researcher is seen to be that of "putting things into order," imposing definitions, examining special cases. This strange duty does not exist when one is faced with a network-but then, neither does empirical dispersion exist. When the form is nothing more than a condensation of the context it is no longer necessary to follow Kant and impose categories on a shapeless dust of a stimulus. There are no brute facts, there are only researchers who brutalize their data. All you have to do is to ask the context itself to designate its own categories, by bringing out regroupings implicit in the network. The naming of the category can itself be made entirely automatically by a process of genuinely democratic elections-bottom-up, not top-down!
Our Candideª model, using Vectorª, can already do this. Take a set of keywords whose co-occurrence has been calculated. We obtain a network of points and of tensors. This network enables us to detect clusters, that is to say sets of points that have similar association profiles.24 Is there one macro-term that can summarize the cluster better than any other? The machine holds elections, and designates the word or words whose association profile is closest to that of the cluster as a whole. This word, generally a composite one, henceforth serves to designate the whole of the network, looked at from a certain point of view. This "nomination" is entirely revokable and reversible. The chosen keyword is not a substance, it is simply the representative or the network node that will enable us during other treatments of the data to gain access to the category and through it to the network that alone gives it meaning. It is not, as used to be the case, the category alone that gives meaning to a scattered collection of data. On the contrary, it is the network alone that gives meaning to the category. Further, since the election is dependent on the point of view taken, one can always within a given network untangle the initial categories and tangle up the data into alternative ones for some other purpose.
Thus we can see the advantages to be gained from automating our procedures. In our own relativist or post-Garfinkel world, it has become impossible to define a category from above. We have to let the actants work out their own dimensions, liaisons, and relative weight. But the task of triangulation seems to be enormous. Once there are more than a few actants, how can we do enough "by hand" to respect the multiplicity of categories and of definitions of actants? It is only this practical difficulty that has made people reject the relativist (or, better, relationist) consequences of network theories and of most ethnomethodological requirements. In the absence of material means for letting the actors organize themselves in their own way, researchers believe that they are forced to continue imposing their own metalanguage.
This procedure allows us to have at that same time, in the same machine, the hitherto contradictory advantages of nominalism and of categories. Indeed it is possible, in our model, to re-aggregate the data using macro-terms by obtaining clusters of clusters, up to any desired level of granularity. But since any category keeps a memory of its own engendering through a series of elections based on a particular and contingent list of associations, it is always possible to retrace one's steps and to rediscover any particular use context. It is this zoom and backtracking effect, so particular to modern network theories, that is the principal advantage of the Hume-Condillac machine, since it enables us to deal with large masses of heterogeneous data without splitting them up into micro and macro levels, into case studies and general theories, or into raw data and interpretations.
c) "There is no automatic treatment of written language that we can use to feed into the Hume machine from above." Yes there is. Provided we look at semantics rather than syntax. The prototype already does this to a degree.
It will be pointed out that vast bodies of texts are needed in order to get at a word's semantic field, or to get self-designated categories. Now since there is no practical way of dealing with complete texts, it would appear that the Hume machine has merely displaced the problem from logic to linguistics. By claiming that language-here texts and other written documents-can restore the context (whose logic is only a condensation of the network), we are still faced with the problem of dealing with language. Granted that the structure of language is infinitely more complex than the structure of formal logic, it is because of this complexity that they are less distant from the network of associations. Words are actants like any others.
It is possible to reduce syntax to semantics in the same way that we have reduced formal structures to particular instances of what they summarize in the association. It is also possible to reduce semantics to the list of trials each word-actant is submitted to. Quite clearly this irreduction would be absurd if it involved going up against the richness of the language used in the interior of the many microtheories that make up linguistics. It is not a matter of re-establishing rules about the agreement of participles from a calculation of co-occurrences. Nor is it a question of re-discovering the relationship between volume and pressure in gases using a Leximappeª network. The work of cognitivists notwithstanding, this is not our aim. The aim of the Hume machine is to travel between microtheories. It does not need any heavy equipment, but on the contrary it needs just the bare minimum to enable it to produce meaningful statements about sets of texts in the absence of any microtheory. Like all explorers who have to carry the most food in the least space, it wants to be able to do this with a language concentrate. Contrary to AI systems, our goal is not to displace expertise from humans to machines but to let the machine develop a minimal expertise where no human has any.
Now one of the most radical ways of "concentrating" language is to keep only substantives, and to consider all syntactical forms (verbs in particular) as configurations of word networks. In the phrase "the cats eat the mouse," we only take away with us the co-occurrence of cats and mice, and we ask the machine to restore the verb "to eat," if necessary, by recognizing the non co-occurrence of "cats" and "mice"! This procedure is not too efficient for the verb "to eat," which instantiates powerful microtheories, but is much more useful for exploring configurations of networks for that there are no verbs in the language, or for which existing verbs-"to be able to," "to cause," "to want," "to occupy," to "hold" (whether they are taken sociologically or logically)-do not suffice.
This simplification becomes crucial when it is complemented by the irreduction of substantives themselves. In effect, what we said above about categories also applies to words. The model of the Hume machine treats all common words as proper nouns-extreme nominalism. But then, all proper nouns are macro-terms elected by the network itself-that is, reversal of nominalism by the network. There is no definition of any word richer than the co-definition obtained by looking at the use context of all words associated with it.25 In most social sciences, we need to operate in terms of networks, since the multiplicity of points of view, of informants, of transformations can entail that a given name cannot be assigned to a particular person or institution. The "same" person can be successively designated in interviews by initials, "Mr. Smith," "John," "the representative of the authorities," or by "industrialist." If the isotopy of this actor is in question-if the different words do not have the same association profile-then there is nothing forcing the researcher to consider that nevertheless there is a single essence having different manifestations.26 We are simply dealing with a "variable geometry" actor. If we wish to stabilize this actor, then we must work just as hard to maintain this isotopy as we would for any other actant.27
Here, as elsewhere, researchers do not have to decide. Nor do they expect the Hume machine to decide for them. On the contrary, they want it to help them maintain a state of variation, of opening, of a possible recomposition of the association network. Here again we can clearly see the abyss that separates our strategy from that of AI experts. We do not delegate the most rule-driven and formal parts of our actions and the surest of our knowledge to the machine while reserving fine-grained interpretation and ambiguous cases for ourselves. On the contrary, we use the machine to keep the system open as long as possible, by keeping for ourselves the tasks of putting things into categories and of locally closing microtheories. It is the computer that enables us to retain a "natural" form of intelligence, and ourselves who continue to produce "artificial"-that is to say, closed-forms. While AI's delegations hardly help us at all except in managing existing microtheories from the inside, the new mixture between the Hume machine and the researcher promises to be more useful between microtheories.
In the preceding section, we showed that the network
of co-occurrences forming the first layer of a Hume machine and the greater part of our model does not dissolve into chaos, and enables us to keep open a great number of characteristics of the context. The network is much richer and much more differentiated than all "higher" terms, which in fact have no other content than the network they summarize or condense.
Let us note at this point that any model of the Hume machine, however primitive and of whatever early version, is already a valuable tool for our declared aim: to help researchers in the social sciences to mobilize masses of heterogeneous data in the form of full texts. Even a machine that could merely let us range over a set of categorizations and synonyms in a mass of interviews is already extremely useful.
However, in the preceding section it was the humans who did the work of simplification, of expressing in terms of microtheories. They only asked the machine to produce reversible categories and to keep the system open. The machine is an extreme empiricist; the researchers are the microtheorists. Now there are two ways that we can present the advantages of the Hume machine. The first is that it is effective because it enables researchers to keep their microtheories reversible. This is the cut-rate version. The second is that it is effective because it itself can produce microtheories. This is the upmarket version, and it is this version that enables the Hume machine to compete on their own grounds with the "scientific discovery" programs reviewed by Thagard.28 Like Condillac's statue, it has to be able to learn to recognize network configurations for itself. We want to progress from a primitive model to advanced ones. In order to do so, the machine must not simply keep the association network open, but must also contribute to closing it. This becomes essential once the number of databases increases. It must be able to serve its apprenticeship and initiate a new dialogue between researcher and machine, whereby it can test its interpretations of the state of the network. It should be able to propose microtheories locally, or at least to propose to the human a progressive passage from the unfolded, irreduced network to a condensed microtheory and vice versa. This progressive and reversible passage is the essential feature of the upmarket version of the Hume machine. However, this time we do not have any working model.
We can now see the direction taken by researchers interacting with more and more elaborate versions of our Hume machine. What is happening is that we are getting closer and closer to the techniques of narrative-an essential tool for historians, ethnologists, field sociologists, and naturalists-and to the description of association networks made up of a jumble of heterogeneous databases. In this fusion of qualitative literary qualities and the power of quantitative treatment, we expect a renewal of methods and explanations in the humanities. There is no more powerful explanation than the analysis of the contingent circumstances of association networks, but for the moment the only way of obtaining this form of meaning is through a narrative limited to narrow terrains. Up until now, the only remedy to this limitation was to go over to statistical tables and to quantitative analyses. This was at the price of a rupture with the fine tissue of networks and circumstances, whence the interminable debate between field sociology and the sociology of structures, between history and sociology, between economic history and economic theory, between arts and sciences, between history of science and model-building philosophy. The Hume machine opens an alternative route. It is quali-quantitative. Since it is not based on any particular innovation in material or in programming, it can immediately begin to guide the construction of models, each of which will at once be of use to researchers in the humanities.
This workstation is a contribution to the debate that is taking shape between the cognitive sciences and the new sociology of science.29 But instead of a sterile opposition of psychology and sociology, we propose to select from within each field those schools of thought that give rise to fruitful associations. Instead of restraining the context so as to enable the mind of the scientist or the computer to make discoveries limited to microtheories in complete isolation, we propose on the contrary to choose each time the school of thought that enables us to follow the context in the most continuous and complete fashion. Just as Bloor and Collins's program-asymmetric, since it has a different treatment of society and nature-is badly adapted to the mentalist cognitive sciences as defined by Slezak and to Thagard's computer tools, so does the symmetrical program that we propose in the name of network theory seem well suited to the cognitive sciences as defined by the connectionists. All the debates about the construction of society, science, and psychology that cognitivists hoped to end, continue within cognitive science, within programming languages, within the sociology of science and technology; and probably within computers themselves.
1 See for instance Michel Callon, John Law, and Arie Rip, eds., Mapping the Dynamics of Science and Technology (London: Macmillan, 1986); and Bruno Latour, Science In Action: How to Follow Scientists and Engineers through Society (Cambridge, MA: Harvard UP, 1987).
2 See Harry Garfinkel, Studies in Ethnomethodology (New Jersey: Prentice, 1967); Michael Lynch, Art and Artifact in Laboratory Science: A Study of Shop Work and Shop Talk in a Research Laboratory (London: Routledge, 1985); Steve Woolgar, Science, The Very Idea (London: Tavistock, 1988); Algirdas Julien Greimas, On Meaning: Selected Writings in Semiotic Theory (Minneapolis: University of Minnesota Press, 1976); and Algirdas Julien Greimas; and Jean Courts, Semiotics and Language: An Analytical Dictionary (Bloomington: Indiana UP, 1982).
3 Howard Becker, Art Worlds (Berkeley: California UP, 1982).
4 Thomas P. Hughes, "The Seamless Web: Technology, Science, Etcetera, Etcetera," Social Studies of Science, 16.2 (1986) 281-292.
5 As instantiated in Bruno Latour, Philippe Mauguin, and Genevive Teil, "Une mthode nouvelle de suivi des innovations: Le chromatographe," La Gestion de la recherche: Nouveaux problmes, nouveaux outils, ed. Dominique Vinck (Bruxelles: De Boeck, 1991) 419-480.
6 The Hume machine, or Condillac's statue, represents the associationist program, which we hope will thus be awakened from its "dogmatic sleep"É. It goes without saying that we are displacing the psychogenesis conjured by these two authors from people onto machines. What they say about humanity is not very likely, but Hume's vision of the formation of human understanding and Condillac's design for his statue should apply much better to the computers described below.
7 Carl Hewitt, "The Challenge Of Open Systems," BYTE (1985) 223ff.
8 See Andy Pickering, ed., Science as Practice and Culture (Chicago: Chicago UP, 1992); Wiebe Bijker and John Law, eds., Shaping Technology-Building Society: Studies in Sociotechnical Change (Cambridge, MA: MIT Press, 1992); and Wiebe Bijker, Thomas Hughes, and Trevor Pinch, eds., The Social Construction of Technological Systems: New Directions in the Sociology and History of Technology (Cambridge, MA: MIT Press, 1987).
9 For the present, we can summarize the analogy as follows: the two principles of calculability replace higher, structured cognitive capacities by a determined and systematic management of stupidity. They are strategies of weakness. The Hume machine is determinist but is not refutable since refutability can only occur within a microtheory.
10 Bruno Latour, "Irreductions," The Pasteurization of France, Part 2 (Cambridge, MA: Harvard UP, 1988).
11 "Actant" is a semiotic term used to replace the difference between actors-usually humans and objects. It designates anything that may be said to act in a story.
12 We borrow the term "microtheory" from Hewitt. He uses it to describe a set of formal rules in a closed system (the theory of relativity, calculation of sales tax, stock management programs). Clearly, this definition presupposes that there are no macro-theories. There are only local theories, whether these be theories of relativity, cosmology, or accounting rules. Thus the question is one of knowing if we can extend microtheories in order to help them survive in an "open system," or if we need to profoundly modify the very way in which one microtheory is extended to reach another.
13 See also Francisco Varela, Evan Thompson, and Eleanor Rosch, L'inscription corporelle de l'esprit: Sciences cognitives et exprience humaine (Paris: Le Seuil, 1993), for a similar irreductionist argument in terms of networks.
14 It would be just as absurd to try to make the Hume machine work within a given microtheory as to ask modern astronomers to make their calculations by hand. But this very image is a case in point. It shows to what extent the development of computers has allowed us to demystify microtheories-whence their current name of microtheories. There has been a delegation of the act of calculation to blind automata. This is captured by the concept of "algorithm." Formal procedures are economical and mechanical, they are operative and operational. They no longer have the intellectual aura that made us look for the form or structure "beyond simple" empirical and contingent data. This materialization and banalization of operations once considered to be spiritual or at least mental is probably the most significant cultural effect of computers.
15 The aporia that Gödel arrived at-"any formal system sufficiently rich to include arithmetic contains undecidable propositions"-that was so mortifying to formalists is merely a consequence of the project of expanding microtheories to the set of open systems. The associationism proposed here does not lead to the same contradictions. The Hume machine is not bound by the incompleteness theorem. It produces interpretations in the form of networks of possibilities and not in the form of a microtheory. Thus it is neither complete nor indeterminist. Far from being contradictory, it can be applied to the very theory of open systems, without taking refuge in a metalanguage that would close them. The theory is therefore reflexive. See Steve Woolgar, Knowledge and Reflexivity: New Frontiers in the Sociology of Knowledge (London: Sage, 1988).
16 For the first line, see Edward Feigenbaum, Pamela McCorduck, and H. Penny Nii, The Rise of the Expert Company: How Visionary Companies are Using Artificial Intelligence to Achieve Higher Productivity and Profit (New York: Times Books, 1989). For the second, see Harry Collins, Artificial Experts: Social Knowledge and Intelligent Machines (Cambridge, MA: MIT Press, 1990); Hubert L. Dreyfus, What Computers Still Can't Do (Cambridge, MA: MIT Press, 1992); and Terry Winograd and Fernando Flors, Understanding Computer and Cognition: A New Foundation for Design (New York: Addison Wesley, 1986).
17 Brian C. Smith, "Formality and Separation," unpublished ms., 1991.
18 Our procedure is linked to the ideas used in the field of neural nets and by connectionists. For example, see David E. Rumelhart, James L. McClelland, and the PDP Research Group, Parallel Distributed Processing: Explorations in the Microstructure of Cognition (Cambridge, MA: MIT Press, 1988). Neural nets very elegantly resolve certain difficulties in calculation and in form recognition by short-circuiting traditional stepwise and rule-bound programming. However, they presuppose the existence of highly structured forms either at the outset or in their output. Even if they offer certain advantages for programming the Hume machine, they have no way of resolving the basic problem of its architecture: there are no structured forms to start with even to reward the net.
19 If one takes formalism to mean "metalanguage," then it is clear that we ourselves have a metalanguage for describing the Hume machine and for interacting with it. However, it is not formal in the strong sense of the word. An inspection of its inherent properties will never enable one, by itself, to judge its correctness. Thus there are metalanguages that designate simple trajectories through the context and stronger metalanguages (microtheories) that aim to substitute themselves for the context or to do without it.
20 This prototype already prefigures the workstation that we want to make of the Hume machine in the future. It is in current use as a tool in network analysis (in the particular fields of science policy, sociology of science and the economy of networks, and, more recently, in libraries). It works in a Hypercard environment on a Macintosh, using a tool for the indexing of key-words and a Leximappe program. It is already capable of treating large bodies of heterogeneous written documents. See the account of the Candide workstation by Geneviève Teil, Candide, un outil de sociologie assistée par ordinateur pour l'analyse quantitative de gros corpus de textes, Doctorat (ENSMP, Paris, 1991).
21 The Leximappe program detects the occurrence of A, and that it co-occurs with B, C and Z. The Vector program analyses this first list, and looks to see that words Y, N, etc., also have B, C and Z as "associates." The interest of such a profile is that it does not depend on the form of the network, but on the contingent nature of the key words that are in fact linked to such and such another word.
22 Bertrand Michelet, Doctorat (Paris VII, Paris, 1988).
23 The "vocabulary" of automatic analysis developed by working from large databases using slow programs has already proven interesting. Take, for example, "negation" (see Jean Pierre Courtial and J. Pomian, "A System Based on Associational Logic for the Interrogation of Databases," Journal of Information Science, no. 13 (1987) 91-97), judgements about 'central research, dense research, peripheral research' etc. (see Michel Callon, Jean-Pierre Courtial, and Françoise Lavergne, La Méthode des mots associés: Un outil pour l'évaluation des programmes publics de recherche. Etude pour la National Science Foundation (Paris: Ecole des Mines, 1989), the answer "question well put, question badly formulated, peripheral question" and "A and B are in competition" (see Teil, Candide), etc.
24 See Callon, Law, and Arie; and Michelet.
25 This formulation is simply a transcription into network language of Saussure's definition in terms of structure. Its only advantage is that it allows us to do away with the notion of structure, a notion that is too far-fetched for the Hume machine, and in any case not accessible by the model.
26 The Hume machine does not make any assumptions about essences. It considers that all the co-occurrences of a single word are simply homonyms that are then considered in the analysis of the network to see if they are synonyms or not.
27 For the practical means of following this geometry, see Latour, Mauguin, and Teil, "Une méthode"; and Bruno Latour, Philippe Mauguin, and Geneviève Teil, "A Note on Socio-technical Graphs," Social Studies of Science, 22.1 (1992) 33-59, 91-94.
28 Paul Thagard,Conceptual Revolutions (Princeton: Princeton UP, 1992).
29 Peter Slezak, "Scientific Discovery by Computer as Empirical Refutation," Social Studies of Science, 19.4 (1989) 563-600.