Neural Nets

Neural Networks

Sylvia Bereknyei

STS: 129

Prof. Gorman

November 15, 2001

What if these theories are really true, and we were magically shrunk and put into someone’s brain while he was thinking. We would see all the pumps, pistons, gears and levers working away, and we would be able to describe their workings completely, in mechanical terms, thereby completely describing the thought processes of the brain. But that description would nowhere contain any mention of thought! It would contain nothing but descriptions of pumps, pistons, levers!

— Gottfried Wilhelm Leibniz

Positive scientific impacts of neural nets vary dramatically across domains: from pattern recognition, to perception, to tree searching. Neural nets are used by banks for automated teller machines to verify the identity of individuals by evaluating three-dimensional pictures of the individual’s face. Medical diagnoses are made every day with the help of neural nets by computing specific enzyme data. The stock market is even flooded with neural nets — a stock market analysis company regularly follows the Standard & Poor’s 500 with a predictor neural net which regularly outperforms conventional calculations. Each day, neural network research advances towards overtaking specific human functions, yetronically, neural nets are based upon animal (human) neural functioning. “There is an emerging community of researchers who intend to build neural nets the way nature intended: massively parallel, with a dedicated little computer for each neuron” (Kurzwell, 90). Each of these connections would run at speeds over a million times faster than human neurons, yet there is no single central processing power available to connect these intricate models. While neural networks are both innovative and powerful in terms of computing capability and can lend themselves to a wide variety of applications, neural networks will not be considered intelligent unless there is a centralized symbolic information processor.

The theoretical foundations of neural nets are rooted in the works of John von Neumann and Rudolph Ortvay. Ortvay’s field of study was physics, and used that knowledge to first suggest that there is “a connection between the brain and electronic calculating equipment” (Levy 25). Von Neumann went on to study this connection to associate a built-in computational system within living organisms. “’Anything that can exhaustively and unambiguously described, anything that can be completely and unambiguously put into work, is ipso facto realizable by a suitable finite neural network’” remarked von Neumann (Levy 25). Von Neumann’s theory made work on neural networks probable due to his innovation by proposing hypothetical self-reproducing automaton that followed a specific set of rules consisting of: computational elements, manipulating elements, cutting elements, fusing elements, sensing elements and information storage elements that also served as structural elements. The kinematic model of the automaton had a flaw in the ambiguity of the design of the elemental rules and was quickly rejected.

Perceptrons

Rosenblatt created a model for perceptual activity in 1951 and coined the term ‘peceptron’. Perceptrons are machines created on the theories of von Neumann that are essentially pattern recognition devices. These devices associate input patterns with specific responses. The machine has a binary input cell (similar to retina) and an output cell (similar to brain cells) that recognizes associative patterns from the input cells. The perceptron learns by utilizing the McCullock and Pitts neuron, which declares the output cell as the basic threshold unit. The threshold concept comes directly from the neural biological source – a neuron will propagate the electrical potential to the postsynaptic nerve only if there is a strong enough potential from the presynaptic nerve (or nerves). There are various forms of biological synaptic summation (either temporal or spatial), but neural nets strictly use spatial summation in conjunction with synaptic weights, which multiplies the level of the input cell with either an excitatory or inhibitory activity. Abdi states that “more formally, the response of the threshold unit depends upon its level of activation, which is computer as the sum of the weights coming from active input cells. The response is then obtained by thresholding the activation” (Abdi 4-5). Therefore, the cell will only propagate a signal if the activation level is greater than the algorithmically assigned threshold.

Perceptrons have proven to be useful in logical functions. According to Abdi, “a logical function is a function with associates a binary response (i.e., 0 or 1) to any pair of binary numbers” (Abdi 5). The OR function categorizes the four possible results from a single layer perceptron encounter (either [0, 0], [0, 1], [1, 0], or [1, 1]) by any positive input of ³1 responding with activity. How perceptrons learn correspond with the OR function — “essentially, a perceptron output cell “learns” by adapting (i.e., changing) its weights when the response it gives does not correspond to the response that was expected: The perceptron learns only when it makes mistakes” (Abdi 9). Under such supervised learning, the output cell processes information directly from the input cell and will adjust the synaptic weight so that the error will not occur again with the same stimulus.

The strength of perceptrons primarily lies in the ability to analyze a specific linear model with low error calculations (Fogel). Abdi’s “A Neural Network Primer” extrapolates that “because the activation of the output cell is a linear combination of the retinal input cells, the perceptron can learn only to discriminate linearly separable categories…then if the learning constant is small enough, convergence is guaranteed”. Therefore, early perceptrons were trained to categorize schematic faces according to specific binary inputs.

The perceptron’s limitations include the dependence on linear functions to pass through the networks. In “A Neural Network Primer”, Abdi claims that “these networks are equivalent to linear regression and to discriminant analysis”. Nonlinear inputs were next to impossible to compute because of the binary limitations associated with perceptrons. Shortly after the perceptron’s appearance in the artificial life world, the perceptron was branded as being able to only moderately model the human brain in the early sixties, and therefore the project was abandoned for a symbolic systems approach.

Linear Associators

After researchers realized the drastic limitations of symbolic systems, resurgence into developing neural nets occurred in the late seventies, early eighties. Problems addressed by neural networks in conjunction with limitations brought up by symbolic systems approaches include: “the general disappointment with the performance of the symbolic approach; the availability of cheap but powerful (micro) computers; the development of nonlinear models of neural networks; and the (re)discovery of techniques for training hidden layers of neurons”, as stated in “A Neural Network Primer” written Abdi. Linear associative memories “give a response that is a linear combination of all the input values” (Abdi). Organizationally, linear associators resembled perceptrons in either having a single layer with the input cell also being the output cell (auto) or being comprised of multiple layers with hidden layers between the input and output cells (hetero).

An example of a linear associator is the Boltzmann machine, which activates or inhibits each cell according to the probability associated with it. The Boltzmann machine is useful in modeling biological systems with different temperature states:

When the temperature is high, neurons change states with higher probability than when the temperature is low. During the stabilization process, the temperature of the network is gradually lowered; resulting in a system that becomes progressively more stable over time. (Abdi)

Physicists studying metallurgy and mean field theories use such computational variability. Linear associators are bound by linear representations and are limited to specific linear-only modeling forms. Since linear associators do not answer all the problems associated with the symbolic systems approach, the development of other neural networks modeling forms have been created.

Back Propagation Networks

Similar in structure to perceptrons, back propagation networks adjust the weights of hidden units between the input and output cells. Under supervised learning, the multiple-layered modeling system computes an error estimate according to error signals by hidden units which is back propagated through the connection units through the hidden layers, after which the connection weights recalculate their values in accordance to the error estimate.

It is crucial to train functions within these systems to yield the appropriate stimulus-response. Back-propagation is used most often due to the “simple gradient descent search of the error response surface determined by the set of weights and biases” (Fogel, 19). Recently, genetic algorithms have started to train to optimize the complexity of neural networks. Although genetic algorithms are being used to train neural networks, experiments show general superiority of the ‘supervisors’. In comparative studies in industry, the “genetic algorithm significantly outperformed back propagation in training a neural network for predicting the optimum transistor width in a complementary metal-oxide semiconductor switch” (Haupt, 150). Back propagating neural networks, by dismissing many optimizing calculations due to the use of inherent nonlinear functions, find suboptimal solutions in real-life situations.

Current uses for neural networks are highly used under special conditions. When data does not follow set statistical parameters, back propagation techniques prove to be effective. Neural networks are apt at recognizing patterns in collective properties in groups of variables, and are extensively used in massively parallel computation. The adaptive aspects of neural nets, in terms of optimization and associative memory function, make them ideal for nonstationary data. Only recently have neural networks been able to create higher abstractions and new information processing functions [Kohonen].

Centralized Systems

Machine intelligence still requires human expertise since machines “solve problems, but they do not solve the problem of how to solve problems” (Fogel, 253). Neural nets have not reached one millionth of the capacity of neural connections that humans possess — although there are neural network connections that are faster than human due to electric synapses, the level of processing power is far inferior to that of humans. The overall parts of human neural networks are available to neural network designers, yet there is gross focus on specific areas which do not include the whole unit of the brain. “In thinking and in preattentive and subconscious information processing there is a tendency to compress by forming reduced representations of the most relevant facts” (Kohonen, 79). There is a lack in communication among parts that may change the environment, or at least the perception of the environment, which may alter the computational outcome in responses.

Centralized processing systems are necessary to attain similar processing capabilities as the human brain. Designers focus on one area of topic while inherently ignoring any additional inputs. According to Rodney Brooks, ”When researchers working on a particular module get to choose both the inputs and outputs that specify the module requirements, [he] believes [that] there is little chance the work they do will fit if not a complete intelligent system” (Brooks). Additionally, “no one talks about replicating the full gamut of human intelligence any more. Instead we see a retreat into specialized subproblems, such as ways to represent knowledge, natural language understanding, visoin or even more specialized areas such as truth maintenance systems or plan verification” (Brooks).

Until we move beyond the exact functionality of how the brain functions, we will only see how the levers move, and not understand thought, as Liebniz explains. The descriptions that neural network modeling provides show us a glimpse of the brain’s mechanism, but not why we think.

REFERENCES

Abdi, H. “A Neural Network Primer.” Journal of Biological Systems. 2. (1994):

247-283

Abdi, H,. Valentin, D., and Edelman, B. (1999). Neural Networks. Sage

University Papers Series on Quantitative Applications in the Social Sciences, 07-124. Thousand Oaks, CA: Sage, 1999.

Brooks, Rodney. “Intelligence Without Representation.” Artificial Intelligence,

47 (1991), 139-159

Fogel, David. Evolutionary Computation: Toward a new philosophy of machine

intelligence. 2^nd edition. New York: The Institute of Electrical and Electronics Engineers, Inc., 2000.

Haupt, Randy, and Haupt, Sue Ellen. Practical Genetic Algorithms. New York:

John Wiley and Sons, Inc., 1998.

Kohonen, Tuevo, ed. Self-Organizing Maps. Springer Series in Information

Sciences. 2^nd edition. Germany: Springer, 1997.

Kurzwell, Ray. The Age of Spiritual Machines: When computers exceed human

intelligence. New York: Penguin, 1999.

Levy, Steven. Artificial Life: A report from the frontier where computers meet biology.

New York: Vintage Books, 1992.