Neural
Networks
Sylvia
Bereknyei
STS: 129
Prof.
Gorman
November
15, 2001
What if these theories are really true, and we were magically shrunk and put into someone’s brain while he was thinking. We would see all the pumps, pistons, gears and levers working away, and we would be able to describe their workings completely, in mechanical terms, thereby completely describing the thought processes of the brain. But that description would nowhere contain any mention of thought! It would contain nothing but descriptions of pumps, pistons, levers!
—
Gottfried Wilhelm Leibniz
Positive
scientific impacts of neural nets vary dramatically across domains: from
pattern recognition, to perception, to tree searching. Neural nets are used by banks for automated
teller machines to verify the identity of individuals by evaluating
three-dimensional pictures of the individual’s face. Medical diagnoses are made every day with the help of neural nets
by computing specific enzyme data. The
stock market is even flooded with neural nets — a stock market analysis company
regularly follows the Standard & Poor’s 500 with a predictor neural net
which regularly outperforms conventional calculations. Each day, neural network research advances
towards overtaking specific human functions, yetronically, neural nets are
based upon animal (human) neural functioning.
“There is an emerging community of researchers who intend to build
neural nets the way nature intended: massively parallel, with a dedicated little
computer for each neuron” (Kurzwell, 90).
Each of these connections would run at speeds over a million times
faster than human neurons, yet there is no single central processing power
available to connect these intricate models. While
neural networks are both innovative and powerful in terms of computing
capability and can lend themselves to a wide variety of applications, neural
networks will not be considered intelligent unless there is a centralized symbolic
information processor.
The theoretical foundations
of neural nets are rooted in the works of John von Neumann and Rudolph
Ortvay. Ortvay’s field of study was
physics, and used that knowledge to first suggest that there is “a connection
between the brain and electronic calculating equipment” (Levy 25). Von Neumann went on to study this connection
to associate a built-in computational system within living organisms.
“’Anything that can exhaustively and unambiguously described, anything that can
be completely and unambiguously put into work, is ipso facto realizable by a
suitable finite neural network’” remarked von Neumann (Levy 25). Von Neumann’s theory made work on neural
networks probable due to his innovation by proposing hypothetical
self-reproducing automaton that followed a specific set of rules consisting of:
computational elements, manipulating elements, cutting elements, fusing
elements, sensing elements and information storage elements that also served as
structural elements. The kinematic
model of the automaton had a flaw in the ambiguity of the design of the
elemental rules and was quickly rejected.
Perceptrons
Rosenblatt created a model for perceptual activity in 1951 and coined the term ‘peceptron’. Perceptrons are machines created on the theories of von Neumann that are essentially pattern recognition devices. These devices associate input patterns with specific responses. The machine has a binary input cell (similar to retina) and an output cell (similar to brain cells) that recognizes associative patterns from the input cells. The perceptron learns by utilizing the McCullock and Pitts neuron, which declares the output cell as the basic threshold unit. The threshold concept comes directly from the neural biological source – a neuron will propagate the electrical potential to the postsynaptic nerve only if there is a strong enough potential from the presynaptic nerve (or nerves). There are various forms of biological synaptic summation (either temporal or spatial), but neural nets strictly use spatial summation in conjunction with synaptic weights, which multiplies the level of the input cell with either an excitatory or inhibitory activity. Abdi states that “more formally, the response of the threshold unit depends upon its level of activation, which is computer as the sum of the weights coming from active input cells. The response is then obtained by thresholding the activation” (Abdi 4-5). Therefore, the cell will only propagate a signal if the activation level is greater than the algorithmically assigned threshold.
Perceptrons have proven to be useful in logical functions. According to Abdi, “a logical function is a function with associates a binary response (i.e., 0 or 1) to any pair of binary numbers” (Abdi 5). The OR function categorizes the four possible results from a single layer perceptron encounter (either [0, 0], [0, 1], [1, 0], or [1, 1]) by any positive input of ³1 responding with activity. How perceptrons learn correspond with the OR function — “essentially, a perceptron output cell “learns” by adapting (i.e., changing) its weights when the response it gives does not correspond to the response that was expected: The perceptron learns only when it makes mistakes” (Abdi 9). Under such supervised learning, the output cell processes information directly from the input cell and will adjust the synaptic weight so that the error will not occur again with the same stimulus.
The strength of perceptrons primarily lies in the ability to analyze a specific linear model with low error calculations (Fogel). Abdi’s “A Neural Network Primer” extrapolates that “because the activation of the output cell is a linear combination of the retinal input cells, the perceptron can learn only to discriminate linearly separable categories…then if the learning constant is small enough, convergence is guaranteed”. Therefore, early perceptrons were trained to categorize schematic faces according to specific binary inputs.
The perceptron’s
limitations include the dependence on linear functions to pass through the
networks. In “A Neural Network Primer”,
Abdi claims that “these networks are equivalent to linear regression and to
discriminant analysis”. Nonlinear inputs were next to impossible to compute
because of the binary limitations associated with perceptrons. Shortly after the perceptron’s appearance in
the artificial life world, the perceptron was branded as being able to only
moderately model the human brain in the early sixties, and therefore the
project was abandoned for a symbolic systems approach.
Linear Associators
After researchers realized
the drastic limitations of symbolic systems, resurgence into developing neural
nets occurred in the late seventies, early eighties. Problems addressed by neural networks in conjunction with
limitations brought up by symbolic systems approaches include: “the general disappointment with the performance
of the symbolic approach; the availability of cheap but powerful (micro)
computers; the development of nonlinear models of neural networks; and the
(re)discovery of techniques for training hidden layers of neurons”, as stated
in “A Neural Network Primer” written Abdi.
Linear associative memories “give a response that is a linear
combination of all the input values” (Abdi).
Organizationally, linear associators resembled perceptrons in either
having a single layer with the input cell also being the output cell (auto) or
being comprised of multiple layers with hidden layers between the input and
output cells (hetero).
An example of a linear
associator is the Boltzmann machine, which activates or inhibits each cell
according to the probability associated with it. The Boltzmann machine is useful in modeling biological systems
with different temperature states:
When the temperature is high, neurons change
states with higher probability than when the temperature is low. During the
stabilization process, the temperature of the network is gradually lowered;
resulting in a system that becomes progressively more stable over time. (Abdi)
Physicists studying metallurgy
and mean field theories use such computational variability. Linear associators are bound by linear
representations and are limited to specific linear-only modeling forms. Since linear associators do not answer all
the problems associated with the symbolic systems approach, the development of
other neural networks modeling forms have been created.
Similar
in structure to perceptrons, back propagation networks adjust the weights of
hidden units between the input and output cells. Under supervised learning, the multiple-layered modeling system
computes an error estimate according to error signals by hidden units which is
back propagated through the connection units through the hidden layers, after
which the connection weights recalculate their values in accordance to the
error estimate.
It is
crucial to train functions within these systems to yield the appropriate
stimulus-response. Back-propagation is
used most often due to the “simple gradient descent search of the error
response surface determined by the set of weights and biases” (Fogel, 19).
Recently, genetic algorithms have started to train to optimize the complexity
of neural networks. Although genetic
algorithms are being used to train neural networks, experiments show general
superiority of the ‘supervisors’. In
comparative studies in industry, the “genetic algorithm significantly
outperformed back propagation in training a neural network for predicting the
optimum transistor width in a complementary metal-oxide semiconductor switch”
(Haupt, 150). Back propagating neural
networks, by dismissing many optimizing calculations due to the use of inherent
nonlinear functions, find suboptimal solutions in real-life situations.
Current
uses for neural networks are highly used under special conditions. When data does not follow set statistical
parameters, back propagation techniques prove to be effective. Neural networks are apt at recognizing patterns
in collective properties in groups of variables, and are extensively used in
massively parallel computation. The
adaptive aspects of neural nets, in terms of optimization and associative
memory function, make them ideal for nonstationary data. Only recently have neural networks been able
to create higher abstractions and new information processing functions
[Kohonen].
Machine
intelligence still requires human expertise since machines “solve problems, but
they do not solve the problem of how to solve problems” (Fogel, 253). Neural nets have not reached one millionth
of the capacity of neural connections that humans possess — although there are
neural network connections that are faster than human due to electric synapses,
the level of processing power is far inferior to that of humans. The overall parts of human neural networks
are available to neural network designers, yet there is gross focus on specific
areas which do not include the whole unit of the brain. “In thinking and in
preattentive and subconscious information processing there is a tendency to
compress by forming reduced
representations of the most relevant facts” (Kohonen, 79). There is a lack in communication among parts
that may change the environment, or at least the perception of the environment,
which may alter the computational outcome in responses.
Centralized processing
systems are necessary to attain similar processing capabilities as the human
brain. Designers focus on one area of
topic while inherently ignoring any additional inputs. According to Rodney Brooks, ”When
researchers working on a particular module get to choose both the inputs and
outputs that specify the module requirements, [he] believes [that] there is
little chance the work they do will fit if not a complete intelligent system”
(Brooks). Additionally, “no one talks
about replicating the full gamut of human intelligence any more. Instead we see a retreat into specialized
subproblems, such as ways to represent knowledge, natural language
understanding, visoin or even more specialized areas such as truth maintenance
systems or plan verification” (Brooks).
Until we move beyond the
exact functionality of how the brain functions, we will only see how the levers
move, and not understand thought, as Liebniz explains. The descriptions that neural network
modeling provides show us a glimpse of the brain’s mechanism, but not why we
think.
Abdi,
H. “A Neural Network Primer.” Journal of Biological Systems. 2.
(1994):
247-283
Abdi,
H,. Valentin, D., and Edelman, B. (1999).
Neural Networks. Sage
University Papers Series on Quantitative
Applications in the Social Sciences, 07-124.
Thousand Oaks, CA: Sage, 1999.
Brooks,
Rodney. “Intelligence Without
Representation.” Artificial
Intelligence,
47 (1991), 139-159
Fogel,
David. Evolutionary Computation: Toward
a new philosophy of machine
intelligence. 2nd
edition. New York: The Institute of
Electrical and Electronics Engineers, Inc., 2000.
Haupt,
Randy, and Haupt, Sue Ellen. Practical
Genetic Algorithms. New York:
John Wiley and Sons, Inc., 1998.
Kohonen,
Tuevo, ed. Self-Organizing Maps. Springer Series in Information
Sciences. 2nd
edition. Germany: Springer, 1997.
Kurzwell,
Ray. The Age of Spiritual Machines:
When computers exceed human
intelligence. New York:
Penguin, 1999.
Levy,
Steven. Artificial Life: A report
from the frontier where computers meet biology.
New York: Vintage Books, 1992.