Artificial Neural Networks

A Human Nature, Learning & Mind Web Assignment
By Bevan Clark.

An Artificial Neural Network is an approach to modelling the structure and function of the brain. It is an attempt to simulate with specialised hardware or software, the simple information processing capabilities of neurones connected in multiple layers. It is closely aligned with fields such as connectionism, parallel distributed processing, neuro-processing, and natural intelligent systems.

This paper is intended as an overview of the subject, and does not delve too deeply into the mathematical or technical realms of Artificial Neural Networks.


The 1943 release of a paper on how neurones might work by neurophysiologist Warren McCulloch and mathematician Walter Pitts and the construction of a simple neural network with electrical circuits is seen as the first step towards artificial neural networks. Donald Hebb reinforced this work with the 1949 book "The Organisation of Behavior", which suggested that neural pathways were strengthened each time they were used.

The introduction of early electronic computers in the 1950s allowed the basis of these theories to be modelled by IBM's Nathanial Rochester in the first computer simulation of a neural network. At this time traditional serial computing began to steal the limelight from artificial intelligence and neural network studies, although academic seminars like the 1956 Dartmouth Summer Research Project on Artificial Intelligence provided some momentum to AI and neural processing.

In 1957, John von Neumann suggested using telegraph relays or thermionic valves to simulate simple neurone functions. In the same year, neuro-biologist Frank Rosenblatt began work on an machine modelled on the eye of a fly. The Perceptron was a hardware device with a single layer of processing. It worked by computing the weighted sum of its inputs, subtracting a threshold level and passed out one of two possible values. Perceptrons are mainly used for pattern recognition.

In 1959, Bernard Widrow and Marcian Hoff of Stanford developed the first neural network models to be applied to a real world problem. The models ADALINE and MADALINE were adaptive filters built to eliminate echo on telephone lines.

Research in neural networks went through a dark age until 1982 when John Hopfield of Caltech presented a paper to the National Academy of Sciences which showed through mathematical analysis what could and could not be achieved by neural networks. The last 20 years have seen neural networks become a "hot" research topic, and embraced by commercial interests and be applied to numerous business problems.

Biological vs Artificial Neurones

Biological neurones consist of a cell body or soma containing a nucleus, branch-like dendrites, that transfer information via synapses from surrounding cells to the soma, and an axon that carries the nerve impulse from the some to its target structure.

The artificial neurone is a grossly simplified model of the biological specimen containing the four basic elements; synapses, dendrites, soma, and axon.

The synapses and dendrites of the artificial neurone are the inputs to the processing element (soma). Each of the inputs has an associated connection weight which simulates the strength of a particular synaptic connection. The processing element multiplies each input by its connection weight and usually sums these products, which is then passed to the transfer function to generate a result which is transmitted via the output path. The transfer function dictates the firing of the neurone. This could be based on a certain threshold level, a linear function, or a sigmoid function where the threshold for output varies. Neurones can be classed as excitatory or inhibitory depending on the effect their output has on the output of a target neurone.


An approximation of the 3-dimensional interconnectedness of biological neurones is achieved in artificial neural networks by the use of layers.

The three types of layers involved are input, output, and hidden. The input layer accepts some kind of real-world stimulus. This is transmitted to one or many hidden layers, which process the input information with regard to its connections with the input layer, and the weights of those connections. The results of these transformations are then passed to the output layer, and again processed with regard to connections and weights and is communicated to the user or environment.

Layer connection types

Some degree of complexity is achieved in the artificial neural network through the application of different systems of layer connection. Between layer (inter-layer) can consist of the following systems:

Fully Connected
Each neurone of the first layer is connected to each of the second layer.

Partially Connected
Not all neurones in the first layer are connected to the second layer.

Feed Forward
Information from the first layer flows to the second layer without any feedback from the second layer.        

Pathways exist to allow the output of the second layer to become the input of the first layer.

Where neurones of one level may only communicate with neurones of the adjacent level.

The use of bi-directional connections to facilitate reaching a target condition.

The complexity of layer connections can be increased by the use of two types of intra-layer systems.


Neural networks are trained towards specific outputs by imposing a learning scheme. Learning occurs where a network alters the weights of its component connections, so as to bring it closer to a desired output or problem solution. Three learning schemes are used:

Unsupervised learning
The hidden layers organise the weights of their connections with influence from outside the network.

Reinforcement/supervised learning
Where the weights of the connections in the hidden layers are randomised and the resultant output is graded by an instructor or target data set as to how near the desired output it is.

Back propagation
A highly successful method of training multi-level networks where not only feedback on proximity to target outputs is returned to the hidden layers, but also information on error levels.

In off-line methods of learning once the network is in operation its connection weights are fixed. Most networks are off-line.

On-line or real-time learning is where the system continues to learn whilst being used as a decision tool, such as in a decision support system. This type of learning method requires a complex architecture.

Learning Laws

Learning laws are mathematical algorithms that dictate how the connection weights of a neural network will be altered after learning. This is again a crude approximation of biological function as our knowledge of biological learning systems is incomplete. Some of the major laws are:

Hebb's Rule
If a neurone receives input from another neurone, and if both are highly active, the weight between the neurones should be strengthened.

Hopfield Law
If the desired output and input are both active or both inactive, increment the connection weight by the learning rate (usually a positive number between zero and 1).

The Delta Rule / Least Mean Squared Rule / Windrow Hoff Rule
A rule where the input connection weights are continuously modified to reduce the difference (delta) between the desired output and the actual output of the neurone. This rule aims to minimise the mean squared error of the network, and error data is back-propagated through the layers in sequence until the first layer is reached.

Kohenon's Learning Law
A law where neurones compete for the opportunity to change their connection weights. The neurone with the largest output is given the power to inhibit its competitors and excite its neighbours.


Most applications of neural networks can be categorised by five functions:

This is the use of some input values to predict an output. Examples of this can be seen in stock market prediction, cardiovascular disease prediction, and airline booking systems.

The use of input values to classify objects. Pattern recognition is a major application area of neural networks. This can also include tasks like handwriting recognition, text to speech conversion, diagnosis of disease, optical quality control, image processing, and chemical analysis.

Data association
Similar to the classification of objects, but including feedback on errors in the system. Could be found in fault tolerant systems.

Data conceptualisation
Analyse inputs so that grouping relationships can be inferred. This finds use in the creation of market demographics, and data management.

Data filtering
The use of the neural network to reduce errors or noise in an input. This capability is found in image processing, and improving signal to noise ratio in communication systems.

Neural networks and parallel systems in general are becoming a major force in high end computing. Industry has taken over what was once an obscure research field. IBM is now producing large parallel systems, and at the other end of the market, low cost supercomputers called Beowulf clusters consisting of scavenged PCs are being built in basements and garages around the world.
Despite this the intelligence of HAL, the paranoid onboard computer in Stanley Kubrick's 2001: A Space Odyssey still seems to be science fiction for the moment.


How to build your own Beowulf Cluster

The development of Neural Networks

Neural Networks

IBM Neural Network Utility brochure

Neural Networks

What are Neural Networks?

Neural Nets, what are they?

Introduction to Neural Networks