A neural network is a network or circuit of neurons, or in a modern sense, an artificial neural network
, composed of artificial neuron
s or nodes. Thus a neural network is either a biological neural network
, made up of real biological neurons, or an artificial neural network, for solving artificial intelligence
(AI) problems. The connections of the biological neuron are modeled as weights. A positive weight reflects an excitatory connection, while negative values mean inhibitory connections. All inputs are modified by a weight and summed. This activity is referred to as a linear combination. Finally, an activation function controls the amplitude
of the output. For example, an acceptable range of output is usually between 0 and 1, or it could be −1 and 1.
These artificial networks may be used for predictive modeling
, adaptive control and applications where they can be trained via a dataset. Self-learning resulting from experience can occur within networks, which can derive conclusions from a complex and seemingly unrelated set of information.
A biological neural network
is composed of a groups of chemically connected or functionally associated neurons. A single neuron may be connected to many other neurons and the total number of neurons and connections in a network may be extensive. Connections, called synapse
s, are usually formed from axon
s to dendrite
s, though dendrodendritic synapse
s and other connections are possible. Apart from the electrical signaling, there are other forms of signaling that arise from neurotransmitter
Artificial intelligence, cognitive modeling, and neural networks are information processing paradigms inspired by the way biological neural systems process data. Artificial intelligence
and cognitive modeling
try to simulate some properties of biological neural networks. In the artificial intelligence
field, artificial neural networks have been applied successfully to speech recognition
, image analysis
and adaptive control
, in order to construct software agents
(in computer and video games
) or autonomous robot
Historically, digital computers evolved from the von Neumann model
, and operate via the execution of explicit instructions via access to memory by a number of processors. On the other hand, the origins of neural networks are based on efforts to model information processing in biological systems. Unlike the von Neumann model, neural network computing does not separate memory and processing.
Neural network theory has served both to better identify how the neurons in the brain function and to provide the basis for efforts to create artificial intelligence.
The preliminary theoretical base for contemporary neural networks was independently proposed by Alexander Bain
(1873) and William James
(1890). In their work, both thoughts and body activity resulted from interactions among neurons within the brain.
every activity led to the firing of a certain set of neurons. When activities were repeated, the connections between those neurons strengthened. According to his theory, this repetition was what led to the formation of memory. The general scientific community at the time was skeptical of Bain's
theory because it required what appeared to be an inordinate number of neural connections within the brain. It is now apparent that the brain is exceedingly complex and that the same brain “wiring” can handle multiple problems and inputs.
theory was similar to Bain's,
however, he suggested that memories and actions resulted from electrical currents flowing among the neurons in the brain. His model, by focusing on the flow of electrical currents, did not require individual neural connections for each memory or action.
C. S. Sherrington
(1898) conducted experiments to test James's theory. He ran electrical currents down the spinal cords of rats. However, instead of demonstrating an increase in electrical current as projected by James, Sherrington found that the electrical current strength decreased as the testing continued over time. Importantly, this work led to the discovery of the concept of habituation
(1943) created a computational model for neural networks based on mathematics and algorithms. They called this model threshold logic
. The model paved the way for neural network research to split into two distinct approaches. One approach focused on biological processes in the brain and the other focused on the application of neural networks to artificial intelligence.
In the late 1940s psychologist Donald Hebb
created a hypothesis of learning based on the mechanism of neural plasticity that is now known as Hebbian learning
. Hebbian learning is considered to be a 'typical' unsupervised learning
rule and its later variants were early models for long term potentiation
. These ideas started being applied to computational models in 1948 with Turing's B-type machines
Farley and Clark (1954) first used computational machines, then called calculators, to simulate a Hebbian network at MIT. Other neural network computational machines were created by Rochester, Holland, Habit, and Duda (1956).
(1958) created the perceptron
, an algorithm for pattern recognition based on a two-layer learning computer network using simple addition and subtraction. With mathematical notation, Rosenblatt also described circuitry not in the basic perceptron, such as the exclusive-or
circuit, a circuit whose mathematical computation could not be processed until after the backpropagation
algorithm was created by Werbos
Neural network research stagnated after the publication of machine learning research by Marvin Minsky
and Seymour Papert
(1969). They discovered two key issues with the computational machines that processed neural networks. The first issue was that single-layer neural networks were incapable of processing the exclusive-or circuit. The second significant issue was that computers were not sophisticated enough to effectively handle the long run time required by large neural networks. Neural network research slowed until computers achieved greater processing power. Also key in later advances was the backpropagation
algorithm which effectively solved the exclusive-or problem (Werbos 1975).
The parallel distributed processing
of the mid-1980s became popular under the name connectionism
. The text by Rumelhart and McClelland (1986) provided a full exposition on the use of connectionism in computers to simulate neural processes.
Neural networks, as used in artificial intelligence, have traditionally been viewed as simplified models of neural processing
in the brain, even though the relation between this model and brain biological architecture is debated, as it is not clear to what degree artificial neural networks mirror brain function.
A ''neural network'' (NN), in the case of artificial neurons called ''artificial neural network'' (ANN) or ''simulated neural network'' (SNN), is an interconnected group of natural or artificial neuron
s that uses a mathematical or computational model
for information processing
based on a connectionistic
approach to computation
. In most cases an ANN is an adaptive system
that changes its structure based on external or internal information that flows through the network.
In more practical terms neural networks are non-linear statistical data modeling
or decision making
tools. They can be used to model complex relationships between inputs and outputs or to find patterns
An artificial neural network
involves a network of simple processing elements (artificial neuron
s) which can exhibit complex global behavior, determined by the connections between the processing elements and element parameters. Artificial neurons were first proposed in 1943 by Warren McCulloch
, a neurophysiologist, and Walter Pitts
, a logician, who first collaborated at the University of Chicago
One classical type of artificial neural network is the recurrent Hopfield network
The concept of a neural network appears to have first been proposed by Alan Turing
in his 1948 paper ''Intelligent Machinery'' in which he called them "B-type unorganised machines".
The utility of artificial neural network models lies in the fact that they can be used to infer a function from observations and also to use it. Unsupervised neural networks can also be used to learn representations of the input that capture the salient characteristics of the input distribution, e.g., see the Boltzmann machine
(1983), and more recently, deep learning
algorithms, which can implicitly learn the distribution function of the observed data. Learning in neural networks is particularly useful in applications where the complexity of the data or task makes the design of such functions by hand impractical.
Neural networks can be used in different fields. The tasks to which artificial neural networks are applied tend to fall within the following broad categories:
, or regression analysis
, including time series prediction
, including pattern
and sequence recognition, novelty detection
and sequential decision making.
, including filtering, clustering, blind signal separation
Application areas of ANNs include nonlinear system identification
and control (vehicle control, process control), game-playing and decision making (backgammon, chess, racing), pattern recognition (radar systems, face identification
, object recognition), sequence recognition (gesture, speech, handwritten text recognition
), medical diagnosis, financial applications, data mining
(or knowledge discovery in databases, "KDD"), visualization and e-mail spam
filtering. For example, it is possible to create a semantic profile of user's interests emerging from pictures trained for object recognition.
Theoretical and computational neuroscience
is the field concerned with the analysis and computational modeling of biological neural systems.
Since neural systems are intimately related to cognitive processes and behaviour, the field is closely related to cognitive and behavioural modeling.
The aim of the field is to create models of biological neural systems in order to understand how biological systems work. To gain this understanding, neuroscientists strive to make a link between observed biological processes (data), biologically plausible mechanisms for neural processing and learning (biological neural network
models) and theory (statistical learning theory and information theory
Types of models
Many models are used; defined at different levels of abstraction, and modeling different aspects of neural systems. They range from models of the short-term behaviour of individual neurons
, through models of the dynamics of neural circuitry arising from interactions between individual neurons, to models of behaviour arising from abstract neural modules that represent complete subsystems. These include models of the long-term and short-term plasticity of neural systems and its relation to learning and memory, from the individual neuron to the system level.
In August 2020 scientists reported that bi-directional connections, or added appropriate feedback connections, can accelerate and improve communication between and in modular neural networks
of the brain's cerebral cortex
and lower the threshold for their successful communication. They showed that adding feedback connections between a resonance pair can support successful propagation of a single pulse packet throughout the entire network.
A common criticism of neural networks, particularly in robotics, is that they require a large diversity of training samples for real-world operation. This is not surprising, since any learning machine needs sufficient representative examples in order to capture the underlying structure that allows it to generalize to new cases. Dean Pomerleau, in his research presented in the paper "Knowledge-based Training of Artificial Neural Networks for Autonomous Robot Driving," uses a neural network to train a robotic vehicle to drive on multiple types of roads (single lane, multi-lane, dirt, etc.). A large amount of his research is devoted to (1) extrapolating multiple training scenarios from a single training experience, and (2) preserving past training diversity so that the system does not become overtrained (if, for example, it is presented with a series of right turns—it should not learn to always turn right). These issues are common in neural networks that must decide from amongst a wide variety of responses, but can be dealt with in several ways, for example by randomly shuffling the training examples, by using a numerical optimization algorithm that does not take too large steps when changing the network connections following an example, or by grouping examples in so-called mini-batches.
A. K. Dewdney
, a former ''Scientific American
'' columnist, wrote in 1997, "Although neural nets do solve a few toy problems, their powers of computation are so limited that I am surprised anyone takes them seriously as a general problem-solving tool" (Dewdney, p. 82).
Arguments for Dewdney's position are that to implement large and effective software neural networks, much processing and storage resources need to be committed. While the brain has hardware tailored to the task of processing signals through a graph of neurons, simulating even a most simplified form on Von Neumann technology may compel a neural network designer to fill many millions of database
rows for its connections—which can consume vast amounts of computer memory
and data storage
capacity. Furthermore, the designer of neural network systems will often need to simulate the transmission of signals through many of these connections and their associated neurons—which must often be matched with incredible amounts of CPU
processing power and time. While neural networks often yield ''effective'' programs, they too often do so at the cost of ''efficiency'' (they tend to consume considerable amounts of time and money).
Arguments against Dewdney's position are that neural nets have been successfully used to solve many complex and diverse tasks, such as autonomously flying aircraft.
Technology writer Roger Bridgman
commented on Dewdney's statements about neural nets:
Neural networks, for instance, are in the dock not only because they have been hyped to high heaven, (what hasn't?) but also because you could create a successful net without understanding how it worked: the bunch of numbers that captures its behaviour would in all probability be "an opaque, unreadable table...valueless as a scientific resource".
In spite of his emphatic declaration that science is not technology, Dewdney seems here to pillory neural nets as bad science when most of those devising them are just trying to be good engineers. An unreadable table that a useful machine could read would still be well worth having.
Although it is true that analyzing what has been learned by an artificial neural network is difficult, it is much easier to do so than to analyze what has been learned by a biological neural network. Moreover, recent emphasis on the explainability of AI has contributed towards the development of methods, notably those based on attention mechanisms, for visualizing and explaining learned neural networks. Furthermore, researchers involved in exploring learning algorithms for neural networks are gradually uncovering generic principles that allow a learning machine to be successful. For example, Bengio and LeCun (2007) wrote an article regarding local vs non-local learning, as well as shallow vs deep architecture.
Some other criticisms came from believers of hybrid models (combining neural networks and symbolic
approaches). They advocate the intermix of these two approaches and believe that hybrid models can better capture the mechanisms of the human mind (Sun and Bookman, 1990).
While initially research had been concerned mostly with the electrical characteristics of neurons, a particularly important part of the investigation in recent years has been the exploration of the role of neuromodulators
such as dopamine
, and serotonin
on behaviour and learning.
models, such as BCM theory
, have been important in understanding mechanisms for synaptic plasticity
, and have had applications in both computer science and neuroscience. Research is ongoing in understanding the computational algorithms used in the brain, with some recent biological evidence for radial basis networks
and neural backpropagation
as mechanisms for processing data.
Computational devices have been created in CMOS for both biophysical simulation and neuromorphic computing
. More recent efforts show promise for creating nanodevice
s for very large scale principal component
s analyses and convolution
. If successful, these efforts could usher in a new era of neural computing
that is a step beyond digital computing, because it depends on learning
rather than programming
and because it is fundamentally analog
rather than digital
even though the first instantiations may in fact be with CMOS digital devices.
Between 2009 and 2012, the recurrent neural network
s and deep feedforward neural network
s developed in the research group of Jürgen Schmidhuber
at the Swiss AI Lab IDSIA
have won eight international competitions in pattern recognition
and machine learning
. For example, multi-dimensional long short term memory
(LSTM) won three competitions in connected handwriting recognition at the 2009 International Conference on Document Analysis and Recognition (ICDAR), without any prior knowledge about the three different languages to be learned.
Variants of the back-propagation
algorithm as well as unsupervised methods by Geoff Hinton
and colleagues at the University of Toronto
can be used to train deep, highly nonlinear neural architectures, similar to the 1980 Neocognitron
by Kunihiko Fukushima
, and the "standard architecture of vision", inspired by the simple and complex cells identified by David H. Hubel
and Torsten Wiesel
in the primary visual cortex
Radial basis function and wavelet networks have also been introduced. These can be shown to offer best approximation properties and have been applied in nonlinear system identification
and classification applications.
feedforward networks alternate convolution
al layers and max-pooling layers, topped by several pure classification layers. Fast GPU
-based implementations of this approach have won several pattern recognition contests, including the IJCNN 2011 Traffic Sign Recognition Competition and the ISBI 2012 Segmentation of Neuronal Structures in Electron Microscopy Stacks challenge. Such neural networks also were the first artificial pattern recognizers to achieve human-competitive or even superhuman performance
[D. C. Ciresan, U. Meier, J. Schmidhuber. Multi-column Deep Neural Networks for Image Classification. IEEE Conf. on Computer Vision and Pattern Recognition CVPR 2012.]
on benchmarks such as traffic sign recognition (IJCNN 2012), or the MNIST handwritten digits problem of Yann LeCun
and colleagues at NYU
A Brief Introduction to Neural Networks (D. Kriesel)
- Illustrated, bilingual manuscript about artificial neural networks; Topics so far: Perceptrons, Backpropagation, Radial Basis Functions, Recurrent Neural Networks, Self Organizing Maps, Hopfield Networks.
*ttps://web.archive.org/web/20091216110504/http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html Another introduction to ANN
br>Next Generation of Neural Networks
- Google Tech TalksNeural Networks and Information
Category:Information, knowledge, and uncertainty