Connectionism refers to both an approach in the field of cognitive science that hopes to explain

mental Mental may refer to: * of or relating to the mind Films * ''Mental'' (2012 film), an Australian comedy-drama * ''Mental'' (2016 film), a Bangladeshi romantic-action movie * ''Mental'', a 2008 documentary by Kazuhiro Soda * ''Mental'', a 2014 O ...

phenomena using artificial neural networks (ANN) and to a wide range of techniques and algorithms using ANNs in the context of

artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machine A machine is a physical system using Power (physics), power to apply Force, forces and control Motion, moveme ...

to build more intelligent machines. Connectionism presents a cognitive theory based on simultaneously occurring, distributed signal activity via connections that can be represented numerically, where learning occurs by modifying connection strengths based on experience. Some advantages of the connectionist approach include its applicability to a broad array of functions, structural approximation to biological neurons, low requirements for innate structure, and capacity for

graceful degradation Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of one or more faults within some of its components. If its operating quality decreases at all, the decrease is proportional to the ...

. Some disadvantages include the difficulty in deciphering how ANNs process information, or account for the compositionality of mental representations, and a resultant difficulty explaining phenomena at a higher level. The success of deep learning networks in the past decade has greatly increased the popularity of this approach, but the complexity and scale of such networks has brought with them increased interpretability problems. Connectionism is seen by many to offer an alternative to classical theories of mind based on symbolic computation, but the extent to which the two approaches are compatible has been the subject of much debate since their inception. Artificial_neural_network

Basic principles

The central connectionist principle is that mental phenomena can be described by interconnected networks of simple and often uniform units. The form of the connections and the units can vary from model to model. For example, units in the network could represent

neurons A neuron, neurone, or nerve cell is an electrically excitable cell that communicates with other cells via specialized connections called synapses. The neuron is the main component of nervous tissue in all animals except sponges and placozoa. ...

and the connections could represent

synapses In the nervous system, a synapse is a structure that permits a neuron (or nerve cell) to pass an electrical or chemical signal to another neuron or to the target effector cell. Synapses are essential to the transmission of nervous impulses from ...

, as in the

human brain The human brain is the central organ (anatomy), organ of the human nervous system, and with the spinal cord makes up the central nervous system. The brain consists of the cerebrum, the brainstem and the cerebellum. It controls most of the act ...

Spreading activation

In most connectionist models, networks change over time. A closely related and very common aspect of connectionist models is ''activation''. At any time, a unit in the network has an activation, which is a numerical value intended to represent some aspect of the unit. For example, if the units in the model are neurons, the activation could represent the

probability Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...

that the neuron would generate an

action potential An action potential occurs when the membrane potential of a specific cell location rapidly rises and falls. This depolarization then causes adjacent locations to similarly depolarize. Action potentials occur in several types of animal cells, ...

spike. Activation typically spreads to all the other units connected to it. Spreading activation is always a feature of neural network models, and it is very common in connectionist models used by cognitive psychologists.

Neural networks

Neural networks are by far the most commonly used connectionist model today. Though there are a large variety of neural network models, they almost always follow two basic principles regarding the mind: # Any mental state can be described as an (N)-dimensional

vector Vector most often refers to: *Euclidean vector, a quantity with a magnitude and a direction *Vector (epidemiology), an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematic ...

of numeric activation values over neural units in a network. # Memory is created by modifying the strength of the connections between neural units. The connection strengths, or "weights", are generally represented as an N×M

matrix Matrix most commonly refers to: * ''The Matrix'' (franchise), an American media franchise ** '' The Matrix'', a 1999 science-fiction action film ** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchi ...

. Most of the variety among neural network models comes from: * ''Interpretation of units'': Units can be interpreted as neurons or groups of neurons. * ''Definition of activation'': Activation can be defined in a variety of ways. For example, in a Boltzmann machine, the activation is interpreted as the probability of generating an action potential spike, and is determined via a

logistic function A logistic function or logistic curve is a common S-shaped curve (sigmoid function, sigmoid curve) with equation f(x) = \frac, where For values of x in the domain of real numbers from -\infty to +\infty, the S-curve shown on the right is ...

on the sum of the inputs to a unit. * ''Learning algorithm'': Different networks modify their connections differently. In general, any mathematically defined change in connection weights over time is referred to as the "learning algorithm". Connectionists are in agreement that recurrent neural networks (directed networks wherein connections of the network can form a directed cycle) are a better model of the brain than

feedforward neural networks A feedforward neural network (FNN) is an artificial neural network wherein connections between the nodes do ''not'' form a cycle. As such, it is different from its descendant: recurrent neural networks. The feedforward neural network was the ...

(directed networks with no cycles, called DAG). Many recurrent connectionist models also incorporate

dynamical systems theory Dynamical systems theory is an area of mathematics used to describe the behavior of complex systems, complex dynamical systems, usually by employing differential equations or difference equations. When differential equations are employed, the theo ...

. Many researchers, such as the connectionist Paul Smolensky, have argued that connectionist models will evolve toward fully continuous, high-dimensional,

non-linear In mathematics and science, a nonlinear system is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathematicians, and many other ...

dynamic systems In mathematics, a dynamical system is a system in which a function describes the time dependence of a point in an ambient space. Examples include the mathematical models that describe the swinging of a clock pendulum, the flow of water in ...

approaches.

Biological realism

Connectionist work in general does not need to be biologically realistic and therefore suffers from a lack of neuroscientific plausibility. However, the structure of neural networks is derived from that of biological

neuron A neuron, neurone, or nerve cell is an membrane potential#Cell excitability, electrically excitable cell (biology), cell that communicates with other cells via specialized connections called synapses. The neuron is the main component of nervous ...

s, and this parallel in low-level structure is often argued to be an advantage of connectionism in modeling cognitive structures compared with other approaches. One area where connectionist models are thought to be biologically implausible is with respect to error-propagation networks that are needed to support learning, but error propagation can explain some of the biologically-generated electrical activity seen at the scalp in event-related potentials such as the N400 and P600, and this provides some biological support for one of the key assumptions of connectionist learning procedures.

Learning

The weights in a neural network are adjusted according to some learning rule or algorithm, such as

Hebbian learning Hebbian theory is a neuroscientific theory claiming that an increase in synaptic efficacy arises from a presynaptic cell's repeated and persistent stimulation of a postsynaptic cell. It is an attempt to explain synaptic plasticity, the adaptatio ...

. Thus, connectionists have created many sophisticated learning procedures for neural networks. Learning always involves modifying the connection weights. In general, these involve mathematical formulas to determine the change in weights when given sets of data consisting of activation vectors for some subset of the neural units. Several studies have been focused on designing teaching-learning methods based on connectionism. By formalizing learning in such a way, connectionists have many tools. A very common strategy in connectionist learning methods is to incorporate

gradient descent In mathematics, gradient descent (also often called steepest descent) is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of ...

over an error surface in a space defined by the weight matrix. All gradient descent learning in connectionist models involves changing each weight by the

partial derivative In mathematics, a partial derivative of a function of several variables is its derivative with respect to one of those variables, with the others held constant (as opposed to the total derivative, in which all variables are allowed to vary). Pa ...

of the error surface with respect to the weight.

Backpropagation In machine learning, backpropagation (backprop, BP) is a widely used algorithm for training feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions gener ...

(BP), first made popular in the 1980s, is probably the most commonly known connectionist gradient descent algorithm today. Connectionism can be traced to ideas more than a century old, which were little more than speculation until the mid-to-late 20th century.

Parallel distributed processing

The prevailing connectionist approach today was originally known as parallel distributed processing (PDP). It was an

artificial neural network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected units ...

approach that stressed the parallel nature of neural processing, and the distributed nature of neural representations. It provided a general mathematical framework for researchers to operate in. The framework involved eight major aspects: * A set of ''processing units'', represented by a set of integers. * An ''activation'' for each unit, represented by a vector of time-dependent functions. * An ''output function'' for each unit, represented by a vector of functions on the activations. * A ''pattern of connectivity'' among units, represented by a matrix of real numbers indicating connection strength. * A ''propagation rule'' spreading the activations via the connections, represented by a function on the output of the units. * An ''activation rule'' for combining inputs to a unit to determine its new activation, represented by a function on the current activation and propagation. * A '' learning rule'' for modifying connections based on experience, represented by a change in the weights based on any number of variables. * An ''environment'' that provides the system with experience, represented by sets of activation vectors for some

subset In mathematics, set ''A'' is a subset of a set ''B'' if all elements of ''A'' are also elements of ''B''; ''B'' is then a superset of ''A''. It is possible for ''A'' and ''B'' to be equal; if they are unequal, then ''A'' is a proper subset o ...

of the units. A lot of the research that led to the development of PDP was done in the 1970s, but PDP became popular in the 1980s with the release of the books ''Parallel Distributed Processing: Explorations in the Microstructure of Cognition - Volume 1 (foundations)'' and ''Volume 2 (Psychological and Biological Models)'', by

James L. McClelland James Lloyd "Jay" McClelland, FBA (born December 1, 1948) is the Lucie Stern Professor at Stanford University, where he was formerly the chair of the Psychology Department. He is best known for his work on statistical learning and Parallel Dis ...