"A Logical Calculus of the Ideas Immanent to Nervous Activity" is a 1943 article written by

Warren McCulloch Warren Sturgis McCulloch (November 16, 1898 – September 24, 1969) was an American neurophysiologist and cybernetician known for his work on the foundation for certain brain theories and his contribution to the cybernetics movement.Ken Aizawa ...

and

Walter Pitts Walter Harry Pitts, Jr. (April 23, 1923 – May 14, 1969) was an American logician who worked in the field of computational neuroscience.Smalheiser, Neil R"Walter Pitts", ''Perspectives in Biology and Medicine'', Volume 43, Number 2, Wint ...

. The paper, published in the journal '' The Bulletin of Mathematical Biophysics,'' proposed a mathematical model of the nervous system as a network of simple logical elements, later known as artificial neurons, or McCulloch-Pitts neurons. These neurons receive inputs, perform a weighted sum, and fire an output signal based on a threshold function. By connecting these units in various configurations, McCulloch and Pitts demonstrated that their model could perform all logical functions. It is a seminal work in

cognitive science Cognitive science is the interdisciplinary, scientific study of the mind and its processes. It examines the nature, the tasks, and the functions of cognition (in a broad sense). Mental faculties of concern to cognitive scientists include percep ...

computational neuroscience Computational neuroscience (also known as theoretical neuroscience or mathematical neuroscience) is a branch of neuroscience which employs mathematics, computer science, theoretical analysis and abstractions of the brain to understand th ...

computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...

, and

artificial intelligence Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...

. It was a foundational result in

automata theory Automata theory is the study of abstract machines and automata, as well as the computational problems that can be solved using them. It is a theory in theoretical computer science with close connections to cognitive science and mathematical l ...

John von Neumann John von Neumann ( ; ; December 28, 1903 – February 8, 1957) was a Hungarian and American mathematician, physicist, computer scientist and engineer. Von Neumann had perhaps the widest coverage of any mathematician of his time, in ...

cited it as a significant result.von Neumann, J. (1951).
The general and logical theory of automata
'. In L. A. Jeffress (Ed.), ''Cerebral mechanisms in behavior; the Hixon Symposium'' (pp. 1–41). Wiley.

Mathematics

The artificial neuron used in the original paper is slightly different from the modern version. They considered neural networks that operate in discrete steps of time

t = 0, 1, \dots

. The neural network contains a number of neurons. Let the state of a neuron

i

at time

t

N_i(t)

. The state of a neuron can either be 0 or 1, standing for "not firing" and "firing". Each neuron also has a firing threshold

\theta

, such that it fires if the total input exceeds the threshold. Each neuron can connect to any other neuron (including itself) with positive

synapses In the nervous system, a synapse is a structure that allows a neuron (or nerve cell) to pass an electrical or chemical signal to another neuron or a target effector cell. Synapses can be classified as either chemical or electrical, depending o ...

(excitatory) or negative synapses (inhibitory). That is, each neuron can connect to another neuron with a weight

w

taking an integer value. A peripheral afferent is a neuron with no incoming synapses. We can regard each neural network as a directed graph, with the nodes being the neurons, and the directed edges being the synapses. A neural network has a circle or a circuit if there exists a directed circle in the graph. Let

w_(t)

be the connection weight from neuron

j

to neuron

i

at time

t

, then its next state is

N_i(t+1) = H \left( \sum_^ w_(t) N_j(t) - \theta_i(t) \right),

where

H

is the

Heaviside step function The Heaviside step function, or the unit step function, usually denoted by or (but sometimes , or ), is a step function named after Oliver Heaviside, the value of which is zero for negative arguments and one for positive arguments. Differen ...

(outputting 1 if the input is greater than or equal to 0, and 0 otherwise).

Symbolic logic

The paper used, as a logical language for describing neural networks, "Language II" from ''The Logical Syntax of Language'' by

Rudolf Carnap Rudolf Carnap (; ; 18 May 1891 – 14 September 1970) was a German-language philosopher who was active in Europe before 1935 and in the United States thereafter. He was a major member of the Vienna Circle and an advocate of logical positivism. ...

with some notations taken from ''

Principia Mathematica The ''Principia Mathematica'' (often abbreviated ''PM'') is a three-volume work on the foundations of mathematics written by the mathematician–philosophers Alfred North Whitehead and Bertrand Russell and published in 1910, 1912, and 1 ...

'' by

Alfred North Whitehead Alfred North Whitehead (15 February 1861 – 30 December 1947) was an English mathematician and philosopher. He created the philosophical school known as process philosophy, which has been applied in a wide variety of disciplines, inclu ...

and

Bertrand Russell Bertrand Arthur William Russell, 3rd Earl Russell, (18 May 1872 – 2 February 1970) was a British philosopher, logician, mathematician, and public intellectual. He had influence on mathematics, logic, set theory, and various areas of analytic ...

. Language II covers substantial parts of classical mathematics, including

real analysis In mathematics, the branch of real analysis studies the behavior of real numbers, sequences and series of real numbers, and real functions. Some particular properties of real-valued sequences and functions that real analysis studies include co ...

and portions of set theory. To describe a neural network with peripheral afferents

N_1, N_2, \dots, N_p

and non-peripheral afferents

N_, N_, \dots, N_n

they considered logical predicate of form

Pr(N_1, N_2, \dots, N_p, t)

where

Pr

is a

first-order logic First-order logic, also called predicate logic, predicate calculus, or quantificational logic, is a collection of formal systems used in mathematics, philosophy, linguistics, and computer science. First-order logic uses quantified variables over ...

predicate function (a function that outputs a boolean),

N_1, \dots, N_p

are predicates that take

t

as an argument, and

t

is the only

free variable In mathematics, and in other disciplines involving formal languages, including mathematical logic and computer science, a variable may be said to be either free or bound. Some older books use the terms real variable and apparent variable for f ...

in the predicate. Intuitively speaking,

N_1, \dots, N_p

specifies the binary input patterns going into the neural network over all time, and

Pr(N_1, N_2, \dots, N_n, t)

is a function that takes some binary input patterns, and constructs an output binary pattern

Pr(N_1, N_2, \dots, N_n, 0), Pr(N_1, N_2, \dots, N_n, 1), \dots

. A logical sentence

Pr(N_1, N_2, \dots, N_n, t)

is realized by a neural network iff there exists a time-delay

T \geq 0

, a neuron

i

in the network, and an initial state for the non-peripheral neurons

N_(0), \dots, N_n(0)

, such that for any time

t

, the truth-value of the logical sentence is equal to the state of the neuron

i

at time

t + T

. That is,

\forall t = 0, 1, 2, \dots, \quad Pr(N_1, N_2, \dots, N_p, t) = N_i(t + T)

Equivalence

In the paper, they considered some alternative definitions of artificial neural networks, and have shown them to be equivalent, that is, neural networks under one definition realizes precisely the same logical sentences as neural networks under another definition. They considered three forms of inhibition: relative inhibition, absolute inhibition, and extinction. The definition above is relative inhibition. By "absolute inhibition" they meant that if any negative synapse fires, then the neuron will not fire. By "extinction" they meant that if at time

t

, any inhibitory synapse fires on a neuron

i

, then

\theta_i(t + j) = \theta_i(0) + b_j

for

j = 1, 2, 3, \dots

, until the next time an inhibitory synapse fires on

i

. It is required that

b_j = 0

for all large

j

. Theorem 4 and 5 state that these are equivalent. They considered three forms of excitation: spatial summation, temporal summation, and facilitation. The definition above is spatial summation (which they pictured as having multiple synapses placed close together, so that the effect of their firing sums up). By "temporal summation" they meant that the total incoming signal is

\sum_^T\sum_^ w_(t) N_j(t - \tau)

for some

T \geq 1

. By "facilitation" they meant the same as extinction, except that

b_j \leq 0

. Theorem 6 states that these are equivalent. They considered neural networks that do not change, and those that change by Hebbian learning. That is, they assume that at

t = 0

, some excitatory synaptic connections are not active. If at any

t

, both

N_i(t) = 1, N_j(t) = 1

, then any latent excitatory synapse between

i, j

becomes active. Theorem 7 states that these are equivalent.

Logical expressivity

They considered "temporal propositional expressions" (TPE), which are propositional formulas with one free variable

t

. For example,

N_1(t) \vee N_2(t) \wedge \neg N_3(t)

is such an expression. Theorem 1 and 2 together showed that neural nets without circles are equivalent to TPE. For neural nets with loops, they noted that "realizable

Pr

may involve reference to past events of an indefinite degree of remoteness". These then encodes for sentences like "There was some x such that x was a ψ" or

(\exists x) (\psi x)

. Theorems 8 to 10 showed that neural nets with loops can encode all first-order logic with equality and conversely, any looped neural networks is equivalent to a sentence in first-order logic with equality, thus showing that they are equivalent in logical expressiveness. As a remark, they noted that a neural network, if furnished with a tape, scanners, and write-heads, is equivalent to a

Turing machine A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer algori ...

, and conversely, every Turing machine is equivalent to some such neural network. Thus, these neural networks are equivalent to Turing computability, Church's lambda-definability, and

Kleene Stephen Cole Kleene ( ; January 5, 1909 – January 25, 1994) was an American mathematician. One of the students of Alonzo Church, Kleene, along with Rózsa Péter, Alan Turing, Emil Post, and others, is best known as a founder of the branch of ...

's primitive recursiveness.

Context

Previous work

The paper built upon several previous strands of work. In the symbolic logic side, it built on the previous work by Carnap, Whitehead, and Russell. This was contributed by Walter Pitts, who had a strong proficiency with symbolic logic. Pitts provided mathematical and logical rigor to McCulloch’s vague ideas on psychons (atoms of psychological events) and circular causality. In the neuroscience side, it built on previous work by the mathematical biology research group centered around Nicolas Rashevsky, of which McCulloch was a member. The paper was published in the ''Bulletin of Mathematical Biophysics'', which was founded by Rashevsky in 1939. During the late 1930s, Rashevsky's research group was producing papers that had difficulty publishing in other journals at the time, so Rashevsky decided to found a new journal exclusively devoted to mathematical biophysics. Also in the Rashevsky's group was Alston Scott Householder, who in 1941 published an abstract model of the steady-state activity of biological neural networks. The model, in modern language, is an artificial neural network with

ReLU In the context of Neural network (machine learning), artificial neural networks, the rectifier or ReLU (rectified linear unit) activation function is an activation function defined as the non-negative part of its argument, i.e., the ramp function ...

activation function. In a series of papers, Householder calculated the stable states of very simple networks: a chain, a circle, and a bouquet. Walter Pitts' first two papers formulated a mathematical theory of learning and conditioning. The next three were mathematical developments of Householder’s model. In 1938, at age 15, Pitts ran away from home in Detroit and arrived in the

University of Chicago The University of Chicago (UChicago, Chicago, or UChi) is a Private university, private research university in Chicago, Illinois, United States. Its main campus is in the Hyde Park, Chicago, Hyde Park neighborhood on Chicago's South Side, Chic ...

. Later, he walked into Rudolf Carnap's office with Carnap's book filled with corrections and suggested improvements. He started studying under Carnap and attending classes during 1938--1943. He wrote several early papers on neuronal network modelling and regularly attended Rashevsky's seminars in theoretical biology. The seminar attendants included Gerhard von Bonin and Householder. In 1940, von Bonin introduced Lettvin to McCulloch. In 1942, both Lettvin and Pitts had moved in with McCulloch's home. McCulloch had been interested in circular causality from studies with causalgia after amputation,

epileptic Epilepsy is a group of non-communicable neurological disorders characterized by a tendency for recurrent, unprovoked seizures. A seizure is a sudden burst of abnormal electrical activity in the brain that can cause a variety of symptoms, rang ...

activity of surgically isolated brain, and Lorente de Nò's research showing

recurrent neural networks Recurrent neural networks (RNNs) are a class of artificial neural networks designed for processing sequential data, such as text, speech, and time series, where the order of elements is important. Unlike feedforward neural networks, which proces ...

are needed to explain vestibular nystagmus. He had difficulty with treating circular causality until Pitts demonstrated how it can be treated by the appropriate mathematical tools of modular arithmetics and symbolic logic. Both authors' affiliation in the article was given as "University of Illinois, College of Medicine, Department of Psychiatry at the Illinois Neuropsychiatric Institute, University of Chicago, Chicago, U.S.A."

Subsequent work

It was a foundational result in

cited it as a significant result. This work led to work on neural networks and their link to

finite automata A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number ...

. Kleene introduced the term ''"regular"'' for "regular language" in a 1951 technical report, where Kleene proved that regular languages are all that could be generated by neural networks, among other results. The term "regular" was meant to be suggestive of "regularly occurring events" that the neural net automaton must process and respond to.

Marvin Minsky Marvin Lee Minsky (August 9, 1927 – January 24, 2016) was an American cognitive scientist, cognitive and computer scientist concerned largely with research in artificial intelligence (AI). He co-founded the Massachusetts Institute of Technology ...

was influenced by McCulloch, built an early example of neural network SNARC (1951), and did a PhD thesis on neural networks (1954). McCulloch was the chair to the ten

Macy conferences The Macy conferences were a set of meetings of scholars from various academic disciplines held in New York under the direction of Frank Fremont-Smith at the Josiah Macy Jr. Foundation starting in 1941 and ending in 1960. The explicit aim of th ...

(1946--1953) on "Circular Causal and Feedback Mechanisms in Biological and Social Systems". This was a key event in the beginning of

cybernetics Cybernetics is the transdisciplinary study of circular causal processes such as feedback and recursion, where the effects of a system's actions (its outputs) return as inputs to that system, influencing subsequent action. It is concerned with ...

, and what later became known as

. Pitts also attended the conferences. In the 1943 paper, they described how memories can be formed by a neural network with loops in it, or alterable synapses, which are operating over time, and implements logical universals -- "there exists" and "for all". This was generalized for spatial objects, such as geometric figures, in their 1947 paper ''How we know universals''. Norbert Wiener found this a significant evidence for a general method for how animals recognizing objects, by scanning a scene from multiple transformations and finding a canonical representation. He hypothesized that this "scanning" activity is clocked by the

alpha wave Alpha waves, or the alpha rhythm, are neural oscillations in the frequency range of 8–12 Hz likely originating from the synchronous and coherent ( in phase or constructive) neocortical neuronal electrical activity possibly involving thala ...

, which he mistakenly thought was tightly regulated at 10 Hz (instead of the 8 -- 13 Hz as modern research shows). McCulloch worked with

Manuel Blum Manuel Blum (born 26 April 1938) is a Venezuelan-born American computer scientist who received the Turing Award in 1995 "In recognition of his contributions to the foundations of computational complexity theory and its application to cryptography ...

in studying how a neural network can be "logically stable", that is, can implement a boolean function even if the activation thresholds of individual neurons are varied.Blum, Manuel. "Properties of a neuron with many inputs." ''Bionics Symposium: Living Prototypes--the Key to New Technology, 13-14-15 September 1960''. WADD technical report, 60-600. (1961) They were inspired by the problem of how the brain can perform the same functions, such as breathing, under influence of

caffeine Caffeine is a central nervous system (CNS) stimulant of the methylxanthine chemical classification, class and is the most commonly consumed Psychoactive drug, psychoactive substance globally. It is mainly used for its eugeroic (wakefulness pr ...

alcohol Alcohol may refer to: Common uses * Alcohol (chemistry), a class of compounds * Ethanol, one of several alcohols, commonly known as alcohol in everyday life ** Alcohol (drug), intoxicant found in alcoholic beverages ** Alcoholic beverage, an alco ...

, which shifts the activation threshold over the entire brain.

References

{{Reflist, 30em Machine learning Artificial neural networks History of artificial intelligence Computer science papers

Mathematics

Symbolic logic

Equivalence

Logical expressivity

Context

Previous work

Subsequent work

See also

References