In the field of
mathematical modeling
A mathematical model is an abstract and concrete, abstract description of a concrete system using mathematics, mathematical concepts and language of mathematics, language. The process of developing a mathematical model is termed ''mathematical m ...
, a radial basis function network is an
artificial neural network
In machine learning, a neural network (also artificial neural network or neural net, abbreviated ANN or NN) is a computational model inspired by the structure and functions of biological neural networks.
A neural network consists of connected ...
that uses
radial basis function
In mathematics a radial basis function (RBF) is a real-valued function \varphi whose value depends only on the distance between the input and some fixed point, either the origin, so that \varphi(\mathbf) = \hat\varphi(\left\, \mathbf\right\, ), o ...
s as
activation function
The activation function of a node in an artificial neural network is a function that calculates the output of the node based on its individual inputs and their weights. Nontrivial problems can be solved using only a few nodes if the activation f ...
s. The output of the network is a
linear combination
In mathematics, a linear combination or superposition is an Expression (mathematics), expression constructed from a Set (mathematics), set of terms by multiplying each term by a constant and adding the results (e.g. a linear combination of ''x'' a ...
of radial basis functions of the inputs and neuron parameters. Radial basis function networks have many uses, including
function approximation
In general, a function approximation problem asks us to select a function (mathematics), function among a that closely matches ("approximates") a in a task-specific way. The need for function approximations arises in many branches of applied ...
,
time series prediction,
classification
Classification is the activity of assigning objects to some pre-existing classes or categories. This is distinct from the task of establishing the classes themselves (for example through cluster analysis). Examples include diagnostic tests, identif ...
, and system
control. They were first formulated in a 1988 paper by Broomhead and Lowe, both researchers at the
Royal Signals and Radar Establishment
The Royal Signals and Radar Establishment (RSRE) was a scientific research establishment within the Ministry of Defence (MoD) of the United Kingdom. It was located primarily at Malvern in Worcestershire, England. The RSRE motto was ''Ubique ...
.
Network architecture

Radial basis function (RBF) networks typically have three layers: an input layer, a hidden layer with a non-linear RBF activation function and a linear output layer. The input can be modeled as a vector of real numbers
. The output of the network is then a scalar function of the input vector,
, and is given by
:
where
is the number of neurons in the hidden layer,
is the center vector for neuron
, and
is the weight of neuron
in the linear output neuron. Functions that depend only on the distance from a center vector are radially symmetric about that vector, hence the name radial basis function. In the basic form, all inputs are connected to each hidden neuron. The
norm
Norm, the Norm or NORM may refer to:
In academic disciplines
* Normativity, phenomenon of designating things as good or bad
* Norm (geology), an estimate of the idealised mineral content of a rock
* Norm (philosophy), a standard in normative e ...
is typically taken to be the
Euclidean distance
In mathematics, the Euclidean distance between two points in Euclidean space is the length of the line segment between them. It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, and therefore is o ...
(although the
Mahalanobis distance
The Mahalanobis distance is a distance measure, measure of the distance between a point P and a probability distribution D, introduced by Prasanta Chandra Mahalanobis, P. C. Mahalanobis in 1936. The mathematical details of Mahalanobis distance ...
appears to perform better with pattern recognition
) and the radial basis function is commonly taken to be
Gaussian
:
.
The Gaussian basis functions are local to the center vector in the sense that
:
i.e. changing parameters of one neuron has only a small effect for input values that are far away from the center of that neuron.
Given certain mild conditions on the shape of the activation function, RBF networks are
universal approximators on a
compact
Compact as used in politics may refer broadly to a pact or treaty; in more specific cases it may refer to:
* Interstate compact, a type of agreement used by U.S. states
* Blood compact, an ancient ritual of the Philippines
* Compact government, a t ...
subset of
.
This means that an RBF network with enough hidden neurons can approximate any continuous function on a closed, bounded set with arbitrary precision.
The parameters
,
, and
are determined in a manner that optimizes the fit between
and the data.
Normalization
Normalized architecture
In addition to the above ''unnormalized'' architecture, RBF networks can be ''normalized''. In this case the mapping is
:
where
:
is known as a ''normalized radial basis function''.
Theoretical motivation for normalization
There is theoretical justification for this architecture in the case of stochastic data flow. Assume a
stochastic kernel
In probability theory, a Markov kernel (also known as a stochastic kernel or probability kernel) is a map that in the general theory of Markov processes plays the role that the transition matrix does in the theory of Markov processes with a finit ...
approximation for the joint probability density
:
where the weights
and
are exemplars from the data and we require the kernels to be normalized
:
and
:
.
The probability densities in the input and output spaces are
:
and
:
The expectation of y given an input
is
:
where
:
is the conditional probability of y given
.
The conditional probability is related to the joint probability through
Bayes' theorem
Bayes' theorem (alternatively Bayes' law or Bayes' rule, after Thomas Bayes) gives a mathematical rule for inverting Conditional probability, conditional probabilities, allowing one to find the probability of a cause given its effect. For exampl ...
:
which yields
:
.
This becomes
:
when the integrations are performed.
Local linear models
It is sometimes convenient to expand the architecture to include
local linear models. In that case the architectures become, to first order,
:
and
:
in the unnormalized and normalized cases, respectively. Here
are weights to be determined. Higher order linear terms are also possible.
This result can be written
:
where
:
and
:
in the unnormalized case and in the normalized case.
Here
is a
Kronecker delta function defined as
:
.
Training
RBF networks are typically trained from pairs of input and target values
,
by a two-step algorithm.
In the first step, the center vectors
of the RBF functions in the hidden layer are chosen. This step can be performed in several ways; centers can be randomly sampled from some set of examples, or they can be determined using
k-means clustering
''k''-means clustering is a method of vector quantization, originally from signal processing, that aims to partition of a set, partition ''n'' observations into ''k'' clusters in which each observation belongs to the cluster (statistics), cluste ...
. Note that this step is
unsupervised.
The second step simply fits a linear model with coefficients
to the hidden layer's outputs with respect to some objective function. A common objective function, at least for regression/function estimation, is the least squares function:
:
where
:
.
We have explicitly included the dependence on the weights. Minimization of the least squares objective function by optimal choice of weights optimizes accuracy of fit.
There are occasions in which multiple objectives, such as smoothness as well as accuracy, must be optimized. In that case it is useful to optimize a regularized objective function such as
:
where
:
and
:
where optimization of S maximizes smoothness and
is known as a
regularization
Regularization may refer to:
* Regularization (linguistics)
* Regularization (mathematics)
* Regularization (physics)
* Regularization (solid modeling)
* Regularization Law, an Israeli law intended to retroactively legalize settlements
See also ...
parameter.
A third optional
backpropagation
In machine learning, backpropagation is a gradient computation method commonly used for training a neural network to compute its parameter updates.
It is an efficient application of the chain rule to neural networks. Backpropagation computes th ...
step can be performed to fine-tune all of the RBF net's parameters.
Interpolation
RBF networks can be used to interpolate a function
when the values of that function are known on finite number of points:
. Taking the known points
to be the centers of the radial basis functions and evaluating the values of the basis functions at the same points
the weights can be solved from the equation
: