In the field of
mathematical modeling, a radial basis function network is an
artificial neural network
Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains.
An ANN is based on a collection of connected unit ...
that uses
radial basis functions as
activation functions. The output of the network is a
linear combination of radial basis functions of the inputs and neuron parameters. Radial basis function networks have many uses, including
function approximation,
time series prediction,
classification Classification is a process related to categorization, the process in which ideas and objects are recognized, differentiated and understood.
Classification is the grouping of related facts into classes.
It may also refer to:
Business, organizat ...
, and system
control
Control may refer to:
Basic meanings Economics and business
* Control (management), an element of management
* Control, an element of management accounting
* Comptroller (or controller), a senior financial officer in an organization
* Controlling ...
. They were first formulated in a 1988 paper by Broomhead and Lowe, both researchers at the
Royal Signals and Radar Establishment.
Network architecture

Radial basis function (RBF) networks typically have three layers: an input layer, a hidden layer with a non-linear RBF activation function and a linear output layer. The input can be modeled as a vector of real numbers
. The output of the network is then a scalar function of the input vector,
, and is given by
:
where
is the number of neurons in the hidden layer,
is the center vector for neuron
, and
is the weight of neuron
in the linear output neuron. Functions that depend only on the distance from a center vector are radially symmetric about that vector, hence the name radial basis function. In the basic form, all inputs are connected to each hidden neuron. The
norm
Naturally occurring radioactive materials (NORM) and technologically enhanced naturally occurring radioactive materials (TENORM) consist of materials, usually industrial wastes or by-products enriched with radioactive elements found in the envir ...
is typically taken to be the
Euclidean distance
In mathematics, the Euclidean distance between two points in Euclidean space is the length of a line segment between the two points.
It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, therefor ...
(although the
Mahalanobis distance The Mahalanobis distance is a measure of the distance between a point ''P'' and a distribution ''D'', introduced by P. C. Mahalanobis in 1936. Mahalanobis's definition was prompted by the problem of identifying the similarities of skulls based ...
appears to perform better with pattern recognition
) and the radial basis function is commonly taken to be
Gaussian
Carl Friedrich Gauss (1777–1855) is the eponym of all of the topics listed below.
There are over 100 topics all named after this German mathematician and scientist, all in the fields of mathematics, physics, and astronomy. The English eponymo ...
:
.
The Gaussian basis functions are local to the center vector in the sense that
:
i.e. changing parameters of one neuron has only a small effect for input values that are far away from the center of that neuron.
Given certain mild conditions on the shape of the activation function, RBF networks are
universal approximator
In the mathematical theory of artificial neural networks, universal approximation theorems are results that establish the density of an algorithmically generated class of functions within a given function space of interest. Typically, these result ...
s on a
compact
Compact as used in politics may refer broadly to a pact or treaty; in more specific cases it may refer to:
* Interstate compact
* Blood compact, an ancient ritual of the Philippines
* Compact government, a type of colonial rule utilized in British ...
subset of
.
This means that an RBF network with enough hidden neurons can approximate any continuous function on a closed, bounded set with arbitrary precision.
The parameters
,
, and
are determined in a manner that optimizes the fit between
and the data.
Normalized
Normalized architecture
In addition to the above ''unnormalized'' architecture, RBF networks can be ''normalized''. In this case the mapping is
:
where
:
is known as a ''normalized radial basis function''.
Theoretical motivation for normalization
There is theoretical justification for this architecture in the case of stochastic data flow. Assume a
stochastic kernel approximation for the joint probability density
:
where the weights
and
are exemplars from the data and we require the kernels to be normalized
:
and
:
.
The probability densities in the input and output spaces are
:
and
:
The expectation of y given an input
is
:
where
:
is the conditional probability of y given
.
The conditional probability is related to the joint probability through
Bayes theorem
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
:
which yields
:
.
This becomes
:
when the integrations are performed.
Local linear models
It is sometimes convenient to expand the architecture to include
local linear models. In that case the architectures become, to first order,
:
and
:
in the unnormalized and normalized cases, respectively. Here
are weights to be determined. Higher order linear terms are also possible.
This result can be written
:
where
:
and
:
in the unnormalized case and
:
in the normalized case.
Here
is a
Kronecker delta function
In mathematics, the Kronecker delta (named after Leopold Kronecker) is a function of two variables, usually just non-negative integers. The function is 1 if the variables are equal, and 0 otherwise:
\delta_ = \begin
0 &\text i \neq j, \\
1 &\t ...
defined as
:
.
Training
RBF networks are typically trained from pairs of input and target values
,
by a two-step algorithm.
In the first step, the center vectors
of the RBF functions in the hidden layer are chosen. This step can be performed in several ways; centers can be randomly sampled from some set of examples, or they can be determined using
k-means clustering
''k''-means clustering is a method of vector quantization, originally from signal processing, that aims to partition ''n'' observations into ''k'' clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or ...
. Note that this step is
unsupervised
''Unsupervised'' is an American adult animated sitcom created by David Hornsby, Rob Rosell, and Scott Marder which ran on FX from January 19 to December 20, 2012. The show was created, and for the most part, written by David Hornsby, Scott Marder ...
.
The second step simply fits a linear model with coefficients
to the hidden layer's outputs with respect to some objective function. A common objective function, at least for regression/function estimation, is the least squares function:
:
where
:
.
We have explicitly included the dependence on the weights. Minimization of the least squares objective function by optimal choice of weights optimizes accuracy of fit.
There are occasions in which multiple objectives, such as smoothness as well as accuracy, must be optimized. In that case it is useful to optimize a regularized objective function such as
:
where
:
and
:
where optimization of S maximizes smoothness and
is known as a
regularization
Regularization may refer to:
* Regularization (linguistics)
* Regularization (mathematics)
* Regularization (physics)
* Regularization (solid modeling)
* Regularization Law, an Israeli law intended to retroactively legalize settlements
See also ...
parameter.
A third optional
backpropagation step can be performed to fine-tune all of the RBF net's parameters.
Interpolation
RBF networks can be used to interpolate a function
when the values of that function are known on finite number of points:
. Taking the known points
to be the centers of the radial basis functions and evaluating the values of the basis functions at the same points
the weights can be solved from the equation
: