
In
statistics, originally in
geostatistics, kriging or Kriging, also known as Gaussian process regression, is a method of
interpolation
In the mathematical field of numerical analysis, interpolation is a type of estimation, a method of constructing (finding) new data points based on the range of a discrete set of known data points.
In engineering and science, one often has ...
based on
Gaussian process
In probability theory and statistics, a Gaussian process is a stochastic process (a collection of random variables indexed by time or space), such that every finite collection of those random variables has a multivariate normal distribution, i.e. ...
governed by prior
covariance
In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the le ...
s. Under suitable assumptions of the prior, kriging gives the
best linear unbiased prediction (BLUP) at unsampled locations.
Interpolating methods based on other criteria such as
smoothness (e.g.,
smoothing spline
Smoothing splines are function estimates, \hat f(x), obtained from a set of noisy observations y_i of the target f(x_i), in order to balance a measure of goodness of fit of \hat f(x_i) to y_i with a derivative based measure of the smoothness of \ ...
) may not yield the BLUP. The method is widely used in the domain of
spatial analysis
Spatial analysis or spatial statistics includes any of the formal techniques
Technique or techniques may refer to:
Music
* The Techniques, a Jamaican rocksteady vocal group of the 1960s
*Technique (band), a British female synth pop band in the ...
and
computer experiment A computer experiment or simulation experiment is an experiment used to study a computer simulation, also referred to as an in silico system. This area includes computational physics, computational chemistry, computational biology and other simi ...
s. The technique is also known as Wiener–Kolmogorov prediction, after
Norbert Wiener
Norbert Wiener (November 26, 1894 – March 18, 1964) was an American mathematician and philosopher. He was a professor of mathematics at the Massachusetts Institute of Technology (MIT). A child prodigy, Wiener later became an early researcher ...
and
Andrey Kolmogorov
Andrey Nikolaevich Kolmogorov ( rus, Андре́й Никола́евич Колмого́ров, p=ɐnˈdrʲej nʲɪkɐˈlajɪvʲɪtɕ kəlmɐˈɡorəf, a=Ru-Andrey Nikolaevich Kolmogorov.ogg, 25 April 1903 – 20 October 1987) was a Sovi ...
.
The theoretical basis for the method was developed by the French mathematician
Georges Matheron in 1960, based on the master's thesis of
Danie G. Krige
Danie Gerhardus Krige () (26 August 1919 – 3 March 2013) was a South African statistician and mining engineer who pioneered the field of geostatistics and was professor at the University of the Witwatersrand, Republic of South Africa. The techni ...
, the pioneering plotter of distance-weighted average gold grades at the
Witwatersrand
The Witwatersrand () (locally the Rand or, less commonly, the Reef) is a , north-facing scarp in South Africa. It consists of a hard, erosion-resistant quartzite metamorphic rock, over which several north-flowing rivers form waterfalls, which ...
reef complex in
South Africa
South Africa, officially the Republic of South Africa (RSA), is the southernmost country in Africa. It is bounded to the south by of coastline that stretch along the South Atlantic and Indian Oceans; to the north by the neighbouring count ...
. Krige sought to estimate the most likely distribution of gold based on samples from a few boreholes. The English verb is ''to krige'', and the most common noun is ''kriging''; both are often pronounced with a
hard "g", following an Anglicized pronunciation of the name "Krige". The word is sometimes capitalized as ''Kriging'' in the literature.
Though computationally intensive in its basic formulation, kriging can be scaled to larger problems using various
approximation methods.
Main principles
Related terms and techniques
Kriging predicts the value of a function at a given point by computing a weighted average of the known values of the function in the neighborhood of the point. The method is closely related to
regression analysis
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
. Both theories derive a
best linear unbiased estimator based on assumptions on
covariance
In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the le ...
s, make use of
Gauss–Markov theorem to prove independence of the estimate and error, and use very similar formulae. Even so, they are useful in different frameworks: kriging is made for estimation of a single realization of a random field, while regression models are based on multiple observations of a multivariate data set.
The kriging estimation may also be seen as a
spline in a
reproducing kernel Hilbert space, with the reproducing kernel given by the covariance function. The difference with the classical kriging approach is provided by the interpretation: while the spline is motivated by a minimum-norm interpolation based on a Hilbert-space structure, kriging is motivated by an expected squared prediction error based on a stochastic model.
Kriging with ''polynomial trend surfaces'' is mathematically identical to
generalized least squares polynomial
curve fitting
Curve fitting is the process of constructing a curve, or mathematical function, that has the best fit to a series of data points, possibly subject to constraints. Curve fitting can involve either interpolation, where an exact fit to the data i ...
.
Kriging can also be understood as a form of
Bayesian optimization. Kriging starts with a
prior distribution Distribution may refer to:
Mathematics
*Distribution (mathematics), generalized functions used to formulate solutions of partial differential equations
*Probability distribution, the probability of a particular value or value range of a varia ...
over
functions. This prior takes the form of a Gaussian process:
samples from a function will be
normally distributed, where the
covariance
In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the le ...
between any two samples is the covariance function (or
kernel
Kernel may refer to:
Computing
* Kernel (operating system), the central component of most operating systems
* Kernel (image processing), a matrix used for image convolution
* Compute kernel, in GPGPU programming
* Kernel method, in machine lea ...
) of the Gaussian process evaluated at the spatial location of two points. A
set of values is then observed, each value associated with a spatial location. Now, a new value can be predicted at any new spatial location by combining the Gaussian prior with a Gaussian
likelihood function
The likelihood function (often simply called the likelihood) represents the probability of random variable realizations conditional on particular values of the statistical parameters. Thus, when evaluated on a given sample, the likelihood funct ...
for each of the observed values. The resulting
posterior distribution is also Gaussian, with a mean and covariance that can be simply computed from the observed values, their variance, and the kernel matrix derived from the prior.
Geostatistical estimator
In geostatistical models, sampled data are interpreted as the result of a random process. The fact that these models incorporate uncertainty in their conceptualization doesn't mean that the phenomenon – the forest, the aquifer, the mineral deposit – has resulted from a random process, but rather it allows one to build a methodological basis for the spatial inference of quantities in unobserved locations and to quantify the uncertainty associated with the estimator.
A
stochastic process is, in the context of this model, simply a way to approach the set of data collected from the samples. The first step in geostatistical modulation is to create a random process that best describes the set of observed data.
A value from location
(generic denomination of a set of
geographic coordinates
The geographic coordinate system (GCS) is a spherical or ellipsoidal coordinate system for measuring and communicating positions directly on the Earth as latitude and longitude. It is the simplest, oldest and most widely used of the various ...
) is interpreted as a realization
of the
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the p ...
. In the space
, where the set of samples is dispersed, there are
realizations of the random variables
, correlated between themselves.
The set of random variables constitutes a random function, of which only one realization is known – the set
of observed data. With only one realization of each random variable, it's theoretically impossible to determine any
statistical parameter of the individual variables or the function. The proposed solution in the geostatistical formalism consists in ''assuming'' various degrees of ''stationarity'' in the random function, in order to make the inference of some statistic values possible.
For instance, if one assumes, based on the homogeneity of samples in area
where the variable is distributed, the hypothesis that the
first moment is stationary (i.e. all random variables have the same mean), then one is assuming that the mean can be estimated by the arithmetic mean of sampled values.
The hypothesis of stationarity related to the
second moment
In mathematics, the moments of a function are certain quantitative measures related to the shape of the function's graph. If the function represents mass density, then the zeroth moment is the total mass, the first moment (normalized by total mas ...
is defined in the following way: the correlation between two random variables solely depends on the spatial distance between them and is independent of their location. Thus if
and
, then:
:
:
For simplicity, we define
and
.
This hypothesis allows one to infer those two measures – the
variogram and the
covariogram:
:
:
where:
:
;
:
denotes the set of pairs of observations
such that
, and
is the number of pairs in the set.
In this set,
and
denote the same element. Generally an "approximate distance"
is used, implemented using a certain tolerance.
Linear estimation
Spatial inference, or estimation, of a quantity
, at an unobserved location
, is calculated from a linear combination of the observed values
and weights
:
:
The weights
are intended to summarize two extremely important procedures in a spatial inference process:
* reflect the structural "proximity" of samples to the estimation location
;
* at the same time, they should have a desegregation effect, in order to avoid bias caused by eventual sample ''clusters''.
When calculating the weights
, there are two objectives in the geostatistical formalism: ''unbias'' and ''minimal variance of estimation''.
If the cloud of real values
is plotted against the estimated values
, the criterion for global unbias, ''intrinsic stationarity'' or
wide sense stationarity of the field, implies that the mean of the estimations must be equal to mean of the real values.
The second criterion says that the mean of the squared deviations
must be minimal, which means that when the cloud of estimated values ''versus'' the cloud real values is more disperse, the estimator is more imprecise.
Methods
Depending on the stochastic properties of the random field and the various degrees of stationarity assumed, different methods for calculating the weights can be deduced, i.e. different types of kriging apply. Classical methods are:
* ''Ordinary kriging'' assumes constant unknown mean only over the search neighborhood of
.
* ''Simple kriging'' assumes stationarity of the
first moment over the entire domain with a known mean:
, where
is the known mean.
* ''
Universal kriging'' assumes a general polynomial trend model, such as linear trend model
.
* ''IRFk-kriging'' assumes
to be an unknown
polynomial
In mathematics, a polynomial is an expression consisting of indeterminates (also called variables) and coefficients, that involves only the operations of addition, subtraction, multiplication, and positive-integer powers of variables. An ex ...
in
.
* ''Indicator kriging'' uses
indicator function
In mathematics, an indicator function or a characteristic function of a subset of a set is a function that maps elements of the subset to one, and all other elements to zero. That is, if is a subset of some set , one has \mathbf_(x)=1 if x ...
s instead of the process itself, in order to estimate transition probabilities.
** ''Multiple-indicator kriging'' is a version of indicator kriging working with a family of indicators. Initially, MIK showed considerable promise as a new method that could more accurately estimate overall global mineral deposit concentrations or grades. However, these benefits have been outweighed by other inherent problems of practicality in modelling due to the inherently large block sizes used and also the lack of mining scale resolution. Conditional simulation is fast, becoming the accepted replacement technique in this case.
* ''Disjunctive kriging'' is a nonlinear generalisation of kriging.
* ''
Log-normal kriging'' interpolates positive data by means of
logarithm
In mathematics, the logarithm is the inverse function to exponentiation. That means the logarithm of a number to the base is the exponent to which must be raised, to produce . For example, since , the ''logarithm base'' 10 of ...
s.
* ''Latent kriging'' assumes the various krigings on the latent level (second stage) of the
nonlinear mixed-effects model to produce a spatial functional prediction. This technique is useful when analyzing a spatial functional data
, where
is a time series data over
period,
is a vector of
covariates, and
is a spatial location (longitude, latitude) of the
-th subject.
*''Co-kriging'' denotes the joint kriging of data from multiple sources with a relationship between the different data sources. Co-kriging is also possible in a
Bayesian approach.
*''Bayesian kriging'' departs from the optimization of unknown coefficients and hyperparameters, which is understood as a
maximum likelihood estimate from the Bayesian perspective. Instead, the coefficients and hyperparameters are estimated from their
expectation values. An advantage of Bayesian kriging is, that it allows to quantify the evidence for and the uncertainty of the kriging
emulator. If the emulator is employed to propagate uncertainties, the quality of the kriging emulator can be assessed by comparing the emulator uncertainty to the total uncertainty (see also
Bayesian Polynomial Chaos). Bayesian kriging can also be mixed with co-kriging.
Ordinary kriging
The unknown value
is interpreted as a random variable located in
, as well as the values of neighbors samples
. The estimator
is also interpreted as a random variable located in
, a result of the linear combination of variables.
In order to deduce the kriging system for the assumptions of the model, the following error committed while estimating
in
is declared:
:
The two quality criteria referred to previously can now be expressed in terms of the mean and variance of the new random variable
:
; Lack of bias
Since the random function is stationary,
, the following constraint is observed:
:
:
In order to ensure that the model is unbiased, the weights must sum to one.
; Minimum variance
Two estimators can have
, but the dispersion around their mean determines the difference between the quality of estimators. To find an estimator with minimum variance, we need to minimize