In applied statistics, optimal estimation is a regularized

matrix Matrix (: matrices or matrixes) or MATRIX may refer to: Science and mathematics * Matrix (mathematics), a rectangular array of numbers, symbols or expressions * Matrix (logic), part of a formula in prenex normal form * Matrix (biology), the m ...

inverse method based on

Bayes' theorem Bayes' theorem (alternatively Bayes' law or Bayes' rule, after Thomas Bayes) gives a mathematical rule for inverting Conditional probability, conditional probabilities, allowing one to find the probability of a cause given its effect. For exampl ...

. It is used very commonly in the

geoscience Earth science or geoscience includes all fields of natural science related to the planet Earth. This is a branch of science dealing with the physical, chemical, and biological complex constitutions and synergistic linkages of Earth's four spheres ...

s, particularly for

atmospheric sounding Atmospheric sounding or atmospheric profiling is a measurement of vertical distribution of physical properties of the atmospheric column such as pressure, temperature, wind speed and wind direction (thus deriving wind shear), liquid water content, o ...

. A matrix inverse problem looks like this: :

\mathbf \vec x = \vec y

The essential concept is to transform the matrix, A, into a

conditional probability In probability theory, conditional probability is a measure of the probability of an Event (probability theory), event occurring, given that another event (by assumption, presumption, assertion or evidence) is already known to have occurred. This ...

and the variables,

\vec x

and

\vec y

into probability distributions by assuming Gaussian statistics and using empirically-determined covariance matrices.

Derivation

Typically, one expects the statistics of most measurements to be

Gaussian Carl Friedrich Gauss (1777–1855) is the eponym of all of the topics listed below. There are over 100 topics all named after this German mathematician and scientist, all in the fields of mathematics, physics, and astronomy. The English eponymo ...

. So for example for

P(\vec y, \vec x)

, we can write: :

P(\vec y, \vec x) = \frac  
	\exp \left -\frac (\boldsymbol \vec - \vec)^T
	\boldsymbol ^
	(\boldsymbol \vec - \vec) \right

where ''m'' and ''n'' are the numbers of elements in

\vec x

and

\vec y

respectively

\boldsymbol

is the matrix to be solved (the linear or linearised forward model) and

\boldsymbol

is the

covariance matrix In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of ...

of the vector

\vec y

. This can be similarly done for

\vec x

: :

P(\vec x) = \frac  
	\exp \left \frac  (\vec-\widehat)^T
	\boldsymbol ^ (\vec-\widehat) \right

Here

P(\vec x)

is taken to be the so-called "a-priori" distribution:

\widehat

denotes the a-priori values for

\vec

while

\boldsymbol

is its covariance matrix. The nice thing about the Gaussian distributions is that only two parameters are needed to describe them and so the whole problem can be converted once again to matrices. Assuming that

P(\vec x, \vec y)

takes the following form: :

P(\vec x, \vec y) = \frac  
	\exp \left -\frac (\vec - \widehat) ^T
	\boldsymbol ^ (\vec - \widehat) \right

P(\vec y)

may be neglected since, for a given value of

\vec x

, it is simply a constant scaling term. Now it is possible to solve for both the expectation value of

\vec x

\widehat

, and for its covariance matrix by equating

P(\vec x, \vec y)

and

P(\vec y, \vec x)P(\vec x)

. This produces the following equations: :

\boldsymbol = (\boldsymbol^T \boldsymbol \boldsymbol +
	\boldsymbol)^

\widehat = \widehat + \boldsymbol
	\boldsymbol^T \boldsymbol^(\vec-\boldsymbol \widehat)

Because we are using Gaussians, the

expected value In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...

is equivalent to the maximum likely value, and so this is also a form of

maximum likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ...

estimation. Typically with optimal estimation, in addition to the vector of retrieved quantities, one extra matrix is returned along with the covariance matrix. This is sometimes called the resolution matrix or the averaging kernel and is calculated as follows: :

\boldsymbol = (\boldsymbol^T \boldsymbol^ \boldsymbol +
	\boldsymbol^)^
	\boldsymbol^T \boldsymbol^ \boldsymbol

This tells us, for a given element of the retrieved vector, how much of the other elements of the vector are mixed in. In the case of a retrieval of profile information, it typical indicates the altitude resolution for a given altitude. For instance if the resolution vectors for all the altitudes contain non-zero elements (to a numerical tolerance) in their four nearest neighbours, then the altitude resolution is only one fourth that of the actual grid size.

References

* * *{{cite journal , title = Atmospheric Remote Sensing: The Inverse Problem , year = 2002 , author = Clive D. Rodgers , publisher = University of Oxford , journal = Proceedings of the Fourth Oxford/RAL Spring School in Quantitative Earth Observation Inverse problems Remote sensing