Isoline retrieval is a
remote sensing
Remote sensing is the acquisition of information about an object or phenomenon without making physical contact with the object, in contrast to in situ or on-site observation. The term is applied especially to acquiring information about Ear ...
inverse method that retrieves one or more
isolines of a trace atmospheric constituent or variable. When used to validate another contour, it is the most accurate method possible for the task. When used to retrieve a whole field, it is a general, nonlinear inverse method and a robust estimator.
For validating advected contours
Rationale
Suppose we have, as in
contour advection
Contour advection is a Lagrangian method
of simulating the evolution of one or more contours or isolines of
a tracer as it is stirred by a moving fluid.
Consider a blob of dye injected into a river or stream: to first order it could be modelled ...
, inferred knowledge of a
single contour or isoline of an atmospheric constituent, ''q''
and we wish to validate this against satellite remote-sensing data.
Since satellite instruments cannot measure the constituent directly,
we need to perform some sort of inversion.
In order to validate the contour, it is not necessary to know,
at any given point, the exact value of the constituent. We only need to
know whether it falls inside or outside, that is, is it greater
than or less than the value of the contour, ''q
0''.
This is a classification problem. Let:
:
be the discretized variable.
This will be related to the satellite ''measurement vector'',
,
by some conditional probability,
,
which we approximate by collecting samples, called ''training data'', of both the
measurement vector and the state variable, ''q''.
By generating classification results over the region of interest
and using any contouring algorithm to separate the
two classes, the isoline will have been "retrieved."
The accuracy of a retrieval will be given by integrating
the conditional probability over the area of interest, ''A'':
:
where ''c'' is the retrieved class at position,
.
We can maximize this quantity by maximizing the value of the integrand
at each point:
:
Since this is the definition of maximum likelihood,
a
classification algorithm based on
maximum likelihood
In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed sta ...
is the most accurate method possible of validating an advected contour.
A good method for performing maximum likelihood classification
from a set of training data is
variable kernel density estimation In statistics, adaptive or "variable-bandwidth" kernel density estimation is a form of kernel density estimation
In statistics, kernel density estimation (KDE) is the application of kernel smoothing for probability density estimation, i.e., a no ...
.
Training data
There are two methods of generating the training data.
The most obvious is empirically, by simply matching measurements of
the variable, ''q'', with
collocated
measurements from the satellite instrument. In this case,
no knowledge of the actual physics that produce the measurement
is required and the retrieval algorithm is purely statistical.
The second is with a forward model:
:
where
is the ''state vector'' and
''q = x
k'' is a single component.
An advantage of this method is that state vectors need not
reflect actual atmospheric configurations, they need only
take on a state that could reasonably occur in the real atmosphere.
There are also none of the errors inherent in
most
collocation
In corpus linguistics, a collocation is a series of words or terms that co-occur more often than would be expected by chance. In phraseology, a collocation is a type of compositional phraseme, meaning that it can be understood from the words ...
procedures,
e.g. because of offset errors in the locations of the paired samples
and differences in the footprint sizes of the two instruments.
Since retrievals will be biased towards more common states,
however, the statistics ought to reflect those in the real world.
Error characterization
The conditional probabilities,
, provide
excellent error characterization, therefore the classification
algorithm ought to return them.
We define the ''confidence rating'' by rescaling the conditional
probability:
:
where ''n
c'' is the number of classes (in this case, two).
If ''C'' is zero, then the classification is little better than
chance, while if it is one, then it should be perfect.
To transform the confidence rating to a statistical ''tolerance'',
the following line integral can be applied to an isoline retrieval
for which the true isoline is known:
:
where ''s'' is the path, ''l'' is the length of the isoline
and
is the retrieved confidence as a function
of position.
While it appears that the integral must be evaluated separately
for each value of the confidence rating, ''C'', in fact it may be
done for all values of ''C'' by sorting the confidence ratings of the
results,
.
The function relates the threshold value of the confidence rating
for which the tolerance is applicable.
That is, it defines a region that contains a fraction of the true
isoline equal to the tolerance.
Example: water vapour from AMSU
The
Advanced Microwave Sounding Unit (AMSU) series of satellite instruments
are designed to detect temperature and water vapour. They have a high
horizontal resolution (as little as 15 km) and because they are
mounted on more than one satellite, full global coverage can be
obtained in less than one day.
Training data was generated using the second method from
European Centre for Medium-Range Weather Forecasts
The European Centre for Medium-Range Weather Forecasts (ECMWF) is an independent intergovernmental organisation supported by most of the nations of Europe. It is based at three sites: Shinfield Park, Reading, United Kingdom; Bologna, Italy; an ...
(ECMWF) ERA-40
data fed to a fast
radiative transfer
Radiative transfer is the physical phenomenon of energy transfer in the form of electromagnetic radiation. The propagation of radiation through a medium is affected by absorption, emission, and scattering processes. The equation of radiative trans ...
model called
RTTOV.
The function,
has been generated from
simulated retrievals and is shown in the figure to the right.
This is then used to set the 90 percent tolerance in the figure
below by shading all the confidence ratings less than 0.8.
Thus we expect the true isoline to fall within the shading
90 percent of the time.
For continuum retrievals
Isoline retrieval is also useful for retrieving a continuum variable
and constitutes a general,
nonlinear
In mathematics and science, a nonlinear system is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathematicians, and many other ...
inverse method.
It has the advantage over both a
neural network
A neural network is a network or neural circuit, circuit of biological neurons, or, in a modern sense, an artificial neural network, composed of artificial neurons or nodes. Thus, a neural network is either a biological neural network, made up ...
, as well as iterative
methods such as
optimal estimation In applied statistics, optimal estimation is a regularized matrix inverse method based on Bayes' theorem.
It is used very commonly in the geosciences, particularly for atmospheric sounding.
A matrix inverse problem looks like this:
:
\mathbf \vec ...
that invert the forward model
directly, in that there is no possibility of getting stuck in a
local minimum
In mathematical analysis, the maxima and minima (the respective plurals of maximum and minimum) of a function, known collectively as extrema (the plural of extremum), are the largest and smallest value of the function, either within a given r ...
.
There are a number of methods of reconstituting the continuum variable
from the discretized one. Once a sufficient number of contours
have been retrieved, it is straightforward to
interpolate
In the mathematical field of numerical analysis, interpolation is a type of estimation, a method of constructing (finding) new data points based on the range of a discrete set of known data points.
In engineering and science, one often has a n ...
between
them. Conditional probabilities make a good
proxy
Proxy may refer to:
* Proxy or agent (law), a substitute authorized to act for another entity or a document which authorizes the agent so to act
* Proxy (climate), a measured variable used to infer the value of a variable of interest in climate re ...
for
the continuum value.
Consider the transformation from a continuum to a discrete variable:
:
:
Suppose that
is given by a Gaussian:
:
where
is the expectation value and
is the standard deviation, then the conditional probability is related to the
continuum variable, ''q'', by the error function:
:
The figure shows conditional probability versus specific humidity for the example
retrieval discussed above.
As a robust estimator
The location of ''q''
0 is found by setting the conditional probabilities
of the two classes to be equal:
:
In other words, equal amounts of the "zeroeth order moment" lie on either side
of ''q''
0. This type of formulation is characteristic of a
robust estimator
Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, such ...
.
References
*
* {{Cite journal
, author = Peter Mills
, title = Efficient statistical classification of satellite measurements
, journal = International Journal of Remote Sensing
, doi = 10.1080/01431161.2010.507795
, year = 2010
, url = http://peteysoft.users.sourceforge.net/TRES_A_507795.pdf
, arxiv = 1202.2194
, access-date = 2011-12-28
, archive-url = https://web.archive.org/web/20120426073755/http://peteysoft.users.sourceforge.net/TRES_A_507795.pdf
, archive-date = 2012-04-26
, url-status = dead
External links
Software for isoline retrieval
Remote sensing
Inverse problems