Isoline retrieval is a

remote sensing Remote sensing is the acquisition of information about an object or phenomenon without making physical contact with the object, in contrast to in situ or on-site observation. The term is applied especially to acquiring information about Ear ...

inverse method that retrieves one or more isolines of a trace atmospheric constituent or variable. When used to validate another contour, it is the most accurate method possible for the task. When used to retrieve a whole field, it is a general, nonlinear inverse method and a robust estimator.

For validating advected contours

Rationale

Suppose we have, as in

contour advection Contour advection is a Lagrangian method of simulating the evolution of one or more contours or isolines of a tracer as it is stirred by a moving fluid. Consider a blob of dye injected into a river or stream: to first order it could be modelled ...

, inferred knowledge of a single contour or isoline of an atmospheric constituent, ''q'' and we wish to validate this against satellite remote-sensing data. Since satellite instruments cannot measure the constituent directly, we need to perform some sort of inversion. In order to validate the contour, it is not necessary to know, at any given point, the exact value of the constituent. We only need to know whether it falls inside or outside, that is, is it greater than or less than the value of the contour, ''q₀''. This is a classification problem. Let: :

j = \begin 1; & q < q_0 \\
		2; & q \geq q_0\end

be the discretized variable. This will be related to the satellite ''measurement vector'',

\vec y

, by some conditional probability,

P(\vec y, j)

, which we approximate by collecting samples, called ''training data'', of both the measurement vector and the state variable, ''q''. By generating classification results over the region of interest and using any contouring algorithm to separate the two classes, the isoline will have been "retrieved." The accuracy of a retrieval will be given by integrating the conditional probability over the area of interest, ''A'': :

\, d\vec

where ''c'' is the retrieved class at position,

\vec r

. We can maximize this quantity by maximizing the value of the integrand at each point: :

\right \rbrace \, d\vec

Since this is the definition of maximum likelihood, a classification algorithm based on

maximum likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed sta ...

is the most accurate method possible of validating an advected contour. A good method for performing maximum likelihood classification from a set of training data is

variable kernel density estimation In statistics, adaptive or "variable-bandwidth" kernel density estimation is a form of kernel density estimation In statistics, kernel density estimation (KDE) is the application of kernel smoothing for probability density estimation, i.e., a no ...

Training data

There are two methods of generating the training data. The most obvious is empirically, by simply matching measurements of the variable, ''q'', with collocated measurements from the satellite instrument. In this case, no knowledge of the actual physics that produce the measurement is required and the retrieval algorithm is purely statistical. The second is with a forward model: :

\vec y = \vec f(\vec x) \,

where

\vec x

is the ''state vector'' and ''q = x_k'' is a single component. An advantage of this method is that state vectors need not reflect actual atmospheric configurations, they need only take on a state that could reasonably occur in the real atmosphere. There are also none of the errors inherent in most

collocation In corpus linguistics, a collocation is a series of words or terms that co-occur more often than would be expected by chance. In phraseology, a collocation is a type of compositional phraseme, meaning that it can be understood from the words ...

procedures, e.g. because of offset errors in the locations of the paired samples and differences in the footprint sizes of the two instruments. Since retrievals will be biased towards more common states, however, the statistics ought to reflect those in the real world.

Error characterization

The conditional probabilities,

P(\vec y, j)

, provide excellent error characterization, therefore the classification algorithm ought to return them. We define the ''confidence rating'' by rescaling the conditional probability: :

C = \frac

where ''n_c'' is the number of classes (in this case, two). If ''C'' is zero, then the classification is little better than chance, while if it is one, then it should be perfect. To transform the confidence rating to a statistical ''tolerance'', the following line integral can be applied to an isoline retrieval for which the true isoline is known: :

\delta(C) = \frac \int_0^l h(C - C^\prime(\vec)) \, ds

where ''s'' is the path, ''l'' is the length of the isoline and

C^\prime

is the retrieved confidence as a function of position. While it appears that the integral must be evaluated separately for each value of the confidence rating, ''C'', in fact it may be done for all values of ''C'' by sorting the confidence ratings of the results,

C^\prime

. The function relates the threshold value of the confidence rating for which the tolerance is applicable. That is, it defines a region that contains a fraction of the true isoline equal to the tolerance.

Example: water vapour from AMSU

The Advanced Microwave Sounding Unit (AMSU) series of satellite instruments are designed to detect temperature and water vapour. They have a high horizontal resolution (as little as 15 km) and because they are mounted on more than one satellite, full global coverage can be obtained in less than one day. Training data was generated using the second method from

European Centre for Medium-Range Weather Forecasts The European Centre for Medium-Range Weather Forecasts (ECMWF) is an independent intergovernmental organisation supported by most of the nations of Europe. It is based at three sites: Shinfield Park, Reading, United Kingdom; Bologna, Italy; an ...

(ECMWF) ERA-40 data fed to a fast

radiative transfer Radiative transfer is the physical phenomenon of energy transfer in the form of electromagnetic radiation. The propagation of radiation through a medium is affected by absorption, emission, and scattering processes. The equation of radiative trans ...

model called RTTOV. The function,

\delta(C)

has been generated from simulated retrievals and is shown in the figure to the right. This is then used to set the 90 percent tolerance in the figure below by shading all the confidence ratings less than 0.8. Thus we expect the true isoline to fall within the shading 90 percent of the time.

For continuum retrievals

Isoline retrieval is also useful for retrieving a continuum variable and constitutes a general,

nonlinear In mathematics and science, a nonlinear system is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathematicians, and many other ...

inverse method. It has the advantage over both a

neural network A neural network is a network or neural circuit, circuit of biological neurons, or, in a modern sense, an artificial neural network, composed of artificial neurons or nodes. Thus, a neural network is either a biological neural network, made up ...

, as well as iterative methods such as

optimal estimation In applied statistics, optimal estimation is a regularized matrix inverse method based on Bayes' theorem. It is used very commonly in the geosciences, particularly for atmospheric sounding. A matrix inverse problem looks like this: : \mathbf \vec ...

that invert the forward model directly, in that there is no possibility of getting stuck in a

local minimum In mathematical analysis, the maxima and minima (the respective plurals of maximum and minimum) of a function, known collectively as extrema (the plural of extremum), are the largest and smallest value of the function, either within a given r ...

. There are a number of methods of reconstituting the continuum variable from the discretized one. Once a sufficient number of contours have been retrieved, it is straightforward to

interpolate In the mathematical field of numerical analysis, interpolation is a type of estimation, a method of constructing (finding) new data points based on the range of a discrete set of known data points. In engineering and science, one often has a n ...

between them. Conditional probabilities make a good

proxy Proxy may refer to: * Proxy or agent (law), a substitute authorized to act for another entity or a document which authorizes the agent so to act * Proxy (climate), a measured variable used to infer the value of a variable of interest in climate re ...

for the continuum value. Consider the transformation from a continuum to a discrete variable: :

P(1 ,  \vec) = \int_^ P(q ,  \vec) \, dq

P(2 ,  \vec) =  \int^_ P(q ,  \vec) \, dq

Suppose that

P(q ,  \vec y)

is given by a Gaussian: :

P(q ,  \vec y) = \frac
	\exp \left \lbrace - \frac \right \rbrace

where

\bar q

is the expectation value and

\sigma_q

is the standard deviation, then the conditional probability is related to the continuum variable, ''q'', by the error function: :

R=P(2 ,  \vec)-P(1 ,  \vec) = \mathrm \left \frac \right

The figure shows conditional probability versus specific humidity for the example retrieval discussed above.

As a robust estimator

The location of ''q''₀ is found by setting the conditional probabilities of the two classes to be equal: :

\int_^ P(q ,  \vec) \, dq = 
\int^\infty_ P(q ,  \vec) \, dq

In other words, equal amounts of the "zeroeth order moment" lie on either side of ''q''₀. This type of formulation is characteristic of a

robust estimator Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, such ...

References

* * {{Cite journal , author = Peter Mills , title = Efficient statistical classification of satellite measurements , journal = International Journal of Remote Sensing , doi = 10.1080/01431161.2010.507795 , year = 2010 , url = http://peteysoft.users.sourceforge.net/TRES_A_507795.pdf , arxiv = 1202.2194 , access-date = 2011-12-28 , archive-url = https://web.archive.org/web/20120426073755/http://peteysoft.users.sourceforge.net/TRES_A_507795.pdf , archive-date = 2012-04-26 , url-status = dead

External links

Software for isoline retrieval
Remote sensing Inverse problems