Non-homogeneous Gaussian regression (NGR)
is a type of statistical
regression analysis
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
used in the
atmospheric sciences
Atmospheric science is the study of the Earth's atmosphere and its various inner-working physical processes. Meteorology includes atmospheric chemistry and atmospheric physics with a major focus on weather forecasting. Climatology is the study of ...
as a way to convert
ensemble forecasts into
probabilistic forecasts. Relative to
simple linear regression
In statistics, simple linear regression is a linear regression model with a single explanatory variable. That is, it concerns two-dimensional sample points with one independent variable and one dependent variable (conventionally, the ''x'' and ...
, NGR uses the ensemble spread as an additional predictor, which is used to improve the prediction of uncertainty and allows the predicted uncertainty to vary from case to case. The prediction of uncertainty in NGR is derived from both past forecast errors statistics and the ensemble spread. NGR was originally developed for site-specific medium range temperature forecasting,
but has since also been applied to site-specific medium-range wind forecasting
and to seasonal forecasts,
and has been adapted for precipitation forecasting.
The introduction of NGR was the first demonstration that probabilistic forecasts that take account of the varying ensemble spread could achieve better skill scores than forecasts based on standard
Model output statistics In weather forecasting, model output statistics (MOS) is a multiple linear regression technique in which predictands, often near-surface quantities (such as two-meter-above-ground-level air temperature, horizontal visibility, and wind direction, ...
approaches applied to the ensemble mean.
Intuition
Weather forecasts
Weather forecasting is the application of science and technology to predict the conditions of the atmosphere for a given location and time. People have attempted to predict the weather informally for millennia and formally since the 19th centu ...
generated by
computer simulations of the atmosphere and ocean typically consist of an
ensemble
Ensemble may refer to:
Art
* Architectural ensemble
* ''Ensemble'' (album), Kendji Girac 2015 album
* Ensemble (band), a project of Olivier Alary
* Ensemble cast (drama, comedy)
* Ensemble (musical theatre), also known as the chorus
* ''En ...
of individual forecasts. Ensembles are used as a way to attempt to capture and quantify the
uncertainties in the weather forecasting process, such as
uncertainty in the initial conditions and
uncertainty in the parameterisations in the model. For point forecasts of
normally distributed variables, one can summarize an ensemble forecast with the
mean
There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set.
For a data set, the ''arithme ...
and the
standard deviation
In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
of the ensemble. The ensemble mean is often a better forecast than any of the individual forecasts, and the ensemble standard deviation may give an indication of the uncertainty in the forecast.
However, direct output from computer simulations of the atmosphere needs calibration before it can be meaningfully compared with observations of weather variables. This calibration process is often known as
model output statistics In weather forecasting, model output statistics (MOS) is a multiple linear regression technique in which predictands, often near-surface quantities (such as two-meter-above-ground-level air temperature, horizontal visibility, and wind direction, ...
(MOS). The simplest form of such calibration is to correct biases, using a bias correction calculated from past forecast errors. Bias correction can be applied to both individual ensemble members and the ensemble mean. A more complex form of calibration is to use past forecasts and past observations to train a
simple linear regression
In statistics, simple linear regression is a linear regression model with a single explanatory variable. That is, it concerns two-dimensional sample points with one independent variable and one dependent variable (conventionally, the ''x'' and ...
model that maps the ensemble mean onto the observations. In such a model the uncertainty in the prediction is derived purely from the statistical properties of the past forecast errors. However, ensemble forecasts are constructed with the hope that the ensemble spread may contain additional information about the uncertainty, above and beyond the information that can be derived from analysing past performance of the forecast. In particular since the ensemble spread is typically different for each successive forecast, it has been suggested that the ensemble spread may give a basis for predicting different levels of uncertainty in different forecasts, which is difficult to do from past performance-based estimates of uncertainty. Whether the ensemble spread actually contains information about forecast uncertainty, and how much information it contains, depends on many factors such as the forecast system, the forecast variable, the resolution and the lead time of the forecast.
NGR is a way to include information from the ensemble spread in the calibration of a forecast, by predicting future uncertainty as a weighted combination of the uncertainty estimated using past forecast errors, as in MOS, and the uncertainty estimated using the ensemble spread. The weights on the two sources of uncertainty information are calibrated using past forecasts and past observations in an attempt to derive optimal weighting.
Overview
Consider a series of past weather observations
over a period of
days (or other time interval):
:
and a corresponding series of past ensemble forecasts, characterized by the sample mean
and standard deviation
of the ensemble:
:
.
Also consider a new ensemble forecast from the same system with ensemble mean
and ensemble standard deviation
, intended as a forecast for an unknown future weather observation
.
A straightforward way to calibrate the new ensemble forecast output parameters
and produce a calibrated forecast for
is to use a simple linear regression model based on the ensemble mean
, trained using the past weather observations and past forecasts:
:
This model has the effect of bias correcting the ensemble mean and adjusting the level of variability of the forecast.
It can be applied to the new ensemble forecast
to generate a point forecast for
using
:
or to obtain a probabilistic forecast for the distribution of possible values for
based on the normal distribution with mean
and variance
:
:
The use of regression to calibrate weather forecasts in this way is an example of
model output statistics In weather forecasting, model output statistics (MOS) is a multiple linear regression technique in which predictands, often near-surface quantities (such as two-meter-above-ground-level air temperature, horizontal visibility, and wind direction, ...
.
However, this simple linear regression model does not use the ensemble standard deviation
, and hence misses any information that the ensemble standard deviation may contain about the forecast uncertainty. The NGR model was introduced as a way to potentially improve the prediction of uncertainty in the forecast of
by including information extracted from the ensemble standard deviation. It achieves this by generalising the simple linear regression model to either:
:
or
:
this can then be used to calibrate the new ensemble forecast parameters
using either
:
or
:
respectively. The prediction uncertainty is now given by two terms: the
term is constant in time, while the
term varies as the ensemble spread varies.
Parameter estimation
In the scientific literature the four parameters
of NGR have been estimated either by maximum likelihood
or by maximum
continuous ranked probability score
Continuity or continuous may refer to:
Mathematics
* Continuity (mathematics), the opposing concept to discreteness; common examples include
** Continuous probability distribution or random variable in probability and statistics
** Continuous g ...
(CRPS).
The pros and cons of these two approaches have also been discussed.
History
NGR was originally developed in the private sector by scientists at Risk Management Solutions Ltd for the purpose of using information in the ensemble spread for the valuation of
weather derivative Weather derivatives are financial instruments that can be used by organizations or individuals as part of a risk management strategy to reduce risk associated with adverse or unexpected weather conditions. Weather derivatives are index-based instr ...
s.
Terminology
NGR was originally referred to as ‘spread regression’ rather than NGR.
Subsequent authors, however, introduced first the alternative names Ensemble Model Output Statistics (EMOS)
and then NGR.
The original name ‘spread regression’ has now fallen from use, EMOS is used to refer generally to any method used for the calibration of ensembles, and NGR is typically used to refer to the method described in this article.
References
{{Reflist, refs=
[{{cite journal
, first1=M. , last1=Gebetsberger
, first2=J. , last2=Messner
, first3=G. , last3=Mayr
, first4=A. , last4=Zeileis
, year=2018
, title=Estimation Methods for Nonhomogeneous Regression Models: Minimum Continuous Ranked Probability Score versus Maximum Likelihood
, journal=Monthly Weather Review
, volume=146
, issue=12
, pages=4323–4338
, doi=10.1175/MWR-D-17-0364.1
, doi-access=free
]
[{{cite journal
, first1=T. , last1=Gneiting
, first2=A. , last2=Raftery
, first3=A. , last3=Westveld
, first4=T. , last4=Goldman
, year=2005
, title=Calibrated Probabilistic Forecasting Using Ensemble Model Output Statistics and Minimum CRPS Estimation
, journal=Monthly Weather Review
, volume=133
, issue=5
, pages=1098
, doi=10.1175/MWR2904.1
, doi-access=free
]
[{{cite journal
, first1=S. , last1=Jewson
, first2=A. , last2=Brix
, first3=C. , last3=Ziehmann
, year=2004
, title=A new parametric model for the assessment and calibration of medium‐range ensemble temperature forecasts
, journal=Atmospheric Science Letters
, volume=5
, issue=5
, pages=96–102
, doi=10.1002/asl.69
, doi-access=free
]
[{{cite journal
, first1=B. , last1=Lalic
, first2=A. , last2=Firany Sremac
, first3=L. , last3=Dekic
, first4=J. , last4=Eitzinger
, year=2017
, title=Seasonal forecasting of green water components and crop yields of winter wheat in Serbia and Austria
, journal=The Journal of Agricultural Science
, volume=156
, issue=5
, pages=645–657
, doi=10.1017/S0021859617000788
, pmid=30369628
, pmc=6199547
, doi-access=free
]
[{{cite journal
, first=M. , last=Scheuerer
, year=2013
, title=Probabilistic quantitative precipitation forecasting using Ensemble Model Output Statistics
, journal=Quarterly Journal of the Royal Meteorological Society
, volume=140
, issue=680
, pages=1086–1096
, doi=10.1002/qj.2183
, arxiv=1302.0893
, s2cid=88512854
]
[{{cite journal
, first1=T. , last1=Thorarinsdottir
, first2=M. , last2=Johnson
, year=2012
, title=Probabilistic Wind Gust Forecasting Using Nonhomogeneous Gaussian Regression
, journal=Monthly Weather Review
, volume=140
, issue=3
, pages=889–897
, doi=10.1175/MWR-D-11-00075.1
]
Regression analysis