An error correction model (ECM) belongs to a category of multiple
time series
In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. E ...
models most commonly used for data where the underlying variables have a long-run common stochastic trend, also known as
cointegration
Cointegration is a statistical property of a collection of time series variables. First, all of the series must be integrated of order ''d'' (see Order of integration). Next, if a linear combination of this collection is integrated of order less ...
. ECMs are a theoretically-driven approach useful for estimating both short-term and long-term effects of one time series on another. The term error-correction relates to the fact that last-period's deviation from a long-run equilibrium, the ''error'', influences its short-run dynamics. Thus ECMs directly estimate the speed at which a dependent variable returns to equilibrium after a change in other variables.
History
Yule
Yule, actually Yuletide ("Yule time") is a festival observed by the historical Germanic peoples, later undergoing Christianised reformulation resulting in the now better-known Christmastide. The earliest references to Yule are by way of indige ...
(1926) and Granger and Newbold (1974) were the first to draw attention to the problem of
spurious correlation
In statistics, a spurious relationship or spurious correlation is a mathematical relationship in which two or more events or variables are associated but '' not'' causally related, due to either coincidence or the presence of a certain third, ...
and find solutions on how to address it in time series analysis. Given two completely unrelated but integrated (non-stationary) time series, the
regression analysis
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
of one on the other will tend to produce an apparently statistically significant relationship and thus a researcher might falsely believe to have found evidence of a true relationship between these variables.
Ordinary least squares
In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the ...
will no longer be consistent and commonly used test-statistics will be non-valid. In particular,
Monte Carlo simulations
Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be determi ...
show that one will get a very high
R squared
In statistics, the coefficient of determination, denoted ''R''2 or ''r''2 and pronounced "R squared", is the proportion of the variation in the dependent variable that is predictable from the independent variable(s).
It is a statistic used i ...
, very high individual
t-statistic
In statistics, the ''t''-statistic is the ratio of the departure of the estimated value of a parameter from its hypothesized value to its standard error. It is used in hypothesis testing via Student's ''t''-test. The ''t''-statistic is used i ...
and a low
Durbin–Watson statistic
In statistics, the Durbin–Watson statistic is a test statistic used to detect the presence of autocorrelation at lag 1 in the residuals (prediction errors) from a regression analysis. It is named after James Durbin and Geoffrey Watson. The ...
. Technically speaking, Phillips (1986) proved that parameter estimates will not
converge in probability, the
intercept will diverge and the slope will have a non-degenerate distribution as the sample size increases. However, there might be a common
stochastic trend to both series that a researcher is genuinely interested in because it reflects a long-run relationship between these variables.
Because of the stochastic nature of the trend it is not possible to break up integrated series into a deterministic (predictable)
trend and a stationary series containing deviations from trend. Even in deterministically detrended
random walk
In mathematics, a random walk is a random process that describes a path that consists of a succession of random steps on some mathematical space.
An elementary example of a random walk is the random walk on the integer number line \mathbb ...
s spurious correlations will eventually emerge. Thus detrending does not solve the estimation problem.
In order to still use the
Box–Jenkins approach, one could difference the series and then estimate models such as
ARIMA
Arima, officially The Royal Chartered Borough of Arima is the easternmost and second largest in area of the three boroughs of Trinidad and Tobago. It is geographically adjacent to Sangre Grande and Arouca at the south central foothills of ...
, given that many commonly used time series (e.g. in economics) appear to be stationary in first differences. Forecasts from such a model will still reflect cycles and seasonality that are present in the data. However, any information about long-run adjustments that the data in levels may contain is omitted and longer term forecasts will be unreliable.
This led
Sargan (1964) to develop the ECM methodology, which retains the level information.
Estimation
Several methods are known in the literature for estimating a refined dynamic model as described above. Among these are the Engle and Granger 2-step approach, estimating their ECM in one step and the vector-based VECM using
Johansen's method.
Engle and Granger 2-step approach
The first step of this method is to pretest the individual time series one uses in order to confirm that they are
non-stationary in the first place. This can be done by standard
unit root
In probability theory and statistics, a unit root is a feature of some stochastic processes (such as random walks) that can cause problems in statistical inference involving time series models. A linear stochastic process has a unit root if ...
DF testing and
ADF test (to resolve the problem of serially correlated errors).
Take the case of two different series
and
. If both are I(0), standard regression analysis will be valid. If they are integrated of a different order, e.g. one being I(1) and the other being I(0), one has to transform the model.
If they are both integrated to the same order (commonly I(1)), we can estimate an ECM model of the form
:
''If'' both variables are integrated and this ECM exists, they are cointegrated by the Engle–Granger representation theorem.
The second step is then to estimate the model using
ordinary least squares
In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the ...
:
If the regression is not spurious as determined by test criteria described above,
Ordinary least squares
In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the ...
will not only be valid, but in fact super
consistent
In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consisten ...
(Stock, 1987).
Then the predicted residuals
from this regression are saved and used in a regression of differenced variables plus a lagged error term
:
One can then test for cointegration using a standard
t-statistic
In statistics, the ''t''-statistic is the ratio of the departure of the estimated value of a parameter from its hypothesized value to its standard error. It is used in hypothesis testing via Student's ''t''-test. The ''t''-statistic is used i ...
on
.
While this approach is easy to apply, there are, however numerous problems:
* The univariate unit root tests used in the first stage have low
statistical power
In statistics, the power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis (H_0) when a specific alternative hypothesis (H_1) is true. It is commonly denoted by 1-\beta, and represents the chances ...
* The choice of dependent variable in the first stage influences test results, i.e. we need weak exogeneity for
as determined by
Granger causality
* One can potentially have a small sample bias
* The cointegration test on
does not follow a standard distribution
* The validity of the long-run parameters in the first regression stage where one obtains the residuals cannot be verified because the distribution of the OLS estimator of the cointegrating vector is highly complicated and non-normal
* At most one cointegrating relationship can be examined.
VECM
The Engle–Granger approach as described above suffers from a number of weaknesses. Namely it is restricted to only a single equation with one variable designated as the dependent variable, explained by another variable that is assumed to be weakly exogeneous for the parameters of interest. It also relies on pretesting the time series to find out whether variables are I(0) or I(1). These weaknesses can be addressed through the use of Johansen's procedure. Its advantages include that pretesting is not necessary, there can be numerous cointegrating relationships, all variables are treated as endogenous and tests relating to the long-run parameters are possible. The resulting model is known as a vector error correction model (VECM), as it adds error correction features to a multi-factor model known as
vector autoregression
Vector autoregression (VAR) is a statistical model used to capture the relationship between multiple quantities as they change over time. VAR is a type of stochastic process model. VAR models generalize the single-variable (univariate) autoregre ...
(VAR). The procedure is done as follows:
* Step 1: estimate an unrestricted VAR involving potentially non-stationary variables
* Step 2: Test for cointegration using
Johansen test
* Step 3: Form and analyse the VECM.
An example of ECM
The idea of cointegration may be demonstrated in a simple macroeconomic setting. Suppose, consumption
and disposable income
are macroeconomic time series that are related in the long run (see
Permanent income hypothesis
The permanent income hypothesis (PIH) is a model in the field of economics to explain the formation of consumption patterns. It suggests consumption patterns are formed from future expectations and consumption smoothing. The theory was developed ...
). Specifically, let
average propensity to consume be 90%, that is, in the long run
. From the econometrician's point of view, this long run relationship (aka cointegration) exists if errors from the regression
are a
stationary
In addition to its common meaning, stationary may have the following specialized scientific meanings:
Mathematics
* Stationary point
* Stationary process
* Stationary state
Meteorology
* A stationary front is a weather front that is not moving ...
series, although
and
are non-stationary. Suppose also that if
suddenly changes by
, then
changes by
, that is,
marginal propensity to consume
In economics, the marginal propensity to consume (MPC) is a metric that quantifies induced consumption, the concept that the increase in personal consumer spending ( consumption) occurs with an increase in disposable income (income after taxes an ...
equals 50%. Our final assumption is that the gap between current and equilibrium consumption decreases each period by 20%.
In this setting a change
in consumption level can be modelled as
. The first term in the RHS describes short-run impact of change in
on
, the second term explains long-run gravitation towards the equilibrium relationship between the variables, and the third term reflects random shocks that the system receives (e.g. shocks of consumer confidence that affect consumption). To see how the model works, consider two kinds of shocks: permanent and transitory (temporary). For simplicity, let
be zero for all t. Suppose in period ''t'' − 1 the system is in equilibrium, i.e.
. Suppose that in the period t, disposable income
increases by 10 and then returns to its previous level. Then
first (in period t) increases by 5 (half of 10), but after the second period
begins to decrease and converges to its initial level. In contrast, if the shock to
is permanent, then
slowly converges to a value that exceeds the initial
by 9.
This structure is common to all ECM models. In practice, econometricians often first estimate the cointegration relationship (equation in levels), and then insert it into the main model (equation in differences).
References
Further reading
*
*
*
* {{cite book , last=Martin , first=Vance , last2=Hurn , first2=Stan , last3=Harris , first3=David , title=Econometric Modelling with Time Series , location=New York , publisher=Cambridge University Press , year=2013 , isbn=978-0-521-13981-6 , pages=662–711
Error detection and correction
Time series models
Econometric models