Vector autoregression (VAR) is a statistical model used to capture the relationship between multiple quantities as they change over time. VAR is a type of

stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that ap ...

model. VAR models generalize the single-variable (univariate)

autoregressive model In statistics, econometrics and signal processing, an autoregressive (AR) model is a representation of a type of random process; as such, it is used to describe certain time-varying processes in nature, economics, etc. The autoregressive model spe ...

by allowing for multivariate

time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Ex ...

. VAR models are often used in

economics Economics () is the social science that studies the production, distribution, and consumption of goods and services. Economics focuses on the behaviour and interactions of economic agents and how economies work. Microeconomics anal ...

and the

natural science Natural science is one of the branches of science concerned with the description, understanding and prediction of natural phenomena, based on empirical evidence from observation and experimentation. Mechanisms such as peer review and repeatab ...

s. Like the autoregressive model, each variable has an equation modelling its evolution over time. This equation includes the variable's lagged (past) values, the lagged values of the other variables in the model, and an

error term In mathematics and statistics, an error term is an additive type of error An error (from the Latin ''error'', meaning "wandering") is an action which is inaccurate or incorrect. In some usages, an error is synonymous with a mistake. The etymol ...

. VAR models do not require as much knowledge about the forces influencing a variable as do structural models with simultaneous equations. The only prior knowledge required is a list of variables which can be hypothesized to affect each other over time.

Specification

Definition

A VAR model describes the evolution of a set of ''k'' variables, called ''

endogenous Endogenous substances and processes are those that originate from within a living system such as an organism, tissue, or cell. In contrast, exogenous substances and processes are those that originate from outside of an organism. For example, ...

variables'', over time. Each period of time is numbered, ''t'' = 1, ..., ''T''. The variables are collected in a vector, ''y_t'', which is of length ''k.'' (Equivalently, this vector might be described as a (''k'' × 1)- matrix.) The vector is modelled as a linear function of its previous value. The vector's components are referred to as ''y''_''i'',''t'', meaning the observation at time ''t'' of the ''i'' th variable. For example, if the first variable in the model measures the price of wheat over time, then ''y''_1,1998 would indicate the price of wheat in the year 1998. VAR models are characterized by their ''order'', which refers to the number of earlier time periods the model will use. Continuing the above example, a 5th-order VAR would model each year's wheat price as a linear combination of the last five years of wheat prices. A ''lag'' is the value of a variable in a previous time period. So in general a ''p''th-order VAR refers to a VAR model which includes lags for the last ''p'' time periods. A ''p''th-order VAR is denoted "VAR(''p'')" and sometimes called "a VAR with ''p'' lags". A ''p''th-order VAR model is written as :

y_t = c + A_1 y_ + A_2 y_ + \cdots + A_p y_ + e_t, \,

The variables of the form ''y''_''t''−i indicate that variable's value ''i'' time periods earlier and are called the "i''th'' lag" of ''y''_t. The variable ''c'' is a ''k''-vector of constants serving as the intercept of the model. ''A_i'' is a time-invariant (''k'' × ''k'')-matrix and ''e''_''t'' is a ''k''-vector of error terms. The error terms must satisfy three conditions: #

\mathrm(e_t) = 0\,

. Every error term has a

mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value ( magnitude and sign) of a given data set. For a data set, the '' ar ...

of zero. #

\mathrm(e_t e_t') = \Omega\,

. The contemporaneous

covariance matrix In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of ...

of error terms is a ''k'' × ''k''

positive-semidefinite matrix In mathematics, a symmetric matrix M with real entries is positive-definite if the real number z^\textsfMz is positive for every nonzero real column vector z, where z^\textsf is the transpose of More generally, a Hermitian matrix (that is, a ...

denoted Ω. #

\mathrm(e_t e_') = 0\,

for any non-zero ''k''. There is no correlation across time. In particular, there is no serial correlation in individual error terms. The process of choosing the maximum lag ''p'' in the VAR model requires special attention because

inference Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word ''wikt:infer, infer'' means to "carry forward". Inference is theoretically traditionally divided into deductive reasoning, deduction and in ...

is dependent on correctness of the selected lag order.

Order of integration of the variables

Note that all variables have to be of the same

order of integration In statistics, the order of integration, denoted ''I''(''d''), of a time series is a summary statistic, which reports the minimum number of differences required to obtain a covariance-stationary series. Integration of order ''d'' A time ...

. The following cases are distinct: *All the variables are I(0) (stationary): this is in the standard case, i.e. a VAR in level *All the variables are I(''d'') (non-stationary) with ''d'' > 0: **The variables are cointegrated: the error correction term has to be included in the VAR. The model becomes a Vector

error correction model An error correction model (ECM) belongs to a category of multiple time series models most commonly used for data where the underlying variables have a long-run common stochastic trend, also known as cointegration. ECMs are a theoretically-driven a ...

(VECM) which can be seen as a restricted VAR. **The variables are not cointegrated: first, the variables have to be differenced d times and one has a VAR in difference.

Concise matrix notation

One can stack the vectors in order to write a VAR(''p'') as a

stochastic Stochastic (, ) refers to the property of being well described by a random probability distribution. Although stochasticity and randomness are distinct in that the former refers to a modeling approach and the latter refers to phenomena themselv ...

matrix difference equation, with a concise matrix notation: :

Y=BZ +U \,

Details of the matrices are in a separate page.

Example

For a general example of a VAR(''p'') with ''k'' variables, see General matrix notation of a VAR(p). A VAR(1) in two variables can be written in matrix form (more compact notation) as :

\beginy_ \\ y_\end = \beginc_ \\ c_\end + \begina_&a_ \\ a_&a_\end\beginy_ \\ y_\end + \begine_ \\ e_\end,

(in which only a single ''A'' matrix appears because this example has a maximum lag ''p'' equal to 1), or, equivalently, as the following system of two equations :

y_ = c_ + a_y_ + a_y_ + e_\,

y_ = c_ + a_y_ + a_y_ + e_.\,

Each variable in the model has one equation. The current (time ''t'') observation of each variable depends on its own lagged values as well as on the lagged values of each other variable in the VAR.

Writing VAR(''p'') as VAR(1)

A VAR with ''p'' lags can always be equivalently rewritten as a VAR with only one lag by appropriately redefining the dependent variable. The transformation amounts to stacking the lags of the VAR(''p'') variable in the new VAR(1) dependent variable and appending identities to complete the number of equations. For example, the VAR(2) model :

y_t = c + A_1 y_ + A_2 y_ + e_t

can be recast as the VAR(1) model ::

\beginy_ \\ y_\end = \beginc \\ 0\end + \beginA_&A_ \\ I&0\end\beginy_ \\ y_\end + \begine_ \\ 0\end,

where ''I'' is the

identity matrix In linear algebra, the identity matrix of size n is the n\times n square matrix with ones on the main diagonal and zeros elsewhere. Terminology and notation The identity matrix is often denoted by I_n, or simply by I if the size is immaterial or ...

. The equivalent VAR(1) form is more convenient for analytical derivations and allows more compact statements.

Structural vs. reduced form

Structural VAR

A ''structural VAR with p lags'' (sometimes abbreviated SVAR) is :

B_0 y_t = c_0 + B_1 y_ + B_2 y_ + \cdots + B_p y_ + \epsilon_t,

where ''c''₀ is a ''k'' × 1 vector of constants, ''B_i'' is a ''k'' × ''k'' matrix (for every ''i'' = 0, ..., ''p'') and ''ε''_''t'' is a ''k'' × 1 vector of error terms. The main diagonal terms of the ''B''₀ matrix (the coefficients on the ''i''^th variable in the ''i''^th equation) are scaled to 1. The error terms ε''_t'' (''structural shocks'') satisfy the conditions (1) - (3) in the definition above, with the particularity that all the elements in the off diagonal of the covariance matrix

\mathrm(\epsilon_t\epsilon_t') = \Sigma

are zero. That is, the structural shocks are uncorrelated. For example, a two variable structural VAR(1) is: :

\begin1&B_ \\ B_&1\end\beginy_ \\ y_\end = \beginc_ \\ c_\end + \beginB_&B_ \\ B_&B_\end\beginy_ \\ y_\end + \begin\epsilon_ \\ \epsilon_\end,

where :

\Sigma = \mathrm(\epsilon_t \epsilon_t') = \begin\sigma_^2&0 \\ 0&\sigma_^2\end;

that is, the

variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbe ...

s of the structural shocks are denoted

\mathrm(\epsilon_i) = \sigma_i^2

(''i'' = 1, 2) and the

covariance In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the le ...

\mathrm(\epsilon_1,\epsilon_2) = 0

. Writing the first equation explicitly and passing ''y_2,t'' to the

right hand side In mathematics, LHS is informal shorthand for the left-hand side of an equation. Similarly, RHS is the right-hand side. The two sides have the same value, expressed differently, since equality is symmetric.identity matrix In linear algebra, the identity matrix of size n is the n\times n square matrix with ones on the main diagonal and zeros elsewhere. Terminology and notation The identity matrix is often denoted by I_n, or simply by I if the size is immaterial or ...

(all off-diagonal elements are zero — the case in the initial definition), when ''y''_2,''t'' can impact directly ''y''_1,''t''+1 and subsequent future values, but not ''y''_1,''t''. Because of the parameter identification problem,

ordinary least squares In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the ...

estimation of the structural VAR would yield

inconsistent In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consiste ...

parameter estimates. This problem can be overcome by rewriting the VAR in reduced form. From an economic point of view, if the joint dynamics of a set of variables can be represented by a VAR model, then the structural form is a depiction of the underlying, "structural", economic relationships. Two features of the structural form make it the preferred candidate to represent the underlying relations: :1. ''Error terms are not correlated''. The structural, economic shocks which drive the dynamics of the economic variables are assumed to be

independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independe ...

, which implies zero correlation between error terms as a desired property. This is helpful for separating out the effects of economically unrelated influences in the VAR. For instance, there is no reason why an oil price shock (as an example of a supply shock) should be related to a shift in consumers' preferences towards a style of clothing (as an example of a demand shock); therefore one would expect these factors to be statistically independent. :2. ''Variables can have a contemporaneous impact on other variables''. This is a desirable feature especially when using low frequency data. For example, an

indirect tax An indirect tax (such as sales tax, per unit tax, value added tax (VAT), or goods and services tax (GST), excise, consumption tax, tariff) is a tax that is levied upon goods and services before they reach the customer who ultimately pays th ...

rate increase would not affect

tax revenues Tax revenue is the income that is collected by governments through taxation. Taxation is the primary source of government revenue. Revenue may be extracted from sources such as individuals, public enterprises, trade, royalties on natural resourc ...

the day the decision is announced, but one could find an effect in that quarter's data.

Reduced-form VAR

By premultiplying the structural VAR with the inverse of ''B''₀ :

y_t = B_0^c_0 + B_0^ B_1 y_ + B_0^ B_2 y_ + \cdots + B_0^ B_p y_ + B_0^\epsilon_t,

and denoting :

B_^ c_0 = c,\quad B_^B_i = A_\texti = 1, \dots, p\textB_^\epsilon_t = e_t

one obtains the ''p''th order reduced VAR :

y_t = c + A_1 y_ + A_2 y_ + \cdots + A_p y_ + e_t

Note that in the reduced form all right hand side variables are predetermined at time ''t''. As there are no time ''t'' endogenous variables on the right hand side, no variable has a ''direct'' contemporaneous effect on other variables in the model. However, the error terms in the reduced VAR are composites of the structural shocks ''e''_''t'' = ''B''₀⁻¹''ε''_''t''. Thus, the occurrence of one structural shock ''ε_i,t'' can potentially lead to the occurrence of shocks in all error terms ''e_j,t'', thus creating contemporaneous movement in all endogenous variables. Consequently, the covariance matrix of the reduced VAR :

\Omega = \mathrm(e_t e_t') = \mathrm (B_0^ \epsilon_t \epsilon_t' (B_0^)') = B_0^\Sigma(B_0^)'\,

can have non-zero off-diagonal elements, thus allowing non-zero correlation between error terms.

Estimation

Estimation of the regression parameters

Starting from the concise matrix notation (for details see this annex): :

Y=BZ +U \,

*The multivariate least squares (MLS) approach for estimating B yields: :

\hat B= YZ'(ZZ')^.

This can be written alternatively as: :

\operatorname(\hat B) = ((ZZ')^ Z \otimes I_)\ \operatorname(Y),

where

\otimes

denotes the

Kronecker product In mathematics, the Kronecker product, sometimes denoted by ⊗, is an operation on two matrices of arbitrary size resulting in a block matrix. It is a generalization of the outer product (which is denoted by the same symbol) from vectors to ...

and Vec the vectorization of the indicated matrix. This estimator is consistent and asymptotically efficient. It is furthermore equal to the conditional maximum likelihood estimator. * As the explanatory variables are the same in each equation, the multivariate least squares estimator is equivalent to the

estimator applied to each equation separately.

Estimation of the covariance matrix of the errors

As in the standard case, the maximum likelihood estimator (MLE) of the covariance matrix differs from the ordinary least squares (OLS) estimator. MLE estimator:

\hat \Sigma = \frac \sum_^T \hat \epsilon_t\hat \epsilon_t'

OLS estimator:

\hat \Sigma = \frac \sum_^T \hat \epsilon_t\hat \epsilon_t'

for a model with a constant, ''k'' variables and ''p'' lags. In a matrix notation, this gives: :

\hat \Sigma = \frac (Y-\hatZ)(Y-\hatZ)'.

Estimation of the estimator's covariance matrix

The covariance matrix of the parameters can be estimated as :

\widehat  \mbox (\mbox(\hat B)) =()^ \otimes\hat \Sigma.\,

Degrees of freedom

Vector autoregression models often involve the estimation of many parameters. For example, with seven variables and four lags, each matrix of coefficients for a given lag length is 7 by 7, and the vector of constants has 7 elements, so a total of 49×4 + 7 = 203 parameters are estimated, substantially lowering the

degrees of freedom Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...

of the regression (the number of data points minus the number of parameters to be estimated). This can hurt the accuracy of the parameter estimates and hence of the forecasts given by the model.

Interpretation of estimated model

Properties of the VAR model are usually summarized using structural analysis using Granger causality, impulse responses, and forecast error variance decompositions.

Impulse response

Consider the first-order case (i.e., with only one lag), with equation of evolution :

y_t=Ay_+e_t,

for evolving (state) vector

y

and vector

e

of shocks. To find, say, the effect of the ''j''-th element of the vector of shocks upon the ''i''-th element of the state vector 2 periods later, which is a particular impulse response, first write the above equation of evolution one period lagged: :

y_=Ay_+e_.

Use this in the original equation of evolution to obtain :

y_t=A^2y_+Ae_+e_t;

then repeat using the twice lagged equation of evolution, to obtain :

y_t=A^3y_+A^2e_+Ae_+e_t.

From this, the effect of the ''j''-th component of

e_

upon the ''i''-th component of

y_t

is the ''i, j'' element of the matrix

A^2.

It can be seen from this induction process that any shock will have an effect on the elements of ''y'' infinitely far forward in time, although the effect will become smaller and smaller over time assuming that the AR process is stable — that is, that all the

eigenvalue In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denote ...

s of the matrix ''A'' are less than 1 in

absolute value In mathematics, the absolute value or modulus of a real number x, is the non-negative value without regard to its sign. Namely, , x, =x if is a positive number, and , x, =-x if x is negative (in which case negating x makes -x positive), ...

Forecasting using an estimated VAR model

An estimated VAR model can be used for forecasting, and the quality of the forecasts can be judged, in ways that are completely analogous to the methods used in univariate autoregressive modelling.

Applications

Christopher Sims Christopher Albert Sims (born October 21, 1942) is an American econometrician and macroeconomist. He is currently the John J.F. Sherrerd '52 University Professor of Economics at Princeton University. Together with Thomas Sargent, he won the ...

has advocated VAR models, criticizing the claims and performance of earlier modeling in

macroeconomic Macroeconomics (from the Greek prefix ''makro-'' meaning "large" + ''economics'') is a branch of economics dealing with performance, structure, behavior, and decision-making of an economy as a whole. For example, using interest rates, taxes, an ...

econometrics Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships. M. Hashem Pesaran (1987). "Econometrics," '' The New Palgrave: A Dictionary of Economics'', v. 2, p. 8 p. ...

. He recommended VAR models, which had previously appeared in time series

statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...

and in

system identification The field of system identification uses statistical methods to build mathematical models of dynamical systems from measured data. System identification also includes the optimal design of experiments for efficiently generating informative dat ...

, a statistical specialty in

control theory Control theory is a field of mathematics that deals with the control system, control of dynamical systems in engineered processes and machines. The objective is to develop a model or algorithm governing the application of system inputs to drive ...

. Sims advocated VAR models as providing a theory-free method to estimate economic relationships, thus being an alternative to the "incredible identification restrictions" in structural models. VAR models are also increasingly used in health research for automatic analyses of diary data or sensor data.

Software

* R: The package
vars
' includes functions for VAR models. Other R packages are listed in the CRAN Task View: Time Series Analysis. * Python: The '' statsmodels'' package's tsa (time series analysis) module supports VARs. ''PyFlux'' has support for VARs and Bayesian VARs. * SAS: VARMAX *

Stata Stata (, , alternatively , occasionally stylized as STATA) is a general-purpose statistical software package developed by StataCorp for data manipulation, visualization, statistics, and automated reporting. It is used by researchers in many fie ...

: "var" *

EViews EViews is a statistical package for Windows, used mainly for time-series oriented econometric analysis. It is developed by Quantitative Micro Software (QMS), now a part of IHS. Version 1.0 was released in March 1994, and replaced MicroTSP. T ...

: "VAR" *

Gretl gretl is an open-source statistical package, mainly for econometrics. The name is an acronym for ''G''nu ''R''egression, ''E''conometrics and ''T''ime-series ''L''ibrary. It has both a graphical user interface (GUI) and a command-line inter ...