In the

statistical Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

analysis of

time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. ...

, autoregressive–moving-average (ARMA) models are a way to describe a (weakly) stationary stochastic process using autoregression (AR) and a

moving average In statistics, a moving average (rolling average or running average or moving mean or rolling mean) is a calculation to analyze data points by creating a series of averages of different selections of the full data set. Variations include: #Simpl ...

(MA), each with a polynomial. They are a tool for understanding a series and predicting future values. AR involves regressing the variable on its own lagged (i.e., past) values. MA involves modeling the

error An error (from the Latin , meaning 'to wander'Oxford English Dictionary, s.v. “error (n.), Etymology,” September 2023, .) is an inaccurate or incorrect action, thought, or judgement. In statistics, "error" refers to the difference between t ...

as a

linear combination In mathematics, a linear combination or superposition is an Expression (mathematics), expression constructed from a Set (mathematics), set of terms by multiplying each term by a constant and adding the results (e.g. a linear combination of ''x'' a ...

of error terms occurring contemporaneously and at various times in the past. The model is usually denoted ARMA(''p'', ''q''), where ''p'' is the order of AR and ''q'' is the order of MA. The general ARMA model was described in the 1951 thesis of Peter Whittle, ''Hypothesis testing in time series analysis'', and it was popularized in the 1970 book by George E. P. Box and Gwilym Jenkins. ARMA models can be estimated by using the Box–Jenkins method.

Mathematical formulation

Autoregressive model

The notation AR(''p'') refers to the autoregressive model of order ''p''. The AR(''p'') model is written as :

X_t = \sum_^p \varphi_i X_+ \varepsilon_t

where

\varphi_1, \ldots, \varphi_p

are

parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...

s and the random variable

\varepsilon_t

white noise In signal processing, white noise is a random signal having equal intensity at different frequencies, giving it a constant power spectral density. The term is used with this or similar meanings in many scientific and technical disciplines, i ...

, usually

independent and identically distributed Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in Pennsylvania, United States * Independentes (English: Independents), a Portuguese artist ...

(i.i.d.) normal random variables. In order for the model to remain stationary, the roots of its

characteristic polynomial In linear algebra, the characteristic polynomial of a square matrix is a polynomial which is invariant under matrix similarity and has the eigenvalues as roots. It has the determinant and the trace of the matrix among its coefficients. The ...

must lie outside the unit circle. For example, processes in the AR(1) model with

, \varphi_1,  \ge 1

are not stationary because the root of

1 - \varphi_1B = 0

lies within the unit circle. The augmented Dickey–Fuller test can assesses the stability of an intrinsic mode function and trend components. For stationary time series, the ARMA models can be used, while for non-stationary series,

Long short-term memory Long short-term memory (LSTM) is a type of recurrent neural network (RNN) aimed at mitigating the vanishing gradient problem commonly encountered by traditional RNNs. Its relative insensitivity to gap length is its advantage over other RNNs, ...

models can be used to derive abstract features. The final value is obtained by reconstructing the predicted outcomes of each time series.

Moving average model

The notation MA(''q'') refers to the moving average model of order ''q'': :

X_t = \mu + \varepsilon_t + \sum_^q \theta_i \varepsilon_\,

where the

\theta_1,...,\theta_q

are the parameters of the model,

\mu

is the expectation of

X_t

(often assumed to equal 0), and

\varepsilon_1

, ...,

\varepsilon_t

are i.i.d. white noise error terms that are commonly normal random variables.

ARMA model

The notation ARMA(''p'', ''q'') refers to the model with ''p'' autoregressive terms and ''q'' moving-average terms. This model contains the AR(''p'') and MA(''q'') models, :

X_t = \varepsilon_t +  \sum_^p \varphi_i X_ + \sum_^q \theta_i \varepsilon_.\,

In terms of lag operator

In some texts, the models is specified using the lag operator ''L''. In these terms, the AR(''p'') model is given by :

\varepsilon_t = \left(1 - \sum_^p \varphi_i L^i\right) X_t =  \varphi (L) X_t\,

where

\varphi

represents the polynomial :

\varphi (L) = 1 - \sum_^p \varphi_i L^i.\,

The MA(''q'') model is given by :

X_t - \mu =  \left(1 + \sum_^q \theta_i L^i\right) \varepsilon_t =  \theta (L) \varepsilon_t , \,

where

\theta

represents the polynomial :

\theta(L)= 1 + \sum_^q \theta_i L^i .\,

Finally, the combined ARMA(''p'', ''q'') model is given by :

\left(1 - \sum_^p \varphi_i L^i\right) X_t = \left(1 + \sum_^q \theta_i L^i\right) \varepsilon_t \, ,

or more concisely, :

\varphi(L) X_t = \theta(L) \varepsilon_t \,

or :

\fracX_t = \varepsilon_t \, .

This is the form used in

Box A box (plural: boxes) is a container with rigid sides used for the storage or transportation of its contents. Most boxes have flat, parallel, rectangular sides (typically rectangular prisms). Boxes can be very small (like a matchbox) or v ...

, Jenkins & Reinsel. Moreover, starting summations from

i=0

and setting

\phi_0 = -1

and

\theta_0 = 1

, then we get an even more elegant formulation:

-\sum_^p \phi_i L^i \; X_t = \sum_^q \theta_i L^i \; \varepsilon_t \, .

Spectrum

The

spectral density In signal processing, the power spectrum S_(f) of a continuous time signal x(t) describes the distribution of power into frequency components f composing that signal. According to Fourier analysis, any physical signal can be decomposed into ...

of an ARMA process is

S(f) = \frac \left\vert \frac \right\vert^2

where

\sigma^2

is the

variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...

of the white noise,

\theta

is the characteristic polynomial of the moving average part of the ARMA model, and

\phi

is the characteristic polynomial of the autoregressive part of the ARMA model.

Fitting models

Choosing ''p'' and ''q''

An appropriate value of ''p'' in the ARMA(''p'', ''q'') model can be found by plotting the

partial autocorrelation function In time series analysis, the partial autocorrelation function (PACF) gives the partial correlation of a stationary time series with its own lagged values, regressed the values of the time series at all shorter lags. It contrasts with the autocorre ...

s. Similarly, ''q'' can be estimated by using the

autocorrelation function Autocorrelation, sometimes known as serial correlation in the discrete time case, measures the correlation of a signal with a delayed copy of itself. Essentially, it quantifies the similarity between observations of a random variable at differe ...

s. Both ''p'' and ''q'' can be determined simultaneously using extended autocorrelation functions (EACF). Further information can be gleaned by considering the same functions for the residuals of a model fitted with an initial selection of ''p'' and ''q''. Brockwell & Davis recommend using

Akaike information criterion The Akaike information criterion (AIC) is an estimator of prediction error and thereby relative quality of statistical models for a given set of data. Given a collection of models for the data, AIC estimates the quality of each model, relative to ...

(AIC) for finding ''p'' and ''q''. Another option is the

Bayesian information criterion In statistics, the Bayesian information criterion (BIC) or Schwarz information criterion (also SIC, SBC, SBIC) is a criterion for model selection among a finite set of models; models with lower BIC are generally preferred. It is based, in part, on ...

(BIC).

Estimating coefficients

After choosing ''p'' and ''q,'' ARMA models can be fitted by

least squares The method of least squares is a mathematical optimization technique that aims to determine the best fit function by minimizing the sum of the squares of the differences between the observed values and the predicted values of the model. The me ...

regression to find the values of the parameters which minimize the error term. It is good practice to find the smallest values of ''p'' and ''q'' which provide an acceptable fit to the data. For a pure AR model, the Yule-Walker equations may be used to provide a fit. ARMA outputs are used primarily to forecast (predict), and not to infer causation as in other areas of econometrics and regression methods such as OLS and 2SLS.

Software implementations

* In R, standard package stats has function arima, documented i
ARIMA Modelling of Time Series
Packag

has an improved script called sarima for fitting ARMA models (seasonal and nonseasonal) and sarima.sim to simulate data from these models. Extension packages contain related and extended functionality: package tseries includes the function arma(), documented i

packa
fracdiff
contains fracdiff() for fractionally integrated ARMA processes; and packag

includes auto.arima for selecting a parsimonious set of ''p, q''. The CRAN task view o

contains links to most of these. *

Mathematica Wolfram (previously known as Mathematica and Wolfram Mathematica) is a software system with built-in libraries for several areas of technical computing that allows machine learning, statistics, symbolic computation, data manipulation, network ...

has a complete library of time series functions including ARMA. *

MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...

includes functions such a
arma
an

to estimate autoregressive, exogenous autoregressive and ARMAX models. Se

an

for details. * Julia has community-driven packages that implement fitting with an ARMA model such a
arma.jl
* Python has the

statsmodelsS
package which includes many models and functions for time series analysis, including ARMA. Formerly part of the scikit-learn 



scikit-learn (formerly scikits.learn and also known as sklearn) is a free and open-source machine learning library for the  Python programming language.
It features various classification,  regression and  clustering algorithms including support ...
 library, it is now stand-alone and integrates well with Pandas 




Pediatric autoimmune neuropsychiatric disorders associated with streptococcal infections (PANDAS) is a controversial  hypothetical diagnosis for a subset of children with rapid onset of  obsessive-compulsive disorder (OCD) or  tic disorders. Sy ...
.
*  PyFlux has a Python-based implementation of ARIMAX models, including Bayesian ARIMAX models.
*  IMSL Numerical Libraries are libraries of numerical analysis functionality including ARMA and ARIMA procedures implemented in standard programming languages like C, Java, C# .NET, and Fortran.
* gretl 

gretl is an  open-source  statistical package, mainly for econometrics. The name is an acronym for  ''G''nu  ''R''egression, ''E''conometrics and  ''T''ime-series ''L''ibrary.

It has both a graphical user interface (GUI) and a command-line interf ...
 can estimate ARMA models, as mentione
here
* GNU Octave 


GNU Octave is a scientific programming language for scientific computing and numerical computation. Octave helps in solving linear and nonlinear problems numerically, and for performing other numerical experiments using a language that is mostly  ...
 extra packag
octave-forge
supports AR models.
* Stata 



Stata (, , alternatively , occasionally stylized as STATA) is a general-purpose Statistics, statistical software package developed by StataCorp for data manipulation, visualization, statistics, and automated reporting. It is used by researchers  ...
 includes the function arima. for ARMA and ARIMA 




Arima, officially The Royal Chartered Borough of Arima is the easternmost and second largest in area of the three boroughs of Trinidad and Tobago. It is geographically adjacent to  Sangre Grande and  Arouca at the south central foothills of the ...
 models.
*  SuanShu is a Java library of numerical methods that implements univariate/multivariate ARMA, ARIMA, ARMAX, etc models, documented i
"SuanShu, a Java numerical and statistical library"

*  SAS has an econometric package, ETS, that estimates ARIMA models
See details


  History and interpretations 

The general ARMA model was described in the 1951 thesis of  Peter Whittle, who used mathematical analysis (Laurent series 




In mathematics, the Laurent series of a  complex function f(z) is a representation of that function as a  power series which includes terms of negative degree. It may be used to express complex functions in cases where a  Taylor series expansio ...
 and Fourier analysis 






In mathematics, Fourier analysis () is the study of the way general  functions may be represented or approximated by sums of simpler trigonometric functions. Fourier analysis grew from the study of Fourier series, and is named after Joseph Fo ...
) and statistical inference. ARMA models were popularized by a 1970 book by  George E. P. Box and Jenkins, who expounded an iterative ( Box–Jenkins) method for choosing and estimating them. This method was useful for low-order polynomials (of degree three or less).: 

ARMA is essentially an infinite impulse response 

Infinite impulse response (IIR) is a property applying to many  linear time-invariant systems that are distinguished by having an  impulse response h(t) that does not become exactly zero past a certain point but continues indefinitely. This is in  ...
 filter applied to white noise, with some additional interpretation placed on it.

In digital signal processing 




Digital signal processing (DSP) is the use of digital processing, such as by computers or more specialized digital signal processors, to perform a wide variety of signal processing operations.  The digital signals processed in this manner are a ...
, ARMA is represented as a digital filter with white noise at the input and the ARMA process at the output.

  Applications 

ARMA is appropriate when a system is a function of a series of unobserved shocks (the MA or moving average part) as well as its own behavior.  For example, stock prices may be shocked by fundamental information as well as exhibiting technical trending and  mean-reversion effects due to market participants.

  Generalizations 


There are various generalizations of ARMA. Nonlinear AR (NAR), nonlinear MA (NMA) and nonlinear ARMA (NARMA) model nonlinear dependence on past values and error terms.  Vector AR (VAR) and vector ARMA (VARMA) model multivariate time series. Autoregressive integrated moving average 

In time series analysis used in statistics and econometrics, autoregressive integrated moving average (ARIMA) and seasonal ARIMA (SARIMA) models are generalizations of the  autoregressive moving average (ARMA) model to non-stationary series and pe ...
 (ARIMA) models non-stationary time series (that is, whose mean changes over time). Autoregressive conditional heteroskedasticity 


In econometrics, the autoregressive conditional heteroskedasticity (ARCH) model is a statistical model for time series data that describes the variance of the current error term or innovation as a function of the actual sizes of the previous time ...
 (ARCH) models time series where the variance changes. Seasonal ARIMA (SARIMA or periodic ARMA) models periodic variation.   Autoregressive fractionally integrated moving average (ARFIMA, or Fractional ARIMA, FARIMA) model time-series that exhibits long memory. Multiscale AR (MAR) is indexed by the nodes of a tree 









In botany, a tree is a perennial plant with an elongated  stem, or  trunk, usually supporting  branches and  leaves. In some usages, the definition of a tree may be narrower, e.g., including only  woody plants with  secondary growth, only  ...
 instead of integers.

  Autoregressive–moving-average model with exogenous inputs (ARMAX) 
 

The notation ARMAX(''p'', ''q'', ''b'') refers to a model with ''p'' autoregressive terms, ''q'' moving average terms and ''b'' exogenous inputs terms. The last term is a linear combination of the last ''b'' terms of a known and external time series  $d_t$ . It is given by:

: $X_t = \varepsilon_t +  \sum_^p \varphi_i X_ + \sum_^q \theta_i \varepsilon_ + \sum_^b \eta_i d_.\,$ 
where  $\eta_1, \ldots, \eta_b$  are the ''parameters'' of the exogenous input  $d_t$ .

Some nonlinear variants of models with exogenous variables have been defined: see for example  Nonlinear autoregressive exogenous model.

Statistical packages implement the ARMAX model through the use of "exogenous" (that is, independent) variables. Care must be taken when interpreting the output of those packages, because the estimated parameters usually (for example, in  RARIMA Modelling of Time Series
 R documentation and gretl 

gretl is an  open-source  statistical package, mainly for econometrics. The name is an acronym for  ''G''nu  ''R''egression, ''E''conometrics and  ''T''ime-series ''L''ibrary.

It has both a graphical user interface (GUI) and a command-line interf ...
) refer to the regression:
:  $X_t - m_t = \varepsilon_t + \sum_^p \varphi_i (X_ - m_) + \sum_^q \theta_i \varepsilon_.\,$ 
where  $m_t$  incorporates all exogenous (or independent) variables:
:  $m_t = c + \sum_^b \eta_i d_.\,$ 

  See also 

* Autoregressive integrated moving average 

In time series analysis used in statistics and econometrics, autoregressive integrated moving average (ARIMA) and seasonal ARIMA (SARIMA) models are generalizations of the  autoregressive moving average (ARMA) model to non-stationary series and pe ...
 (ARIMA)
* Exponential smoothing 
Exponential smoothing or exponential moving average (EMA) is a rule of thumb technique for smoothing time series data using the exponential  window function. Whereas in the  simple moving average the past observations are  weighted equally, exponen ...

* Linear predictive coding 


Linear predictive coding (LPC) is a method used mostly in audio signal processing and  speech processing for representing the spectral envelope of a digital signal of speech in  compressed form, using the information of a linear  predictive model ...

* Predictive analytics 


Predictive analytics encompasses a variety of Statistics, statistical techniques from data mining, Predictive modelling, predictive modeling, and machine learning that analyze current and historical facts to make predictions about future or other ...

* Infinite impulse response 

Infinite impulse response (IIR) is a property applying to many  linear time-invariant systems that are distinguished by having an  impulse response h(t) that does not become exactly zero past a certain point but continues indefinitely. This is in  ...

* Finite impulse response 
In signal processing, a finite impulse response (FIR) filter is a  filter whose  impulse response (or response to any finite length input) is of ''finite'' duration, because it settles to zero in finite time. This is in contrast to  infinite impuls ...




  References 


  Further reading 

* 
* 
* .

Shumway, R.H. and Stoffer, D.S. (2017). ''Time Series Analysis and Its Applications with R Examples''. Springer. DOI: 10.1007/978-3-319-52452-8





{{DEFAULTSORT:Autoregressive-Moving-Average Model
 Autocorrelation