statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

, mean absolute error (MAE) is a measure of errors between paired observations expressing the same phenomenon. Examples of ''Y'' versus ''X'' include comparisons of predicted versus observed, subsequent time versus initial time, and one technique of measurement versus an alternative technique of measurement. MAE is calculated as the sum of absolute errors (i.e., the

Manhattan distance Taxicab geometry or Manhattan geometry is geometry where the familiar Euclidean distance is ignored, and the distance between two point (geometry), points is instead defined to be the sum of the absolute differences of their respective Cartesian ...

) divided by the

sample size Sample size determination or estimation is the act of choosing the number of observations or replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences abo ...

\mathrm = \frac =\frac.

It is thus an arithmetic average of the absolute errors

, e_i,  = , y_i - x_i,

, where

y_i

is the prediction and

x_i

the true value. Alternative formulations may include relative frequencies as weight factors. The mean absolute error uses the same scale as the data being measured. This is known as a scale-dependent accuracy measure and therefore cannot be used to make comparisons between predicted values that use different scales. The mean absolute error is a common measure of

forecast error In statistics, a forecast error is the difference between the actual or real and the predicted or forecast value of a time series or any other phenomenon of interest. Since the forecast error is derived from the same scale of data, comparisons bet ...

time series analysis In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. ...

, sometimes used in confusion with the more standard definition of

mean absolute deviation The average absolute deviation (AAD) of a data set is the average of the absolute deviations from a central point. It is a summary statistic of statistical dispersion or variability. In the general form, the central point can be a mean, median, m ...

. The same confusion exists more generally.

Quantity disagreement and allocation disagreement

remote sensing Remote sensing is the acquisition of information about an physical object, object or phenomenon without making physical contact with the object, in contrast to in situ or on-site observation. The term is applied especially to acquiring inform ...

the MAE is sometimes expressed as the sum of two components: quantity disagreement and allocation disagreement. Quantity disagreement is the absolute value of the mean error:

\left, \frac\.

Allocation disagreement is MAE minus quantity disagreement. It is also possible to identify the types of difference by looking at an

(x,y)

plot. Quantity difference exists when the average of the X values does not equal the average of the Y values. Allocation difference exists if and only if points reside on both sides of the identity line.

Related measures

The mean absolute error is one of a number of ways of comparing forecasts with their eventual outcomes. Well-established alternatives are the

mean absolute scaled error In statistics, the mean absolute scaled error (MASE) is a measure of the accuracy of forecasts. It is the mean absolute error of the forecast values, divided by the mean absolute error of the in-sample one-step naive forecast. It was proposed in ...

(MASE), mean absolute log error (MALE), and the

mean squared error In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference betwee ...

. These all summarize performance in ways that disregard the direction of over- or under- prediction; a measure that does place emphasis on this is the mean signed difference. Where a prediction model is to be fitted using a selected performance measure, in the sense that the

least squares The method of least squares is a mathematical optimization technique that aims to determine the best fit function by minimizing the sum of the squares of the differences between the observed values and the predicted values of the model. The me ...

approach is related to the

, the equivalent for mean absolute error is

least absolute deviations Least absolute deviations (LAD), also known as least absolute errors (LAE), least absolute residuals (LAR), or least absolute values (LAV), is a statistical optimality criterion and a statistical optimization technique based on minimizing the su ...

. MAE is not identical to root-mean square error (RMSE), although some researchers report and interpret it that way. The MAE is conceptually simpler and also easier to interpret than RMSE: it is simply the average absolute vertical or horizontal distance between each point in a scatter plot and the Y=X line. In other words, MAE is the average absolute difference between X and Y. Furthermore, each error contributes to MAE in proportion to the absolute value of the error. This is in contrast to RMSE which involves squaring the differences, so that a few large differences will increase the RMSE to a greater degree than the MAE.

Optimality property

The ''mean absolute error'' of a real variable ''c'' with respect to the

random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...

''X'' is

E(\left, X-c\).

Provided that the probability distribution of ''X'' is such that the above expectation exists, then ''m'' is a

median The median of a set of numbers is the value separating the higher half from the lower half of a Sample (statistics), data sample, a statistical population, population, or a probability distribution. For a data set, it may be thought of as the “ ...

of ''X'' if and only if ''m'' is a minimizer of the mean absolute error with respect to ''X''. In particular, ''m'' is a sample median if and only if ''m'' minimizes the arithmetic mean of the absolute deviations. More generally, a median is defined as a minimum of

E(, X-c,  - , X,  ),

as discussed at Multivariate median (and specifically at Spatial median). This optimization-based definition of the median is useful in statistical data-analysis, for example, in ''k''-medians clustering.

Proof of optimality

Statement: The classifier minimising

\mathbb, y-\hat,

\hat(x)=\text(y, X=x)

. Proof: The

Loss functions for classification In machine learning and mathematical optimization, loss functions for classification are computationally feasible loss functions representing the price paid for inaccuracy of predictions in classification problems (problems of identifying which ...

\begin
L &= \mathbb \fracL = \int_^af_(y)\, dy+\int_a^-f_(y)\, dy=0 .

This means

\int_^a f(y)\, dy = \int_a^ f(y)\, dy .

Hence,

F_(a)=0.5 .

References

{{DEFAULTSORT:Mean Absolute Error Point estimation performance Statistical deviation and dispersion Time series Errors and residuals>y-a, , X=x\ &= \int_^, y-a, f_(y)\, dy\\ &= \int_^a (a-y)f_(y)\, dy+\int_a^(y-a)f_(y)\, dy.\\ \endDifferentiating with respect to ''a'' gives

\fracL = \int_^af_(y)\, dy+\int_a^-f_(y)\, dy=0 .

This means

\int_^a f(y)\, dy = \int_a^ f(y)\, dy .

Hence,

F_(a)=0.5 .

References

{{DEFAULTSORT:Mean Absolute Error Point estimation performance Statistical deviation and dispersion Time series Errors and residuals

Quantity disagreement and allocation disagreement

Related measures

Optimality property

Proof of optimality

See also

References

See also

References