statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

, the Jarque–Bera test is a goodness-of-fit test of whether sample data have the

skewness In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined. For a unimodal ...

and

kurtosis In probability theory and statistics, kurtosis (from , ''kyrtos'' or ''kurtos'', meaning "curved, arching") refers to the degree of “tailedness” in the probability distribution of a real-valued random variable. Similar to skewness, kurtos ...

matching a

normal distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac ...

. The test is named after Carlos Jarque and Anil K. Bera. The test statistic is always nonnegative. If it is far from zero, it signals the data do not have a normal distribution. The

test statistic Test statistic is a quantity derived from the sample for statistical hypothesis testing.Berger, R. L.; Casella, G. (2001). ''Statistical Inference'', Duxbury Press, Second Edition (p.374) A hypothesis test is typically specified in terms of a tes ...

''JB'' is defined as :

\mathit = \frac \left( S^2 + \frac14 (K-3)^2 \right)

where ''n'' is the number of observations (or degrees of freedom in general); ''S'' is the sample

, ''K'' is the sample

: :

S = \frac
        = \frac  ,

K = \frac
    = \frac  ,

where

\hat_3

and

\hat_4

are the estimates of third and fourth

central moment In probability theory and statistics, a central moment is a moment of a probability distribution of a random variable about the random variable's mean; that is, it is the expected value of a specified integer power of the deviation of the random ...

s, respectively,

\bar

is the sample

mean A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...

, and

\hat^2

is the estimate of the second central moment, the

variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...

. If the data comes from a normal distribution, the ''JB'' statistic

asymptotically In analytic geometry, an asymptote () of a curve is a line such that the distance between the curve and the line approaches zero as one or both of the ''x'' or ''y'' coordinates tends to infinity. In projective geometry and related contexts, ...

has a

chi-squared distribution In probability theory and statistics, the \chi^2-distribution with k Degrees of freedom (statistics), degrees of freedom is the distribution of a sum of the squares of k Independence (probability theory), independent standard normal random vari ...

with two

degrees of freedom In many scientific fields, the degrees of freedom of a system is the number of parameters of the system that may vary independently. For example, a point in the plane has two degrees of freedom for translation: its two coordinates; a non-infinite ...

, so the statistic can be used to

test Test(s), testing, or TEST may refer to: * Test (assessment), an educational assessment intended to measure the respondents' knowledge or other abilities Arts and entertainment * ''Test'' (2013 film), an American film * ''Test'' (2014 film) ...

the hypothesis that the data are from a

. The

null hypothesis The null hypothesis (often denoted ''H''0) is the claim in scientific research that the effect being studied does not exist. The null hypothesis can also be described as the hypothesis in which no relationship exists between two sets of data o ...

is a joint hypothesis of the skewness being zero and the

excess kurtosis In probability theory and statistics, kurtosis (from , ''kyrtos'' or ''kurtos'', meaning "curved, arching") refers to the degree of “tailedness” in the probability distribution of a real-valued random variable. Similar to skewness, kurtosi ...

being zero. Samples from a normal distribution have an expected skewness of 0 and an expected excess kurtosis of 0 (which is the same as a kurtosis of 3). As the definition of ''JB'' shows, any deviation from this increases the JB statistic. For small samples the chi-squared approximation is overly sensitive, often rejecting the null hypothesis when it is true. Furthermore, the distribution of ''p''-values departs from a uniform distribution and becomes a right-skewed

unimodal distribution In mathematics, unimodality means possessing a unique mode (statistics), mode. More generally, unimodality means there is only a single highest value, somehow defined, of some mathematical object. Unimodal probability distribution In statis ...

, especially for small ''p''-values. This leads to a large

Type I error Type I error, or a false positive, is the erroneous rejection of a true null hypothesis in statistical hypothesis testing. A type II error, or a false negative, is the erroneous failure in bringing about appropriate rejection of a false null hy ...

rate. The table below shows some ''p''-values approximated by a chi-squared distribution that differ from their true alpha levels for small samples. : (These values have been approximated using

Monte Carlo simulation Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be det ...

Matlab MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...

) In

MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...

's implementation, the chi-squared approximation for the JB statistic's distribution is only used for large sample sizes (> 2000). For smaller samples, it uses a table derived from Monte Carlo simulations in order to interpolate ''p''-values.

History

The statistic was derived by Carlos M. Jarque and Anil K. Bera while working on their Ph.D. Thesis at the Australian National University.

Jarque–Bera test in regression analysis

According to Robert Hall, David Lilien, et al. (1995) when using this test along with multiple regression analysis the right estimate is: :

\mathit = \frac \left( S^2 + \frac14 (K-3)^2 \right)

where ''n'' is the number of observations and ''k'' is the number of regressors when examining residuals to an equation.

Implementations

ALGLIB
includes an implementation of the Jarque–Bera test in C++, C#, Delphi, Visual Basic, etc. *

gretl gretl is an open-source statistical package, mainly for econometrics. The name is an acronym for ''G''nu ''R''egression, ''E''conometrics and ''T''ime-series ''L''ibrary. It has both a graphical user interface (GUI) and a command-line interf ...

includes an implementation of the Jarque–Bera test * Julia includes an implementation of the Jarque-Bera test ''JarqueBeraTest'' in the ''HypothesisTests'' package. *

includes an implementation of the Jarque–Bera test, the function "jbtest". * Python statsmodels includes an implementation of the Jarque–Bera test, "statsmodels.stats.stattools.py". * R includes implementations of the Jarque–Bera test: ''jarque.bera.test'' in the package ''tseries'', for example, and ''jarque.test'' in the package ''moments''. * Wolfram includes a built in function called, JarqueBeraALMTest and is not limited to testing against a Gaussian distribution.

History

Jarque–Bera test in regression analysis

Implementations

See also

References

Further reading