statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

, asymptotic theory, or large sample theory, is a framework for assessing properties of

estimator In statistics, an estimator is a rule for calculating an estimate of a given quantity based on Sample (statistics), observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguish ...

s and

statistical test A statistical hypothesis test is a method of statistical inference used to decide whether the data provide sufficient evidence to reject a particular hypothesis. A statistical hypothesis test typically involves a calculation of a test statistic. ...

s. Within this framework, it is often assumed that the

sample size Sample size determination or estimation is the act of choosing the number of observations or replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences abo ...

may grow indefinitely; the properties of estimators and tests are then evaluated under the limit of . In practice, a limit evaluation is considered to be approximately valid for large finite sample sizes too.Höpfner, R. (2014), Asymptotic Statistics, Walter de Gruyter. 286 pag. ,

Overview

Most statistical problems begin with a dataset of

size Size in general is the Magnitude (mathematics), magnitude or dimensions of a thing. More specifically, ''geometrical size'' (or ''spatial size'') can refer to three geometrical measures: length, area, or volume. Length can be generalized ...

. The asymptotic theory proceeds by assuming that it is possible (in principle) to keep collecting additional data, thus that the sample size grows infinitely, i.e. . Under the assumption, many results can be obtained that are unavailable for samples of finite size. An example is the weak law of large numbers. The law states that for a sequence of independent and identically distributed (IID)

random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...

s , if one value is drawn from each random variable and the average of the first values is computed as , then the converge in probability to the population mean as .A. DasGupta (2008), ''Asymptotic Theory of Statistics and Probability'', Springer. , In asymptotic theory, the standard approach is . For some

statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repre ...

s, slightly different approaches of asymptotics may be used. For example, with

panel data In statistics and econometrics, panel data and longitudinal data are both multi-dimensional data involving measurements over time. Panel data is a subset of longitudinal data where observations are for the same subjects each time. Time series and ...

, it is commonly assumed that one dimension in the data remains fixed, whereas the other dimension grows: and , or vice versa. Besides the standard approach to asymptotics, other alternative approaches exist: * Within the

local asymptotic normality In statistics, local asymptotic normality is a property of a sequence of statistical models, which allows this sequence to be asymptotic distribution, asymptotically approximated by a normal distribution, normal location model, after an appropriate ...

framework, it is assumed that the value of the "true parameter" in the model varies slightly with , such that the -th model corresponds to . This approach lets us study the regularity of estimators. * When

s are studied for their power to distinguish against the alternatives that are close to the null hypothesis, it is done within the so-called "local alternatives" framework: the null hypothesis is and the alternative is . This approach is especially popular for the

unit root test In statistics, a unit root test tests whether a time series variable is non-stationary and possesses a unit root. The null hypothesis is generally defined as the presence of a unit root and the alternative hypothesis is either Stationary process, s ...

s. * There are models where the dimension of the parameter space slowly expands with , reflecting the fact that the more observations there are, the more structural effects can be feasibly incorporated in the model. * In

kernel density estimation In statistics, kernel density estimation (KDE) is the application of kernel smoothing for probability density estimation, i.e., a non-parametric method to estimate the probability density function of a random variable based on '' kernels'' as ...

and kernel regression, an additional parameter is assumed—the bandwidth . In those models, it is typically taken that as . The rate of convergence must be chosen carefully, though, usually . In many cases, highly accurate results for finite samples can be obtained via numerical methods (i.e. computers); even in such cases, though, asymptotic analysis can be useful. This point was made by , as follows.

Modes of convergence of random variables

Asymptotic properties

Estimators

''
Consistency In deductive logic, a consistent theory is one that does not lead to a logical contradiction. A theory T is consistent if there is no formula \varphi such that both \varphi and its negation \lnot\varphi are elements of the set of consequences ...
''

A sequence of estimates is said to be ''consistent'', if it converges in probability to the true value of the parameter being estimated: :

\hat\theta_n\ \xrightarrow\ \theta_0.

That is, roughly speaking with an infinite amount of data the

(the formula for generating the estimates) would almost surely give the correct result for the parameter being estimated.

'' Asymptotic distribution''

If it is possible to find sequences of non-random constants , (possibly depending on the value of ), and a non-degenerate distribution such that :

b_n(\hat\theta_n - a_n)\ \xrightarrow\ G ,

then the sequence of estimators

\textstyle\hat\theta_n

is said to have the '' asymptotic distribution'' ''G''. Most often, the estimators encountered in practice are asymptotically normal, meaning their asymptotic distribution is the

normal distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac ...

, with , , and : :

\sqrt(\hat\theta_n - \theta_0)\ \xrightarrow\ \mathcal(0, V).

''Asymptotic
confidence region In statistics, a confidence region is a multi-dimensional generalization of a confidence interval. For a bivariate normal distribution, it is an ellipse, also known as the error ellipse. More generally, it is a set of points in an ''n''-dimension ...
s''

Asymptotic theorems

Central limit theorem In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the Probability distribution, distribution of a normalized version of the sample mean converges to a Normal distribution#Standard normal distributi ...

* Continuous mapping theorem *

Glivenko–Cantelli theorem In the theory of probability, the Glivenko–Cantelli theorem (sometimes referred to as the fundamental theorem of statistics), named after Valery Ivanovich Glivenko and Francesco Paolo Cantelli, describes the asymptotic behaviour of the empirica ...

Law of large numbers In probability theory, the law of large numbers is a mathematical law that states that the average of the results obtained from a large number of independent random samples converges to the true value, if it exists. More formally, the law o ...

* Law of the iterated logarithm * Slutsky's theorem *

Delta method In statistics, the delta method is a method of deriving the asymptotic distribution of a random variable. It is applicable when the random variable being considered can be defined as a differentiable function of a random variable which is Asymptoti ...

References

Bibliography

* * * * * * * * * * * * * {{Authority control