In

statistics
Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data
Data (; ) are individual facts, statistics, or items of information, often numeric. In a more technical sens ...

, the concept of being an invariant estimator is a criterion that can be used to compare the properties of different estimator
In statistics, an estimator is a rule for calculating an estimate of a given quantity based on Sample (statistics), observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguishe ...

s for the same quantity. It is a way of formalising the idea that an estimator should have certain intuitively appealing qualities. Strictly speaking, "invariant" would mean that the estimates themselves are unchanged when both the measurements and the parameters are transformed in a compatible way, but the meaning has been extended to allow the estimates to change in appropriate ways with such transformations. The term equivariant estimator is used in formal mathematical contexts that include a precise description of the relation of the way the estimator changes in response to changes to the dataset and parameterisation: this corresponds to the use of "equivariance
In mathematics, equivariance is a form of symmetry for function (mathematics), functions from one space with symmetry to another (such as symmetric spaces). A function is said to be an equivariant map when its domain and codomain are Group action (m ...

" in more general mathematics.
General setting

Background

Instatistical inference
Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution, distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical ...

, there are several approaches to estimation theory
Estimation theory is a branch of statistics
Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data
Data (; ) are individual facts, statistics, or items of inf ...

that can be used to decide immediately what estimators should be used according to those approaches. For example, ideas from Bayesian inference
Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, and e ...

would lead directly to Bayesian estimator
Thomas Bayes (/beɪz/; c. 1701 – 1761) was an English statistician, philosopher, and Presbyterian minister.
Bayesian () refers to a range of concepts and approaches that are ultimately based on a degree-of-belief interpretation of probability, ...

s. Similarly, the theory of classical statistical inference can sometimes lead to strong conclusions about what estimator should be used. However, the usefulness of these theories depends on having a fully prescribed statistical model
A statistical model is a mathematical model
A mathematical model is a description of a system
A system is a group of Interaction, interacting or interrelated elements that act according to a set of rules to form a unified whole.
A system ...

and may also depend on having a relevant loss function to determine the estimator. Thus a Bayesian analysis
Bayesian inference is a method of statistical inference in which Bayes' theorem
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule; recently Bayes–Price theorem), named after the Reverend Thomas Bay ...

might be undertaken, leading to a posterior distribution for relevant parameters, but the use of a specific utility or loss function may be unclear. Ideas of invariance can then be applied to the task of summarising the posterior distribution. In other cases, statistical analyses are undertaken without a fully defined statistical model or the classical theory of statistical inference cannot be readily applied because the family of models being considered are not amenable to such treatment. In addition to these cases where general theory does not prescribe an estimator, the concept of invariance of an estimator can be applied when seeking estimators of alternative forms, either for the sake of simplicity of application of the estimator or so that the estimator is robustRobustness is the property of being strong and healthy in constitution. When it is transposed into a system, it refers to the ability of tolerating perturbations that might affect the system’s functional body. In the same line ''robustness'' can be ...

.
The concept of invariance is sometimes used on its own as a way of choosing between estimators, but this is not necessarily definitive. For example, a requirement of invariance may be incompatible with the requirement that the estimator be mean-unbiased; on the other hand, the criterion of median-unbiasedness is defined in terms of the estimator's sampling distribution and so is invariant under many transformations.
One use of the concept of invariance is where a class or family of estimators is proposed and a particular formulation must be selected amongst these. One procedure is to impose relevant invariance properties and then to find the formulation within this class that has the best properties, leading to what is called the optimal invariant estimator.
Some classes of invariant estimators

There are several types of transformations that are usefully considered when dealing with invariant estimators. Each gives rise to a class of estimators which are invariant to those particular types of transformation. *Shift invariance: Notionally, estimates of alocation parameter
In statistics, a location parameter of a probability distribution is a scalar- or vector-valued statistical parameter, parameter x_0, which determines the "location" or shift of the distribution. In the literature of location parameter estimation, ...

should be invariant to simple shifts of the data values. If all data values are increased by a given amount, the estimate should change by the same amount. When considering estimation using a weighted average
The weighted arithmetic mean is similar to an ordinary arithmetic mean (the most common type of average), except that instead of each of the data points contributing equally to the final average, some data points contribute more than others. The ...

, this invariance requirement immediately implies that the weights should sum to one. While the same result is often derived from a requirement for unbiasedness, the use of "invariance" does not require that a mean value exists and makes no use of any probability distribution at all.
*Scale invariance: Note that this topic about the invariance of the estimator scale parameter not to be confused with the more general scale invariance
In physics, mathematics and statistics, scale invariance is a feature of objects or laws that do not change if scales of length, energy, or other variables, are multiplied by a common factor, and thus represent a universality.
The technical term ...

about the behavior of systems under aggregate properties (in physics).
*Parameter-transformation invariance: Here, the transformation applies to the parameters alone. The concept here is that essentially the same inference should be made from data and a model involving a parameter θ as would be made from the same data if the model used a parameter φ, where φ is a one-to-one transformation of θ, φ=''h''(θ). According to this type of invariance, results from transformation-invariant estimators should also be related by φ=''h''(θ). Maximum likelihood estimator
In statistics, maximum likelihood estimation (MLE) is a method of estimation theory, estimating the Statistical parameter, parameters of a probability distribution by Mathematical optimization, maximizing a likelihood function, so that under the a ...

s have this property when the transformation is monotonic
In mathematics
Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities a ...

. Though the asymptotic properties of the estimator might be invariant, the small sample properties can be different, and a specific distribution needs to be derived.Gouriéroux and Monfort (1995)
*Permutation invariance: Where a set of data values can be represented by a statistical model that they are outcomes from independent and identically distributed
In probability theory
Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expres ...

random variables
In probability and statistics
Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventio ...

, it is reasonable to impose the requirement that any estimator of any property of the common distribution should be permutation-invariant: specifically that the estimator, considered as a function of the set of data-values, should not change if items of data are swapped within the dataset.
The combination of permutation invariance and location invariance for estimating a location parameter from an independent and identically distributed
In probability theory
Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expres ...

dataset using a weighted average implies that the weights should be identical and sum to one. Of course, estimators other than a weighted average may be preferable.
Optimal invariant estimators

Under this setting, we are given a set of measurements $x$ which contains information about an unknown parameter $\backslash theta$. The measurements $x$ are modelled as a vector random variable having aprobability density function
In probability theory
Probability theory is the branch of mathematics
Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces ...

$f(x,\; \backslash theta)$ which depends on a parameter vector $\backslash theta$.
The problem is to estimate $\backslash theta$ given $x$. The estimate, denoted by $a$, is a function of the measurements and belongs to a set $A$. The quality of the result is defined by a loss function In mathematical optimization
Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. Optimizat ...

$L=L(a,\backslash theta)$ which determines a risk functionIn mathematical optimization
File:Nelder-Mead Simionescu.gif, Nelder-Mead minimum search of Test functions for optimization, Simionescu's function. Simplex vertices are ordered by their values, with 1 having the lowest ( best) value., alt=
Math ...

$R=R(a,\backslash theta)=E;\; href="/html/ALL/s/(a,\backslash theta).html"\; ;"title="(a,\backslash theta)">\backslash theta$In classification

Instatistical classification
In statistics
Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data
Data (; ) are individual facts, statistics, or items of information, often numeric. In a mor ...

, the rule which assigns a class to a new data-item can be considered to be a special type of estimator. A number of invariance-type considerations can be brought to bear in formulating prior knowledge for pattern recognitionPattern recognition is a very active field of research intimately bound to machine learning. Also known as classification or statistical classification, pattern recognition aims at building a classifier (mathematics), classifier that can determine th ...

.
Mathematical setting

Definition

An invariant estimator is an estimator which obeys the following two rules: # Principle of Rational Invariance: The action taken in a decision problem should not depend on transformation on the measurement used # Invariance Principle: If two decision problems have the same formal structure (in terms of $X$, $\backslash Theta$, $f(x,\; \backslash theta)$ and $L$), then the same decision rule should be used in each problem. To define an invariant or equivariant estimator formally, some definitions related to groups of transformations are needed first. Let $X$ denote the set of possible data-samples. Agroup of transformation
In mathematics, a group action on a space (mathematics), space is a group homomorphism of a given group (mathematics), group into the group of transformation (geometry), transformations of the space. Similarly, a group action on a mathematical ...

s of $X$, to be denoted by $G$, is a set of (measurable) 1:1 and onto transformations of $X$ into itself, which satisfies the following conditions:
# If $g\_1\backslash in\; G$ and $g\_2\backslash in\; G$ then $g\_1\; g\_2\backslash in\; G\; \backslash ,$
# If $g\backslash in\; G$ then $g^\backslash in\; G$, where $g^(g(x))=x\; \backslash ,\; .$ (That is, each transformation has an inverse within the group.)
# $e\backslash in\; G$ (i.e. there is an identity transformation $e(x)=x\; \backslash ,$)
Datasets $x\_1$ and $x\_2$ in $X$ are equivalent if $x\_1=g(x\_2)$ for some $g\backslash in\; G$. All the equivalent points form an equivalence class
In mathematics
Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities an ...

.
Such an equivalence class is called an orbit
In celestial mechanics, an orbit is the curved trajectory of an physical body, object such as the trajectory of a planet around a star, or of a natural satellite around a planet, or of an satellite, artificial satellite around an object or po ...

(in $X$). The $x\_0$ orbit, $X(x\_0)$, is the set $X(x\_0)=\backslash $.
If $X$ consists of a single orbit then $g$ is said to be transitive.
A family of densities $F$ is said to be invariant under the group $G$ if, for every $g\backslash in\; G$ and $\backslash theta\backslash in\; \backslash Theta$ there exists a unique $\backslash theta^*\backslash in\; \backslash Theta$ such that $Y=g(x)$ has density $f(y,\; \backslash theta^*)$. $\backslash theta^*$ will be denoted $\backslash bar(\backslash theta)$.
If $F$ is invariant under the group $G$ then the loss function $L(\backslash theta,a)$ is said to be invariant under $G$ if for every $g\backslash in\; G$ and $a\backslash in\; A$ there exists an $a^*\backslash in\; A$ such that $L(\backslash theta,a)=L(\backslash bar(\backslash theta),a^*)$ for all $\backslash theta\; \backslash in\; \backslash Theta$. The transformed value $a^*$ will be denoted by $\backslash tilde(a)$.
In the above, $\backslash bar=\backslash $ is a group of transformations from $\backslash Theta$ to itself and $\backslash tilde=\backslash $ is a group of transformations from $A$ to itself.
An estimation problem is invariant(equivariant) under $G$ if there exist three groups $G,\; \backslash bar,\; \backslash tilde$ as defined above.
For an estimation problem that is invariant under $G$, estimator $\backslash delta(x)$ is an invariant estimator under $G$ if, for all $x\backslash in\; X$ and $g\backslash in\; G$,
:$\backslash delta(g(x))\; =\; \backslash tilde(\backslash delta(x)).$
Properties

# The risk function of an invariant estimator, $\backslash delta$, is constant on orbits of $\backslash Theta$. Equivalently $R(\backslash theta,\backslash delta)=R(\backslash bar(\backslash theta),\backslash delta)$ for all $\backslash theta\; \backslash in\; \backslash Theta$ and $\backslash bar\backslash in\; \backslash bar$. # The risk function of an invariant estimator with transitive $\backslash bar$ is constant. For a given problem, the invariant estimator with the lowest risk is termed the "best invariant estimator". Best invariant estimator cannot always be achieved. A special case for which it can be achieved is the case when $\backslash bar$ is transitive.Example: Location parameter

Suppose $\backslash theta$ is a location parameter if the density of $X$ is of the form $f(x-\backslash theta)$. For $\backslash Theta=A=\backslash mathbb^1$ and $L=L(a-\backslash theta)$, the problem is invariant under $g=\backslash bar=\backslash tilde=\backslash $. The invariant estimator in this case must satisfy :$\backslash delta(x+c)=\backslash delta(x)+c,\; \backslash text\; c\backslash in\; \backslash mathbb,$ thus it is of the form $\backslash delta(x)=x+K$ ($K\backslash in\; \backslash mathbb$). $\backslash bar$ is transitive on $\backslash Theta$ so the risk does not vary with $\backslash theta$: that is, $R(\backslash theta,\backslash delta)=R(0,\backslash delta)=\backslash operatorname;\; href="/html/ALL/s/(X+K).html"\; ;"title="(X+K)">\backslash theta=0$Pitman estimator

The estimation problem is that $X=(X\_1,\backslash dots,X\_n)$ has density $f(x\_1-\backslash theta,\backslash dots,x\_n-\backslash theta)$, where ''θ'' is a parameter to be estimated, and where theloss function In mathematical optimization
Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. Optimizat ...

is $L(,\; a-\backslash theta,\; )$. This problem is invariant with the following (additive) transformation groups:
:$G=\backslash ,$
:$\backslash bar=\backslash ,$
:$\backslash tilde=\backslash \; .$
The best invariant estimator $\backslash delta(x)$ is the one that minimizes
:$\backslash frac,$
and this is Pitman's estimator (1939).
For the squared error loss case, the result is
:$\backslash delta(x)=\backslash frac.$
If $x\; \backslash sim\; N(\backslash theta\; 1\_n,I)\backslash ,\backslash !$ (i.e. a multivariate normal distribution
In probability theory
Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...

with independent, unit-variance components) then
:$\backslash delta\_\; =\; \backslash delta\_=\backslash frac.$
If $x\; \backslash sim\; C(\backslash theta\; 1\_n,I\; \backslash sigma^2)\backslash ,\backslash !$ (independent components having a with scale parameter ''σ'') then
$\backslash delta\_\; \backslash ne\; \backslash delta\_$,. However the result is
:$\backslash delta\_=\backslash sum\_^n,\; \backslash qquad\; n>1,$
with
:$w\_k\; =\; \backslash prod\_\backslash left;\; href="/html/ALL/s/frac\backslash right.html"\; ;"title="frac\backslash right">frac\backslash right$
References

* * * * {{DEFAULTSORT:Invariant Estimator Estimator Invariant theory