MINQUE
   HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, the theory of minimum norm quadratic unbiased estimation (MINQUE) was developed by C. R. Rao. MINQUE is a theory alongside other estimation methods in
estimation theory Estimation theory is a branch of statistics that deals with estimating the values of Statistical parameter, parameters based on measured empirical data that has a random component. The parameters describe an underlying physical setting in such ...
, such as the method of moments or
maximum likelihood estimation In statistics, maximum likelihood estimation (MLE) is a method of estimation theory, estimating the Statistical parameter, parameters of an assumed probability distribution, given some observed data. This is achieved by Mathematical optimization, ...
. Similar to the theory of best linear unbiased estimation, MINQUE is specifically concerned with
linear regression In statistics, linear regression is a statistical model, model that estimates the relationship between a Scalar (mathematics), scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A mode ...
models. The method was originally conceived to estimate heteroscedastic error variance in multiple linear regression. MINQUE estimators also provide an alternative to maximum likelihood estimators or restricted maximum likelihood estimators for variance components in mixed effects models. MINQUE estimators are
quadratic forms In mathematics, a quadratic form is a polynomial with terms all of degree two (" form" is another name for a homogeneous polynomial). For example, 4x^2 + 2xy - 3y^2 is a quadratic form in the variables and . The coefficients usually belong to ...
of the response variable and are used to estimate a linear function of the variances.


Principles

We are concerned with a mixed effects model for the random vector \mathbf \in \mathbb^n with the following linear structure. \mathbf = \mathbf\boldsymbol\beta + \mathbf_1 \boldsymbol\xi_1 + \cdots + \mathbf_k \boldsymbol\xi_k Here, \mathbf \in \mathbb^ is a
design matrix In statistics and in particular in regression analysis, a design matrix, also known as model matrix or regressor matrix and often denoted by X, is a matrix of values of explanatory variables of a set of objects. Each row represents an individual o ...
for the fixed effects, \boldsymbol\beta \in \mathbb^m represents the unknown fixed-effect parameters, \mathbf_i \in \mathbb^ is a design matrix for the i-th random-effect component, and \boldsymbol\xi_i\in\mathbb^ is a
random vector In probability, and statistics, a multivariate random variable or random vector is a list or vector of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge ...
for the i-th random-effect component. The random effects are assumed to have zero mean (\mathbb boldsymbol\xi_i\mathbf) and be uncorrelated (\mathbb boldsymbol\xi_i\sigma^2_i\mathbf_). Furthermore, any two random effect vectors are also uncorrelated (\mathbb boldsymbol\xi_i, \boldsymbol\xi_j\mathbf\, \forall i\neq j). The unknown variances \sigma^2_1,\cdots,\sigma^2_k represent the variance components of the model. This is a general model that captures commonly used linear regression models. # Gauss-Markov Model: If we consider a one-component model where \mathbf_1=\mathbf_n, then the model is equivalent to the Gauss-Markov model \mathbf=\mathbf\boldsymbol\beta + \boldsymbol\epsilon with \mathbb boldsymbol\epsilon\mathbf and \mathbb boldsymbol\epsilon\sigma^2_1 \mathbf_n. # Heteroscedastic Model: Each set of random variables in \mathbf that shares a common variance can be modeled as an individual variance component with an appropriate \mathbf_i. A compact representation for the model is the following, where \mathbf = \left begin\mathbf_1&\cdots&\mathbf_k\end\right/math> and \boldsymbol\xi^\top = \left begin \boldsymbol\xi_1^\top&\cdots&\boldsymbol\xi_k^\top\end\right/math>. \mathbf=\mathbf\boldsymbol\beta+\mathbf\boldsymbol\xi Note that this model makes no distributional assumptions about \mathbf other than the first and second moments. \mathbb mathbf= \mathbf\boldsymbol\beta \mathbb mathbf\sigma^2_1\mathbf_1\mathbf_1^\top + \cdots + \sigma^2_k \mathbf_k \mathbf_k^\top \equiv \sigma^2_1\mathbf_1 + \cdots + \sigma^2_k \mathbf_k The goal in MINQUE is to estimate \theta = \sum_^k p_i \sigma^2_i using a quadratic form \hat=\mathbf^\top \mathbf \mathbf. MINQUE estimators are derived by identifying a matrix \mathbf such that the estimator has some desirable properties, described below.


Optimal Estimator Properties to Constrain MINQUE


Invariance to translation of the fixed effects

Consider a new fixed-effect parameter \boldsymbol\gamma=\boldsymbol\beta - \boldsymbol\beta_0, which represents a translation of the original fixed effect. The new, equivalent model is now the following. \mathbf - \mathbf\boldsymbol\beta_0 = \mathbf\boldsymbol\gamma + \mathbf\boldsymbol\xi Under this equivalent model, the MINQUE estimator is now (\mathbf - \mathbf\boldsymbol\beta_0)^\top \mathbf (\mathbf - \mathbf\boldsymbol\beta_0). Rao argued that since the underlying models are equivalent, this estimator should be equal to \mathbf^\top \mathbf \mathbf. This can be achieved by constraining \mathbf such that \mathbf\mathbf = \mathbf, which ensures that all terms other than \mathbf^\top \mathbf \mathbf in the expansion of the quadratic form are zero.


Unbiased estimation

Suppose that we constrain \mathbf\mathbf = \mathbf, as argued in the section above. Then, the MINQUE estimator has the following form \begin \hat &= \mathbf^\top \mathbf \mathbf\\ &= (\mathbf\boldsymbol\beta + \mathbf\boldsymbol\xi)^\top \mathbf (\mathbf\boldsymbol\beta + \mathbf\boldsymbol\xi)\\ &= \boldsymbol\xi^\top\mathbf^\top\mathbf\mathbf\boldsymbol\xi \end To ensure that this estimator is
unbiased Bias is a disproportionate weight ''in favor of'' or ''against'' an idea or thing, usually in a way that is inaccurate, closed-minded, prejudicial, or unfair. Biases can be innate or learned. People may develop biases for or against an individ ...
, the expectation of the estimator \mathbb
hat A hat is a Headgear, head covering which is worn for various reasons, including protection against weather conditions, ceremonial reasons such as university graduation, religious reasons, safety, or as a fashion accessory. Hats which incorpor ...
/math> must equal the parameter of interest, \theta. Below, the expectation of the estimator can be decomposed for each component since the components are uncorrelated with each other. Furthermore, the cyclic property of the trace is used to evaluate the expectation with respect to \boldsymbol\xi_i. \begin \mathbb
hat A hat is a Headgear, head covering which is worn for various reasons, including protection against weather conditions, ceremonial reasons such as university graduation, religious reasons, safety, or as a fashion accessory. Hats which incorpor ...
&= \mathbb boldsymbol\xi^\top \mathbf^\top \mathbf \mathbf \boldsymbol\xi\ &= \sum_^k \mathbb boldsymbol\xi_i^\top\mathbf_i^\top\mathbf\mathbf_i\boldsymbol\xi_i\ &= \sum_^k \sigma_i^2 \mathrm mathbf_i^\top \mathbf \mathbf_i\end To ensure that this estimator is unbiased, Rao suggested setting \sum_^k \sigma_i^2 \mathrm mathbf_i^\top \mathbf \mathbf_i= \sum_^k p_i \sigma_i^2, which can be accomplished by constraining \mathbf such that \mathrm mathbf_i^\top \mathbf \mathbf_i= \mathrm mathbf\mathbf_i= p_i for all components.


Minimum Norm

Rao argues that if \boldsymbol\xi were observed, a "natural" estimator for \theta would be the following since \mathbb boldsymbol\xi_i^\top\boldsymbol\xi_ic_i \sigma_i^2. Here, \boldsymbol\Delta is defined as a
diagonal matrix In linear algebra, a diagonal matrix is a matrix in which the entries outside the main diagonal are all zero; the term usually refers to square matrices. Elements of the main diagonal can either be zero or nonzero. An example of a 2×2 diagon ...
. \frac\boldsymbol\xi_1^\top\boldsymbol\xi_1 + \cdots + \frac\boldsymbol\xi_k^\top\boldsymbol\xi_k = \boldsymbol\xi^\top\left mathrm\left(\frac,\cdots,\frac\right)\rightboldsymbol\xi \equiv \boldsymbol\xi^\top\boldsymbol\Delta\boldsymbol\xi The difference between the proposed estimator and the natural estimator is \boldsymbol\xi^\top (\mathbf^\top \mathbf \mathbf - \boldsymbol\Delta)\boldsymbol\xi. This difference can be minimized by minimizing the norm of the matrix \lVert \mathbf^\top\mathbf\mathbf-\boldsymbol\Delta \rVert.


Procedure

Given the constraints and optimization strategy derived from the optimal properties above, the MINQUE estimator \hat for \theta=\sum_^k p_i\sigma_i^2 is derived by choosing a matrix \mathbf that minimizes \lVert \mathbf^\top\mathbf\mathbf-\boldsymbol\Delta \rVert, subject to the constraints # \mathbf\mathbf=\mathbf, and # \mathrm mathbf\mathbf_ip_i.


Examples of Estimators


Standard Estimator for Homoscedastic Error

In the Gauss-Markov model, the error variance \sigma^2 is estimated using the following. s^2 = \frac(\mathbf-\mathbf\hat)^\top(\mathbf-\mathbf\hat) This estimator is unbiased and can be shown to minimize the
Euclidean norm Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, in Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics there are ''Euclidean spaces'' ...
of the form \lVert \mathbf^\top\mathbf\mathbf-\boldsymbol\Delta \rVert. Thus, the standard estimator for error variance in the Gauss-Markov model is a MINQUE estimator.


Random Variables with Common Mean and Heteroscedastic Error

For random variables Y_1,\cdots,Y_n with a common mean and different variances \sigma^2_1,\cdots,\sigma^2_n, the MINQUE estimator for \sigma^2_i is \frac(Y_i - \overline)^2 - \frac, where \overline = \frac \sum_^n Y_i and s^2 = \frac \sum_^n (Y_i - \overline)^2.


Estimator for Variance Components

Rao proposed a MINQUE estimator for the variance components model based on minimizing the
Euclidean norm Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, in Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics there are ''Euclidean spaces'' ...
. The Euclidean norm \lVert \cdot \rVert_2 is the square root of the sum of squares of all elements in the matrix. When evaluating this norm below, \mathbf=\mathbf_1+\cdots+\mathbf_k = \mathbf \mathbf^\top. Furthermore, using the cyclic property of traces, \mathrm mathbf^\top\mathbf\mathbf\boldsymbol\Delta= \mathrm mathbf\mathbf\boldsymbol\Delta\mathbf^\top= \mathrm\left sum_^k \frac \mathbf\mathbf_i \right= \mathrm boldsymbol\Delta\boldsymbol\Delta. \begin \lVert \mathbf^\top\mathbf\mathbf - \boldsymbol\Delta \rVert^2_2 &= (\mathbf^\top\mathbf\mathbf - \boldsymbol\Delta)^\top (\mathbf^\top\mathbf\mathbf - \boldsymbol\Delta)\\ &= \mathrm mathbf^\top\mathbf\mathbf\mathbf\mathbf\mathbf^\top- \mathrm \mathbf^\top\mathbf\mathbf\boldsymbol\Delta+ \mathrm boldsymbol\Delta\boldsymbol\Delta\ &= \mathrm mathbf\mathbf\mathbf\mathbf- \mathrm boldsymbol\Delta\boldsymbol\Delta\end Note that since \mathrm boldsymbol\Delta\boldsymbol\Delta does not depend on \mathbf , the MINQUE with the Euclidean norm is obtained by identifying the matrix \mathbf that minimizes \mathrm mathbf\mathbf\mathbf\mathbf, subject to the MINQUE constraints discussed above. Rao showed that the matrix \mathbf that satisfies this optimization problem is \mathbf_\star=\sum_^k \lambda_i \mathbf\mathbf_i\mathbf , where \mathbf = \mathbf^(\mathbf-\mathbf) , \mathbf=\mathbf(\mathbf^\top\mathbf^\mathbf)^\mathbf^\top\mathbf^ is the
projection matrix In statistics, the projection matrix (\mathbf), sometimes also called the influence matrix or hat matrix (\mathbf), maps the vector of response values (dependent variable values) to the vector of fitted values (or predicted values). It describes ...
into the column space of \mathbf , and (\cdot)^ represents the
generalized inverse In mathematics, and in particular, algebra, a generalized inverse (or, g-inverse) of an element ''x'' is an element ''y'' that has some properties of an inverse element but not necessarily all of them. The purpose of constructing a generalized inv ...
of a matrix. Therefore, the MINQUE estimator is the following, where the vectors \boldsymbol\lambda and \mathbf are defined based on the sum. \begin \hat &= \mathbf^\top \mathbf_\star\mathbf\\ &= \sum_^k \lambda_i \mathbf^\top\mathbf\mathbf_i\mathbf\mathbf\\ &\equiv\sum_^k \lambda_i Q_i\\ &\equiv \boldsymbol\lambda^\top \mathbf \end The vector \boldsymbol\lambda is obtained by using the constraint \mathrm mathbf_\star\mathbf_ip_i. That is, the vector represents the solution to the following system of equations \forall j\in\ . \begin \mathrm mathbf_\star\mathbf_j&= p_j\\ \mathrm\left \sum_^k \lambda_i \mathbf\mathbf_i\mathbf\mathbf_j \right&= p_j\\ \sum_^k \lambda_i \mathrm mathbf\mathbf_i\mathbf\mathbf_j&= p_j \end This can be written as a matrix product \mathbf\boldsymbol\lambda=\mathbf , where \mathbf= _1\,\cdots\,p_k\top and \mathbf is the following. \mathbf=\begin \mathrm mathbf\mathbf_1\mathbf\mathbf_1& \cdots & \mathrm mathbf\mathbf_k\mathbf\mathbf_1\ \vdots & \ddots & \vdots\\ \mathrm mathbf\mathbf_1\mathbf\mathbf_k& \cdots & \mathrm mathbf\mathbf_k\mathbf\mathbf_k\end Then, \boldsymbol\lambda=\mathbf^\mathbf . This implies that the MINQUE is \hat=\boldsymbol\lambda^\top\mathbf=\mathbf^\top(\mathbf^)^\top\mathbf=\mathbf^\top\mathbf^\mathbf . Note that \theta=\sum_^k p_i \sigma_i^2 = \mathbf^\top\boldsymbol\sigma , where \boldsymbol\sigma = sigma^2_1\,\cdots\,\sigma^2_k\top . Therefore, the estimator for the variance components is \hat=\mathbf^\mathbf .


Extensions

MINQUE estimators can be obtained without the invariance criteria, in which case the estimator is only unbiased and minimizes the norm. Such estimators have slightly different constraints on the minimization problem. The model can be extended to estimate covariance components. In such a model, the random effects of a component are assumed to have a common covariance structure \mathbb boldsymbol\xi_i\boldsymbol\Sigma. A MINQUE estimator for a mixture of variance and covariance components was also proposed. In this model, \mathbb boldsymbol\xi_i\boldsymbol\Sigma for i\in \ and \mathbb boldsymbol\xi_i \sigma_i^2\mathbf_ for i\in\{s+1,\cdots,k\}.


References

Estimation theory Statistical deviation and dispersion