HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, a random effects model, also called a variance components model, is a
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repres ...
where the model parameters are
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s. It is a kind of hierarchical linear model, which assumes that the data being analysed are drawn from a hierarchy of different populations whose differences relate to that hierarchy. A random effects model is a special case of a mixed model. Contrast this to the
biostatistics Biostatistics (also known as biometry) are the development and application of statistical methods to a wide range of topics in biology. It encompasses the design of biological experiments, the collection and analysis of data from those experime ...
definitions, as biostatisticians use "fixed" and "random" effects to respectively refer to the population-average and subject-specific effects (and where the latter are generally assumed to be unknown,
latent variables In statistics, latent variables (from Latin: present participle of ''lateo'', “lie hidden”) are variables that can only be inferred indirectly through a mathematical model from other observable variables that can be directly observed or me ...
).


Qualitative description

Random effect models assist in controlling for unobserved heterogeneity when the heterogeneity is constant over time and not correlated with independent variables. This constant can be removed from longitudinal data through differencing, since taking a first difference will remove any time invariant components of the model. Two common assumptions can be made about the individual specific effect: the random effects assumption and the fixed effects assumption. The random effects assumption is that the individual unobserved heterogeneity is uncorrelated with the independent variables. The fixed effect assumption is that the individual specific effect is correlated with the independent variables. If the random effects assumption holds, the random effects estimator is more efficient than the fixed effects model.


Simple example

Suppose ''m'' large elementary schools are chosen randomly from among thousands in a large country. Suppose also that ''n'' pupils of the same age are chosen randomly at each selected school. Their scores on a standard aptitude test are ascertained. Let ''Y''''ij'' be the score of the ''j''th pupil at the ''i''th school. A simple way to model this variable is : Y_ = \mu + U_i + W_ + \epsilon_,\, where ''μ'' is the average test score for the entire population. In this model ''Ui'' is the school-specific random effect: it measures the difference between the average score at school ''i'' and the average score in the entire country. The term ''Wij'' is the individual-specific random effect, i.e., it's the deviation of the ''j''-th pupil's score from the average for the ''i''-th school. The model can be augmented by including additional explanatory variables, which would capture differences in scores among different groups. For example: : Y_ = \mu + \beta_1 \mathrm_ + \beta_2 \mathrm_ + U_i + W_+ \epsilon_,\, where Sex''ij'' is the dummy variable for boys/girls and ParentsEduc''ij'' records, say, the average education level of a child's parents. This is a mixed model, not a purely random effects model, as it introduces fixed-effects terms for Sex and Parents' Education.


Variance components

The variance of ''Y''''ij'' is the sum of the variances τ2 and σ2 of ''U''''i'' and ''W''''ij'' respectively. Let : \overline_ = \frac\sum_^n Y_ be the average, not of all scores at the ''i''th school, but of those at the ''i''th school that are included in the random sample. Let :\overline_ = \frac\sum_^m\sum_^n Y_ be the grand average. Let :SSW = \sum_^m\sum_^n (Y_ - \overline_)^2 \, :SSB = n\sum_^m (\overline_ - \overline_)^2 \, be respectively the sum of squares due to differences ''within'' groups and the sum of squares due to difference ''between'' groups. Then it can be shown that : \fracE(SSW) = \sigma^2 and : \fracE(SSB) = \frac + \tau^2. These "
expected mean square In statistics, expected mean squares (EMS) are the expected values of certain statistics arising in partitions of sums of squares in the analysis of variance (ANOVA). They can be used for ascertaining which statistic should appear in the denominato ...
s" can be used as the basis for estimation of the "variance components" ''σ''2 and ''τ''2. The ''τ''2 parameter is also called the intraclass correlation coefficient.


Applications

Random effects models used in practice include the Bühlmann model of insurance contracts and the Fay-Herriot model used for small area estimation.


See also

* Bühlmann model * Hierarchical linear modeling * Fixed effects * MINQUE * Covariance estimation * Conditional variance


Further reading

* * * *


References


External links


Fixed and random effects models
{{DEFAULTSORT:Random Effects Model Regression models Analysis of variance