In statistics, identifiability is a property which a model must satisfy for precise

inference Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word '' infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction that ...

to be possible. A model is identifiable if it is theoretically possible to learn the true values of this model's underlying parameters after obtaining an infinite number of observations from it. Mathematically, this is equivalent to saying that different values of the parameters must generate different

probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomeno ...

s of the observable variables. Usually the model is identifiable only under certain technical restrictions, in which case the set of these requirements is called the identification conditions. A model that fails to be identifiable is said to be non-identifiable or unidentifiable: two or more parametrizations are

observationally equivalent Observational equivalence is the property of two or more underlying entities being indistinguishable on the basis of their observable implications. Thus, for example, two scientific theories are observationally equivalent if all of their empirically ...

. In some cases, even though a model is non-identifiable, it is still possible to learn the true values of a certain subset of the model parameters. In this case we say that the model is partially identifiable. In other cases it may be possible to learn the location of the true parameter up to a certain finite region of the parameter space, in which case the model is

set identifiable In statistics and econometrics, set identification (or partial identification) extends the concept of identifiability (or "point identification") in statistical models to situations where the distribution of observable variables is not informative o ...

. Aside from strictly theoretical exploration of the model properties, identifiability can be referred to in a wider scope when a model is tested with experimental data sets, using identifiability analysis.

Definition

Let

\mathcal=\

be a

statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, ...

with parameter space

\Theta

. We say that

\mathcal

is identifiable if the mapping

\theta\mapsto P_\theta

one-to-one One-to-one or one to one may refer to: Mathematics and communication *One-to-one function, also called an injective function *One-to-one correspondence, also called a bijective function *One-to-one (communication), the act of an individual comm ...

: :

P_=P_ \quad\Rightarrow\quad \theta_1=\theta_2 \quad\ \text \theta_1,\theta_2\in\Theta.

This definition means that distinct values of ''θ'' should correspond to distinct probability distributions: if ''θ''₁≠''θ''₂, then also ''P''_''θ''₁≠''P''_''θ''₂. If the distributions are defined in terms of the

probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) c ...

s (pdfs), then two pdfs should be considered distinct only if they differ on a set of non-zero measure (for example two functions ƒ₁(''x'') = 1_{0 ≤ ''x'' < 1} and ƒ₂(''x'') = 1_{0 ≤ ''x'' ≤ 1} differ only at a single point ''x'' = 1 — a set of measure zero — and thus cannot be considered as distinct pdfs). Identifiability of the model in the sense of invertibility of the map

\theta\mapsto P_\theta

is equivalent to being able to learn the model's true parameter if the model can be observed indefinitely long. Indeed, if ⊆ ''S'' is the sequence of observations from the model, then by the strong law of large numbers, :

\frac 1 T \sum_^T \mathbf_ \ \xrightarrow\ \Pr_t\in A

for every measurable set ''A'' ⊆ ''S'' (here 1 is the

indicator function In mathematics, an indicator function or a characteristic function of a subset of a set is a function that maps elements of the subset to one, and all other elements to zero. That is, if is a subset of some set , one has \mathbf_(x)=1 if x ...

). Thus, with an infinite number of observations we will be able to find the true probability distribution ''P''₀ in the model, and since the identifiability condition above requires that the map

\theta\mapsto P_\theta

be invertible, we will also be able to find the true value of the parameter which generated given distribution ''P''₀.

Examples

Example 1

Let

\mathcal

be the normal location-scale family: :

\mathcal = \Big\.

Then :

\Longleftrightarrow & x^2 \left(\frac 1 -\frac 1 \right) - 2x\left(\frac-\frac \right) + \left(\frac-\frac+\ln\sigma_1-\ln\sigma_2\right) = 0 \end

This expression is equal to zero for almost all ''x'' only when all its coefficients are equal to zero, which is only possible when , ''σ''₁, = , ''σ''₂, and ''μ''₁ = ''μ''₂. Since in the scale parameter ''σ'' is restricted to be greater than zero, we conclude that the model is identifiable: ƒ_''θ''₁ = ƒ_''θ''₂ ⇔ ''θ''₁ = ''θ''₂.

Example 2

Let

\mathcal

be the standard linear regression model: :

y = \beta'x + \varepsilon, \quad \mathrm,\varepsilon\mid x\, 0

(where ′ denotes matrix

transpose In linear algebra, the transpose of a matrix is an operator which flips a matrix over its diagonal; that is, it switches the row and column indices of the matrix by producing another matrix, often denoted by (among other notations). The tr ...

). Then the parameter ''β'' is identifiable if and only if the matrix

\mathrm x'

is invertible. Thus, this is the identification condition in the model.

Example 3

Suppose

\mathcal

is the classical

errors-in-variables In statistics, errors-in-variables models or measurement error models are regression models that account for measurement errors in the independent variables. In contrast, standard regression models assume that those regressors have been measured e ...

linear model In statistics, the term linear model is used in different ways according to the context. The most common occurrence is in connection with regression models and the term is often taken as synonymous with linear regression model. However, the term ...

: :

\begin
    y = \beta x^* + \varepsilon, \\
    x = x^* + \eta,
  \end

where (''ε'',''η'',''x*'') are jointly normal independent random variables with zero expected value and unknown variances, and only the variables (''x'',''y'') are observed. Then this model is not identifiable, only the product βσ²_∗ is (where σ²_∗ is the variance of the latent regressor ''x*''). This is also an example of a

model: although the exact value of ''β'' cannot be learned, we can guarantee that it must lie somewhere in the interval (''β''_yx, 1÷''β''_xy), where ''β''_yx is the coefficient in

OLS OLS or Ols may refer to: * Oleśnica (German: Öls), Poland * Optical landing system * Order of Luthuli in Silver, a South African honour * Ordinary least squares, a method used in regression analysis for estimating linear models * Ottawa Linux Sy ...

regression of ''y'' on ''x'', and ''β''_xy is the coefficient in OLS regression of ''x'' on ''y''. If we abandon the normality assumption and require that ''x*'' were not normally distributed, retaining only the independence condition ''ε'' ⊥ ''η'' ⊥ ''x*'', then the model becomes identifiable.

References

Citations

Sources

* * * * *