In statistics, the logit ( ) function is the quantile function associated with the standard

logistic distribution Logistic may refer to: Mathematics * Logistic function, a sigmoid function used in many fields ** Logistic map, a recurrence relation that sometimes exhibits chaos ** Logistic regression, a statistical model using the logistic function ** Logit, ...

. It has many uses in data analysis and machine learning, especially in data transformations. Mathematically, the logit is the inverse of the standard logistic function

\sigma(x) = 1/(1+e^)

, so the logit is defined as :

\operatorname p = \sigma^(p) = \ln \frac \quad \text \quad p \in (0,1)

. Because of this, the logit is also called the log-odds since it is equal to the

logarithm In mathematics, the logarithm is the inverse function to exponentiation. That means the logarithm of a number to the base is the exponent to which must be raised, to produce . For example, since , the ''logarithm base'' 10 of ...

of the odds

\frac

where is a probability. Thus, the logit is a type of function that maps probability values from

(0, 1)

to real numbers in

(-\infty, +\infty)

, akin to the probit function.

Definition

If is a

probability Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speakin ...

, then is the corresponding odds; the of the probability is the logarithm of the odds, i.e.: :

\operatorname(p)=\ln\left( \frac \right) =\ln(p)-\ln(1-p)=-\ln\left( \frac-1\right)=2\operatorname(2p-1)

The base of the

function used is of little importance in the present article, as long as it is greater than 1, but the natural logarithm with base is the one most often used. The choice of base corresponds to the choice of

logarithmic unit A logarithmic scale (or log scale) is a way of displaying numerical data over a very wide range of values in a compact way—typically the largest numbers in the data are hundreds or even thousands of times larger than the smallest numbers. Such a ...

for the value: base 2 corresponds to a shannon, base to a “ nat”, and base 10 to a hartley; these units are particularly used in information-theoretic interpretations. For each choice of base, the logit function takes values between negative and positive infinity. The “logistic” function of any number

\alpha

is given by the inverse-: :

\operatorname^(\alpha) = \operatorname(\alpha) = \frac = \frac = \frac

The difference between the s of two probabilities is the logarithm of the

odds ratio An odds ratio (OR) is a statistic that quantifies the strength of the association between two events, A and B. The odds ratio is defined as the ratio of the odds of A in the presence of B and the odds of A in the absence of B, or equivalently (due ...

(), thus providing a shorthand for writing the correct combination of odds ratios only by adding and subtracting: :

\operatorname(R)=\ln\left( \frac \right) =\ln\left( \frac \right) - \ln\left(\frac\right)=\operatorname(p_1)-\operatorname(p_2)\,.

History

There have been several efforts to adapt linear regression methods to a domain where the output is a probability value,

(0, 1)

, instead of any real number

(-\infty, +\infty)

. In many cases, such efforts have focused on modeling this problem by mapping the range

(0, 1)

(-\infty, +\infty)

and then running the linear regression on these transformed values. In 1934

Chester Ittner Bliss Chester Ittner Bliss (February 1, 1899 – March 14, 1979) was primarily a biologist, who is best known for his contributions to statistics. He was born in Springfield, Ohio in 1899 and died in 1979. He was the first secretary of the International ...

used the cumulative normal distribution function to perform this mapping and called his model

probit In probability theory and statistics, the probit function is the quantile function associated with the standard normal distribution. It has applications in data analysis and machine learning, in particular exploratory statistical graphics and s ...

an abbreviation for "probability unit";. However, this is computationally more expensive. In 1944, Joseph Berkson used log of odds and called this function ''logit,'' abbreviation for "logistic unit" following the analogy for probit: Log odds was used extensively by

Charles Sanders Peirce Charles Sanders Peirce ( ; September 10, 1839 – April 19, 1914) was an American philosopher, logician, mathematician and scientist who is sometimes known as "the father of pragmatism". Educated as a chemist and employed as a scientist for t ...

(late 19th century). G. A. Barnard in 1949 coined the commonly used term ''log-odds''; the log-odds of an event is the logit of the probability of the event. Barnard also coined the term ''lods'' as an abstract form of "log-odds", but suggested that "in practice the term 'odds' should normally be used, since this is more familiar in everyday life".

Uses and properties

* The logit in logistic regression is a special case of a link function in a generalized linear model: it is the canonical link function for the Bernoulli distribution. * The logit function is the negative of the

derivative In mathematics, the derivative of a function of a real variable measures the sensitivity to change of the function value (output value) with respect to a change in its argument (input value). Derivatives are a fundamental tool of calculus. ...

of the

binary entropy function In information theory, the binary entropy function, denoted \operatorname H(p) or \operatorname H_\text(p), is defined as the entropy of a Bernoulli process with probability p of one of two values. It is a special case of \Eta(X), the entropy fun ...

. * The logit is also central to the probabilistic Rasch model for measurement, which has applications in psychological and educational assessment, among other areas. * The inverse-logit function (i.e., the logistic function) is also sometimes referred to as the ''expit'' function. * In plant disease epidemiology the logit is used to fit the data to a logistic model. With the Gompertz and Monomolecular models all three are known as Richards family models. * The log-odds function of probabilities is often used in state estimation algorithms because of its numerical advantages in the case of small probabilities. Instead of multiplying very small floating point numbers, log-odds probabilities can just be summed up to calculate the (log-odds) joint probability.

Comparison with probit

Closely related to the function (and logit model) are the probit function and probit model. The and are both

sigmoid function A sigmoid function is a mathematical function having a characteristic "S"-shaped curve or sigmoid curve. A common example of a sigmoid function is the logistic function shown in the first figure and defined by the formula: :S(x) = \frac = \ ...

s with a domain between 0 and 1, which makes them both quantile functions – i.e., inverses of the cumulative distribution function (CDF) of a probability distribution. In fact, the is the quantile function of the

, while the is the quantile function of the normal distribution. The function is denoted

\Phi^(x)

, where

\Phi(x)

is the CDF of the standard normal distribution, as just mentioned: :

\Phi(x) = \frac\int_^  e^ dy.

As shown in the graph on the right, the and functions are extremely similar when the function is scaled, so that its slope at matches the slope of the . As a result, probit models are sometimes used in place of logit models because for certain applications (e.g., in

Bayesian statistics Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...

) the implementation is easier.

References

* *

Definition

History

Uses and properties

Comparison with probit

See also

References

Further reading