In
statistics, specifically
regression analysis
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
, a binary regression estimates a relationship between one or more
explanatory variables and a single output
binary variable. Generally the probability of the two alternatives is modeled, instead of simply outputting a single value, as in
linear regression
In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is ...
.
Binary regression is usually analyzed as a special case of
binomial regression
In statistics, binomial regression is a regression analysis technique in which the response (often referred to as ''Y'') has a binomial distribution: it is the number of successes in a series of independent Bernoulli trials, where each trial ha ...
, with a single outcome (
), and one of the two alternatives considered as "success" and coded as 1: the value is the
count
Count (feminine: countess) is a historical title of nobility in certain European countries, varying in relative status, generally of middling rank in the hierarchy of nobility. Pine, L. G. ''Titles: How the King Became His Majesty''. New Yor ...
of successes in 1 trial, either 0 or 1. The most common binary regression models are the
logit model (
logistic regression
In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear function (calculus), linear combination of one or more independent var ...
) and the
probit model (
probit regression).
Applications
Binary regression is principally applied either for prediction (
binary classification), or for estimating the
association between the explanatory variables and the output. In economics, binary regressions are used to model
binary choice
In economics, discrete choice models, or qualitative choice models, describe, explain, and predict choices between two or more discrete alternatives, such as entering or not entering the labor market, or choosing between modes of transport. Su ...
.
Interpretations
Binary regression models can be interpreted as
latent variable models, together with a measurement model; or as probabilistic models, directly modeling the probability.
Latent variable model
The latent variable interpretation has traditionally been used in
bioassay, yielding the
probit model, where normal variance and a cutoff are assumed. The latent variable interpretation is also used in
item response theory (IRT).
Formally, the latent variable interpretation posits that the outcome ''y'' is related to a vector of explanatory variables ''x'' by
: