Applications
Example
Problem
A group of 20 students spends between 0 and 6 hours studying for an exam. How does the number of hours spent studying affect the probability of the student passing the exam?
Model
Fit
Parameter estimation
Predictions
Model evaluation
Generalizations
Background
Definition of the logistic function
Definition of the inverse of the logistic function
Interpretation of these terms
Definition of the odds
The odds ratio
Multiple explanatory variables
Definition
Many explanatory variables, two categories
Multinomial logistic regression: Many explanatory variables and many categories
Interpretations
As a generalized linear model
As a latent-variable model
Two-way latent-variable model
As a "log-linear" model
As a single-layer perceptron
In terms of binomial data
Model fitting
Maximum likelihood estimation (MLE)
Iteratively reweighted least squares (IRLS)
Bayesian
"Rule of ten"
Error and significance of fit
Deviance and likelihood ratio test ─ a simple case
Goodness of fit summary
Deviance and likelihood ratio tests
Pseudo-R-squared
Hosmer–Lemeshow test
Coefficient significance
Likelihood ratio test
Wald statistic
Case-control sampling
Discussion
Although the dependent variable in logistic regression is Bernoulli, the logit is on an unrestricted scale. The logit function is the link function in this kind of generalized linear model, i.e. \operatorname \operatorname(Y) = \beta_0 + \beta_1 x is the Bernoulli-distributed response variable and is the predictor variable; the values are the linear parameters. The of the probability of success is then fitted to the predictors. The predicted value of the is converted back into predicted odds, via the inverse of the natural logarithm – the exponential function The exponential function is a mathematical function denoted by f(x)=\exp(x) or e^x (where the argument is written as an exponent). Unless otherwise specified, the term generally refers to the positive-valued function of a real variable, .... Thus, although the observed dependent variable in binary logistic regression is a 0-or-1 variable, the logistic regression estimates the odds, as a continuous variable, that the dependent variable is a 'success'. In some applications, the odds are all that is needed. In others, a specific yes-or-no prediction is needed for whether the dependent variable is or is not a 'success'; this categorical prediction can be based on the computed odds of success, with predicted odds above some chosen cutoff value being translated into a prediction of success. Maximum entropy Of all the functional forms used for estimating the probabilities of a particular categorical outcome which optimize the fit by maximizing the likelihood function (e.g. probit regression In statistics, a probit model is a type of regression where the dependent variable can take only two values, for example married or not married. The word is a portmanteau, coming from ''probability'' + ''unit''. The purpose of the model is to est ..., Poisson regression In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable ''Y'' has a Poisson distribution, and assumes the loga ..., etc.), the logistic regression solution is unique in that it is a maximum entropy solution. This is a case of a general property: an exponential family In probability and statistics, an exponential family is a parametric set of probability distributions of a certain form, specified below. This special form is chosen for mathematical convenience, including the enabling of the user to calculate ... of distributions maximizes entropy, given an expected value. In the case of the logistic model, the logistic function is the natural parameter of the Bernoulli distribution (it is in "canonical form In mathematics and computer science, a canonical, normal, or standard form of a mathematical object is a standard way of presenting that object as a mathematical expression. Often, it is one which provides the simplest representation of an ob ...", and the logistic function is the canonical link function), while other sigmoid functions are non-canonical link functions; this underlies its mathematical elegance and ease of optimization. See for details. Proof In order to show this, we use the method of Lagrange multipliers In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equality constraints (i.e., subject to the condition that one or more equations have to be satisfied .... The Lagrangian is equal to the entropy plus the sum of the products of Lagrange multipliers times various constraint expressions. The general multinomial case will be considered, since the proof is not made that much simpler by considering simpler cases. Equating the derivative of the Lagrangian with respect to the various probabilities to zero yields a functional form for those probabilities which corresponds to those used in logistic regression. As in the above section on multinomial logistic regression In statistics, multinomial logistic regression is a statistical classification, classification method that generalizes logistic regression to multiclass classification, multiclass problems, i.e. with more than two possible discrete outcomes. T ..., we will consider explanatory variables denoted and which include x_0=1. There will be a total of ''K'' data points, indexed by k=\, and the data points are given by x_ and . The ''xmk'' will also be represented as an -dimensional vector \boldsymbol_k = \. There will be possible values of the categorical variable ''y'' ranging from 0 to N. Let ''pn(x)'' be the probability, given explanatory variable vector x, that the outcome will be y=n. Define p_=p_n(\boldsymbol_k) which is the probability that for the ''k''-th measurement, the categorical outcome is ''n''. The Lagrangian will be expressed as a function of the probabilities ''pnk'' and will minimized by equating the derivatives of the Lagrangian with respect to these probabilities to zero. An important point is that the probabilities are treated equally and the fact that they sum to unity is part of the Lagrangian formulation, rather than being assumed from the beginning. The first contribution to the Lagrangian is the entropy Entropy is a scientific concept, as well as a measurable physical property, that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodyna ...: :\mathcal_=-\sum_^K\sum_^N p_\ln(p_) The log-likelihood is: :\ell=\sum_^K\sum_^N \Delta(n,y_k)\ln(p_) Assuming the multinomial logistic function, the derivative of the log-likelihood with respect the beta coefficients was found to be: :\frac=\sum_^K ( p_x_-\Delta(n,y_k)x_) A very important point here is that this expression is (remarkably) not an explicit function of the beta coefficients. It is only a function of the probabilities ''pnk'' and the data. Rather than being specific to the assumed multinomial logistic case, it is taken to be a general statement of the condition at which the log-likelihood is maximized and makes no reference to the functional form of ''pnk''. There are then (''M''+1)(''N''+1) fitting constraints and the fitting constraint term in the Lagrangian is then: :\mathcal_=\sum_^N\sum_^M \lambda_\sum_^K (p_x_-\Delta(n,y_k)x_) where the ''λnm'' are the appropriate Lagrange multipliers. There are ''K'' normalization constraints which may be written: :\sum_^N p_=1 so that the normalization term in the Lagrangian is: :\mathcal_=\sum_^K \alpha_k \left(1-\sum_^N p_\right) where the ''αk'' are the appropriate Lagrange multipliers. The Lagrangian is then the sum of the above three terms: :\mathcal=\mathcal_ + \mathcal_ + \mathcal_ Setting the derivative of the Lagrangian with respect to one of the probabilities to zero yields: :\frac=0=-\ln(p_)-1+\sum_^M (\lambda_x_)-\alpha_ Using the more condensed vector notation: :\sum_^M \lambda_x_ = \boldsymbol_n\cdot\boldsymbol_k and dropping the primes on the ''n'' and ''k'' indices, and then solving for p_ yields: :p_=e^/Z_k where: :Z_k=e^ Imposing the normalization constraint, we can solve for the ''Zk'' and write the probabilities as: :p_=\frac The \boldsymbol_n are not all independent. We can add any constant -dimensional vector to each of the \boldsymbol_n without changing the value of the p_ probabilities so that there are only ''N'' rather than independent \boldsymbol_n. In the multinomial logistic regression In statistics, multinomial logistic regression is a statistical classification, classification method that generalizes logistic regression to multiclass classification, multiclass problems, i.e. with more than two possible discrete outcomes. T ... section above, the \boldsymbol_0 was subtracted from each \boldsymbol_n which set the exponential term involving \boldsymbol_0 to unity, and the beta coefficients were given by \boldsymbol_n=\boldsymbol_n-\boldsymbol_0. Other approaches In machine learning applications where logistic regression is used for binary classification, the MLE minimises the Cross entropy loss function. Logistic regression is an important machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ... algorithm. The goal is to model the probability of a random variable Y being 0 or 1 given experimental data. Consider a generalized linear model In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a ''link function'' and by ... function parameterized by \theta, : h_\theta(X) = \frac = \Pr(Y=1 \mid X; \theta) Therefore, : \Pr(Y=0 \mid X; \theta) = 1 - h_\theta(X) and since Y \in \, we see that \Pr(y\mid X;\theta) is given by \Pr(y \mid X; \theta) = h_\theta(X)^y(1 - h_\theta(X))^. We now calculate the likelihood function The likelihood function (often simply called the likelihood) represents the probability of random variable realizations conditional on particular values of the statistical parameters. Thus, when evaluated on a given sample, the likelihood funct ... assuming that all the observations in the sample are independently Bernoulli distributed, :\begin L(\theta \mid y; x) &= \Pr(Y \mid X; \theta) \\ &= \prod_i \Pr(y_i \mid x_i; \theta) \\ &= \prod_i h_\theta(x_i)^(1 - h_\theta(x_i))^ \end Typically, the log likelihood is maximized, : N^ \log L(\theta \mid y; x) = N^ \sum_^N \log \Pr(y_i \mid x_i; \theta) which is maximized using optimization techniques such as gradient descent In mathematics, gradient descent (also often called steepest descent) is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of .... Assuming the (x, y) pairs are drawn uniformly from the underlying distribution, then in the limit of large ''N'', :\begin & \lim \limits_ N^ \sum_^N \log \Pr(y_i \mid x_i; \theta) = \sum_ \sum_ \Pr(X=x, Y=y) \log \Pr(Y=y \mid X=x; \theta) \\ pt= & \sum_ \sum_ \Pr(X=x, Y=y) \left( - \log\frac + \log \Pr(Y=y \mid X=x) \right) \\ pt= & - D_\text( Y \parallel Y_\theta ) - H(Y \mid X) \end where H(Y\mid X) is the conditional entropy and D_\text is the Kullback–Leibler divergence In mathematical statistics, the Kullback–Leibler divergence (also called relative entropy and I-divergence), denoted D_\text(P \parallel Q), is a type of statistical distance: a measure of how one probability distribution ''P'' is different fro .... This leads to the intuition that by maximizing the log-likelihood of a model, you are minimizing the KL divergence of your model from the maximal entropy distribution. Intuitively searching for the model that makes the fewest assumptions in its parameters. Comparison with linear regression Logistic regression can be seen as a special case of the generalized linear model In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a ''link function'' and by ... and thus analogous to linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is .... The model of logistic regression, however, is based on quite different assumptions (about the relationship between the dependent and independent variables) from those of linear regression. In particular, the key differences between these two models can be seen in the following two features of logistic regression. First, the conditional distribution y \mid x is a Bernoulli distribution In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli,James Victor Uspensky: ''Introduction to Mathematical Probability'', McGraw-Hill, New York 1937, page 45 is the discrete probab ... rather than a Gaussian distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu i ..., because the dependent variable is binary. Second, the predicted values are probabilities and are therefore restricted to (0,1) through the logistic distribution function because logistic regression predicts the probability of particular outcomes rather than the outcomes themselves. Alternatives A common alternative to the logistic model (logit model) is the probit model, as the related names suggest. From the perspective of generalized linear model In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a ''link function'' and by ...s, these differ in the choice of link function: the logistic model uses the logit function (inverse logistic function), while the probit model uses the probit function (inverse error function In mathematics, the error function (also called the Gauss error function), often denoted by , is a complex function of a complex variable defined as: :\operatorname z = \frac\int_0^z e^\,\mathrm dt. This integral is a special (non- elementa ...). Equivalently, in the latent variable interpretations of these two methods, the first assumes a standard logistic distribution Logistic may refer to: Mathematics * Logistic function, a sigmoid function used in many fields ** Logistic map, a recurrence relation that sometimes exhibits chaos ** Logistic regression, a statistical model using the logistic function ** Logit ... of errors and the second a standard normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu i ... of errors. Other sigmoid function A sigmoid function is a mathematical function having a characteristic "S"-shaped curve or sigmoid curve. A common example of a sigmoid function is the logistic function shown in the first figure and defined by the formula: :S(x) = \frac = \ ...s or error distributions can be used instead. Logistic regression is an alternative to Fisher's 1936 method, linear discriminant analysis Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features .... If the assumptions of linear discriminant analysis hold, the conditioning can be reversed to produce logistic regression. The converse is not true, however, because logistic regression does not require the multivariate normal assumption of discriminant analysis. The assumption of linear predictor effects can easily be relaxed using techniques such as spline functions. History A detailed history of the logistic regression is given in . The logistic function was developed as a model of population growth Population growth is the increase in the number of people in a population or dispersed group. Actual global human population growth amounts to around 83 million annually, or 1.1% per year. The global population has grown from 1 billion in 1800 to ... and named "logistic" by Pierre François Verhulst Pierre François Verhulst (28 October 1804, Brussels – 15 February 1849, Brussels) was a Belgian mathematician and a doctor in number theory from the University of Ghent in 1825. He is best known for the logistic growth model. Logistic e ... in the 1830s and 1840s, under the guidance of Adolphe Quetelet Lambert Adolphe Jacques Quetelet FRSF or FRSE (; 22 February 1796 – 17 February 1874) was a Belgian astronomer, mathematician, statistician and sociologist who founded and directed the Brussels Observatory and was influential in introdu ...; see for details. In his earliest paper (1838), Verhulst did not specify how he fit the curves to the data. In his more detailed paper (1845), Verhulst determined the three parameters of the model by making the curve pass through three observed points, which yielded poor predictions. The logistic function was independently developed in chemistry as a model of autocatalysis (Wilhelm Ostwald Friedrich Wilhelm Ostwald (; 4 April 1932) was a Baltic German chemist and philosopher. Ostwald is credited with being one of the founders of the field of physical chemistry, with Jacobus Henricus van 't Hoff, Walther Nernst, and Svante Arrh ..., 1883). An autocatalytic reaction is one in which one of the products is itself a catalyst Catalysis () is the process of increasing the rate of a chemical reaction by adding a substance known as a catalyst (). Catalysts are not consumed in the reaction and remain unchanged after it. If the reaction is rapid and the catalyst recyc ... for the same reaction, while the supply of one of the reactants is fixed. This naturally gives rise to the logistic equation for the same reason as population growth: the reaction is self-reinforcing but constrained. The logistic function was independently rediscovered as a model of population growth in 1920 by Raymond Pearl Raymond Pearl (June 3, 1879 – November 17, 1940) was an American biologist, regarded as one of the founders of biogerontology. He spent most of his career at Johns Hopkins University in Baltimore. Pearl was a prolific writer of academic books, ... and Lowell Reed, published as , which led to its use in modern statistics. They were initially unaware of Verhulst's work and presumably learned about it from L. Gustave du Pasquier Louis-Gustave du Pasquier (18 August 1876, Auvernier – 31 January 1957, Cornaux) was a Swiss mathematician and historian of mathematics and mathematical sciences. Education and career Du Pasquier studied at l' École Polytechnique, the Univers ..., but they gave him little credit and did not adopt his terminology. Verhulst's priority was acknowledged and the term "logistic" revived by Udny Yule George Udny Yule FRS (18 February 1871 – 26 June 1951), usually known as Udny Yule, was a British statistician, particularly known for the Yule distribution. Personal life Yule was born at Beech Hill, a house in Morham near Haddingto ... in 1925 and has been followed since. Pearl and Reed first applied the model to the population of the United States, and also initially fitted the curve by making it pass through three points; as with Verhulst, this again yielded poor results. In the 1930s, the probit model was developed and systematized by Chester Ittner Bliss, who coined the term "probit" in , and by John Gaddum in , and the model fit by maximum likelihood estimation In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ... by Ronald A. Fisher in , as an addendum to Bliss's work. The probit model was principally used in bioassay, and had been preceded by earlier work dating to 1860; see . The probit model influenced the subsequent development of the logit model and these models competed with each other. The logistic model was likely first used as an alternative to the probit model in bioassay by Edwin Bidwell Wilson and his student Jane Worcester Jane Worcester (died October 8, 1989) was a biostatistician and epidemiologist who became the second tenured female professor, after Martha May Eliot, and the first female chair of biostatistics in the Harvard School of Public Health. Worcester gr ... in . However, the development of the logistic model as a general alternative to the probit model was principally due to the work of Joseph Berkson over many decades, beginning in , where he coined "logit", by analogy with "probit", and continuing through and following years. The logit model was initially dismissed as inferior to the probit model, but "gradually achieved an equal footing with the logit", particularly between 1960 and 1970. By 1970, the logit model achieved parity with the probit model in use in statistics journals and thereafter surpassed it. This relative popularity was due to the adoption of the logit outside of bioassay, rather than displacing the probit within bioassay, and its informal use in practice; the logit's popularity is credited to the logit model's computational simplicity, mathematical properties, and generality, allowing its use in varied fields. Various refinements occurred during that time, notably by David Cox, as in . The multinomial logit model was introduced independently in and , which greatly increased the scope of application and the popularity of the logit model. In 1973 Daniel McFadden linked the multinomial logit to the theory of discrete choice In economics, discrete choice models, or qualitative choice models, describe, explain, and predict choices between two or more discrete alternatives, such as entering or not entering the labor market, or choosing between modes of transport. Su ..., specifically Luce's choice axiom, showing that the multinomial logit followed from the assumption of independence of irrelevant alternatives The independence of irrelevant alternatives (IIA), also known as binary independence or the independence axiom, is an axiom of decision theory and various social sciences. The term is used in different connotation in several contexts. Although it a ... and interpreting odds of alternatives as relative preferences; this gave a theoretical foundation for the logistic regression. Extensions There are large numbers of extensions: * Multinomial logistic regression In statistics, multinomial logistic regression is a statistical classification, classification method that generalizes logistic regression to multiclass classification, multiclass problems, i.e. with more than two possible discrete outcomes. T ... (or multinomial logit) handles the case of a multi-way categorical dependent variable (with unordered values, also called "classification"). Note that the general case of having dependent variables with more than two values is termed ''polytomous regression''. * Ordered logistic regression In statistics, the ordered logit model (also ordered logistic regression or proportional odds model) is an ordinal regression model—that is, a regression model for ordinal dependent variables—first considered by Peter McCullagh. For exampl ... (or ordered logit) handles ordinal dependent variables (ordered values). * Mixed logit is an extension of multinomial logit that allows for correlations among the choices of the dependent variable. * An extension of the logistic model to sets of interdependent variables is the conditional random field. * Conditional logistic regression handles matched or stratified Stratification may refer to: Mathematics * Stratification (mathematics), any consistent assignment of numbers to predicate symbols * Data stratification in statistics Earth sciences * Stable and unstable stratification * Stratification, or s ... data when the strata are small. It is mostly used in the analysis of observational studies. Software Most statistical software Statistical software are specialized computer programs for analysis in statistics and econometrics. Open-source * ADaMSoft – a generalized statistical software with data mining algorithms and methods for data management * ADMB – a softwa ... can do binary logistic regression. * SPSS SPSS Statistics is a statistical software suite developed by IBM for data management, advanced analytics, multivariate analysis, business intelligence, and criminal investigation. Long produced by SPSS Inc., it was acquired by IBM in 2009. Cur ... *for basic logistic regression. * Stata Stata (, , alternatively , occasionally stylized as STATA) is a general-purpose statistical software package developed by StataCorp for data manipulation, visualization, statistics, and automated reporting. It is used by researchers in many fiel ... * SAS SAS or Sas may refer to: Arts, entertainment, and media * ''SAS'' (novel series), a French book series by Gérard de Villiers * ''Shimmer and Shine'', an American animated children's television series * Southern All Stars, a Japanese rock ba ... *PROC LOGISTICfor basic logistic regression. *when all the variables are categorical. *for multilevel model Multilevel models (also known as hierarchical linear models, linear mixed-effect model, mixed models, nested data models, random coefficient, random-effects models, random parameter models, or split-plot designs) are statistical models of parame ... logistic regression. * R ** glm in the stats package (using family = binomial) ** lrm in thrms package** GLMNET package for an efficient implementation regularized logistic regression ** lmer for mixed effects logistic regression ** Rfast package command gm_logistic for fast and heavy calculations involving large scale data. ** arm package for bayesian logistic regression * Python *Logitin the Statsmodels module. *LogisticRegressionin the scikit-learn module. *LogisticRegressorin the TensorFlow module. ** Full example of logistic regression in the Theano tutoria** Bayesian Logistic Regression with ARD priocodetutorial** Variational Bayes Logistic Regression with ARD priocodetutorial** Bayesian Logistic Regressioncodetutorial* NCSS (statistical software), NCSS *Logistic Regression in NCSS* Matlab MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementa ... ** mnrfit in the Statistics and Machine Learning Toolbox (with "incorrect" coded as 2 instead of 0) ** fminunc/fmincon, fitglm, mnrfit, fitclinear, mle can all do logistic regression. *Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's mo ... ( JVM) ** LibLinear Apache Flink**Apache Spark Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance. Originally developed at the University of Califor ... *** SparkML supports Logistic Regression *FPGA A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturinghence the term ''Field-programmability, field-programmable''. The FPGA configuration is generally specifi ... *Logistic Regresesion IP corein HLS for FPGA A field-programmable gate array (FPGA) is an integrated circuit designed to be configured by a customer or a designer after manufacturinghence the term ''Field-programmability, field-programmable''. The FPGA configuration is generally specifi .... Notably, Microsoft Excel Microsoft Excel is a spreadsheet developed by Microsoft for Windows, macOS, Android and iOS. It features calculation or computation capabilities, graphing tools, pivot tables, and a macro programming language called Visual Basic for ...'s statistics extension package does not include it. See also * Logistic function A logistic function or logistic curve is a common S-shaped curve (sigmoid function, sigmoid curve) with equation f(x) = \frac, where For values of x in the domain of real numbers from -\infty to +\infty, the S-curve shown on the right is ... * Discrete choice In economics, discrete choice models, or qualitative choice models, describe, explain, and predict choices between two or more discrete alternatives, such as entering or not entering the labor market, or choosing between modes of transport. Su ... * Jarrow–Turnbull model * Limited dependent variable * Multinomial logit model In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes. That is, it is a model that is used to predict the prob ... * Ordered logit In statistics, the ordered logit model (also ordered logistic regression or proportional odds model) is an ordinal regression model—that is, a regression model for ordinal dependent variables—first considered by Peter McCullagh. For exampl ... * Hosmer–Lemeshow test The Hosmer–Lemeshow test is a statistical test for goodness of fit for logistic regression models. It is used frequently in risk prediction models. The test assesses whether or not the observed event rates match expected event rates in subgroup ... * Brier score * mlpack - contains a C++ implementation of logistic regression * Local case-control sampling * Logistic model tree References Further reading * * * ** Published in: * * * * * * * * * * * * External links * * by Mark ThomaLogistic Regression tutorial software in C for teaching purposes {{Authority control Predictive analytics Regression models
Maximum entropy
Proof
Other approaches
Comparison with linear regression
Alternatives
History
Extensions
Software
glm
lrm
gm_logistic
Logit
LogisticRegression
LogisticRegressor
mnrfit
fminunc/fmincon, fitglm, mnrfit, fitclinear, mle
Logistic Regresesion IP core
See also
References
Further reading
External links