HOME

TheInfoList



OR:

In statistics, a zero-inflated model is a
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data (and similar data from a larger population). A statistical model represents, often in considerably idealized form, ...
based on a zero-inflated
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomeno ...
, i.e. a distribution that allows for frequent zero-valued observations.


Zero-inflated Poisson

One well-known zero-inflated model is Diane Lambert's zero-inflated Poisson model, which concerns a random event containing excess zero-count data in unit time. For example, the number of insurance claims within a population for a certain type of risk would be zero-inflated by those people who have not taken out insurance against the risk and thus are unable to claim. The zero-inflated Poisson (ZIP) model mixes two zero generating processes. The first process generates zeros. The second process is governed by a
Poisson distribution In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known ...
that generates counts, some of which may be zero. The
mixture distribution In probability and statistics, a mixture distribution is the probability distribution of a random variable that is derived from a collection of other random variables as follows: first, a random variable is selected by chance from the collection ...
is described as follows: : \Pr (Y = 0) = \pi + (1 - \pi) e^ :\Pr (Y = y_i) = (1 - \pi) \frac ,\qquad y_i = 1,2,3,... where the outcome variable y_i has any non-negative integer value, \lambda is the expected Poisson count for the ith individual; \pi is the probability of extra zeros. The mean is (1-\pi) \lambda and the variance is \lambda (1-\pi) (1+\pi \lambda) .


Estimators of ZIP parameters

The method of moments estimators are given by : \hat_ = \frac - 1, : \hat_ = \frac, where m is the sample mean and s^2 is the sample variance. The maximum likelihood estimator can be found by solving the following equation : m(1- e^) = \hat_ \left( 1 - \frac \right). where \frac is the observed proportion of zeros. A closed form solution of this equation is given by : \hat_ = W_(-s e^)+s with W_0 being the main branch of Lambert's W-function and : s = \frac . Alternatively, the equation can be solved by iteration. The maximum likelihood estimator for \pi is given by : \hat_ = 1 - \frac.


Related models

In 1994, Greene considered the zero-inflated negative binomial (ZINB) model. Daniel B. Hall adapted Lambert's methodology to an upper-bounded count situation, thereby obtaining a zero-inflated binomial (ZIB) model.


Discrete pseudo compound Poisson model

If the count data Y is such that the probability of zero is larger than the probability of nonzero, namely : \Pr (Y = 0) > 0.5 then the discrete data Y obey discrete pseudo
compound Poisson distribution In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. T ...
. In fact, let G(z) = \sum\limits_^\infty P(Y = n)z^n be the probability generating function of y_i. If p_0=\Pr (Y = 0) > 0.5 , then , G(z), \geqslant p_0 - \sum\limits_^\infty p_i = 2p_0-1 > 0. Then from the Wiener–Lévy theorem, G(z) has the probability generating function of the discrete pseudo
compound Poisson distribution In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. T ...
. We say that the discrete random variable Y satisfying probability generating function characterization : G_Y(z) = \sum\limits_^\infty P(Y = n)z^n = \exp\left(\sum_^\infty \alpha_k \lambda (z^k - 1)\right), \quad (, z, \le 1) has a discrete pseudo
compound Poisson distribution In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. T ...
with parameters : (\lambda_1, \lambda_2, \ldots ) = (\alpha_1 \lambda,\alpha_2 \lambda, \ldots ) \in \mathbb^\infty \left( \sum_^\infty \alpha _k = 1, \sum\limits_^\infty , \alpha_k, < \infty, \alpha_k \in \mathbb,\lambda > 0 \right). When all the \alpha_k are non-negative, it is the discrete
compound Poisson distribution In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. T ...
(non-Poisson case) with
overdispersion In statistics, overdispersion is the presence of greater variability (statistical dispersion) in a data set than would be expected based on a given statistical model. A common task in applied statistics is choosing a parametric model to fit a g ...
property.


See also

*
Poisson distribution In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known ...
*
Zero-truncated Poisson distribution In probability theory, the zero-truncated Poisson (ZTP) distribution is a certain discrete probability distribution whose support is the set of positive integers. This distribution is also known as the conditional Poisson distribution or the pos ...
*
Compound Poisson distribution In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. T ...
* Sparse approximation * Hurdle model


Software


pscl
an
brms
R packages


References

{{least squares and regression analysis Generalized linear models Categorical data Poisson point processes