In
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
and
probability
Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
, the Neyman Type A distribution is a discrete probability distribution from the family of
Compound Poisson distribution
In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. ...
. First of all, to easily understand this distribution we will demonstrate it with the following example explained in Univariate Discret Distributions;
we have a statistical model of the distribution of larvae in a unit area of field (in a unit of habitat) by assuming that the variation in the number of clusters of eggs per unit area (per unit of habitat) could be represented by a
Poisson distribution
In probability theory and statistics, the Poisson distribution () is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known const ...
with parameter
, while the number of larvae developing per cluster of eggs are assumed to have independent
Poisson distribution
In probability theory and statistics, the Poisson distribution () is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known const ...
all with the same parameter
. If we want to know how many larvae there are, we define a
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
''Y'' as the sum of the number of larvae hatched in each group (given ''j'' groups). Therefore, ''Y'' = ''X''
1 + ''X''
2 + ... ''X''
j, where ''X''
1,...,''X''
j are independent Poisson variables with parameter
and
.
History
Jerzy Neyman
Jerzy Spława-Neyman (April 16, 1894 – August 5, 1981; ) was a Polish mathematician and statistician who first introduced the modern concept of a confidence interval into statistical hypothesis testing and, with Egon Pearson, revised Ronald Fis ...
was born in Russia in April 16 of 1894, he was a Polish statistician who spent the first part of his career in Europe. In 1939 he developed the Neyman Type A distribution
to describe the distribution of larvae in experimental field plots. Above all, it is used to describe populations based on contagion, e.g.,
entomology
Entomology (from Ancient Greek ἔντομον (''éntomon''), meaning "insect", and -logy from λόγος (''lógos''), meaning "study") is the branch of zoology that focuses on insects. Those who study entomology are known as entomologists. In ...
(Beall
940 Evans
953
Year 953 ( CMLIII) was a common year starting on Saturday of the Julian calendar.
Events
By place
Byzantine Empire
* Battle of Marash: Emir Sayf al-Dawla marches north into the Byzantine Empire and ravages the countryside of Malatya ...
ref name=Evans>), accidents (Creswell i Froggatt
963
Year 963 (Roman numerals, CMLXIII) was a common year starting on Thursday of the Julian calendar.
Events
By place
Byzantine Empire
* March 15 – Emperor Romanos II dies at age 39, probably of poison administered by his wife, Emp ...
,
and
bacteriology
Bacteriology is the branch and specialty of biology that studies the Morphology (biology), morphology, ecology, genetics and biochemistry of bacteria as well as many other aspects related to them. This subdivision of microbiology involves the iden ...
.
The original derivation of this distribution was on the basis of a biological model and, presumably, it was expected that a good fit to the data would justify the hypothesized model. However, it is now known that it is possible to derive this distribution from different models (
William Feller
William "Vilim" Feller (July 7, 1906 – January 14, 1970), born Vilibald Srećko Feller, was a Croatian–American mathematician specializing in probability theory.
Early life and education
Feller was born in Zagreb to Ida Oemichen-Perc, a Cro ...
943,
and in view of this, Neyman's distribution derive as
Compound Poisson distribution
In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. ...
. This interpretation makes them suitable for modelling heterogeneous populations and renders them examples of apparent contagion.
Despite this, the difficulties in dealing with Neyman's Type A arise from the fact that its expressions for probabilities are highly complex. Even estimations of parameters through efficient methods, such as
maximum likelihood
In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ...
, are tedious and not easy to understand equations.
Definition
Probability generating function
The probability generating function (pgf) ''G''
1(''z''), which creates ''N'' independent X
j random variables, is used to a branching process. Each X
j produces a random number of individuals, where X
1, X
2,... have the same distribution as ''X'', which is that of ''X'' with pgf ''G''
2(''z''). The total number of individuals is then the random variable,
:
The p.g.f. of the distribution of ''SN'' is :
:
One of the notations, which is particularly helpful, allows us to use a symbolic representation to refer to an F1 distribution that has been generalized by an F2 distribution is,
:
In this instance, it is written as,
:
Finally, the
probability generating function
In probability theory, the probability generating function of a discrete random variable is a power series representation (the generating function) of the probability mass function of the random variable. Probability generating functions are of ...
is,
:
From the generating function of probabilities we can calculate the
probability mass function
In probability and statistics, a probability mass function (sometimes called ''probability function'' or ''frequency function'') is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes i ...
explained below.
Probability mass function
Let ''X''
1,''X''
2,...''X''
j be Poisson
independent variables
A variable is considered dependent if it depends on (or is hypothesized to depend on) an independent variable. Dependent variables are studied under the supposition or demand that they depend, by some law or rule (e.g., by a mathematical function ...
. The
probability distribution
In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
of the
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
''Y'' = ''X''
1 +''X''
2+...''X''
j is the Neyman's Type A distribution with parameters
and
.
:
Alternatively,
:
In order to see how the previous expression develops, we must bear in mind that the
probability mass function
In probability and statistics, a probability mass function (sometimes called ''probability function'' or ''frequency function'') is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes i ...
is calculated from the
probability generating function
In probability theory, the probability generating function of a discrete random variable is a power series representation (the generating function) of the probability mass function of the random variable. Probability generating functions are of ...
, and use the property of Stirling Numbers. Let's see the development
:
:
:
Another form to estimate the probabilities is with recurring successions,
:
,
Although its length varies directly with ''n'', this recurrence relation is only employed for numerical computation and is particularly useful for computer applications.
where
* ''x'' = 0, 1, 2, ... , except for probabilities of recurring successions, where ''x'' = 1, 2, 3, ...
*
*
,
.
* ''x''! and ''j''! are the
factorial
In mathematics, the factorial of a non-negative denoted is the Product (mathematics), product of all positive integers less than or equal The factorial also equals the product of n with the next smaller factorial:
\begin
n! &= n \times ...
s of ''x'' and ''j'', respectively.
* one of the properties of
Stirling numbers of the second kind
In mathematics, particularly in combinatorics, a Stirling number of the second kind (or Stirling partition number) is the number of ways to partition a set of ''n'' objects into ''k'' non-empty subsets and is denoted by S(n,k) or \textstyle \lef ...
is as follows:
:
Notation
:
Properties
Moment and cumulant generating functions
The
moment generating function
In probability theory and statistics, the moment-generating function of a real-valued random variable is an alternative specification of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compare ...
of a random variable ''X'' is defined as the expected value of ''e''
''t'', as a function of the real parameter ''t''. For an
, the
moment generating function
In probability theory and statistics, the moment-generating function of a real-valued random variable is an alternative specification of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compare ...
exists and is equal to
:
The
cumulant generating function
In probability theory and statistics, the cumulants of a probability distribution are a set of quantities that provide an alternative to the '' moments'' of the distribution. Any two probability distributions whose moments are identical will have ...
is the
logarithm
In mathematics, the logarithm of a number is the exponent by which another fixed value, the base, must be raised to produce that number. For example, the logarithm of to base is , because is to the rd power: . More generally, if , the ...
of the
moment generating function
In probability theory and statistics, the moment-generating function of a real-valued random variable is an alternative specification of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compare ...
and is equal to
:
In the following table we can see the moments of the order from 1 to 4
Skewness
The
skewness
In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined.
For a unimodal ...
is the third moment centered around the
mean
A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...
divided by the 3/2 power of the
standard deviation
In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its Expected value, mean. A low standard Deviation (statistics), deviation indicates that the values tend to be close to the mean ( ...
, and for the
distribution is,
:
Kurtosis
The
kurtosis
In probability theory and statistics, kurtosis (from , ''kyrtos'' or ''kurtos'', meaning "curved, arching") refers to the degree of “tailedness” in the probability distribution of a real-valued random variable. Similar to skewness, kurtos ...
is the fourth moment centered around the
mean
A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...
, divided by the square of the
variance
In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
, and for the
distribution is,
:
The
excess kurtosis
In probability theory and statistics, kurtosis (from , ''kyrtos'' or ''kurtos'', meaning "curved, arching") refers to the degree of “tailedness” in the probability distribution of a real-valued random variable. Similar to skewness, kurtosi ...
is just a correction to make the kurtosis of the
normal distribution
In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
f(x) = \frac ...
equal to zero, and it is the following,
:
*Always
, or
the distribution has a high acute peak around the
mean
A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...
and fatter tails.
Characteristic function
In a
discrete distribution
In probability theory and statistics, a probability distribution is a function that gives the probabilities of occurrence of possible events for an experiment. It is a mathematical description of a random phenomenon in terms of its sample spac ...
the
characteristic function In mathematics, the term "characteristic function" can refer to any of several distinct concepts:
* The indicator function of a subset, that is the function
\mathbf_A\colon X \to \,
which for a given subset ''A'' of ''X'', has value 1 at points ...
of any real-valued random variable is defined as the
expected value
In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...
of
, where ''i'' is the imaginary unit and ''t'' ∈ ''R''
: