The generalized normal distribution (GND) or generalized Gaussian distribution (GGD) is either of two families of parametric

continuous probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...

s on the real line. Both families add a

shape parameter In probability theory and statistics, a shape parameter (also known as form parameter) is a kind of numerical parameter of a parametric family of probability distributionsEveritt B.S. (2002) Cambridge Dictionary of Statistics. 2nd Edition. CUP. th ...

to the

normal distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac ...

. To distinguish the two families, they are referred to below as "symmetric" and "asymmetric"; however, this is not a standard nomenclature.

Symmetric version

The symmetric generalized normal distribution, also known as the exponential power distribution or the generalized error distribution, is a parametric family of symmetric distributions. It includes all normal and

Laplace Pierre-Simon, Marquis de Laplace (; ; 23 March 1749 – 5 March 1827) was a French polymath, a scholar whose work has been instrumental in the fields of physics, astronomy, mathematics, engineering, statistics, and philosophy. He summariz ...

distributions, and as limiting cases it includes all

continuous uniform distribution In probability theory and statistics, the continuous uniform distributions or rectangular distributions are a family of symmetric probability distributions. Such a distribution describes an experiment where there is an arbitrary outcome that li ...

s on bounded intervals of the real line. This family includes the

when

\textstyle\beta=2

(with mean

\textstyle\mu

and variance

\textstyle \frac

) and it includes the

Laplace distribution In probability theory and statistics, the Laplace distribution is a continuous probability distribution named after Pierre-Simon Laplace. It is also sometimes called the double exponential distribution, because it can be thought of as two exponen ...

when

\textstyle\beta=1

. As

\textstyle\beta\rightarrow\infty

, the density converges pointwise to a uniform density on

\textstyle (\mu-\alpha,\mu+\alpha)

. This family allows for tails that are either heavier than normal (when

\beta<2

) or lighter than normal (when

\beta>2

). It is a useful way to parametrize a continuum of symmetric,

platykurtic In probability theory and statistics, kurtosis (from , ''kyrtos'' or ''kurtos'', meaning "curved, arching") refers to the degree of “tailedness” in the probability distribution of a real-valued random variable. Similar to skewness, kurtosis ...

densities spanning from the normal (

\textstyle\beta=2

) to the uniform density (

\textstyle\beta=\infty

), and a continuum of symmetric, leptokurtic densities spanning from the Laplace (

\textstyle\beta=1

) to the normal density (

\textstyle\beta=2

). The shape parameter

\beta

also controls the

peakedness In probability theory and statistics, a shape parameter (also known as form parameter) is a kind of numerical parameter of a parametric family of probability distributionsEveritt B.S. (2002) Cambridge Dictionary of Statistics. 2nd Edition. CUP. t ...

in addition to the tails.

Parameter estimation

Parameter estimation via

maximum likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ...

and the method of moments has been studied. The estimates do not have a closed form and must be obtained numerically. Estimators that do not require numerical calculation have also been proposed. The generalized normal log-likelihood function has infinitely many continuous derivates (i.e. it belongs to the class C^∞ of

smooth function In mathematical analysis, the smoothness of a function is a property measured by the number of continuous derivatives (''differentiability class)'' it has over its domain. A function of class C^k is a function of smoothness at least ; t ...

s) only if

\textstyle\beta

is a positive, even integer. Otherwise, the function has

\textstyle\lfloor \beta \rfloor

continuous derivatives. As a result, the standard results for consistency and asymptotic normality of

estimates of

\beta

only apply when

\textstyle\beta\ge 2

Maximum likelihood estimator

It is possible to fit the generalized normal distribution adopting an approximate

method. With

\mu

initially set to the sample first moment

m_1

\textstyle\beta

is estimated by using a

Newton–Raphson In numerical analysis, the Newton–Raphson method, also known simply as Newton's method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a ...

iterative procedure, starting from an initial guess of

\textstyle\beta=\textstyle\beta_0

, :

\beta _0 = \frac,

where :

m_1= \sum_^N , x_i, ,

is the first statistical moment of the absolute values and

m_2

is the second statistical moment. The iteration is :

\beta_ = \beta_ - \frac ,

where :

g(\beta)= 1 + \frac - \frac +  \frac ,

and :

& - \frac, \end

and where

\psi

and

\psi'

are the

digamma function In mathematics, the digamma function is defined as the logarithmic derivative of the gamma function: :\psi(z) = \frac\ln\Gamma(z) = \frac. It is the first of the polygamma functions. This function is Monotonic function, strictly increasing a ...

and

trigamma function In mathematics, the trigamma function, denoted or , is the second of the polygamma functions, and is defined by : \psi_1(z) = \frac \ln\Gamma(z). It follows from this definition that : \psi_1(z) = \frac \psi(z) where is the digamma functi ...

. Given a value for

\textstyle\beta

, it is possible to estimate

\mu

by finding the minimum of: :

\min_\mu = \sum_^ , x_i-\mu, ^\beta

Finally

\textstyle\alpha

is evaluated as :

\alpha = \left( \frac \sum_^N, x_i-\mu, ^\beta\right)^ .

For

\beta \leq 1

, median is a more appropriate estimator of

\mu

. Once

\mu

is estimated,

\beta

and

\alpha

can be estimated as described above.

Applications

The symmetric generalized normal distribution has been used in modeling when the concentration of values around the mean and the tail behavior are of particular interest. Other families of distributions can be used if the focus is on other deviations from normality. If the

symmetry Symmetry () in everyday life refers to a sense of harmonious and beautiful proportion and balance. In mathematics, the term has a more precise definition and is usually used to refer to an object that is Invariant (mathematics), invariant und ...

of the distribution is the main interest, the skew normal family or asymmetric version of the generalized normal family discussed below can be used. If the tail behavior is the main interest, the student t family can be used, which approximates the normal distribution as the degrees of freedom grows to infinity. The t distribution, unlike this generalized normal distribution, obtains heavier than normal tails without acquiring a

cusp A cusp is the most pointed end of a curve. It often refers to cusp (anatomy), a pointed structure on a tooth. Cusp or CUSP may also refer to: Mathematics * Cusp (singularity), a singular point of a curve * Cusp catastrophe, a branch of bifu ...

at the origin. It finds uses in plasma physics under the name of Langdon Distribution resulting from inverse bremsstrahlung. In a

linear regression In statistics, linear regression is a statistical model, model that estimates the relationship between a Scalar (mathematics), scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A mode ...

problem modeled as

y \sim \mathrm(X\cdot\theta, \alpha, p)

, the MLE will be the

\arg\min_\, X\cdot\theta-y\, _p

where the

p-norm In mathematics, the spaces are function spaces defined using a natural generalization of the -norm for finite-dimensional vector spaces. They are sometimes called Lebesgue spaces, named after Henri Lebesgue , although according to the Bourbaki ...

is used.

Properties

Moments

Let

X_\beta

be zero mean generalized Gaussian distribution of shape

\beta

and scaling parameter

\alpha

. The moments of

X_\beta

exist and are finite for any k greater than −1. For any non-negative integer k, the plain central moments are :

= \begin 0 & \textk\text \\ \alpha^ \Gamma \left( \frac \right) \Big/ \, \Gamma \left( \frac \right) & \textk\text \end

Connection to Stable Count Distribution

From the viewpoint of the

Stable count distribution In probability theory, the stable count distribution is the conjugate prior of a Stable distribution#One-sided stable distribution and stable count distribution, one-sided stable distribution. This distribution was discovered by Stephen Lihn (Chi ...

\beta

can be regarded as Lévy's stability parameter. This distribution can be decomposed to an integral of kernel density where the kernel is either a

or a

Gaussian distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real number, real-valued random variable. The general form of its probability density function is f(x ...

: :

\frac \frac e^ =
      \begin
        \displaystyle\int_0^\infty \frac \left( \frac e^ \right)
        \mathfrak_\beta(\nu) \, d\nu ,
        & 1 \geq \beta > 0; \text \\
        \displaystyle\int_0^\infty \frac \left( \frac e^ \right) 
        V_(s) \, ds ,
        & 2 \geq \beta > 0;
      \end

where

\mathfrak_\beta(\nu)

is the

and

V_(s)

is the Stable vol distribution.

Connection to Positive-Definite Functions

The probability density function of the symmetric generalized normal distribution is a

positive-definite function In mathematics, a positive-definite function is, depending on the context, either of two types of function. Definition 1 Let \mathbb be the set of real numbers and \mathbb be the set of complex numbers. A function f: \mathbb \to \mathbb is ...

for

\beta \in (0,2]

Infinite divisibility

The symmetric generalized Gaussian distribution is an

infinitely divisible distribution Infinity is something which is boundless, endless, or larger than any natural number. It is denoted by \infty, called the infinity symbol. From the time of the ancient Greeks, the philosophical nature of infinity has been the subject of m ...

if and only if

\beta \in (0,1] \cup \

Generalizations

The multivariate generalized normal distribution, i.e. the product of

n

exponential power distributions with the same

\beta

and

\alpha

parameters, is the only probability density that can be written in the form

p(\mathbf x)=g(\, \mathbf x\, _\beta)

and has independent marginals. The results for the special case of the

Multivariate normal distribution In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional ( univariate) normal distribution to higher dimensions. One d ...

is originally attributed to

Maxwell Maxwell may refer to: People * Maxwell (surname), including a list of people and fictional characters with the name ** James Clerk Maxwell, mathematician and physicist * Justice Maxwell (disambiguation) * Maxwell baronets, in the Baronetage of N ...

Asymmetric version

The asymmetric generalized normal distribution is a family of continuous probability distributions in which the shape parameter can be used to introduce asymmetry or skewness. When the shape parameter is zero, the normal distribution results. Positive values of the shape parameter yield left-skewed distributions bounded to the right, and negative values of the shape parameter yield right-skewed distributions bounded to the left. Only when the shape parameter is zero is the density function for this distribution positive over the whole real line: in this case the distribution is a

, otherwise the distributions are shifted and possibly reversed

log-normal distribution In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normal distribution, normally distributed. Thus, if the random variable is log-normally distributed ...

Parameter estimation

Parameters can be estimated via

maximum likelihood estimation In statistics, maximum likelihood estimation (MLE) is a method of estimation theory, estimating the Statistical parameter, parameters of an assumed probability distribution, given some observed data. This is achieved by Mathematical optimization, ...

or the method of moments. The parameter estimates do not have a closed form, so numerical calculations must be used to compute the estimates. Since the sample space (the set of real numbers where the density is non-zero) depends on the true value of the parameter, some standard results about the performance of parameter estimates will not automatically apply when working with this family.

Applications

The asymmetric generalized normal distribution can be used to model values that may be normally distributed, or that may be either right-skewed or left-skewed relative to the normal distribution. The

skew normal distribution In probability theory and statistics, the skew normal distribution is a continuous probability distribution that generalises the normal distribution to allow for non-zero skewness. Definition Let \phi(x) denote the Normal distribution, standard ...

is another distribution that is useful for modeling deviations from normality due to skew. Other distributions used to model skewed data include the

gamma Gamma (; uppercase , lowercase ; ) is the third letter of the Greek alphabet. In the system of Greek numerals it has a value of 3. In Ancient Greek, the letter gamma represented a voiced velar stop . In Modern Greek, this letter normally repr ...

lognormal In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normal distribution, normally distributed. Thus, if the random variable is log-normally distributed ...

, and Weibull distributions, but these do not include the normal distributions as special cases.

Kullback-Leibler divergence between two PDFs

Kullback-Leibler divergence (KLD) is a method using for compute the divergence or similarity between two probability density functions. Let

P(x)

and

Q(x)

two generalized Gaussian distributions with parameters

\alpha_1, \beta_1, \mu_1

and

\alpha_2, \beta_2, \mu_2

subject to the constraint

\mu_1 = \mu_2 = 0

. Then this divergence is given by: :

(P(x), , Q(x)) = -\frac + \frac + \log \left(\frac\right)

Other distributions related to the normal

The two generalized normal families described here, like the skew normal family, are parametric families that extends the normal distribution by adding a shape parameter. Due to the central role of the normal distribution in probability and statistics, many distributions can be characterized in terms of their relationship to the normal distribution. For example, the

log-normal In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normal distribution, normally distributed. Thus, if the random variable is log-normally distributed ...

, folded normal, and inverse normal distributions are defined as transformations of a normally-distributed value, but unlike the generalized normal and skew-normal families, these do not include the normal distributions as special cases. Actually all distributions with finite variance are in the limit highly related to the normal distribution. The Student-t distribution, the

Irwin–Hall distribution In probability and statistics, the Irwin–Hall distribution, named after Joseph Oscar Irwin and Philip Hall, is a probability distribution for a random variable defined as the sum of a number of independent random variables, each having a unifo ...

and the

Bates distribution In probability and business statistics, the Bates distribution, named after Grace Bates, is a probability distribution of the mean of a number of statistically independent uniformly distributed random variables on the unit interval. This dist ...

also extend the normal distribution, and ''include'' in the limit the normal distribution. So there is no strong reason to prefer the "generalized" normal distribution of type 1, e.g. over a combination of Student-t and a normalized extended Irwin–Hall – this would include e.g. the triangular distribution (which cannot be modeled by the generalized Gaussian type 1). A symmetric distribution which can model both tail (long and short) ''and'' center behavior (like flat, triangular or Gaussian) completely independently could be derived e.g. by using ''X'' = IH/chi. The Tukey g- and h-distribution also allows for a deviation from normality, both through skewness and fat tails.The Tukey g-and-h Distribution
Yuan Yan, Marc G. Genton Significance, Volume 16, Issue 3, June 2019, Pages 12–13,

References

{{DEFAULTSORT:Generalized Normal Distribution Continuous distributions

Symmetric version

Parameter estimation

Maximum likelihood estimator

Applications

Properties

Moments

Connection to Stable Count Distribution

Connection to Positive-Definite Functions

Infinite divisibility

Generalizations

Asymmetric version

Parameter estimation

Applications

Kullback-Leibler divergence between two PDFs

Other distributions related to the normal

See also

References