The concept of a normalizing constant arises in

probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...

and a variety of other areas of

mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...

. The normalizing constant is used to reduce any probability function to a probability density function with total probability of one.

Definition

, a normalizing constant is a constant by which an everywhere non-negative function must be multiplied so the area under its graph is 1, e.g., to make it a probability density function or a

probability mass function In probability and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes it is also known as the discrete density function. The probability mass ...

Examples

If we start from the simple Gaussian function

p(x)=e^, \quad x\in(-\infty,\infty)

we have the corresponding Gaussian integral

\int_^\infty p(x) \, dx = \int_^\infty e^ \, dx = \sqrt,

Now if we use the latter's

reciprocal value In mathematics, a multiplicative inverse or reciprocal for a number ''x'', denoted by 1/''x'' or ''x''−1, is a number which when multiplied by ''x'' yields the multiplicative identity, 1. The multiplicative inverse of a fraction ''a''/'' ...

as a normalizing constant for the former, defining a function

\varphi(x)

\varphi(x) = \frac p(x) = \frac e^

so that its integral is unit

\int_^\infty \varphi(x) \, dx = \int_^\infty \frac e^ \, dx = 1

then the function

\varphi(x)

is a probability density function. This is the density of the standard

normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu i ...

. (''Standard'', in this case, means the

expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...

is 0 and the variance is 1.) And constant

\frac

is the normalizing constant of function

p(x)

. Similarly,

\sum_^\infty \frac = e^ ,

and consequently

f(n) = \frac

is a probability mass function on the set of all nonnegative integers. This is the probability mass function of the Poisson distribution with expected value λ. Note that if the probability density function is a function of various parameters, so too will be its normalizing constant. The parametrised normalizing constant for the Boltzmann distribution plays a central role in

statistical mechanics In physics, statistical mechanics is a mathematical framework that applies statistical methods and probability theory to large assemblies of microscopic entities. It does not assume or postulate any natural laws, but explains the macroscopic ...

. In that context, the normalizing constant is called the partition function.

Bayes' theorem

Bayes' theorem says that the posterior probability measure is proportional to the product of the prior probability measure and the likelihood function. ''Proportional to'' implies that one must multiply or divide by a normalizing constant to assign measure 1 to the whole space, i.e., to get a probability measure. In a simple discrete case we have :

P(H_0, D) = \frac

where P(H₀) is the prior probability that the hypothesis is true; P(D, H₀) is the conditional probability of the data given that the hypothesis is true, but given that the data are known it is the likelihood of the hypothesis (or its parameters) given the data; P(H₀, D) is the posterior probability that the hypothesis is true given the data. P(D) should be the probability of producing the data, but on its own is difficult to calculate, so an alternative way to describe this relationship is as one of proportionality: :

P(H_0, D) \propto P(D, H_0)P(H_0).

Since P(H, D) is a probability, the sum over all possible (mutually exclusive) hypotheses should be 1, leading to the conclusion that :

P(H_0, D) = \frac .

In this case, the reciprocal of the value :

P(D)=\sum_i P(D, H_i)P(H_i) \;

is the ''normalizing constant''. It can be extended from countably many hypotheses to uncountably many by replacing the sum by an integral. For concreteness, there are many methods of estimating the normalizing constant for practical purposes. Methods include the bridge sampling technique, the naive Monte Carlo estimator, the generalized harmonic mean estimator, and importance sampling.

Non-probabilistic uses

The Legendre polynomials are characterized by

orthogonality In mathematics, orthogonality is the generalization of the geometric notion of ''perpendicularity''. By extension, orthogonality is also used to refer to the separation of specific features of a system. The term also has specialized meanings in ...

with respect to the uniform measure on the interval ��1, 1and the fact that they are normalized so that their value at 1 is 1. The constant by which one multiplies a polynomial so its value at 1 is a normalizing constant. Orthonormal functions are normalized such that

\langle f_i , \, f_j \rangle = \, \delta_

with respect to some inner product . The constant is used to establish the hyperbolic functions cosh and sinh from the lengths of the adjacent and opposite sides of a hyperbolic triangle.

Notes

References