In
information theory
Information theory is the scientific study of the quantification, storage, and communication of information. The field was originally established by the works of Harry Nyquist and Ralph Hartley, in the 1920s, and Claude Shannon in the 1940s. ...
, the binary entropy function, denoted
or
, is defined as the
entropy
Entropy is a scientific concept, as well as a measurable physical property, that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodyna ...
of a
Bernoulli process
In probability and statistics, a Bernoulli process (named after Jacob Bernoulli) is a finite or infinite sequence of binary random variables, so it is a discrete-time stochastic process that takes only two values, canonically 0 and 1. Th ...
with
probability
Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
of one of two values. It is a special case of
, the
entropy function. Mathematically, the Bernoulli trial is modelled as a
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the p ...
that can take on only two values: 0 and 1, which are mutually exclusive and exhaustive.
If
, then
and the entropy of
(in
shannon
Shannon may refer to:
People
* Shannon (given name)
* Shannon (surname)
* Shannon (American singer), stage name of singer Shannon Brenda Greene (born 1958)
* Shannon (South Korean singer), British-South Korean singer and actress Shannon Arrum Wil ...
s) is given by
:
,
where
is taken to be 0. The logarithms in this formula are usually taken (as shown in the graph) to the base 2. See ''
binary logarithm
In mathematics, the binary logarithm () is the power to which the number must be raised to obtain the value . That is, for any real number ,
:x=\log_2 n \quad\Longleftrightarrow\quad 2^x=n.
For example, the binary logarithm of is , the ...
''.
When
, the binary entropy function attains its maximum value. This is the case of an
unbiased coin flip.
is distinguished from the
entropy function in that the former takes a single real number as a
parameter
A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
whereas the latter takes a distribution or random variable as a parameter.
Sometimes the binary entropy function is also written as
.
However, it is different from and should not be confused with the
Rényi entropy In information theory, the Rényi entropy is a quantity that generalizes various notions of entropy, including Hartley entropy, Shannon entropy, collision entropy, and min-entropy. The Rényi entropy is named after Alfréd Rényi, who looked for th ...
, which is denoted as
.
Explanation
In terms of information theory, ''entropy'' is considered to be a measure of the uncertainty in a message. To put it intuitively, suppose
. At this probability, the event is certain never to occur, and so there is no uncertainty at all, leading to an entropy of 0. If
, the result is again certain, so the entropy is 0 here as well. When
, the uncertainty is at a maximum; if one were to place a fair bet on the outcome in this case, there is no advantage to be gained with prior knowledge of the probabilities. In this case, the entropy is maximum at a value of 1 bit. Intermediate values fall between these cases; for instance, if
, there is still a measure of uncertainty on the outcome, but one can still predict the outcome correctly more often than not, so the uncertainty measure, or entropy, is less than 1 full bit.
Derivative
The
derivative
In mathematics, the derivative of a function of a real variable measures the sensitivity to change of the function value (output value) with respect to a change in its argument (input value). Derivatives are a fundamental tool of calculus. ...
of the binary entropy function may be expressed as the negative of the
logit
In statistics, the logit ( ) function is the quantile function associated with the standard logistic distribution. It has many uses in data analysis and machine learning, especially in data transformations.
Mathematically, the logit is the i ...
function:
:
.
Taylor series
The
Taylor series
In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor se ...
of the binary entropy function in a neighborhood of 1/2 is
:
for
.
Bounds
The following bounds hold for
:
:
and
:
where
denotes natural logarithm.
See also
*
Metric entropy
In mathematics, a measure-preserving dynamical system is an object of study in the abstract formulation of dynamical systems, and ergodic theory in particular. Measure-preserving systems obey the Poincaré recurrence theorem, and are a special ca ...
*
Information theory
Information theory is the scientific study of the quantification, storage, and communication of information. The field was originally established by the works of Harry Nyquist and Ralph Hartley, in the 1920s, and Claude Shannon in the 1940s. ...
*
Information entropy
In information theory, the entropy of a random variable is the average level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes. Given a discrete random variable X, which takes values in the alphabet \ ...
*
Quantities of information
Quantity or amount is a property that can exist as a multitude or magnitude, which illustrate discontinuity and continuity. Quantities can be compared in terms of "more", "less", or "equal", or by assigning a numerical value multiple of a uni ...
References
Further reading
*
MacKay, David J. C. Information Theory, Inference, and Learning Algorithms' Cambridge: Cambridge University Press, 2003. {{ISBN, 0-521-64298-1
Entropy and information
zh-yue:二元熵函數