In the mathematical theory of
probability
Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
, the entropy rate or source information rate of a
stochastic process is, informally, the time density of the average information in a stochastic process. For stochastic processes with a
countable
In mathematics, a set is countable if either it is finite or it can be made in one to one correspondence with the set of natural numbers. Equivalently, a set is ''countable'' if there exists an injective function from it into the natural number ...
index, the
entropy
Entropy is a scientific concept, as well as a measurable physical property, that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodyna ...
rate
is the limit of the
joint entropy
In information theory, joint entropy is a measure of the uncertainty associated with a set of variables.
Definition
The joint Shannon entropy (in bits) of two discrete random variables X and Y with images \mathcal X and \mathcal Y is defined ...
of
members of the process
divided by
, as
tends to infinity
Infinity is that which is boundless, endless, or larger than any natural number. It is often denoted by the infinity symbol .
Since the time of the ancient Greeks, the philosophical nature of infinity was the subject of many discussions am ...
:
:
when the limit exists. An alternative, related quantity is:
:
For
strongly stationary
In mathematics and statistics, a stationary process (or a strict/strictly stationary process or strong/strongly stationary process) is a stochastic process whose unconditional joint probability distribution does not change when shifted in time. Con ...
stochastic processes,
. The entropy rate can be thought of as a general property of stochastic sources; this is the
asymptotic equipartition property
In information theory, the asymptotic equipartition property (AEP) is a general property of the output samples of a stochastic source. It is fundamental to the concept of typical set used in theories of data compression.
Roughly speaking, the t ...
. The entropy rate may be used to estimate the complexity of stochastic processes. It is used in diverse applications ranging from characterizing the complexity of languages, blind source separation, through to optimizing quantizers and data compression algorithms. For example, a maximum entropy rate criterion may be used for
feature selection
In machine learning and statistics, feature selection, also known as variable selection, attribute selection or variable subset selection, is the process of selecting a subset of relevant features (variables, predictors) for use in model construc ...
in
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
.
Entropy rates for Markov chains
Since a stochastic process defined by a
Markov chain
A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happen ...
that is
irreducible,
aperiodic
A periodic function is a function that repeats its values at regular intervals. For example, the trigonometric functions, which repeat at intervals of 2\pi radians, are periodic functions. Periodic functions are used throughout science to desc ...
and
positive recurrent
A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happen ...
has a
stationary distribution Stationary distribution may refer to:
* A special distribution for a Markov chain such that if the chain starts with its stationary distribution, the marginal distribution of all states at any time will always be the stationary distribution. Assum ...
, the entropy rate is independent of the initial distribution.
For example, for such a Markov chain
defined on a
countable
In mathematics, a set is countable if either it is finite or it can be made in one to one correspondence with the set of natural numbers. Equivalently, a set is ''countable'' if there exists an injective function from it into the natural number ...
number of states, given the
transition matrix ,
is given by:
:
where
is the
asymptotic distribution
In mathematics and statistics, an asymptotic distribution is a probability distribution that is in a sense the "limiting" distribution of a sequence of distributions. One of the main uses of the idea of an asymptotic distribution is in providing ...
of the chain.
A simple consequence of this definition is that an
i.i.d. stochastic process has an entropy rate that is the same as the
entropy
Entropy is a scientific concept, as well as a measurable physical property, that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodyna ...
of any individual member of the process.
See also
*
Information source (mathematics) In mathematics, an information source is a sequence of random variables ranging over a finite alphabet Γ, having a stationary distribution.
The uncertainty, or entropy rate, of an information source is defined as
:H\ = \lim_ H(X_n , X_0, ...
*
Markov information source
In mathematics, a Markov information source, or simply, a Markov source, is an information source whose underlying dynamics are given by a stationary finite Markov chain.
Formal definition
An information source is a sequence of random variables ...
*
Asymptotic equipartition property
In information theory, the asymptotic equipartition property (AEP) is a general property of the output samples of a stochastic source. It is fundamental to the concept of typical set used in theories of data compression.
Roughly speaking, the t ...
*
Maximal entropy random walk
Maximal entropy random walk (MERW) is a popular type of biased random walk on a graph, in which transition probabilities are chosen accordingly to the principle of maximum entropy, which says that the probability distribution which best represents ...
- chosen to maximize entropy rate
References
* Cover, T. and Thomas, J. (1991) Elements of Information Theory, John Wiley and Sons, Inc., {{ISBN, 0-471-06259-6}
Information theory
Entropy
Markov models
Temporal rates