The data processing inequality is an
information theoretic concept which states that the information content of a signal cannot be increased via a local physical operation. This can be expressed concisely as 'post-processing cannot increase information'.
Definition
Let three random variables form the
Markov chain
A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happen ...
, implying that the conditional distribution of
depends only on
and is
conditionally independent
In probability theory, conditional independence describes situations wherein an observation is irrelevant or redundant when evaluating the certainty of a hypothesis. Conditional independence is usually formulated in terms of conditional probabil ...
of
. Specifically, we have such a Markov chain if the joint probability mass function can be written as
:
In this setting, no processing of
, deterministic or random, can increase the information that
contains about
. Using the
mutual information
In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables. More specifically, it quantifies the " amount of information" (in units such as ...
, this can be written as :
:
With the equality
if and only if
, i.e.
and
contain the same information about
, and
also forms a Markov chain.
Proof
One can apply the
chain rule for mutual information to obtain two different decompositions of
:
:
By the relationship
, we know that
and
are conditionally independent, given
, which means the
conditional mutual information,
. The data processing inequality then follows from the non-negativity of
.
See also
*
Garbage in, garbage out
In computer science, garbage in, garbage out (GIGO) is the concept that flawed, or nonsense (garbage) input data produces nonsense output. Rubbish in, rubbish out (RIRO) is an alternate wording.
The principle applies to all logical argumentatio ...
References
External links
*http://www.scholarpedia.org/article/Mutual_information
Data processing
{{Comp-sci-stub