In
probability theory
Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set ...
, a pairwise independent collection of
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s is a set of random variables any two of which are
independent
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s
* Independe ...
. Any collection of
mutually independent random variables is pairwise independent, but some pairwise independent collections are not mutually independent. Pairwise independent random variables with finite
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbe ...
are
uncorrelated
In probability theory and statistics, two real-valued random variables, X, Y, are said to be uncorrelated if their covariance, \operatorname ,Y= \operatorname Y- \operatorname \operatorname /math>, is zero. If two variables are uncorrelated, ther ...
.
A pair of random variables ''X'' and ''Y'' are independent if and only if the random vector (''X'', ''Y'') with
joint
A joint or articulation (or articular surface) is the connection made between bones, ossicles, or other hard structures in the body which link an animal's skeletal system into a functional whole.Saladin, Ken. Anatomy & Physiology. 7th ed. McGraw- ...
cumulative distribution function (CDF)
satisfies
:
or equivalently, their joint density
satisfies
:
That is, the joint distribution is equal to the product of the marginal distributions.
Unless it is not clear in context, in practice the modifier "mutual" is usually dropped so that independence means mutual independence. A statement such as " ''X'', ''Y'', ''Z'' are independent random variables" means that ''X'', ''Y'', ''Z'' are mutually independent.
Example
Pairwise independence does not imply mutual independence, as shown by the following example attributed to S. Bernstein.
[ Remark 2.6.1, p. 120.]
Suppose ''X'' and ''Y'' are two independent tosses of a fair coin, where we designate 1 for heads and 0 for tails. Let the third random variable ''Z'' be equal to 1 if exactly one of those coin tosses resulted in "heads", and 0 otherwise (i.e.,
). Then jointly the triple (''X'', ''Y'', ''Z'') has the following
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon ...
:
:
Here the
marginal probability distribution
In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variables ...
s are identical:
and
The
bivariate distributions also agree:
where
Since each of the pairwise joint distributions equals the product of their respective marginal distributions, the variables are pairwise independent:
* ''X'' and ''Y'' are independent, and
* ''X'' and ''Z'' are independent, and
* ''Y'' and ''Z'' are independent.
However, ''X'', ''Y'', and ''Z'' are not
mutually independent, since
the left side equalling for example 1/4 for (''x'', ''y'', ''z'') = (0, 0, 0) while the right side equals 1/8 for (''x'', ''y'', ''z'') = (0, 0, 0). In fact, any of
is completely determined by the other two (any of ''X'', ''Y'', ''Z'' is the
sum (modulo 2) of the others). That is as far from independence as random variables can get.
Probability of the union of pairwise independent events
Bounds on the
probability
Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...
that the sum of
Bernoulli random variables
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
is at least one, commonly known as the
union bound
In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the indivi ...
, are provided by the
Boole–Fréchet[Boole, G. (1854). ''An Investigation of the Laws of Thought, On Which Are Founded the Mathematical Theories of Logic and Probability.'' Walton and Maberly, London. See Boole's "major" and "minor" limits of a conjunction on page 299.][Fréchet, M. (1935). Généralisations du théorème des probabilités totales. ''Fundamenta Mathematicae'' 25: 379–387.] inequalities. While these bounds assume only
univariate
In mathematics, a univariate object is an expression, equation, function or polynomial involving only one variable. Objects involving more than one variable are multivariate. In some cases the distinction between the univariate and multivariate ...
information, several bounds with knowledge of general
bivariate probabilities, have been proposed too. Denote by
which subtracts the maximum weight of a
star
A star is an astronomical object comprising a luminous spheroid of plasma (physics), plasma held together by its gravity. The List of nearest stars and brown dwarfs, nearest star to Earth is the Sun. Many other stars are visible to the naked ...
spanning tree
In the mathematical field of graph theory, a spanning tree ''T'' of an undirected graph ''G'' is a subgraph that is a tree which includes all of the vertices of ''G''. In general, a graph may have several spanning trees, but a graph that is ...
on a
complete graph
In the mathematical field of graph theory, a complete graph is a simple undirected graph in which every pair of distinct vertices is connected by a unique edge. A complete digraph is a directed graph in which every pair of distinct vertices is ...
with
nodes (where the edge weights are given by
) from the sum of the
marginal
Marginal may refer to:
* ''Marginal'' (album), the third album of the Belgian rock band Dead Man Ray, released in 2001
* ''Marginal'' (manga)
* '' El Marginal'', Argentine TV series
* Marginal seat or marginal constituency or marginal, in polit ...
probabilities
.
Hunter-Worsley
tightened this
upper bound by optimizing over
as follows:
::
where
is the set of all
spanning trees on the graph. These bounds are not the
tightest possible with general
bivariates even when
feasibility is guaranteed as shown in Boros et.al.
However, when the variables are
pairwise independent (
), Ramachandra-Natarajan
showed that the Kounias-Hunter-Worsley
bound is
tight
Tight may refer to:
Clothing
* Skin-tight garment, a garment that is held to the skin by elastic tension
* Tights, a type of leg coverings fabric extending from the waist to feet
* Tightlacing, the practice of wearing a tightly-laced corset
...
by proving that the maximum probability of the union of events admits a
closed-form expression
In mathematics, a closed-form expression is a mathematical expression that uses a finite number of standard operations. It may contain constants, variables, certain well-known operations (e.g., + − × ÷), and functions (e.g., ''n''th r ...
given as:
where the
probabilities
Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...
are sorted in increasing order as
. It is interesting to note that the
tight
Tight may refer to:
Clothing
* Skin-tight garment, a garment that is held to the skin by elastic tension
* Tights, a type of leg coverings fabric extending from the waist to feet
* Tightlacing, the practice of wearing a tightly-laced corset
...
bound in depends only on the sum of the smallest
probabilities
Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...
and the largest probability
. Thus, while
ordering
Order, ORDER or Orders may refer to:
* Categorization, the process in which ideas and objects are recognized, differentiated, and understood
* Heterarchy, a system of organization wherein the elements have the potential to be ranked a number of ...
of the
probabilities
Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...
plays a role in the derivation of the bound, the
ordering
Order, ORDER or Orders may refer to:
* Categorization, the process in which ideas and objects are recognized, differentiated, and understood
* Heterarchy, a system of organization wherein the elements have the potential to be ranked a number of ...
among the smallest
probabilities
Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...
is inconsequential since only their sum is used.
Comparison with the Boole–Fréchet
union bound
In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the indivi ...
It is useful to compare the smallest bounds on the probability of the union with arbitrary
dependence and
pairwise independence respectively. The
tightest Boole–Fréchet upper union bound
In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the indivi ...
(assuming only
univariate
In mathematics, a univariate object is an expression, equation, function or polynomial involving only one variable. Objects involving more than one variable are multivariate. In some cases the distinction between the univariate and multivariate ...
information) is given as:
As shown in Ramachandra-Natarajan,
it can be easily verified that the ratio of the two
tight
Tight may refer to:
Clothing
* Skin-tight garment, a garment that is held to the skin by elastic tension
* Tights, a type of leg coverings fabric extending from the waist to feet
* Tightlacing, the practice of wearing a tightly-laced corset
...
bounds in and is
upper bounded by
where the maximum value of
is attained when
::
,
where the
probabilities
Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...
are sorted in increasing order as
. In other words, in the best-case scenario, the pairwise independence bound in provides an improvement of
over the
univariate
In mathematics, a univariate object is an expression, equation, function or polynomial involving only one variable. Objects involving more than one variable are multivariate. In some cases the distinction between the univariate and multivariate ...
bound in .
Generalization
More generally, we can talk about ''k''-wise independence, for any ''k'' ≥ 2. The idea is similar: a set of
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s is ''k''-wise independent if every subset of size ''k'' of those variables is independent. ''k''-wise independence has been used in theoretical computer science, where it was used to prove a theorem about the problem
MAXEkSAT
MAXEkSAT is a problem in computational complexity theory that is a maximization version of the Boolean satisfiability problem 3SAT. In MAXEkSAT, each clause has exactly ''k'' literals, each with distinct variables, and is in conjunctive normal form ...
.
''k''-wise independence is used in the proof that
k-independent hashing functions are secure unforgeable
message authentication code
In cryptography, a message authentication code (MAC), sometimes known as a ''tag'', is a short piece of information used for authenticating a message. In other words, to confirm that the message came from the stated sender (its authenticity) and ...
s.
See also
*
Pairwise
Pairwise generally means "occurring in pairs" or "two at a time."
Pairwise may also refer to:
* Pairwise disjoint
* Pairwise independence of random variables
* Pairwise comparison, the process of comparing two entities to determine which is prefer ...
*
Pairwise disjoint
References
{{reflist
Theory of probability distributions
Independence (probability theory)