HOME

TheInfoList



OR:

In
probability theory Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
, a pairwise independent collection of
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
s is a set of random variables any two of which are
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in Pennsylvania, United States * Independentes (English: Independents), a Portuguese artist ...
. Any collection of mutually independent random variables is pairwise independent, but some pairwise independent collections are not mutually independent. Pairwise independent random variables with finite
variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
are
uncorrelated In probability theory and statistics, two real-valued random variables, X, Y, are said to be uncorrelated if their covariance, \operatorname ,Y= \operatorname Y- \operatorname \operatorname /math>, is zero. If two variables are uncorrelated, ther ...
. A pair of random variables ''X'' and ''Y'' are independent if and only if the random vector (''X'', ''Y'') with
joint A joint or articulation (or articular surface) is the connection made between bones, ossicles, or other hard structures in the body which link an animal's skeletal system into a functional whole.Saladin, Ken. Anatomy & Physiology. 7th ed. McGraw- ...
cumulative distribution function (CDF) F_(x,y) satisfies :F_(x,y) = F_X(x) F_Y(y), or equivalently, their joint density f_(x,y) satisfies :f_(x,y) = f_X(x) f_Y(y). That is, the joint distribution is equal to the product of the marginal distributions. Unless it is not clear in context, in practice the modifier "mutual" is usually dropped so that independence means mutual independence. A statement such as " ''X'', ''Y'', ''Z'' are independent random variables" means that ''X'', ''Y'', ''Z'' are mutually independent.


Example

Pairwise independence does not imply mutual independence, as shown by the following example attributed to S. Bernstein. Remark 2.6.1, p. 120. Suppose ''X'' and ''Y'' are two independent tosses of a fair coin, where we designate 1 for heads and 0 for tails. Let the third random variable ''Z'' be equal to 1 if exactly one of those coin tosses resulted in "heads", and 0 otherwise (i.e., Z = X \oplus Y). Then jointly the triple (''X'', ''Y'', ''Z'') has the following
probability distribution In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
: :(X,Y,Z)=\left\{\begin{matrix} (0,0,0) & \text{with probability}\ 1/4, \\ (0,1,1) & \text{with probability}\ 1/4, \\ (1,0,1) & \text{with probability}\ 1/4, \\ (1,1,0) & \text{with probability}\ 1/4. \end{matrix}\right. Here the marginal probability distributions are identical: f_X(0)=f_Y(0)=f_Z(0)=1/2, and f_X(1)=f_Y(1)=f_Z(1)=1/2. The bivariate distributions also agree: f_{X,Y}=f_{X,Z}=f_{Y,Z}, where f_{X,Y}(0,0)=f_{X,Y}(0,1)=f_{X,Y}(1,0)=f_{X,Y}(1,1)=1/4. Since each of the pairwise joint distributions equals the product of their respective marginal distributions, the variables are pairwise independent: * ''X'' and ''Y'' are independent, and * ''X'' and ''Z'' are independent, and * ''Y'' and ''Z'' are independent. However, ''X'', ''Y'', and ''Z'' are not mutually independent, since f_{X,Y,Z}(x,y,z) \neq f_X(x)f_Y(y)f_Z(z), the left side equalling for example 1/4 for (''x'', ''y'', ''z'') = (0, 0, 0) while the right side equals 1/8 for (''x'', ''y'', ''z'') = (0, 0, 0). In fact, any of \{X,Y,Z\} is completely determined by the other two (any of ''X'', ''Y'', ''Z'' is the sum (modulo 2) of the others). That is as far from independence as random variables can get.


Probability of the union of pairwise independent events

Bounds on the
probability Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
that the sum of Bernoulli
random variables A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. The term 'random variable' in its mathematical definition refers ...
is at least one, commonly known as the
union bound In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the indiv ...
, are provided by the Boole–FréchetBoole, G. (1854). ''An Investigation of the Laws of Thought, On Which Are Founded the Mathematical Theories of Logic and Probability.'' Walton and Maberly, London. See Boole's "major" and "minor" limits of a conjunction on page 299.Fréchet, M. (1935). Généralisations du théorème des probabilités totales. ''Fundamenta Mathematicae'' 25: 379–387. inequalities. While these bounds assume only
univariate In mathematics, a univariate object is an expression (mathematics), expression, equation, function (mathematics), function or polynomial involving only one Variable (mathematics), variable. Objects involving more than one variable are ''wikt:multi ...
information, several bounds with knowledge of general bivariate probabilities, have been proposed too. Denote by \{\max} \sum_{i\neq j} p_{ij},
which subtracts the maximum weight of a
star A star is a luminous spheroid of plasma (physics), plasma held together by Self-gravitation, self-gravity. The List of nearest stars and brown dwarfs, nearest star to Earth is the Sun. Many other stars are visible to the naked eye at night sk ...
spanning tree In the mathematical field of graph theory, a spanning tree ''T'' of an undirected graph ''G'' is a subgraph that is a tree which includes all of the vertices of ''G''. In general, a graph may have several spanning trees, but a graph that is no ...
on a
complete graph In the mathematical field of graph theory, a complete graph is a simple undirected graph in which every pair of distinct vertices is connected by a unique edge. A complete digraph is a directed graph in which every pair of distinct vertices i ...
with n nodes (where the edge weights are given by p_{ij}) from the sum of the marginal probabilities \sum_i p_i.
Hunter-Worsley tightened this
upper bound In mathematics, particularly in order theory, an upper bound or majorant of a subset of some preordered set is an element of that is every element of . Dually, a lower bound or minorant of is defined to be an element of that is less ...
by optimizing over \tau \in T as follows:
:: \mathbb{P}(\displaystyle {\cup}_i A_{i}) \leq \displaystyle \sum_{i=1}^n p_{i}-\underset {\tau \in T}{\max}\sum_{(i,j) \in \tau} p_{ij}, where T is the set of all spanning trees on the graph. These bounds are not the tightest possible with general bivariates p_{ij} even when feasibility is guaranteed as shown in Boros et.al. However, when the variables are pairwise independent (p_{ij}=p_ip_j), Ramachandra—Natarajan showed that the Kounias-Hunter-Worsley bound is
tight Tight may refer to: Clothing * Skin-tight garment, a garment that is held to the skin by elastic tension * Tights, a type of leg coverings fabric extending from the waist to feet * Tightlacing, the practice of wearing a tightly-laced corset ...
by proving that the maximum probability of the union of events admits a
closed-form expression In mathematics, an expression or equation is in closed form if it is formed with constants, variables, and a set of functions considered as ''basic'' and connected by arithmetic operations (, and integer powers) and function composition. ...
given as:
where the
probabilities Probability is a branch of mathematics and statistics concerning Event (probability theory), events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probab ...
are sorted in increasing order as 0 \leq p_{1} \leq p_{2} \leq \ldots \leq p_{n} \leq 1. The
tight Tight may refer to: Clothing * Skin-tight garment, a garment that is held to the skin by elastic tension * Tights, a type of leg coverings fabric extending from the waist to feet * Tightlacing, the practice of wearing a tightly-laced corset ...
bound in depends only on the sum of the smallest n-1
probabilities Probability is a branch of mathematics and statistics concerning Event (probability theory), events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probab ...
\sum_{i=1}^{n-1} p_{i} and the largest probability p_n. Thus, while ordering of the
probabilities Probability is a branch of mathematics and statistics concerning Event (probability theory), events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probab ...
plays a role in the derivation of the bound, the ordering among the smallest n-1
probabilities Probability is a branch of mathematics and statistics concerning Event (probability theory), events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probab ...
\{p_1,p_2,...,p_{n-1}\} is inconsequential since only their sum is used.


Comparison with the Boole–Fréchet

union bound In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the indiv ...

It is useful to compare the smallest bounds on the probability of the union with arbitrary dependence and
pairwise independence In probability theory, a pairwise independent collection of random variables is a set of random variables any two of which are statistical independence, independent. Any collection of Mutual independence, mutually independent random variables is p ...
respectively. The tightest Boole–Fréchet upper
union bound In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the indiv ...
(assuming only
univariate In mathematics, a univariate object is an expression (mathematics), expression, equation, function (mathematics), function or polynomial involving only one Variable (mathematics), variable. Objects involving more than one variable are ''wikt:multi ...
information) is given as:
As shown in Ramachandra-Natarajan, it can be easily verified that the ratio of the two
tight Tight may refer to: Clothing * Skin-tight garment, a garment that is held to the skin by elastic tension * Tights, a type of leg coverings fabric extending from the waist to feet * Tightlacing, the practice of wearing a tightly-laced corset ...
bounds in and is upper bounded by 4/3 where the maximum value of 4/3 is attained when
::\sum_{i=1}^{n-1} p_{i}=1/2, p_n=1/2
where the
probabilities Probability is a branch of mathematics and statistics concerning Event (probability theory), events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probab ...
are sorted in increasing order as 0 \leq p_{1} \leq p_{2} \leq \ldots \leq p_{n} \leq 1. In other words, in the best-case scenario, the pairwise independence bound in provides an improvement of 25\% over the
univariate In mathematics, a univariate object is an expression (mathematics), expression, equation, function (mathematics), function or polynomial involving only one Variable (mathematics), variable. Objects involving more than one variable are ''wikt:multi ...
bound in .


Generalization

More generally, we can talk about ''k''-wise independence, for any ''k'' ≥ 2. The idea is similar: a set of
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
s is ''k''-wise independent if every subset of size ''k'' of those variables is independent. ''k''-wise independence has been used in theoretical computer science, where it was used to prove a theorem about the problem MAXEkSAT. ''k''-wise independence is used in the proof that k-independent hashing functions are secure unforgeable
message authentication code In cryptography, a message authentication code (MAC), sometimes known as an authentication tag, is a short piece of information used for authentication, authenticating and Data integrity, integrity-checking a message. In other words, it is used t ...
s.


See also

* Pairwise *
Disjoint sets In set theory in mathematics and Logic#Formal logic, formal logic, two Set (mathematics), sets are said to be disjoint sets if they have no element (mathematics), element in common. Equivalently, two disjoint sets are sets whose intersection (se ...


References

{{reflist Theory of probability distributions Independence (probability theory)