mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...

, especially in

probability Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...

and

combinatorics Combinatorics is an area of mathematics primarily concerned with counting, both as a means and an end in obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many a ...

, a doubly stochastic matrix (also called bistochastic matrix) is a

square matrix In mathematics, a square matrix is a matrix with the same number of rows and columns. An ''n''-by-''n'' matrix is known as a square matrix of order Any two square matrices of the same order can be added and multiplied. Square matrices are often ...

X=(x_)

of nonnegative

real number In mathematics, a real number is a number that can be used to measure a ''continuous'' one-dimensional quantity such as a distance, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small variations. Every ...

s, each of whose rows and columns sums to 1, i.e., :

\sum_i x_=\sum_j x_=1,

Thus, a doubly stochastic matrix is both left

stochastic Stochastic (, ) refers to the property of being well described by a random probability distribution. Although stochasticity and randomness are distinct in that the former refers to a modeling approach and the latter refers to phenomena themselv ...

and right stochastic. Indeed, any matrix that is both left and right stochastic must be

square In Euclidean geometry, a square is a regular quadrilateral, which means that it has four equal sides and four equal angles (90- degree angles, π/2 radian angles, or right angles). It can also be defined as a rectangle with two equal-length a ...

: if every row sums to one then the sum of all entries in the matrix must be equal to the number of rows, and since the same holds for columns, the number of rows and columns must be equal.

Birkhoff polytope

The class of

n\times n

doubly stochastic matrices is a

convex polytope A convex polytope is a special case of a polytope, having the additional property that it is also a convex set contained in the n-dimensional Euclidean space \mathbb^n. Most texts. use the term "polytope" for a bounded convex polytope, and the w ...

known as the Birkhoff polytope

B_n

. Using the matrix entries as

Cartesian coordinates A Cartesian coordinate system (, ) in a plane is a coordinate system that specifies each point uniquely by a pair of numerical coordinates, which are the signed distances to the point from two fixed perpendicular oriented lines, measured in ...

, it lies in an

(n-1)^2

-dimensional affine subspace of

n^2

-dimensional

Euclidean space Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, that is, in Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics there are Euclidea ...

defined by

2n-1

independent linear constraints specifying that the row and column sums all equal one. (There are

2n-1

constraints rather than

2n

because one of these constraints is dependent, as the sum of the row sums must equal the sum of the column sums.) Moreover, the entries are all constrained to be non-negative and less than or equal to one.

Birkhoff–von Neumann theorem

The Birkhoff–von Neumann theorem (often known simply as Birkhoff's theoremW. B. Jurkat and H. J. Ryser, "Term Ranks and Permanents of Nonnegative Matrices" (1967).) states that the polytope

B_n

is the

convex hull In geometry, the convex hull or convex envelope or convex closure of a shape is the smallest convex set that contains it. The convex hull may be defined either as the intersection of all convex sets containing a given subset of a Euclidean space ...

of the set of

n\times n

permutation matrices, and furthermore that the vertices of

B_n

are precisely the permutation matrices. In other words, if

X

is a doubly stochastic matrix, then there exist

\theta_1,\ldots,\theta_k \ge 0, \sum_^k \theta_i = 1

and permutation matrices

P_1,\ldots,P_k

such that :

X = \theta_1 P_1 + \cdots + \theta_k P_k.

(Such a decomposition of ''X'' is known as a 'convex combination'.) A proof of the theorem based on

Hall's marriage theorem In mathematics, Hall's marriage theorem, proved by , is a theorem with two equivalent formulations: * The combinatorial formulation deals with a collection of finite sets. It gives a necessary and sufficient condition for being able to select a di ...

is given

below Below may refer to: *Earth *Ground (disambiguation) *Soil *Floor *Bottom (disambiguation) *Less than *Temperatures below freezing *Hell or underworld People with the surname *Ernst von Below (1863–1955), German World War I general *Fred Below ( ...

. This representation is known as the Birkhoff–von Neumann decomposition, and may not be unique. It is often described as a real-valued generalization of Kőnig's theorem, where the correspondence is established through adjacency matrices of graphs.

Other properties

* The product of two doubly stochastic matrices is doubly stochastic. However, the inverse of a nonsingular doubly stochastic matrix need not be doubly stochastic (indeed, the inverse is doubly stochastic iff it has nonnegative entries). * The stationary distribution of an irreducible aperiodic finite

Markov chain A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happen ...

is uniform if and only if its transition matrix is doubly stochastic. * Sinkhorn's theorem states that any matrix with strictly positive entries can be made doubly stochastic by pre- and post-multiplication by diagonal matrices. * For

n=2

, all bistochastic matrices are

unistochastic In mathematics, a unistochastic matrix (also called ''unitary-stochastic'') is a doubly stochastic matrix whose entries are the squares of the absolute values of the entries of some unitary matrix. A square matrix ''B'' of size ''n'' is doubly sto ...

and orthostochastic, but for larger

n

this is not the case. *

Van der Waerden's conjecture In linear algebra, the permanent of a square matrix is a function of the matrix similar to the determinant. The permanent, as well as the determinant, is a polynomial in the entries of the matrix. Both are special cases of a more general functio ...

that the minimum permanent among all doubly stochastic matrices is

n!/n^n

, achieved by the matrix for which all entries are equal to

1/n

. Proofs of this conjecture were published in 1980 by B. Gyires and in 1981 by G. P. Egorychev and D. I. Falikman; for this work, Egorychev and Falikman won the

Fulkerson Prize The Fulkerson Prize for outstanding papers in the area of discrete mathematics is sponsored jointly by the Mathematical Optimization Society (MOS) and the American Mathematical Society (AMS). Up to three awards of $1,500 each are presented at e ...

in 1982.Fulkerson Prize
Mathematical Optimization Society, retrieved 2012-08-19.

Proof of the Birkhoff–von Neumann theorem

Let ''X'' be a doubly stochastic matrix. Then we will show that there exists a permutation matrix ''P'' such that ''x_ij'' ≠ 0 whenever ''p_ij'' ≠ 0. Thus if we let λ be the smallest ''x_ij'' corresponding to a non-zero ''p_ij'', the difference ''X'' – λ''P'' will be a scalar multiple of a doubly stochastic matrix and will have at least one more zero cell than ''X''. Accordingly we may successively reduce the number of non-zero cells in ''X'' by removing scalar multiples of permutation matrices until we arrive at the zero matrix, at which point we will have constructed a convex combination of permutation matrices equal to the original ''X''.Birkhoff's theorem
notes by Gábor Hetyei. For instance if

X=\frac\begin 7 & 0 & 5 \\ 2 & 6 & 4 \\ 3 & 6 & 3 \end

then

P=\begin 0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end

\lambda = \frac

, and

X-\lambda P=\frac\begin 7 & 0 & 3 \\ 0 & 6 & 4 \\ 3 & 4 & 3 \end

. ''Proof:'' Construct a

bipartite graph In the mathematical field of graph theory, a bipartite graph (or bigraph) is a graph whose vertices can be divided into two disjoint and independent sets U and V, that is every edge connects a vertex in U to one in V. Vertex sets U and V a ...

in which the rows of ''X'' are listed in one part and the columns in the other, and in which row ''i'' is connected to column ''j'' iff ''x_ij'' ≠ 0. Let ''A'' be any set of rows, and define A' as the set of columns joined to rows in ''A'' in the graph. We want to express the sizes , ''A'', and , A', of the two sets in terms of the ''x_ij''. For every ''i'' in ''A'', the sum over ''j'' in A' of ''x_ij'' is 1, since all columns ''j'' for which ''x_ij'' ≠ 0 are included in A', and ''X'' is doubly stochastic; hence , ''A'', is the sum over all ''i'' ∈ ''A'', ''j'' ∈ A' of ''x_ij''. Meanwhile , A', is the sum over all ''i'' (whether or not in ''A'') and all ''j'' in A' of ''x_ij'' ; and this is ≥ the corresponding sum in which the ''i'' are limited to rows in ''A''. Hence , A', ≥ , ''A'', . It follows that the conditions of

are satisfied, and that we can therefore find a set of edges in the graph which join each row in ''X'' to exactly one (distinct) column. These edges define a permutation matrix whose non-zero cells correspond to non-zero cells in ''X''. ∎

Generalisations

There is a simple generalisation to matrices with more columns and rows such that the ''i^th'' row sum is equal to ''r_i'' (a positive integer), the column sums are equal to 1, and all cells are non-negative (the sum of the row sums being equal to the number of columns). Any matrix in this form can be expressed as a convex combination of matrices in the same form made up of 0s and 1s. The proof is to replace the ''i^th'' row of the original matrix by ''r_i'' separate rows, each equal to the original row divided by ''r_i'' ; to apply Birkhoff's theorem to the resulting square matrix; and at the end to additively recombine the ''r_i'' rows into a single ''i^th'' row. In the same way it is possible to replicate columns as well as rows, but the result of recombination is not necessarily limited to 0s and 1s. A different generalisation (with a significantly harder proof) has been put forward by R. M. Caron et al.R. M. Caron et al., 'Nonsquare "Doubly Stochastic" Matrices', 1996.

References

External links

PlanetMath page on Birkhoff–von Neumann theorem

PlanetMath page on proof of Birkhoff–von Neumann theorem
{{Matrix classes Matrices