HOME

TheInfoList



OR:

In
mathematics Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
and economics, transportation theory or transport theory is a name given to the study of optimal
transport Transport (in British English) or transportation (in American English) is the intentional Motion, movement of humans, animals, and cargo, goods from one location to another. Mode of transport, Modes of transport include aviation, air, land tr ...
ation and allocation of resources. The problem was formalized by the French
mathematician A mathematician is someone who uses an extensive knowledge of mathematics in their work, typically to solve mathematical problems. Mathematicians are concerned with numbers, data, quantity, mathematical structure, structure, space, Mathematica ...
Gaspard Monge Gaspard Monge, Comte de Péluse (; 9 May 1746 – 28 July 1818) was a French mathematician, commonly presented as the inventor of descriptive geometry, (the mathematical basis of) technical drawing, and the father of differential geometry. Dur ...
in 1781.G. Monge. ''Mémoire sur la théorie des déblais et des remblais. Histoire de l’Académie Royale des Sciences de Paris, avec les Mémoires de Mathématique et de Physique pour la même année'', pages 666–704, 1781. In the 1920s A.N. Tolstoi was one of the first to study the transportation problem mathematically. In 1930, in the collection ''Transportation Planning Volume I'' for the National Commissariat of Transportation of the Soviet Union, he published a paper "Methods of Finding the Minimal Kilometrage in Cargo-transportation in space". Major advances were made in the field during World War II by the
Soviet The Union of Soviet Socialist Republics. (USSR), commonly known as the Soviet Union, was a List of former transcontinental countries#Since 1700, transcontinental country that spanned much of Eurasia from 1922 until Dissolution of the Soviet ...
mathematician and economist Leonid Kantorovich.L. Kantorovich. ''On the translocation of masses.'' C.R. (Doklady) Acad. Sci. URSS (N.S.), 37:199–201, 1942. Consequently, the problem as it is stated is sometimes known as the Monge–Kantorovich transportation problem. The
linear programming Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements and objective are represented by linear function#As a polynomia ...
formulation of the transportation problem is also known as the HitchcockKoopmans transportation problem.


Motivation


Mines and factories

Suppose that we have a collection of m mines mining iron ore, and a collection of n factories which use the iron ore that the mines produce. Suppose for the sake of argument that these mines and factories form two disjoint
subset In mathematics, a Set (mathematics), set ''A'' is a subset of a set ''B'' if all Element (mathematics), elements of ''A'' are also elements of ''B''; ''B'' is then a superset of ''A''. It is possible for ''A'' and ''B'' to be equal; if they a ...
s M and F of the
Euclidean plane In mathematics, a Euclidean plane is a Euclidean space of Two-dimensional space, dimension two, denoted \textbf^2 or \mathbb^2. It is a geometric space in which two real numbers are required to determine the position (geometry), position of eac ...
\mathbb^2. Suppose also that we have a ''cost function'' c : \mathbb^2 \times \mathbb^2 \to [0, \infty), so that c(x, y) is the cost of transporting one shipment of iron from x to y. For simplicity, we ignore the time taken to do the transporting. We also assume that each mine can supply only one factory (no splitting of shipments) and that each factory requires precisely one shipment to be in operation (factories cannot work at half- or double-capacity). Having made the above assumptions, a ''transport plan'' is a bijection T: M \to F. In other words, each mine m \in M supplies precisely one target factory T(m) \in F and each factory is supplied by precisely one mine. We wish to find the ''optimal transport plan'', the plan T whose ''total cost'' :c(T) := \sum_ c(m, T(m)) is the least of all possible transport plans from M to F. This motivating special case of the transportation problem is an instance of the assignment problem. More specifically, it is equivalent to finding a minimum weight matching in a bipartite graph.


Moving books: the importance of the cost function

The following simple example illustrates the importance of the cost function in determining the optimal transport plan. Suppose that we have n books of equal width on a shelf (the
real line A number line is a graphical representation of a straight line that serves as spatial representation of numbers, usually graduated like a ruler with a particular origin (geometry), origin point representing the number zero and evenly spaced mark ...
), arranged in a single contiguous block. We wish to rearrange them into another contiguous block, but shifted one book-width to the right. Two obvious candidates for the optimal transport plan present themselves: # move all n books one book-width to the right ("many small moves"); # move the left-most book n book-widths to the right and leave all other books fixed ("one big move"). If the cost function is proportional to Euclidean distance (c(x, y) = \alpha \, x - y\, for some \alpha > 0) then these two candidates are ''both'' optimal. If, on the other hand, we choose the strictly convex cost function proportional to the square of Euclidean distance (c(x, y) = \alpha \, x - y\, ^2 for some \alpha > 0), then the "many small moves" option becomes the unique minimizer. Note that the above cost functions consider only the horizontal distance traveled by the books, not the horizontal distance traveled by a device used to pick each book up and move the book into position. If the latter is considered instead, then, of the two transport plans, the second is always optimal for the Euclidean distance, while, provided there are at least 3 books, the first transport plan is optimal for the squared Euclidean distance.


Hitchcock problem

The following transportation problem formulation is credited to F. L. Hitchcock: :Suppose there are m sources x_1, \ldots, x_m for a commodity, with a(x_i) units of supply at x_i and n sinks y_1, \ldots, y_n for the commodity, with the demand b(y_j) at y_j. If c(x_i,\ y_j) is the unit cost of shipment from x_i to y_j, find a flow that satisfies demand from supplies and minimizes the flow cost. This challenge in logistics was taken up by D. R. Fulkerson and in the book ''Flows in Networks'' (1962) written with L. R. Ford Jr. Tjalling Koopmans is also credited with formulations of transport economics and allocation of resources.


Abstract formulation of the problem


Monge and Kantorovich formulations

The transportation problem as it is stated in modern or more technical literature looks somewhat different because of the development of
Riemannian geometry Riemannian geometry is the branch of differential geometry that studies Riemannian manifolds, defined as manifold, smooth manifolds with a ''Riemannian metric'' (an inner product on the tangent space at each point that varies smooth function, smo ...
and
measure theory In mathematics, the concept of a measure is a generalization and formalization of geometrical measures (length, area, volume) and other common notions, such as magnitude (mathematics), magnitude, mass, and probability of events. These seemingl ...
. The mines-factories example, simple as it is, is a useful reference point when thinking of the abstract case. In this setting, we allow the possibility that we may not wish to keep all mines and factories open for business, and allow mines to supply more than one factory, and factories to accept iron from more than one mine. Let X and Y be two separable
metric space In mathematics, a metric space is a Set (mathematics), set together with a notion of ''distance'' between its Element (mathematics), elements, usually called point (geometry), points. The distance is measured by a function (mathematics), functi ...
s such that any
probability measure In mathematics, a probability measure is a real-valued function defined on a set of events in a σ-algebra that satisfies Measure (mathematics), measure properties such as ''countable additivity''. The difference between a probability measure an ...
on X (or Y) is a Radon measure (i.e. they are Radon spaces). Let c : X \times Y \to , \infty) be a Borel-measurable function. Given probability measures \mu on X and \nu on Y, Monge's formulation of the optimal transportation problem is to find a transport map T : X \to Y that realizes the infimum :\inf \left\, where T_*(\mu) denotes the pushforward measure, push forward of \mu by T. A map T that attains this infimum (''i.e.'' makes it a minimum instead of an infimum) is called an "optimal transport map". Monge's formulation of the optimal transportation problem can be ill-posed, because sometimes there is no T satisfying T_*(\mu) = \nu : this happens, for example, when \mu is a Dirac measure but \nu is not. We can improve on this by adopting Kantorovich's formulation of the optimal transportation problem, which is to find a probability measure \gamma on X \times Y that attains the infimum :\inf \left\, where \Gamma (\mu, \nu) denotes the collection of all probability measures on X \times Y with marginals \mu on X and \nu on Y. It can be shownL. Ambrosio, N. Gigli & G. Savaré.
Gradient Flows in Metric Spaces and in the Space of Probability Measures
'' Lectures in Mathematics ETH Zürich, Birkhäuser Verlag, Basel. (2005)
that a minimizer for this problem always exists when the cost function c is lower semi-continuous and \Gamma(\mu, \nu) is a tight collection of measures (which is guaranteed for Radon spaces X and Y). (Compare this formulation with the definition of the Wasserstein metric W_p on the space of probability measures.) A gradient descent formulation for the solution of the Monge–Kantorovich problem was given by Sigurd Angenent, Steven Haker, and Allen Tannenbaum.


Duality formula

The minimum of the Kantorovich problem is equal to :\sup \left( \int_X \varphi (x) \, \mathrm \mu (x) + \int_Y \psi (y) \, \mathrm \nu (y) \right), where the supremum runs over all pairs of bounded and
continuous function In mathematics, a continuous function is a function such that a small variation of the argument induces a small variation of the value of the function. This implies there are no abrupt changes in value, known as '' discontinuities''. More preci ...
s \varphi : X \rightarrow \mathbb and \psi : Y \rightarrow \mathbb such that :\varphi (x) + \psi (y) \leq c(x, y).


Economic interpretation

The economic interpretation is clearer if signs are flipped. Let x \in X stand for the vector of characteristics of a worker, y \in Y for the vector of characteristics of a firm, and \Phi(x,y) =-c(x,y) for the economic output generated by worker x matched with firm y. Setting u(x) = -\varphi(x) and v(y) =-\psi(y), the Monge–Kantorovich problem rewrites: :\sup \left\ which has dual: :\inf \left\ where the infimum runs over bounded and continuous function u:X\to \mathbb and v:Y\to \mathbb. If the dual problem has a solution, one can see that: :v(y) =\sup_x \left\ so that u(x) interprets as the equilibrium wage of a worker of type x, and v(y) interprets as the equilibrium profit of a firm of type y.


Solution of the problem


Optimal transportation on the real line

For 1 \leq p < \infty, let \mathcal_p(\mathbb) denote the collection of
probability measure In mathematics, a probability measure is a real-valued function defined on a set of events in a σ-algebra that satisfies Measure (mathematics), measure properties such as ''countable additivity''. The difference between a probability measure an ...
s on \mathbb that have finite p-th moment. Let \mu, \nu \in \mathcal_p(\mathbb) and let c(x, y) = h(x-y), where h:\mathbb \to ,\infty) is a convex function. # If \mu has no atom (measure theory)">atom Atoms are the basic particles of the chemical elements. An atom consists of a atomic nucleus, nucleus of protons and generally neutrons, surrounded by an electromagnetically bound swarm of electrons. The chemical elements are distinguished fr ...
, i.e., if the cumulative distribution function F_\mu : \mathbb\to[0,1] of \mu is a
continuous function In mathematics, a continuous function is a function such that a small variation of the argument induces a small variation of the value of the function. This implies there are no abrupt changes in value, known as '' discontinuities''. More preci ...
, then F_^ \circ F_ : \mathbb \to \mathbb is an optimal transport map. It is the unique optimal transport map if h is strictly convex. # We have ::\min_ \int_ c(x, y) \, \mathrm \gamma (x, y) = \int_0^1 c \left( F_^ (s), F_^ (s) \right) \, \mathrm s. The proof of this solution appears in Rachev & Rüschendorf (1998).Rachev, Svetlozar T., and Ludger Rüschendorf. ''Mass Transportation Problems: Volume I: Theory''. Vol. 1. Springer, 1998.


Discrete version and linear programming formulation

In the case where the margins \mu and \nu are discrete, let \mu_x and \nu_y be the probability masses respectively assigned to x\in \mathbf and y\in \mathbf, and let \gamma_ be the probability of an xy assignment. The objective function in the primal Kantorovich problem is then : \sum_ \gamma_c_ and the constraint \gamma \in \Gamma(\mu ,\nu) expresses as : \sum_\gamma_=\mu_x,\forall x\in \mathbf and : \sum_ \gamma_=\nu_y,\forall y\in \mathbf. In order to input this in a
linear programming Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements and objective are represented by linear function#As a polynomia ...
problem, we need to vectorize the matrix \gamma_ by either stacking its columns or its rows, we call \operatorname this operation. In the column-major order, the constraints above rewrite as : \left( 1_\otimes I_\right) \operatorname(\gamma)=\mu and \left( I_\otimes 1_\right) \operatorname(\gamma)=\nu where \otimes is the Kronecker product, 1_ is a matrix of size n\times m with all entries of ones, and I_ is the identity matrix of size n. As a result, setting z=\operatorname(\gamma) , the linear programming formulation of the problem is : \begin & \text &&\operatorname(c)^\top z \\ pt& \text && z \ge 0, \\ pt& && \begin 1_\otimes I_ \\ I_\otimes 1_ \end z=\binom \end which can be readily inputted in a large-scale linear programming solver (see chapter 3.4 of Galichon (2016) Galichon, Alfred. ''Optimal Transport Methods in Economics''. Princeton University Press, 2016.).


Semi-discrete case

In the semi-discrete case, X=Y=\mathbb^d and \mu is a continuous distribution over \mathbb^d, while \nu =\sum_^J \nu_j\delta_ is a discrete distribution which assigns probability mass \nu_j to site y_j \in \mathbb^d. In this case, we can see that the primal and dual Kantorovich problems respectively boil down to: : \inf \left\ for the primal, where \gamma \in \Gamma(\mu ,\nu) means that \int_X d\gamma_j(x)=\nu_j and \sum_j d\gamma_j(x) =d\mu(x), and: : \sup \left\ for the dual, which can be rewritten as: : \sup_\left\ which is a finite-dimensional convex optimization problem that can be solved by standard techniques, such as gradient descent. In the case when c(x,y)=, x-y, ^2/2, one can show that the set of x\in \mathbf assigned to a particular site j is a convex polyhedron. The resulting configuration is called a power diagram.


Quadratic normal case

Assume the particular case \mu =\mathcal(0,\Sigma_X), \nu =\mathcal(0,\Sigma _), and c(x,y) =, y-Ax, ^2/2 where A is invertible. One then has : \varphi(x) =-x^\top \Sigma_X^\left( \Sigma_X^A^\top \Sigma_Y A\Sigma_X^\right) ^\Sigma_^x/2 : \psi(y) =-y^\top A\Sigma_X^\left( \Sigma_X^A^\top \Sigma_Y A\Sigma_^\right)^ \Sigma_X^Ay/2 : T(x) = (A^\top)^\Sigma_X^ \left(\Sigma_X^A^\top \Sigma_Y A\Sigma_X^ \right)^ \Sigma_X^x The proof of this solution appears in Galichon (2016).


Separable Hilbert spaces

Let X be a separable
Hilbert space In mathematics, a Hilbert space is a real number, real or complex number, complex inner product space that is also a complete metric space with respect to the metric induced by the inner product. It generalizes the notion of Euclidean space. The ...
. Let \mathcal_p(X) denote the collection of probability measures on X that have finite p-th moment; let \mathcal_p^r(X) denote those elements \mu \in \mathcal_p(X) that are Gaussian regular: if g is any strictly positive
Gaussian measure In mathematics, Gaussian measure is a Borel measure on finite-dimensional Euclidean space \mathbb^n, closely related to the normal distribution in statistics. There is also a generalization to infinite-dimensional spaces. Gaussian measures are na ...
on X and g(N) = 0, then \mu(N) = 0 also. Let \mu \in \mathcal_p^r (X), \nu \in \mathcal_p(X), c (x, y) = , x - y , ^p/p for p\in(1,\infty), p^ + q^ = 1. Then the Kantorovich problem has a unique solution \kappa, and this solution is induced by an optimal transport map: i.e., there exists a Borel map r\in L^p(X, \mu; X) such that :\kappa = (\mathrm_X \times r)_ (\mu) \in \Gamma (\mu, \nu). Moreover, if \nu has bounded support, then :r(x) = x - , \nabla \varphi (x) , ^ \, \nabla \varphi (x) for \mu-almost all x\in X for some locally Lipschitz, c-concave and maximal Kantorovich potential \varphi. (Here \nabla \varphi denotes the Gateaux derivative of \varphi.)


Entropic regularization

Consider a variant of the discrete problem above, where we have added an entropic regularization term to the objective function of the primal problem : \begin & \text \sum_\gamma_c_+\varepsilon \gamma_ \ln \gamma_ \\ pt& \text \\ pt& \gamma\ge0 \\ pt& \sum_\gamma_ = \mu_x, \forall x\in \mathbf \\ pt& \sum_\gamma_ = \nu_y, \forall y\in \mathbf \end One can show that the dual regularized problem is : \max_ \sum_ \varphi_x \mu_x + \sum_ \psi_y v_y - \varepsilon \sum_ \exp \left( \frac\right) where, compared with the unregularized version, the "hard" constraint in the former dual (\varphi_x + \psi_y - c_\geq 0) has been replaced by a "soft" penalization of that constraint (the sum of the \varepsilon \exp \left( (\varphi _x + \psi_y - c_)/\varepsilon \right) terms). The optimality conditions in the dual problem can be expressed as : \mu_x = \sum_ \exp \left( \frac \right) ~\forall x\in \mathbf : \nu_y = \sum_ \exp \left( \frac\right) ~\forall y\in \mathbf Denoting A as the , \mathbf, \times , \mathbf, matrix of term A_=\exp \left(-c_ / \varepsilon \right), solving the dual is therefore equivalent to looking for two diagonal positive matrices D_1 and D_2 of respective sizes , \mathbf, and , \mathbf, , such that D_1AD_2 1_=\mu and (D_1AD_2)^\top 1_=\nu . The existence of such matrices generalizes Sinkhorn's theorem and the matrices can be computed using the Sinkhorn–Knopp algorithm, which simply consists of iteratively looking for \varphi_x to solve , and \psi_y to solve . Sinkhorn–Knopp's algorithm is therefore a coordinate descent algorithm on the dual regularized problem.


Applications

The Monge–Kantorovich optimal transport has found applications in wide range in different fields. Among them are: * Image registration and warping * Reflector design * Retrieving information from shadowgraphy and proton radiography * Seismic tomography and reflection seismology * The broad class of
economic An economy is an area of the Production (economics), production, Distribution (economics), distribution and trade, as well as Consumption (economics), consumption of Goods (economics), goods and Service (economics), services. In general, it is ...
modelling that involves gross substitutes property (among others, models of matching and discrete choice).


See also

* Wasserstein metric * Transport function *
Hungarian algorithm The Hungarian method is a combinatorial optimization algorithm In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific ...
* Transportation planning * Earth mover's distance * Monge–Ampère equation


References


Further reading

* {{DEFAULTSORT:Transportation Theory Calculus of variations Matching (graph theory) Mathematical economics Measure theory Transport economics Optimization in vector spaces Mathematical optimization in business