In
mathematics
Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
and economics, transportation theory or transport theory is a name given to the study of optimal
transport
Transport (in British English) or transportation (in American English) is the intentional Motion, movement of humans, animals, and cargo, goods from one location to another. Mode of transport, Modes of transport include aviation, air, land tr ...
ation and
allocation of resources
In economics, resource allocation is the assignment of available resources to various uses. In the context of an entire economy, resources can be allocated by various means, such as markets, or planning.
In project management, resource allocation ...
. The problem was formalized by the French
mathematician
A mathematician is someone who uses an extensive knowledge of mathematics in their work, typically to solve mathematical problems. Mathematicians are concerned with numbers, data, quantity, mathematical structure, structure, space, Mathematica ...
Gaspard Monge
Gaspard Monge, Comte de Péluse (; 9 May 1746 – 28 July 1818) was a French mathematician, commonly presented as the inventor of descriptive geometry, (the mathematical basis of) technical drawing, and the father of differential geometry. Dur ...
in 1781.
[G. Monge. ''Mémoire sur la théorie des déblais et des remblais. Histoire de l’Académie Royale des Sciences de Paris, avec les Mémoires de Mathématique et de Physique pour la même année'', pages 666–704, 1781.]
In the 1920s A.N. Tolstoi was one of the first to study the transportation problem
mathematically. In 1930, in the collection ''Transportation Planning Volume I'' for the National Commissariat of Transportation of the Soviet Union, he published a paper "Methods of Finding the Minimal Kilometrage in Cargo-transportation in space".
Major advances were made in the field during World War II by the
Soviet
The Union of Soviet Socialist Republics. (USSR), commonly known as the Soviet Union, was a List of former transcontinental countries#Since 1700, transcontinental country that spanned much of Eurasia from 1922 until Dissolution of the Soviet ...
mathematician and economist
Leonid Kantorovich
Leonid Vitalyevich Kantorovich (, ; 19 January 19127 April 1986) was a Soviet mathematician and economist, known for his theory and development of techniques for the optimal allocation of resources. He is regarded as the founder of linear programm ...
.
[L. Kantorovich. ''On the translocation of masses.'' C.R. (Doklady) Acad. Sci. URSS (N.S.), 37:199–201, 1942.] Consequently, the problem as it is stated is sometimes known as the Monge–Kantorovich transportation problem.
The
linear programming
Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements and objective are represented by linear function#As a polynomia ...
formulation of the transportation problem is also known as the
Hitchcock
Sir Alfred Joseph Hitchcock (13 August 1899 – 29 April 1980) was an English film director. He is widely regarded as one of the most influential figures in the history of cinema. In a career spanning six decades, he directed over 50 featu ...
–
Koopmans Koopmans is a Dutch occupational surname meaning "merchant's".
Motivation
Mines and factories
Suppose that we have a collection of mines mining iron ore, and a collection of factories which use the iron ore that the mines produce. Suppose for the sake of argument that these mines and factories form two disjoint subset
In mathematics, a Set (mathematics), set ''A'' is a subset of a set ''B'' if all Element (mathematics), elements of ''A'' are also elements of ''B''; ''B'' is then a superset of ''A''. It is possible for ''A'' and ''B'' to be equal; if they a ...
s and of the Euclidean plane
In mathematics, a Euclidean plane is a Euclidean space of Two-dimensional space, dimension two, denoted \textbf^2 or \mathbb^2. It is a geometric space in which two real numbers are required to determine the position (geometry), position of eac ...
. Suppose also that we have a ''cost function'' , so that is the cost of transporting one shipment of iron from to . For simplicity, we ignore the time taken to do the transporting. We also assume that each mine can supply only one factory (no splitting of shipments) and that each factory requires precisely one shipment to be in operation (factories cannot work at half- or double-capacity). Having made the above assumptions, a ''transport plan'' is a bijection .
In other words, each mine supplies precisely one target factory and each factory is supplied by precisely one mine.
We wish to find the ''optimal transport plan'', the plan whose ''total cost''
:
is the least of all possible transport plans from to . This motivating special case of the transportation problem is an instance of the assignment problem
The assignment problem is a fundamental combinatorial optimization problem. In its most general form, the problem is as follows:
:The problem instance has a number of ''agents'' and a number of ''tasks''. Any agent can be assigned to perform any t ...
.
More specifically, it is equivalent to finding a minimum weight matching in a bipartite graph
In the mathematics, mathematical field of graph theory, a bipartite graph (or bigraph) is a Graph (discrete mathematics), graph whose vertex (graph theory), vertices can be divided into two disjoint sets, disjoint and Independent set (graph theo ...
.
Moving books: the importance of the cost function
The following simple example illustrates the importance of the cost function in determining the optimal transport plan. Suppose that we have books of equal width on a shelf (the real line
A number line is a graphical representation of a straight line that serves as spatial representation of numbers, usually graduated like a ruler with a particular origin (geometry), origin point representing the number zero and evenly spaced mark ...
), arranged in a single contiguous block. We wish to rearrange them into another contiguous block, but shifted one book-width to the right. Two obvious candidates for the optimal transport plan present themselves:
# move all books one book-width to the right ("many small moves");
# move the left-most book book-widths to the right and leave all other books fixed ("one big move").
If the cost function is proportional to Euclidean distance ( for some ) then these two candidates are ''both'' optimal. If, on the other hand, we choose the strictly convex cost function proportional to the square of Euclidean distance ( for some ), then the "many small moves" option becomes the unique minimizer.
Note that the above cost functions consider only the horizontal distance traveled by the books, not the horizontal distance traveled by a device used to pick each book up and move the book into position. If the latter is considered instead, then, of the two transport plans, the second is always optimal for the Euclidean distance, while, provided there are at least 3 books, the first transport plan is optimal for the squared Euclidean distance.
Hitchcock problem
The following transportation problem formulation is credited to F. L. Hitchcock:
:Suppose there are sources for a commodity, with units of supply at and sinks for the commodity, with the demand at . If is the unit cost of shipment from to , find a flow that satisfies demand from supplies and minimizes the flow cost. This challenge in logistics was taken up by D. R. Fulkerson
Delbert Ray Fulkerson (; August 14, 1924 – January 10, 1976) was an American mathematician who co-developed the FordFulkerson algorithm, one of the most well-known algorithms to solve the maximum flow problem in networks.
Early life and educa ...
and in the book ''Flows in Networks'' (1962) written with L. R. Ford Jr.
Lester Randolph Ford Jr. (September 23, 1927 – February 26, 2017) was an American mathematician specializing in network flow problems. He was the son of mathematician Lester R. Ford Sr.
Ford's paper with D. R. Fulkerson on the maximum flow pr ...
Tjalling Koopmans
Tjalling Charles Koopmans (August 28, 1910 – February 26, 1985) was a Dutch-American mathematician and economist. He was the joint winner with Leonid Kantorovich of the 1975 Nobel Memorial Prize in Economic Sciences for his work on the theory ...
is also credited with formulations of transport economics
Transport economics is a branch of economics founded in 1959 by American economist John R. Meyer that deals with the resource allocation, allocation of resources within the transport sector. It has strong links to civil engineering. Transport ec ...
and allocation of resources.
Abstract formulation of the problem
Monge and Kantorovich formulations
The transportation problem as it is stated in modern or more technical literature looks somewhat different because of the development of Riemannian geometry
Riemannian geometry is the branch of differential geometry that studies Riemannian manifolds, defined as manifold, smooth manifolds with a ''Riemannian metric'' (an inner product on the tangent space at each point that varies smooth function, smo ...
and measure theory
In mathematics, the concept of a measure is a generalization and formalization of geometrical measures (length, area, volume) and other common notions, such as magnitude (mathematics), magnitude, mass, and probability of events. These seemingl ...
. The mines-factories example, simple as it is, is a useful reference point when thinking of the abstract case. In this setting, we allow the possibility that we may not wish to keep all mines and factories open for business, and allow mines to supply more than one factory, and factories to accept iron from more than one mine.
Let and be two separable metric space
In mathematics, a metric space is a Set (mathematics), set together with a notion of ''distance'' between its Element (mathematics), elements, usually called point (geometry), points. The distance is measured by a function (mathematics), functi ...
s such that any probability measure
In mathematics, a probability measure is a real-valued function defined on a set of events in a σ-algebra that satisfies Measure (mathematics), measure properties such as ''countable additivity''. The difference between a probability measure an ...
on (or ) is a Radon measure
In mathematics (specifically in measure theory), a Radon measure, named after Johann Radon, is a measure on the -algebra of Borel sets of a Hausdorff topological space that is finite on all compact sets, outer regular on all Borel sets, and ...
(i.e. they are Radon space
In the mathematical discipline of general topology, a Polish space is a separable completely metrizable topological space; that is, a space homeomorphic to a complete metric space that has a countable dense subset. Polish spaces are so named beca ...
s). Let be a Borel-measurable function. Given probability measures on and on , Monge's formulation of the optimal transportation problem is to find a transport map that realizes the infimum
:
where denotes the pushforward measure, push forward of by . A map that attains this infimum (''i.e.'' makes it a minimum
In mathematical analysis, the maximum and minimum of a function are, respectively, the greatest and least value taken by the function. Known generically as extremum, they may be defined either within a given range (the ''local'' or ''relative ...
instead of an infimum) is called an "optimal transport map".
Monge's formulation of the optimal transportation problem can be ill-posed, because sometimes there is no satisfying : this happens, for example, when is a Dirac measure
In mathematics, a Dirac measure assigns a size to a set based solely on whether it contains a fixed element ''x'' or not. It is one way of formalizing the idea of the Dirac delta function, an important tool in physics and other technical fields.
...
but is not.
We can improve on this by adopting Kantorovich's formulation of the optimal transportation problem, which is to find a probability measure on that attains the infimum
:
where denotes the collection of all probability measures on with marginals on and on . It can be shown[L. Ambrosio, N. Gigli & G. Savaré. ]
Gradient Flows in Metric Spaces and in the Space of Probability Measures
'' Lectures in Mathematics ETH Zürich, Birkhäuser Verlag, Basel. (2005) that a minimizer for this problem always exists when the cost function is lower semi-continuous and is a tight
Tight may refer to:
Clothing
* Skin-tight garment, a garment that is held to the skin by elastic tension
* Tights, a type of leg coverings fabric extending from the waist to feet
* Tightlacing, the practice of wearing a tightly-laced corset
...
collection of measures (which is guaranteed for Radon spaces and ). (Compare this formulation with the definition of the Wasserstein metric
In mathematics, the Wasserstein distance or Kantorovich– Rubinstein metric is a distance function defined between probability distributions on a given metric space M. It is named after Leonid Vaseršteĭn.
Intuitively, if each distribution ...
on the space of probability measures.) A gradient descent formulation for the solution of the Monge–Kantorovich problem was given by Sigurd Angenent
Sigurd Bernardus Angenent (born 1960) is a Dutch-born mathematician and professor at the University of Wisconsin–Madison. Angenent works on partial differential equations and dynamical systems, with his recent research focusing on heat equation a ...
, Steven Haker, and Allen Tannenbaum
Allen Robert Tannenbaum (January 25, 1953 – ) was an American applied mathematician who finished his career as a Distinguished Professor in the Departments of Computer Science and Applied Mathematics & Statistics at the State University of New ...
.
Duality formula
The minimum of the Kantorovich problem is equal to
:
where the supremum
In mathematics, the infimum (abbreviated inf; : infima) of a subset S of a partially ordered set P is the greatest element in P that is less than or equal to each element of S, if such an element exists. If the infimum of S exists, it is unique, ...
runs over all pairs of bounded and continuous function
In mathematics, a continuous function is a function such that a small variation of the argument induces a small variation of the value of the function. This implies there are no abrupt changes in value, known as '' discontinuities''. More preci ...
s and such that
:
Economic interpretation
The economic interpretation is clearer if signs are flipped. Let stand for the vector of characteristics of a worker, for the vector of characteristics of a firm, and for the economic output generated by worker matched with firm . Setting and , the Monge–Kantorovich problem
rewrites:
:
which has dual
Dual or Duals may refer to:
Paired/two things
* Dual (mathematics), a notion of paired concepts that mirror one another
** Dual (category theory), a formalization of mathematical duality
*** see more cases in :Duality theories
* Dual number, a nu ...
:
:
where the infimum runs over bounded and continuous function and . If the dual problem has a solution, one can see that:
:
so that interprets as the equilibrium wage of a worker of type , and interprets as the equilibrium profit of a firm of type .[
]
Solution of the problem
Optimal transportation on the real line
For , let denote the collection of probability measure
In mathematics, a probability measure is a real-valued function defined on a set of events in a σ-algebra that satisfies Measure (mathematics), measure properties such as ''countable additivity''. The difference between a probability measure an ...
s on that have finite -th moment. Let and let , where is a convex function.
# If has no atom (measure theory)">atom
Atoms are the basic particles of the chemical elements. An atom consists of a atomic nucleus, nucleus of protons and generally neutrons, surrounded by an electromagnetically bound swarm of electrons. The chemical elements are distinguished fr ...
, i.e., if the cumulative distribution function
of
is a
continuous function
In mathematics, a continuous function is a function such that a small variation of the argument induces a small variation of the value of the function. This implies there are no abrupt changes in value, known as '' discontinuities''. More preci ...
, then
is an optimal transport map. It is the unique optimal transport map if
is strictly convex.
# We have
::
The proof of this solution appears in Rachev & Rüschendorf (1998).
[Rachev, Svetlozar T., and Ludger Rüschendorf. ''Mass Transportation Problems: Volume I: Theory''. Vol. 1. Springer, 1998.]
Discrete version and linear programming formulation
In the case where the margins
and
are discrete, let
and
be the probability masses respectively assigned to
and
, and let
be the probability of an
assignment. The objective function in the primal Kantorovich problem is then
:
and the constraint
expresses as
:
and
:
In order to input this in a
linear programming
Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements and objective are represented by linear function#As a polynomia ...
problem, we need to
vectorize the matrix
by either stacking
its columns or its rows, we call
this operation. In the
column-major order
In computing, row-major order and column-major order are methods for storing multidimensional arrays in linear storage such as random access memory.
The difference between the orders lies in which elements of an array are contiguous in memory. In ...
, the constraints above rewrite as
:
and
where
is the
Kronecker product
In mathematics, the Kronecker product, sometimes denoted by ⊗, is an operation on two matrices of arbitrary size resulting in a block matrix. It is a specialization of the tensor product (which is denoted by the same symbol) from vector ...
,
is a matrix of size
with all entries of ones, and
is the identity matrix of size
. As a result, setting
, the linear programming formulation of the problem is
:
which can be readily inputted in a large-scale linear programming solver (see chapter 3.4 of Galichon (2016)
[ Galichon, Alfred. ''Optimal Transport Methods in Economics''. Princeton University Press, 2016.]).
Semi-discrete case
In the semi-discrete case,
and
is a continuous distribution over
, while
is a discrete distribution which assigns probability mass
to site
. In this case, we can see that the primal and dual Kantorovich problems respectively boil down to:
:
for the primal, where
means that
and
, and:
:
for the dual, which can be rewritten as:
:
which is a finite-dimensional convex optimization problem that can be solved by standard techniques, such as
gradient descent
Gradient descent is a method for unconstrained mathematical optimization. It is a first-order iterative algorithm for minimizing a differentiable multivariate function.
The idea is to take repeated steps in the opposite direction of the gradi ...
.
In the case when
, one can show that the set of
assigned to a particular site
is a convex polyhedron. The resulting configuration is called a
power diagram
In computational geometry, a power diagram, also called a Laguerre–Voronoi diagram, Dirichlet cell complex, radical Voronoi tesselation or a sectional Dirichlet tesselation, is a partition of the Euclidean plane into polygonal cells defined from ...
.
Quadratic normal case
Assume the particular case
,
, and
where
is invertible. One then has
:
:
:
The proof of this solution appears in Galichon (2016).
[
]
Separable Hilbert spaces
Let be a separable Hilbert space
In mathematics, a Hilbert space is a real number, real or complex number, complex inner product space that is also a complete metric space with respect to the metric induced by the inner product. It generalizes the notion of Euclidean space. The ...
. Let denote the collection of probability measures on that have finite -th moment; let denote those elements that are Gaussian regular: if is any strictly positive Gaussian measure
In mathematics, Gaussian measure is a Borel measure on finite-dimensional Euclidean space \mathbb^n, closely related to the normal distribution in statistics. There is also a generalization to infinite-dimensional spaces. Gaussian measures are na ...
on and , then also.
Let , , for . Then the Kantorovich problem has a unique solution , and this solution is induced by an optimal transport map: i.e., there exists a Borel map such that
:
Moreover, if has bounded support
Support may refer to:
Arts, entertainment, and media
* Supporting character
* Support (art), a solid surface upon which a painting is executed
Business and finance
* Support (technical analysis)
* Child support
* Customer support
* Income Su ...
, then
:
for -almost all for some locally Lipschitz
In mathematical analysis, Lipschitz continuity, named after German mathematician Rudolf Lipschitz, is a strong form of uniform continuity for functions. Intuitively, a Lipschitz continuous function is limited in how fast it can change: there e ...
, -concave and maximal Kantorovich potential . (Here denotes the Gateaux derivative
In mathematics, the Gateaux differential or Gateaux derivative is a generalization of the concept of directional derivative in differential calculus. Named after René Gateaux, it is defined for functions between locally convex topological vect ...
of .)
Entropic regularization
Consider a variant of the discrete problem above, where we have added an entropic regularization term to the objective function of the primal problem
:
One can show that the dual regularized problem is
:
where, compared with the unregularized version, the "hard" constraint in the former dual () has been replaced by a "soft" penalization of that constraint (the sum of the terms). The optimality conditions in the dual problem can be expressed as
:
:
Denoting as the matrix of term , solving the dual is therefore equivalent to looking for two diagonal positive matrices and of respective sizes and , such that and . The existence of such matrices generalizes Sinkhorn's theorem and the matrices can be computed using the Sinkhorn–Knopp algorithm, which simply consists of iteratively looking for to solve , and to solve . Sinkhorn–Knopp's algorithm is therefore a coordinate descent
Coordinate descent is an optimization algorithm that successively minimizes along coordinate directions to find the minimum of a function. At each iteration, the algorithm determines a coordinate or coordinate block via a coordinate selection ru ...
algorithm on the dual regularized problem.
Applications
The Monge–Kantorovich optimal transport has found applications in wide range in different fields. Among them are:
* Image registration
Image registration is the process of transforming different sets of data into one coordinate system. Data may be multiple photographs, data from different sensors, times, depths, or viewpoints. It is used in computer vision, medical imaging, mil ...
and warping
* Reflector design
* Retrieving information from shadowgraph
Shadowgraph is an optical method that reveals non-uniformities in transparent media like air, water, or glass. It is related to, but simpler than, the schlieren and schlieren photography methods that perform a similar function. Shadowgraph is a ...
y and proton radiography
* Seismic tomography
Seismic tomography or seismotomography is a technique for imaging the subsurface of the Earth using seismic waves. The properties of seismic waves are modified by the material through which they travel. By comparing the differences in seismic waves ...
and reflection seismology
Reflection seismology (or seismic reflection) is a method of exploration geophysics that uses the principles of seismology to estimate the properties of the Earth's subsurface from reflection (physics), reflected seismic waves. The method requir ...
* The broad class of economic
An economy is an area of the Production (economics), production, Distribution (economics), distribution and trade, as well as Consumption (economics), consumption of Goods (economics), goods and Service (economics), services. In general, it is ...
modelling that involves gross substitutes property (among others, models of matching and discrete choice
In economics, discrete choice models, or qualitative choice models, describe, explain, and predict choices between two or more discrete alternatives, such as entering or not entering the labor market, or choosing between modes of transport. Such c ...
).
See also
* Wasserstein metric
In mathematics, the Wasserstein distance or Kantorovich– Rubinstein metric is a distance function defined between probability distributions on a given metric space M. It is named after Leonid Vaseršteĭn.
Intuitively, if each distribution ...
* Transport function
* Hungarian algorithm
The Hungarian method is a combinatorial optimization algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific ...
* Transportation planning
Transportation planning is the process of defining future policies, goals, investments, and spatial planning designs to prepare for future needs to move people and goods to destinations. As practiced today, it is a collaborative process that i ...
* Earth mover's distance
In computer science, the earth mover's distance (EMD) is a measure of dissimilarity between two frequency distributions, densities, or measures, over a metric space ''D''.
Informally, if the distributions are interpreted as two different ways of ...
*Monge–Ampère equation
In mathematics, a (real) Monge–Ampère equation is a nonlinear second-order partial differential equation of special kind. A second-order equation for the unknown function ''u'' of two variables ''x'',''y'' is of Monge–Ampère type if it is l ...
References
Further reading
*
{{DEFAULTSORT:Transportation Theory
Calculus of variations
Matching (graph theory)
Mathematical economics
Measure theory
Transport economics
Optimization in vector spaces
Mathematical optimization in business