A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. That is, it is a

spanning tree In the mathematical field of graph theory, a spanning tree ''T'' of an undirected graph ''G'' is a subgraph that is a tree which includes all of the vertices of ''G''. In general, a graph may have several spanning trees, but a graph that is no ...

whose sum of edge weights is as small as possible. More generally, any edge-weighted undirected graph (not necessarily connected) has a minimum spanning forest, which is a union of the minimum spanning trees for its connected components. There are many use cases for minimum spanning trees. One example is a telecommunications company trying to lay cable in a new neighborhood. If it is constrained to bury the cable only along certain paths (e.g. roads), then there would be a graph containing the points (e.g. houses) connected by those paths. Some of the paths might be more expensive, because they are longer, or require the cable to be buried deeper; these paths would be represented by edges with larger weights. Currency is an acceptable unit for edge weight – there is no requirement for edge lengths to obey normal rules of geometry such as the

triangle inequality In mathematics, the triangle inequality states that for any triangle, the sum of the lengths of any two sides must be greater than or equal to the length of the remaining side. This statement permits the inclusion of degenerate triangles, bu ...

. A ''spanning tree'' for that graph would be a subset of those paths that has no cycles but still connects every house; there might be several spanning trees possible. A ''minimum spanning tree'' would be one with the lowest total cost, representing the least expensive path for laying the cable.

Properties

Possible multiplicity

If there are vertices in the graph, then each spanning tree has edges. Multiple minimum spanning trees

There may be several minimum spanning trees of the same weight; in particular, if all the edge weights of a given graph are the same, then every spanning tree of that graph is minimum.

Uniqueness

''If each edge has a distinct weight then there will be only one, unique minimum spanning tree''. This is true in many realistic situations, such as the telecommunications company example above, where it's unlikely any two paths have ''exactly'' the same cost. This generalizes to spanning forests as well. Proof: # Assume the contrary, that there are two different MSTs and . # Since and differ despite containing the same nodes, there is at least one edge that belongs to one but not the other. Among such edges, let be the one with least weight; this choice is unique because the edge weights are all distinct. Without loss of generality, assume is in . # As is an MST, must contain a cycle with . # As a tree, contains no cycles, therefore must have an edge that is not in . # Since was chosen as the unique lowest-weight edge among those belonging to exactly one of and , the weight of must be greater than the weight of . # As and are part of the cycle , replacing with in therefore yields a spanning tree with a smaller weight. # This contradicts the assumption that is an MST. More generally, if the edge weights are not all distinct then only the (multi-)set of weights in minimum spanning trees is certain to be unique; it is the same for all minimum spanning trees.

Minimum-cost subgraph

If the weights are ''positive'', then a minimum spanning tree is in fact a minimum-cost subgraph connecting all vertices, since subgraphs containing cycles necessarily have more total weight.

Cycle property

''For any cycle in the graph, if the weight of an edge of is larger than any of the individual weights of all other edges of , then this edge cannot belong to an MST.'' Proof: Assume the contrary, i.e. that belongs to an MST . Then deleting will break into two subtrees with the two ends of in different subtrees. The remainder of reconnects the subtrees, hence there is an edge of with ends in different subtrees, i.e., it reconnects the subtrees into a tree with weight less than that of , because the weight of is less than the weight of .

Cut property

''For any cut of the graph, if the weight of an edge in the cut-set of is strictly smaller than the weights of all other edges of the cut-set of , then this edge belongs to all MSTs of the graph.'' Proof: Assume that there is an MST that does not contain . Adding to will produce a cycle, that crosses the cut once at and crosses back at another edge . Deleting we get a spanning tree of strictly smaller weight than . This contradicts the assumption that was a MST. By a similar argument, if more than one edge is of minimum weight across a cut, then each such edge is contained in some minimum spanning tree.

Minimum-cost edge

''If the minimum cost edge of a graph is unique, then this edge is included in any MST.'' Proof: if was not included in the MST, removing any of the (larger cost) edges in the cycle formed after adding to the MST, would yield a spanning tree of smaller weight.

Contraction

If is a tree of MST edges, then we can ''contract'' into a single vertex while maintaining the invariant that the MST of the contracted graph plus gives the MST for the graph before contraction.

Algorithms

In all of the algorithms below, is the number of edges in the graph and is the number of vertices.

Classic algorithms

The first algorithm for finding a minimum spanning tree was developed by Czech scientist Otakar Borůvka in 1926 (see Borůvka's algorithm). Its purpose was an efficient electrical coverage of

Moravia Moravia ( , also , ; cs, Morava ; german: link=yes, Mähren ; pl, Morawy ; szl, Morawa; la, Moravia) is a historical region in the east of the Czech Republic and one of three historical Czech lands, with Bohemia and Czech Silesia. Th ...

. The algorithm proceeds in a sequence of stages. In each stage, called ''Boruvka step'', it identifies a forest consisting of the minimum-weight edge incident to each vertex in the graph , then forms the graph as the input to the next step. Here denotes the graph derived from by contracting edges in (by the

Cut property Cut may refer to: Common uses * The act of cutting, the separation of an object into two through acutely-directed force ** A type of wound ** Cut (archaeology), a hole dug in the past ** Cut (clothing), the style or shape of a garment ** Cut (e ...

, these edges belong to the MST). Each Boruvka step takes linear time. Since the number of vertices is reduced by at least half in each step, Boruvka's algorithm takes time. A second algorithm is

Prim's algorithm In computer science, Prim's algorithm (also known as Jarník's algorithm) is a greedy algorithm that finds a minimum spanning tree for a weighted undirected graph. This means it finds a subset of the edges that forms a tree that includes every ...

, which was invented by Vojtěch Jarník in 1930 and rediscovered by Prim in 1957 and Dijkstra in 1959. Basically, it grows the MST () one edge at a time. Initially, contains an arbitrary vertex. In each step, is augmented with a least-weight edge such that is in and is not yet in . By the

, all edges added to are in the MST. Its run-time is either or , depending on the data-structures used. A third algorithm commonly in use is

Kruskal's algorithm Kruskal's algorithm finds a minimum spanning forest of an undirected edge-weighted graph. If the graph is connected, it finds a minimum spanning tree. (A minimum spanning tree of a connected graph is a subset of the edges that forms a tree that ...

, which also takes time. A fourth algorithm, not as commonly used, is the

reverse-delete algorithm The reverse-delete algorithm is an algorithm in graph theory used to obtain a minimum spanning tree from a given connected, edge-weighted graph. It first appeared in , but it should not be confused with Kruskal's algorithm which appears in the sam ...

, which is the reverse of Kruskal's algorithm. Its runtime is . All four of these are

greedy algorithm A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage. In many problems, a greedy strategy does not produce an optimal solution, but a greedy heuristic can yield locall ...

s. Since they run in polynomial time, the problem of finding such trees is in FP, and related

decision problem In computability theory and computational complexity theory, a decision problem is a computational problem that can be posed as a yes–no question of the input values. An example of a decision problem is deciding by means of an algorithm whethe ...

s such as determining whether a particular edge is in the MST or determining if the minimum total weight exceeds a certain value are in P.

Faster algorithms

Several researchers have tried to find more computationally-efficient algorithms. In a comparison model, in which the only allowed operations on edge weights are pairwise comparisons, found a linear time randomized algorithm based on a combination of Borůvka's algorithm and the reverse-delete algorithm. The fastest non-randomized comparison-based algorithm with known complexity, by

Bernard Chazelle Bernard Chazelle (born November 5, 1955) is a French-American computer scientist. He is currently the Eugene Higgins Professor of Computer Science at Princeton University. Much of his work is in computational geometry, where he is known for hi ...

, is based on the

soft heap In computer science, a soft heap is a variant on the simple heap data structure that has constant amortized time complexity for 5 types of operations. This is achieved by carefully "corrupting" (increasing) the keys of at most a constant number o ...

, an approximate priority queue.. Its running time is , where is the classical functional inverse of the Ackermann function. The function grows extremely slowly, so that for all practical purposes it may be considered a constant no greater than 4; thus Chazelle's algorithm takes very close to linear time.

Linear-time algorithms in special cases

Dense graphs

If the graph is dense (i.e. , then a deterministic algorithm by Fredman and Tarjan finds the MST in time . The algorithm executes a number of phases. Each phase executes

many times, each for a limited number of steps. The run-time of each phase is . If the number of vertices before a phase is , the number of vertices remaining after a phase is at most

\tfrac

. Hence, at most phases are needed, which gives a linear run-time for dense graphs. There are other algorithms that work in linear time on dense graphs.

Integer weights

If the edge weights are integers represented in binary, then deterministic algorithms are known that solve the problem in integer operations. Whether the problem can be solved ''deterministically'' for a ''general graph'' in ''linear time'' by a comparison-based algorithm remains an open question.

Decision trees

Given graph where the nodes and edges are fixed but the weights are unknown, it is possible to construct a binary

decision tree A decision tree is a decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains co ...

(DT) for calculating the MST for any permutation of weights. Each internal node of the DT contains a comparison between two edges, e.g. "Is the weight of the edge between and larger than the weight of the edge between and ?". The two children of the node correspond to the two possible answers "yes" or "no". In each leaf of the DT, there is a list of edges from that correspond to an MST. The runtime complexity of a DT is the largest number of queries required to find the MST, which is just the depth of the DT. A DT for a graph is called ''optimal'' if it has the smallest depth of all correct DTs for . For every integer , it is possible to find optimal decision trees for all graphs on vertices by

brute-force search In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique and algorithmic paradigm that consists of systematically enumerating all possible candidates for the soluti ...

. This search proceeds in two steps. A. Generating all potential DTs * There are

2^

different graphs on vertices. * For each graph, an MST can always be found using comparisons, e.g. by

. * Hence, the depth of an optimal DT is less than . * Hence, the number of internal nodes in an optimal DT is less than

2^

. * Every internal node compares two edges. The number of edges is at most so the different number of comparisons is at most . * Hence, the number of potential DTs is less than

$^ = r^.$

B. Identifying the correct DTs To check if a DT is correct, it should be checked on all possible permutations of the edge weights. * The number of such permutations is at most . * For each permutation, solve the MST problem on the given graph using any existing algorithm, and compare the result to the answer given by the DT. * The running time of any MST algorithm is at most , so the total time required to check all permutations is at most . Hence, the total time required for finding an optimal DT for ''all'' graphs with vertices is: :

2^ \cdot r^ \cdot (r^2+1)!,

which is less than :

2^.

Optimal algorithm

Seth Pettie and

Vijaya Ramachandran Vijaya Ramachandran is an Indian-American theoretical computer scientist known for her research on graph algorithms and parallel algorithms. She is the William Blakemore II Regents Professor of Computer Sciences at the University of Texas at A ...

have found a optimal deterministic comparison-based minimum spanning tree algorithm.. The following is a simplified description of the algorithm. # Let , where is the number of vertices. Find all optimal decision trees on vertices. This can be done in time (see

Decision trees A decision tree is a decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains cond ...

above). # Partition the graph to components with at most vertices in each component. This partition uses a

, which "corrupts" a small number of the edges of the graph. # Use the optimal decision trees to find an MST for the uncorrupted subgraph within each component. # Contract each connected component spanned by the MSTs to a single vertex, and apply any algorithm which works on dense graphs in time to the contraction of the uncorrupted subgraph # Add back the corrupted edges to the resulting forest to form a subgraph guaranteed to contain the minimum spanning tree, and smaller by a constant factor than the starting graph. Apply the optimal algorithm recursively to this graph. The runtime of all steps in the algorithm is , ''except for the step of using the decision trees''. The runtime of this step is unknown, but it has been proved that it is optimal - no algorithm can do better than the optimal decision tree. Thus, this algorithm has the peculiar property that it is '' optimal'' although its runtime complexity is ''unknown''.

Parallel and distributed algorithms

Research has also considered

parallel algorithm In computer science, a parallel algorithm, as opposed to a traditional serial algorithm, is an algorithm which can do multiple operations in a given time. It has been a tradition of computer science to describe serial algorithms in abstract machine ...

s for the minimum spanning tree problem. With a linear number of processors it is possible to solve the problem in time. demonstrate an algorithm that can compute MSTs 5 times faster on 8 processors than an optimized sequential algorithm. Other specialized algorithms have been designed for computing minimum spanning trees of a graph so large that most of it must be stored on disk at all times. These ''external storage'' algorithms, for example as described in "Engineering an External Memory Minimum Spanning Tree Algorithm" by Roman, Dementiev et al., can operate, by authors' claims, as little as 2 to 5 times slower than a traditional in-memory algorithm. They rely on efficient external storage sorting algorithms and on graph contraction techniques for reducing the graph's size efficiently. The problem can also be approached in a distributed manner. If each node is considered a computer and no node knows anything except its own connected links, one can still calculate the

distributed minimum spanning tree The distributed minimum spanning tree (MST) problem involves the construction of a minimum spanning tree by a distributed algorithm, in a network where nodes communicate by message passing. It is radically different from the classical sequential pr ...

MST on complete graphs

Alan M. Frieze Alan M. Frieze (born 25 October 1945 in London, England) is a professor in the Department of Mathematical Sciences at Carnegie Mellon University, Pittsburgh, United States. He graduated from the University of Oxford in 1966, and obtained his PhD f ...

showed that given a

complete graph In the mathematical field of graph theory, a complete graph is a simple undirected graph in which every pair of distinct vertices is connected by a unique edge. A complete digraph is a directed graph in which every pair of distinct vertices ...

on ''n'' vertices, with edge weights that are independent identically distributed random variables with distribution function

F

satisfying

F'(0) > 0

, then as ''n'' approaches +∞ the expected weight of the MST approaches

\zeta(3)/F'(0)

, where

\zeta

is the Riemann zeta function (more specifically is

\zeta(3)

Apéry's constant In mathematics, Apéry's constant is the sum of the reciprocals of the positive cubes. That is, it is defined as the number : \begin \zeta(3) &= \sum_^\infty \frac \\ &= \lim_ \left(\frac + \frac + \cdots + \frac\right), \end ...

). Frieze and

Steele Steele may refer to: Places America * Steele, Alabama, a town * Steele, Arkansas, an unincorporated community * Steele, Kentucky, an unincorporated community * Steele, Missouri, a city * Lonetree, Montana, a ghost town originally called Steele * ...

also proved convergence in probability. Svante Janson proved a

central limit theorem In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables thems ...

for weight of the MST. For uniform random weights in

,1 /math>, the exact expected size of the minimum spanning tree has been computed for small complete graphs.

Applications

Minimum spanning trees have direct applications in the design of networks, including

computer network A computer network is a set of computers sharing resources located on or provided by network nodes. The computers use common communication protocols over digital interconnections to communicate with each other. These interconnections ar ...

telecommunications network A telecommunications network is a group of nodes interconnected by telecommunications links that are used to exchange messages between the nodes. The links may use a variety of technologies based on the methodologies of circuit switching, mes ...

s, transportation networks,

water supply network A water supply network or water supply system is a system of engineered hydrologic and hydraulic components that provide water supply. A water supply system typically includes the following: # A drainage basin (see water purification – sourc ...

s, and

electrical grid An electrical grid is an interconnected network for electricity delivery from producers to consumers. Electrical grids vary in size and can cover whole countries or continents. It consists of:Kaplan, S. M. (2009). Smart Grid. Electrical Power ...

s (which they were first invented for, as mentioned above). They are invoked as subroutines in algorithms for other problems, including the Christofides algorithm for approximating the

traveling salesman problem The travelling salesman problem (also called the travelling salesperson problem or TSP) asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each cit ...

, approximating the multi-terminal minimum cut problem (which is equivalent in the single-terminal case to the

maximum flow problem In optimization theory, maximum flow problems involve finding a feasible flow through a flow network that obtains the maximum possible flow rate. The maximum flow problem can be seen as a special case of more complex network flow problems, such ...

), and approximating the minimum-cost weighted perfect matching. Other practical applications based on minimal spanning trees include: *

Taxonomy Taxonomy is the practice and science of categorization or classification. A taxonomy (or taxonomical classification) is a scheme of classification, especially a hierarchical classification, in which things are organized into groups or types. ...

. *

Cluster analysis Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of ...

: clustering points in the plane,

single-linkage clustering In statistics, single-linkage clustering is one of several methods of hierarchical clustering. It is based on grouping clusters in bottom-up fashion (agglomerative clustering), at each step combining two clusters that contain the closest pair of e ...

(a method of

hierarchical clustering In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into tw ...

), graph-theoretic clustering, and clustering

gene expression Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product that enables it to produce end products, protein or non-coding RNA, and ultimately affect a phenotype, as the final effect. ...

data. * Constructing trees for

broadcasting Broadcasting is the distribution of audio or video content to a dispersed audience via any electronic mass communications medium, but typically one using the electromagnetic spectrum (radio waves), in a one-to-many model. Broadcasting began ...

in computer networks. *

Image registration Image registration is the process of transforming different sets of data into one coordinate system. Data may be multiple photographs, data from different sensors, times, depths, or viewpoints. It is used in computer vision, medical imaging, mili ...

and segmentation – see minimum spanning tree-based segmentation. * Curvilinear

feature extraction In machine learning, pattern recognition, and image processing, feature extraction starts from an initial set of measured data and builds derived values ( features) intended to be informative and non-redundant, facilitating the subsequent learning ...

computer vision Computer vision is an Interdisciplinarity, interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate t ...

. *

Handwriting recognition Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other dev ...

of mathematical expressions. *

Circuit design The process of circuit design can cover systems ranging from complex electronic systems down to the individual transistors within an integrated circuit. One person can often do the design process without needing a planned or structured design ...

: implementing efficient multiple constant multiplications, as used in

finite impulse response In signal processing, a finite impulse response (FIR) filter is a filter whose impulse response (or response to any finite length input) is of ''finite'' duration, because it settles to zero in finite time. This is in contrast to infinite impulse ...

filters. * Regionalisation of socio-geographic areas, the grouping of areas into homogeneous, contiguous regions. * Comparing

ecotoxicology Ecotoxicology is the study of the effects of toxic chemicals on biological organisms, especially at the population, community, ecosystem, and biosphere levels. Ecotoxicology is a multidisciplinary field, which integrates toxicology and ecology. T ...

data. * Topological

observability Observability is a measure of how well internal states of a system can be inferred from knowledge of its external outputs. In control theory, the observability and controllability of a linear system are mathematical duals. The concept of observ ...

in power systems. * Measuring homogeneity of two-dimensional materials. * Minimax

process control An industrial process control in continuous production processes is a discipline that uses industrial control systems to achieve a production level of consistency, economy and safety which could not be achieved purely by human manual control. ...

. * Minimum spanning trees can also be used to describe financial markets. A correlation matrix can be created by calculating a coefficient of correlation between any two stocks. This matrix can be represented topologically as a complex network and a minimum spanning tree can be constructed to visualize relationships.

References

External links

{{commons category, Minimum spanning trees

The Stony Brook Algorithm Repository - Minimum Spanning Tree codes

Implemented in QuickGraph for .Net
Spanning tree Polynomial-time problems