Parallel Algorithms For Minimum Spanning Trees
   HOME

TheInfoList



OR:

In
graph theory In mathematics, graph theory is the study of '' graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conn ...
a
minimum spanning tree A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. T ...
(MST) T of a
graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discre ...
G = (V, E) with , V, = n and , E, = m is a
tree In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are ...
subgraph of G that contains all of its vertices and is of minimum weight. MSTs are useful and versatile tools utilised in a wide variety of practical and theoretical fields. For example, a company looking to supply multiple stores with a certain product from a single warehouse might use an MST originating at the warehouse to calculate the shortest paths to each company store. In this case the stores and the warehouse are represented as vertices and the road connections between them - as edges. Each edge is labelled with the length of the corresponding road connection. If G is edge-unweighted every spanning tree possesses the same number of edges and thus the same weight. In the edge-weighted case, the spanning tree, the sum of the weights of the edges of which is lowest among all spanning trees of G, is called a
minimum spanning tree A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. T ...
(MST). It is not necessarily unique. More generally, graphs that are not necessarily
connected Connected may refer to: Film and television * ''Connected'' (2008 film), a Hong Kong remake of the American movie ''Cellular'' * '' Connected: An Autoblogography About Love, Death & Technology'', a 2011 documentary film * ''Connected'' (2015 TV ...
have minimum spanning
forests A forest is an area of land dominated by trees. Hundreds of definitions of forest are used throughout the world, incorporating factors such as tree density, tree height, land use, legal standing, and ecological function. The United Nations' ...
, which consist of a
union Union commonly refers to: * Trade union, an organization of workers * Union (set theory), in mathematics, a fundamental operation on sets Union may also refer to: Arts and entertainment Music * Union (band), an American rock group ** ''Un ...
of MSTs for each connected component. As finding MSTs is a widespread problem in graph theory, there exist many sequential algorithms for solving it. Among them are Prim's, Kruskal's and Borůvka's algorithms, each utilising different properties of MSTs. They all operate in a similar fashion - a subset of E is iteratively grown until a valid MST has been discovered. However, as practical problems are often quite large (road networks sometimes have billions of edges),
performance A performance is an act of staging or presenting a play, concert, or other form of entertainment. It is also defined as the action or process of carrying out or accomplishing an action, task, or function. Management science In the work place ...
is a key factor. One option of improving it is by parallelising known MST algorithms.


Prim's algorithm

This algorithm utilises the cut-property of MSTs. A simple high-level pseudocode implementation is provided below: T \gets \emptyset S \gets \ where s is a random vertex in V repeat , V, - 1 times find lightest edge (u,v) s.t. u \in S but v \in (V \setminus S) S \gets S \cup \ T \gets T \cup \ return T Each edge is observed exactly twice - namely when examining each of its endpoints. Each vertex is examined exactly once for a total of O(n + m) operations aside from the selection of the lightest edge at each loop iteration. This selection is often performed using a
priority queue In computer science, a priority queue is an abstract data-type similar to a regular queue or stack data structure in which each element additionally has a ''priority'' associated with it. In a priority queue, an element with high priority is se ...
(PQ). For each edge at most one decreaseKey operation ( amortised in O(1)) is performed and each loop iteration performs one deleteMin operation (O(\log n)). Thus using Fibonacci heaps the total runtime of Prim's algorithm is
asymptotically In analytic geometry, an asymptote () of a curve is a line such that the distance between the curve and the line approaches zero as one or both of the ''x'' or ''y'' coordinates tends to infinity. In projective geometry and related contexts, ...
in O(m + n \log n). It is important to note that the loop is inherently sequential and can not be properly parallelised. This is the case, since the lightest edge with one endpoint in S and on in V \setminus S might change with the addition of edges to T. Thus no two selections of a lightest edge can be performed at the same time. However, there do exist some attempts at parallelisation. One possible idea is to use O(n) processors to support PQ access in O(1) on an EREW-PRAM machine, thus lowering the total runtime to O(n + m).


Kruskal's algorithm

Kruskal's MST algorithm utilises the cycle property of MSTs. A high-level pseudocode representation is provided below. T \gets forest with every vertex in its own subtree foreach (u, v) \in E in ascending order of weight if u and v in different subtrees of T T \gets T \cup \ return T The subtrees of T are stored in
union-find In computer science, a disjoint-set data structure, also called a union–find data structure or merge–find set, is a data structure that stores a collection of disjoint (non-overlapping) sets. Equivalently, it stores a partition of a se ...
data structures, which is why checking whether or not two vertices are in the same subtree is possible in amortised O(\alpha(m, n)) where \alpha(m, n) is the inverse
Ackermann function In computability theory, the Ackermann function, named after Wilhelm Ackermann, is one of the simplest and earliest-discovered examples of a total computable function that is not primitive recursive. All primitive recursive functions are total ...
. Thus the total runtime of the algorithm is in O(sort(n) + \alpha(n)). Here \alpha(n) denotes the single-valued inverse Ackermann function, for which any realistic input yields an integer less than five.


Approach 1: Parallelising the sorting step

Similarly to Prim's algorithm there are components in Kruskal's approach that can not be parallelised in its classical variant. For example, determining whether or not two vertices are in the same subtree is difficult to parallelise, as two union operations might attempt to join the same subtrees at the same time. Really the only opportunity for parallelisation lies in the sorting step. As sorting is linear in the optimal case on O(\log n) processors, the total runtime can be reduced to O(m \alpha(n)).


Approach 2: Filter-Kruskal

Another approach would be to modify the original algorithm by growing T more aggressively. This idea was presented by Osipov et al. The basic idea behind Filter-Kruskal is to partition the edges in a similar way to
quicksort Quicksort is an efficient, general-purpose sorting algorithm. Quicksort was developed by British computer scientist Tony Hoare in 1959 and published in 1961, it is still a commonly used algorithm for sorting. Overall, it is slightly faster than ...
and filter out edges that connect vertices that belong to the same tree in order to reduce the cost of sorting. A high-level pseudocode representation is provided below. filterKruskal(G): if m < KruskalThreshold: return kruskal(G) pivot = chooseRandom(E) (E_, E_) \gets partition(E, pivot) A \gets filterKruskal(E_) E_ \gets filter(E_) A \gets A \cup filterKruskal(E_) return A partition(E, pivot): E_ \gets \emptyset E_ \gets \emptyset foreach (u, v) \in E: if weight(u, v) \leq pivot: E_ \gets E_ \cup else E_ \gets E_ \cup return (E_, E_) filter(E): E_ \gets \emptyset foreach (u, v) \in E: if find-set(u) \neq find-set(v): E_ \gets E_ \cup return E_ Filter-Kruskal is better suited for parallelisation, since sorting, partitioning and filtering have intuitively easy parallelisations where the edges are simply divided between the cores.


Borůvka's algorithm

The main idea behind Borůvka's algorithm is
edge contraction In graph theory, an edge contraction is an operation that removes an edge from a graph while simultaneously merging the two vertices that it previously joined. Edge contraction is a fundamental operation in the theory of graph minors. Vertex id ...
. An edge \ is contracted by first removing v from the graph and then redirecting every edge \ \in E to \. These new edges retain their old edge weights. If the goal is not just to determine the weight of an MST but also which edges it comprises, it must be noted between which pairs of vertices an edge was contracted. A high-level pseudocode representation is presented below. T \gets \emptyset while , V, > 0 S \gets \emptyset for v \in V S \gets S \cup lightest \ \in E for \ \in S contract \ T \gets T \cup S return T It is possible that contractions lead to multiple edges between a pair of vertices. The intuitive way of choosing the lightest of them is not possible in O(m). However, if all contractions that share a vertex are performed in parallel this is doable. The recursion stops when there is only a single vertex remaining, which means the algorithm needs at most \log n iterations, leading to a total runtime in O(m \log n).


Parallelisation

One possible parallelisation of this algorithm yields a
polylogarithmic In mathematics, a polylogarithmic function in is a polynomial in the logarithm of , : a_k (\log n)^k + a_ (\log n)^ + \cdots + a_1(\log n) + a_0. The notation is often used as a shorthand for , analogous to for . In computer science, poly ...
time complexity, i.e. T(m, n, p) \cdot p \in O(m \log n) and there exists a constant c so that T(m, n, p) \in O(\log^c m). Here T(m, n, p) denotes the runtime for a graph with m edges, n vertices on a machine with p processors. The basic idea is the following: while , V, > 1 find lightest incident edges // O(\frac + \log n + \log p) assign the corresponding subgraph to each vertex // O(\frac + \log n) contract each subgraph // O(\frac + \log n) The MST then consists of all the found lightest edges. This parallelisation utilises the adjacency array graph representation for G = (V, E). This consists of three arrays - \Gamma of length n + 1 for the vertices, \gamma of length m for the endpoints of each of the m edges and c of length m for the edges' weights. Now for vertex i the other end of each edge incident to i can be found in the entries between \gamma Gamma [i-1 and \gamma [\Gamma[i">-1.html" ;"title="Gamma [i-1">Gamma [i-1 and \gamma [\Gamma[i. The weight of the i-th edge in \Gamma can be found in c[i]. Then the i-th edge in \gamma is between vertices u and v if and only if \Gamma[u] \leq i < \Gamma[u + 1] and \gamma[i] = v.


Finding the lightest incident edge

First the edges are distributed between each of the p processors. The i-th processor receives the edges stored between \gamma
frac Frac or FRAC may refer to: * Frac or fraccing, short name for Hydraulic fracturing, a method for extracting oil and natural gas * FRAC Act, United States legislation proposed in 2009 to regulate hydraulic fracturing * Frac module, a format for ...
/math> and \gamma
frac - 1 Frac or FRAC may refer to: * Frac or fraccing, short name for Hydraulic fracturing, a method for extracting oil and natural gas * FRAC Act, United States legislation proposed in 2009 to regulate hydraulic fracturing * Frac module, a format for t ...
/math>. Furthermore, each processor needs to know to which vertex these edges belong (since \gamma only stores one of the edge's endpoints) and stores this in the array pred. Obtaining this information is possible in O(\log n) using p binary searches or in O(\frac + p) using a linear search. In practice the latter approach is sometimes quicker, even though it is asymptotically worse. Now each processor determines the lightest edge incident to each of its vertices. v \gets find(\frac, \Gamma) for e \gets \frac; e < \frac - 1; e++ if \Gamma +1= e v++ ifc < c red[v pred[v">.html" ;"title="red[v">red[v pred[v\gets e Here the issue arises some vertices are handled by more than one processor. A possible solution to this is that every processor has its own prev array which is later combined with those of the others using a reduction. Each processor has at most two vertices that are also handled by other processors and each reduction is in O(\log p). Thus the total runtime of this step is in O(\frac + \log n + \log p).


Assigning subgraphs to vertices

Observe the graph that consists solely of edges collected in the previous step. These edges are directed away from the vertex to which they are the lightest incident edge. The resulting graph decomposes into multiple weakly connected components. The goal of this step is to assign to each vertex the component of which it is a part. Note that every vertex has exactly one outgoing edge and therefore each component is a pseudotree - a tree with a single extra edge that runs in parallel to the lightest edge in the component but in the opposite direction. The following code mutates this extra edge into a loop: parallel forAll v \in V w \gets pred[v] if pred[w] = v \land v < w pred \gets v Now every Glossary_of_graph_theory_terms#weakly_connected, weakly connected component is a directed tree where the root has a
loop Loop or LOOP may refer to: Brands and enterprises * Loop (mobile), a Bulgarian virtual network operator and co-founder of Loop Live * Loop, clothing, a company founded by Carlos Vasquez in the 1990s and worn by Digable Planets * Loop Mobile, ...
. This root is chosen as the representative of each component. The following code uses doubling to assign each vertex its representative: while \exists v \in V: pred \neq pred red[v forAll v \in V pred \gets pred red[v Now every subgraph is a Glossary_of_graph_theory_terms#star">star A star is an astronomical object comprising a luminous spheroid of plasma held together by its gravity. The nearest star to Earth is the Sun. Many other stars are visible to the naked eye at night, but their immense distances from Earth make ...
. With some advanced techniques this step needs O(\frac + \log n) time.


Contracting the subgraphs

In this step each subgraph is contracted to a single vertex. k \gets number of subgraphs V' \gets \ find a bijective function f: star root \rightarrow \ E' \gets \ Finding the bijective function is possible in O(\frac + \log p) using a prefix sum. As we now have a new set of vertices and edges the adjacency array must be rebuilt, which can be done using Integersort on E' in O(\frac + \log p) time.


Complexity

Each iteration now needs O(\frac + \log n) time and just like in the sequential case there are \log n iterations, resulting in a total runtime of O(\log n(\frac + \log n)). If m \in \Omega(p \log^2 p) the efficiency of the algorithm is in \Theta(1) and it is relatively efficient. If m \in O(n) then it is absolutely efficient.


Further algorithms

There are multiple other parallel algorithms that deal the issue of finding an MST. With a linear number of processors it is possible to achieve this in O(\log n). Bader and Cong presented an MST-algorithm, that was five times quicker on eight cores than an optimal sequential algorithm. Another challenge is the External Memory model - there is a proposed algorithm due to Dementiev et al. that is claimed to be only two to five times slower than an algorithm that only makes use of internal memory.


References

{{reflist Spanning tree Parallel computing