HOME

TheInfoList



OR:

In
graph theory In mathematics, graph theory is the study of ''graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conn ...
, reachability refers to the ability to get from one
vertex Vertex, vertices or vertexes may refer to: Science and technology Mathematics and computer science *Vertex (geometry), a point where two or more curves, lines, or edges meet *Vertex (computer graphics), a data structure that describes the position ...
to another within a graph. A vertex s can reach a vertex t (and t is reachable from s) if there exists a sequence of adjacent vertices (i.e. a
walk Walking (also known as ambulation) is one of the main gaits of terrestrial locomotion among legged animals. Walking is typically slower than running and other gaits. Walking is defined by an 'inverted pendulum' gait in which the body vaults ov ...
) which starts with s and ends with t. In an undirected graph, reachability between all pairs of vertices can be determined by identifying the connected components of the graph. Any pair of vertices in such a graph can reach each other
if and only if In logic and related fields such as mathematics and philosophy, "if and only if" (shortened as "iff") is a biconditional logical connective between statements, where either both statements are true or both are false. The connective is bicond ...
they belong to the same connected component; therefore, in such a graph, reachability is symmetric (s reaches t
iff In logic and related fields such as mathematics and philosophy, "if and only if" (shortened as "iff") is a biconditional logical connective between statements, where either both statements are true or both are false. The connective is bicond ...
t reaches s). The connected components of an undirected graph can be identified in linear time. The remainder of this article focuses on the more difficult problem of determining pairwise reachability in a
directed graph In mathematics, and more specifically in graph theory, a directed graph (or digraph) is a graph that is made up of a set of vertices connected by directed edges, often called arcs. Definition In formal terms, a directed graph is an ordered pa ...
(which, incidentally, need not be symmetric).


Definition

For a directed graph G = (V, E), with vertex set V and edge set E, the reachability relation of G is the
transitive closure In mathematics, the transitive closure of a binary relation on a set is the smallest relation on that contains and is transitive. For finite sets, "smallest" can be taken in its usual sense, of having the fewest related pairs; for infinit ...
of E, which is to say the set of all ordered pairs (s,t) of vertices in V for which there exists a sequence of vertices v_0 = s, v_1, v_2, ..., v_k = t such that the edge (v_,v_i) is in E for all 1 \leq i \leq k.. If G is acyclic, then its reachability relation is a
partial order In mathematics, especially order theory, a partially ordered set (also poset) formalizes and generalizes the intuitive concept of an ordering, sequencing, or arrangement of the elements of a set. A poset consists of a set together with a binar ...
; any partial order may be defined in this way, for instance as the reachability relation of its
transitive reduction In the mathematical field of graph theory, a transitive reduction of a directed graph is another directed graph with the same vertices and as few edges as possible, such that for all pairs of vertices , a (directed) path from to in exists i ...
. A noteworthy consequence of this is that since partial orders are anti-symmetric, if s can reach t, then we know that t ''cannot'' reach s. Intuitively, if we could travel from s to t and back to s, then G would contain a cycle, contradicting that it is acyclic. If G is directed but ''not'' acyclic (i.e. it contains at least one cycle), then its reachability relation will correspond to a
preorder In mathematics, especially in order theory, a preorder or quasiorder is a binary relation that is reflexive and transitive. Preorders are more general than equivalence relations and (non-strict) partial orders, both of which are special ca ...
instead of a partial order.


Algorithms

Algorithms for determining reachability fall into two classes: those that require preprocessing and those that do not. If you have only one (or a few) queries to make, it may be more efficient to forgo the use of more complex data structures and compute the reachability of the desired pair directly. This can be accomplished in
linear time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
using algorithms such as breadth first search or
iterative deepening depth-first search In computer science, iterative deepening search or more specifically iterative deepening depth-first search (IDS or IDDFS) is a state space/graph search strategy in which a depth-limited version of depth-first search is run repeatedly with inc ...
. If you will be making many queries, then a more sophisticated method may be used; the exact choice of method depends on the nature of the graph being analysed. In exchange for preprocessing time and some extra storage space, we can create a data structure which can then answer reachability queries on any pair of vertices in as low as O(1) time. Three different algorithms and data structures for three different, increasingly specialized situations are outlined below.


Floyd-Warshall Algorithm

The
Floyd–Warshall algorithm In computer science, the Floyd–Warshall algorithm (also known as Floyd's algorithm, the Roy–Warshall algorithm, the Roy–Floyd algorithm, or the WFI algorithm) is an algorithm for finding shortest paths in a directed weighted graph with p ...
can be used to compute the transitive closure of any directed graph, which gives rise to the reachability relation as in the definition, above. The algorithm requires O(, V, ^3) time and O(, V, ^2) space in the worst case. This algorithm is not solely interested in reachability as it also computes the shortest path distance between all pairs of vertices. For graphs containing negative cycles, shortest paths may be undefined, but reachability between pairs can still be noted.


Thorup's Algorithm

For planar digraphs, a much faster method is available, as described by Mikkel Thorup in 2004. This method can answer reachability queries on a planar graph in O(1) time after spending O(n \log) preprocessing time to create a data structure of O(n \log) size. This algorithm can also supply approximate shortest path distances, as well as route information. The overall approach is to associate with each vertex a relatively small set of so-called separator paths such that any path from a vertex v to any other vertex w must go through at least one of the separators associated with v or w. An outline of the reachability related sections follows. Given a graph G, the algorithm begins by organizing the vertices into layers starting from an arbitrary vertex v_0. The layers are built in alternating steps by first considering all vertices reachable ''from'' the previous step (starting with just v_0) and then all vertices which reach ''to'' the previous step until all vertices have been assigned to a layer. By construction of the layers, every vertex appears at most two layers, and every
directed path In graph theory, a path in a graph is a finite or infinite sequence of edges which joins a sequence of vertices which, by most definitions, are all distinct (and since the vertices are distinct, so are the edges). A directed path (sometimes c ...
, or dipath, in G is contained within two adjacent layers L_i and L_. Let k be the last layer created, that is, the lowest value for k such that \bigcup_^ L_i = V. The graph is then re-expressed as a series of digraphs G_0, G_1, \ldots, G_ where each G_i = r_i \cup L_i \cup L_ and where r_i is the contraction of all previous levels L_0 \ldots L_ into a single vertex. Because every dipath appears in at most two consecutive layers, and because each G_i is formed by two consecutive layers, every dipath in G appears in its entirety in at least one G_i (and no more than 2 consecutive such graphs) For each G_i, three separators are identified which, when removed, break the graph into three components which each contain at most 1/2 the vertices of the original. As G_i is built from two layers of opposed dipaths, each separator may consist of up to 2 dipaths, for a total of up to 6 dipaths over all of the separators. Let S be this set of dipaths. The proof that such separators can always be found is related to the
Planar Separator Theorem In graph theory, the planar separator theorem is a form of isoperimetric inequality for planar graphs, that states that any planar graph can be split into smaller pieces by removing a small number of vertices. Specifically, the removal of vertic ...
of Lipton and Tarjan, and these separators can be located in linear time. For each Q \in S, the directed nature of Q provides for a natural indexing of its vertices from the start to the end of the path. For each vertex v in G_i, we locate the first vertex in Q reachable by v, and the last vertex in Q that reaches to v. That is, we are looking at how early into Q we can get from v, and how far we can stay in Q and still get back to v. This information is stored with each v. Then for any pair of vertices u and w, u can reach w ''via'' Q if u connects to Q earlier than w connects from Q. Every vertex is labelled as above for each step of the recursion which builds G_0 \ldots, G_k. As this recursion has logarithmic depth, a total of O(\log) extra information is stored per vertex. From this point, a logarithmic time query for reachability is as simple as looking over each pair of labels for a common, suitable Q. The original paper then works to tune the query time down to O(1). In summarizing the analysis of this method, first consider that the layering approach partitions the vertices so that each vertex is considered only O(1) times. The separator phase of the algorithm breaks the graph into components which are at most 1/2 the size of the original graph, resulting in a logarithmic recursion depth. At each level of the recursion, only linear work is needed to identify the separators as well as the connections possible between vertices. The overall result is O(n \log n) preprocessing time with only O(\log) additional information stored for each vertex.


Kameda's Algorithm

An even faster method for pre-processing, due to T. Kameda in 1975, can be used if the graph is planar, acyclic, and also exhibits the following additional properties: all 0-
indegree In mathematics, and more specifically in graph theory, a directed graph (or digraph) is a graph that is made up of a set of vertices connected by directed edges, often called arcs. Definition In formal terms, a directed graph is an ordered pai ...
and all 0-
outdegree In mathematics, and more specifically in graph theory, a directed graph (or digraph) is a graph that is made up of a set of vertices connected by directed edges, often called arcs. Definition In formal terms, a directed graph is an ordered pa ...
vertices appear on the same
face The face is the front of an animal's head that features the eyes, nose and mouth, and through which animals express many of their emotions. The face is crucial for human identity, and damage such as scarring or developmental deformities may ...
(often assumed to be the outer face), and it is possible to partition the boundary of that face into two parts such that all 0-indegree vertices appear on one part, and all 0-outdegree vertices appear on the other (i.e. the two types of vertices do not alternate). If G exhibits these properties, then we can preprocess the graph in only O(n) time, and store only O(\log) extra bits per vertex, answering reachability queries for any pair of vertices in O(1) time with a simple comparison. Preprocessing performs the following steps. We add a new vertex s which has an edge to each 0-indegree vertex, and another new vertex t with edges from each 0-outdegree vertex. Note that the properties of G allow us to do so while maintaining planarity, that is, there will still be no edge crossings after these additions. For each vertex we store the list of adjacencies (out-edges) in order of the planarity of the graph (for example, clockwise with respect to the graph's embedding). We then initialize a counter i = n + 1 and begin a Depth-First Traversal from s. During this traversal, the adjacency list of each vertex is visited from left-to-right as needed. As vertices are popped from the traversal's stack, they are labelled with the value i, and i is then decremented. Note that t is always labelled with the value n+1 and s is always labelled with 0. The depth-first traversal is then repeated, but this time the adjacency list of each vertex is visited from right-to-left. When completed, s and t, and their incident edges, are removed. Each remaining vertex stores a 2-dimensional label with values from 1 to n. Given two vertices u and v, and their labels L(u) = (a_1, a_2) and L(v) =(b_1, b_2), we say that L(u) < L(v) if and only if a_1 \leq b_1, a_2 \leq b_2, and there exists at least one component a_1 or a_2 which is strictly less than b_1 or b_2, respectively. The main result of this method then states that v is reachable from u if and only if L(u) < L(v), which is easily calculated in O(1) time.


Related problems

A related problem is to solve reachability queries with some number k of vertex failures. For example: "Can vertex u still reach vertex v even though vertices s_1, s_2, ..., s_k have failed and can no longer be used?" A similar problem may consider edge failures rather than vertex failures, or a mix of the two. The breadth-first search technique works just as well on such queries, but constructing an efficient oracle is more challenging.. Another problem related to reachability queries is in quickly recalculating changes to reachability relationships when some portion of the graph is changed. For example, this is a relevant concern to
garbage collection Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclable m ...
which needs to balance the reclamation of memory (so that it may be reallocated) with the performance concerns of the running application.


See also

* Gammoid * ''st''-connectivity


References

{{reflist Graph connectivity