
The nearest neighbor graph (NNG) is a
directed graph defined for a set of points in a
metric space, such as the
Euclidean distance in
the plane. The NNG has a
vertex for each point, and a
directed edge from ''p'' to ''q'' whenever ''q'' is a nearest neighbor of ''p'', a point whose distance from ''p'' is minimum among all the given points other than ''p'' itself.
In many uses of these graphs, the directions of the edges are ignored and the NNG is defined instead as an
undirected graph. However, the nearest neighbor relation is not a
symmetric one, i.e., ''p'' from the definition is not necessarily a nearest neighbor for ''q''. In theoretical discussions of algorithms a kind of
general position is often assumed, namely, the nearest (k-nearest) neighbor is unique for each object. In implementations of the algorithms it is necessary to bear in mind that this is not always the case. For situations in which it is necessary to make the nearest neighbor for each object unique, the set ''P'' may be indexed and in the case of a tie the object with, e.g., the largest index may be taken as the nearest neighbor.
[
]
The ''k''-nearest neighbor graph (''k''-NNG) is a graph in which two vertices ''p'' and ''q'' are connected by an edge, if the distance between ''p'' and ''q'' is among the ''k''-th smallest distances from ''p'' to other objects from ''P''. The NNG is a special case of the ''k''-NNG, namely it is the 1-NNG. ''k''-NNGs obey a
separator theorem: they can be partitioned into two subgraphs of at most vertices each by the removal of O(''k''
1/''d''''n''
1 − 1/''d'') points.
Another variation is the farthest neighbor graph (FNG), in which each point is connected by an edge to the farthest point from it, instead of the nearest point.
NNGs for points in the plane as well as in multidimensional spaces find applications, e.g., in
data compression,
motion planning, and
facilities location. In
statistical analysis
Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers propertie ...
, the
nearest-neighbor chain algorithm based on following
paths in this graph can be used to find
hierarchical clustering
In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into ...
s quickly. Nearest neighbor graphs are also a subject of
computational geometry
Computational geometry is a branch of computer science devoted to the study of algorithms which can be stated in terms of geometry. Some purely geometrical problems arise out of the study of computational geometric algorithms, and such problems ar ...
.
The method can be used to induce a graph on nodes with unknown connectivity.
NNGs for sets of points
One-dimensional case
For a set of points on a line, the nearest neighbor of a point is its left or right (or both) neighbor, if they are sorted along the line. Therefore, the NNG is a
path or a
forest of several paths and may be constructed in
O(''n'' log ''n'') time by
sorting
Sorting refers to ordering data in an increasing or decreasing manner according to some linear relationship among the data items.
# ordering: arranging items in a sequence ordered by some criterion;
# categorizing: grouping items with similar pro ...
. This estimate is
asymptotically optimal for certain
models of computation, because the constructed NNG gives the answer to the
element uniqueness problem : it is sufficient to check whether the NNG has a zero-length edge.
Higher dimensions
Unless stated otherwise, it is assumed that the NNGs are digraphs with uniquely defined nearest neighbors as described in the introduction.
#Along any directed path in an NNG the edge lengths are non-increasing.
[
#Only cycles of length 2 are possible in an NNG and each weakly connected component of an NNG with at least 2 vertices has exactly one 2-cycle.][
#For the points in the plane the NNG is a ]planar graph
In graph theory, a planar graph is a graph that can be embedded in the plane, i.e., it can be drawn on the plane in such a way that its edges intersect only at their endpoints. In other words, it can be drawn in such a way that no edges cross ...
with vertex degrees at most 6. If points are in general position, the degree is at most 5.[
#The NNG (treated as an undirected graph with multiple nearest neighbors allowed) of a set of points in the plane or any higher dimension is a subgraph of the Delaunay triangulation, the Gabriel graph, and the ]Semi-Yao graph The ''k''-semi-Yao graph (''k''-SYG) of a set of ''n'' objects ''P'' is a geometric proximity graph, which was first described to present a kinetic data structure for maintenance of all the nearest neighbors on moving objects. It is named for its ...
. If the points are in general position or if the single nearest neighbor condition is imposed, the NNG is a forest, a subgraph of the Euclidean minimum spanning tree.
References
{{reflist
Geometric graphs