In
graph theory
In mathematics, graph theory is the study of '' graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conn ...
, a clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together. Evidence suggests that in most real-world networks, and in particular
social network
A social network is a social structure made up of a set of social actors (such as individuals or organizations), sets of dyadic ties, and other social interactions between actors. The social network perspective provides a set of methods for ...
s, nodes tend to create tightly knit groups characterised by a relatively high density of ties; this likelihood tends to be greater than the average probability of a tie randomly established between two nodes (Holland and Leinhardt, 1971; Watts and Strogatz, 1998
).
Two versions of this measure exist: the global and the local. The global version was designed to give an overall indication of the clustering in the network, whereas the local gives an indication of the embeddedness of single nodes.
Local clustering coefficient
The local clustering coefficient of a
vertex (node) in a
graph
Graph may refer to:
Mathematics
*Graph (discrete mathematics), a structure made of vertices and edges
**Graph theory, the study of such graphs and their properties
*Graph (topology), a topological space resembling a graph in the sense of discre ...
quantifies how close its
neighbours are to being a
clique
A clique ( AusE, CanE, or ), in the social sciences, is a group of individuals who interact with one another and share similar interests. Interacting with cliques is part of normative social development regardless of gender, ethnicity, or popula ...
(complete graph).
Duncan J. Watts and
Steven Strogatz introduced the measure in 1998 to determine whether a graph is a
small-world network.
A graph
formally consists of a set of vertices
and a set of edges
between them. An edge
connects vertex
with vertex
.
The
neighbourhood
A neighbourhood (British English, Irish English, Australian English and Canadian English) or neighborhood (American English; American and British English spelling differences, see spelling differences) is a geographically localised community ...
for a vertex
is defined as its immediately connected neighbours as follows:
:
We define
as the number of vertices,
, in the neighbourhood,
, of a vertex.
The local clustering coefficient
for a vertex
is then given by a proportion of the number of links between the vertices within its neighbourhood divided by the number of links that could possibly exist between them. For a directed graph,
is distinct from
, and therefore for each neighbourhood
there are
links that could exist among the vertices within the neighbourhood (
is the number of neighbours of a vertex). Thus, the local clustering coefficient for directed graphs is given as
:
An undirected graph has the property that
and
are considered identical. Therefore, if a vertex
has
neighbours,
edges could exist among the vertices within the neighbourhood. Thus, the local clustering coefficient for undirected graphs can be defined as
:
Let
be the number of triangles on
for undirected graph
. That is,
is the number of subgraphs of
with 3 edges and 3 vertices, one of which is
. Let
be the number of triples on
. That is,
is the number of subgraphs (not necessarily induced) with 2 edges and 3 vertices, one of which is
and such that
is incident to both edges. Then we can also define the clustering coefficient as
:
It is simple to show that the two preceding definitions are the same, since
:
These measures are 1 if every neighbour connected to
is also connected to every other vertex within the neighbourhood, and 0 if no vertex that is connected to
connects to any other vertex that is connected to
.
Since any graph is fully specified by its
adjacency matrix
In graph theory and computer science, an adjacency matrix is a square matrix used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph.
In the special case of a finite simple ...
''A'', the local clustering coefficient for a simple undirected graph can be expressed in terms of ''A'' as:
:
where:
:
and ''C
i''=0 when ''k
i'' is zero or one. In the above expression, the numerator counts twice the number of complete triangles that vertex ''i'' is involved in. In the denominator, ''k
i2'' counts the number of edge pairs that vertex ''i'' is involved in plus the number of single edges traversed twice. ''k
i'' is the number of edges connected to vertex i, and subtracting ''k
i'' then removes the latter, leaving only a set of edge pairs that could conceivably be connected into triangles. For every such edge pair, there will be another edge pair which could form the same triangle, so the denominator counts twice the number of conceivable triangles that vertex ''i'' could be involved in.
Global clustering coefficient
The global clustering coefficient is based on triplets of nodes. A triplet is three nodes that are connected by either two (open triplet) or three (closed triplet) undirected ties. A
triangle graph therefore includes three closed triplets, one centered on each of the nodes (
n.b.
(, or ; plural form ) is a Latin phrase meaning "note well".
It is often abbreviated as NB, n.b., or with the ligature
and first appeared in English writing . In Modern English, it is used, particularly in legal papers, to draw the atten ...
this means the three triplets in a triangle come from overlapping selections of nodes). The global clustering coefficient is the number of closed triplets (or 3 x triangles) over the total number of triplets (both open and closed). The first attempt to measure it was made by Luce and Perry (1949). This measure gives an indication of the clustering in the whole network (global), and can be applied to both undirected and directed networks (often called transitivity, see Wasserman and Faust, 1994, page 243).
The global clustering coefficient is defined as:
:
.
The number of closed triplets has also been referred to as 3 × triangles in the literature, so:
:
.
A generalisation to
weighted networks A weighted network is a network where the ties among nodes have weights assigned to them. A network is a system whose elements are somehow connected. The elements of a system are represented as nodes (also known as actors or vertices) and the connec ...
was proposed by Opsahl and Panzarasa (2009), and a redefinition to two-mode networks (both binary and weighted) by Opsahl (2009).
Since any graph is fully specified by its
adjacency matrix
In graph theory and computer science, an adjacency matrix is a square matrix used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph.
In the special case of a finite simple ...
''A'', the global clustering coefficient for an undirected graph can be expressed in terms of ''A'' as:
:
where:
:
and ''C''=0 when the denominator is zero.
Network average clustering coefficient
As an alternative to the global clustering coefficient, the overall level of clustering in a network is measured by Watts and Strogatz
as the average of the local clustering coefficients of all the vertices
:
:
It is worth noting that this metric places more weight on the low degree nodes, while the transitivity ratio places more weight on the high degree nodes.
A generalisation to
weighted networks A weighted network is a network where the ties among nodes have weights assigned to them. A network is a system whose elements are somehow connected. The elements of a system are represented as nodes (also known as actors or vertices) and the connec ...
was proposed by Barrat et al. (2004), and a redefinition to
bipartite graph
In the mathematical field of graph theory, a bipartite graph (or bigraph) is a graph whose vertices can be divided into two disjoint and independent sets U and V, that is every edge connects a vertex in U to one in V. Vertex sets U and V ar ...
s (also called two-mode networks) by Latapy et al. (2008) and Opsahl (2009).
Alternative generalisations to weighted and
directed graph
In mathematics, and more specifically in graph theory, a directed graph (or digraph) is a graph that is made up of a set of vertices connected by directed edges, often called arcs.
Definition
In formal terms, a directed graph is an ordered pai ...
s have been provided by Fagiolo (2007) and Clemente and Grassi (2018).
This formula is not, by default, defined for graphs with isolated vertices; see Kaiser (2008) and Barmpoutis et al.
The networks with the largest possible average clustering coefficient are found to have a modular structure, and at the same time, they have the smallest possible average distance among the different nodes.
Percolation of clustered networks
For a random
tree-like network without degree-degree correlation, it can be shown that such network can have a
giant component
In network theory, a giant component is a connected component of a given random graph that contains a finite fraction of the entire graph's vertices.
Giant component in Erdős–Rényi model
Giant components are a prominent feature of the Erdő ...
, and the
percolation threshold
The percolation threshold is a mathematical concept in percolation theory that describes the formation of long-range connectivity in random systems. Below the threshold a giant connected component does not exist; while above it, there exists a ...
(transmission probability) is given by
, where
is the
generating function
In mathematics, a generating function is a way of encoding an infinite sequence of numbers () by treating them as the coefficients of a formal power series. This series is called the generating function of the sequence. Unlike an ordinary ser ...
corresponding to the
excess degree distribution.
In networks with low clustering,
, the critical point gets scaled by
such that:
This indicates that for a given degree distribution, the clustering leads to a larger percolation threshold, mainly because for a fixed number of links, the clustering structure reinforces the core of the network with the price of diluting the global connections. For networks with high clustering, strong clustering could induce the core–periphery structure, in which the core and periphery might percolate at different critical points, and the above approximate treatment is not applicable.
For studying the robustness of clustered networks a percolation approach is developed.
See also
*
Directed graph
In mathematics, and more specifically in graph theory, a directed graph (or digraph) is a graph that is made up of a set of vertices connected by directed edges, often called arcs.
Definition
In formal terms, a directed graph is an ordered pai ...
*
Graph theory
In mathematics, graph theory is the study of '' graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conn ...
*
Network theory
Network theory is the study of graphs as a representation of either symmetric relations or asymmetric relations between discrete objects. In computer science and network science, network theory is a part of graph theory: a network can be de ...
*
Network science
Network science is an academic field which studies complex networks such as telecommunication networks, computer networks, biological networks, cognitive and semantic networks, and social networks, considering distinct elements or actors repr ...
*
Percolation theory
In statistical physics and mathematics, percolation theory describes the behavior of a network when nodes or links are added. This is a geometric type of phase transition, since at a critical fraction of addition the network of small, disconnecte ...
*
Scale free network
A scale-free network is a network whose degree distribution follows a power law, at least asymptotically. That is, the fraction ''P''(''k'') of nodes in the network having ''k'' connections to other nodes goes for large values of ''k'' as
:
P(k ...
*
Small world
References
External links
*
{{DEFAULTSORT:Clustering Coefficient
Graph invariants
Algebraic graph theory
Network theory
Network analysis