Network science is an academic field which studies
complex network
In the context of network theory, a complex network is a graph (network) with non-trivial topological features—features that do not occur in simple networks such as lattices or random graphs but often occur in networks representing real ...
s such as
telecommunication network
A telecommunications network is a group of nodes interconnected by telecommunications links that are used to exchange messages between the nodes. The links may use a variety of technologies based on the methodologies of circuit switching, mes ...
s,
computer network
A computer network is a set of computers sharing resources located on or provided by network nodes. The computers use common communication protocols over digital interconnections to communicate with each other. These interconnections ar ...
s,
biological networks, cognitive and
semantic network
A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices ...
s, and
social network
A social network is a social structure made up of a set of social actors (such as individuals or organizations), sets of dyadic ties, and other social interactions between actors. The social network perspective provides a set of methods for ...
s, considering distinct elements or actors represented by ''nodes'' (or ''vertices'') and the connections between the elements or actors as ''links'' (or ''edges''). The field draws on theories and methods including
graph theory
In mathematics, graph theory is the study of '' graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conn ...
from mathematics,
statistical mechanics from physics,
data mining and
information visualization
Information is an abstract concept that refers to that which has the power to inform. At the most fundamental level information pertains to the interpretation of that which may be sensed. Any natural process that is not completely random, ...
from computer science,
inferential modeling from statistics, and
social structure
In the social sciences, social structure is the aggregate of patterned social arrangements in society that are both emergent from and determinant of the actions of individuals. Likewise, society is believed to be grouped into structurally rela ...
from sociology. The
United States National Research Council
The National Academies of Sciences, Engineering, and Medicine (also known as NASEM or the National Academies) are the collective scientific national academy of the United States. The name is used interchangeably in two senses: (1) as an umbrell ...
defines network science as "the study of network representations of physical, biological, and social phenomena leading to predictive models of these phenomena."
Background and history
The study of networks has emerged in diverse disciplines as a means of analyzing complex relational data. The earliest known paper in this field is the famous
Seven Bridges of Königsberg
The Seven Bridges of Königsberg is a historically notable problem in mathematics. Its negative resolution by Leonhard Euler in 1736 laid the foundations of graph theory and prefigured the idea of topology.
The city of Königsberg in Prussia (n ...
written by
Leonhard Euler
Leonhard Euler ( , ; 15 April 170718 September 1783) was a Swiss mathematician, physicist, astronomer, geographer, logician and engineer who founded the studies of graph theory and topology and made pioneering and influential discoveries in ma ...
in 1736. Euler's mathematical description of vertices and edges was the foundation of
graph theory
In mathematics, graph theory is the study of '' graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conn ...
, a branch of mathematics that studies the properties of pairwise relations in a network structure. The field of
graph theory
In mathematics, graph theory is the study of '' graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conn ...
continued to develop and found applications in chemistry (Sylvester, 1878).
Dénes Kőnig, a Hungarian mathematician and professor, wrote the first book in Graph Theory, entitled "Theory of finite and infinite graphs", in 1936.

In the 1930s
Jacob Moreno
Jacob Levy Moreno (born Iacob Levy; May 18, 1889 – May 14, 1974) was a Romanian-American psychiatrist, psychosociologist, and educator, the founder of psychodrama, and the foremost pioneer of group psychotherapy. During his lifetime, he was r ...
, a psychologist in the
Gestalt tradition, arrived in the United States. He developed the
sociogram and presented it to the public in April 1933 at a convention of medical scholars. Moreno claimed that "before the advent of sociometry no one knew what the interpersonal structure of a group 'precisely' looked like" (Moreno, 1953). The sociogram was a representation of the social structure of a group of elementary school students. The boys were friends of boys and the girls were friends of girls with the exception of one boy who said he liked a single girl. The feeling was not reciprocated. This network representation of social structure was found so intriguing that it was printed in
The New York Times
''The New York Times'' (''the Times'', ''NYT'', or the Gray Lady) is a daily newspaper based in New York City with a worldwide readership reported in 2020 to comprise a declining 840,000 paid print subscribers, and a growing 6 million paid ...
(April 3, 1933, page 17). The sociogram has found many applications and has grown into the field of
social network analysis
Social network analysis (SNA) is the process of investigating social structures through the use of networks and graph theory. It characterizes networked structures in terms of ''nodes'' (individual actors, people, or things within the network) ...
.
Probabilistic theory in network science developed as an offshoot of
graph theory
In mathematics, graph theory is the study of '' graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conn ...
with
Paul Erdős and
Alfréd Rényi
Alfréd Rényi (20 March 1921 – 1 February 1970) was a Hungarian mathematician known for his work in probability theory, though he also made contributions in combinatorics, graph theory, and number theory.
Life
Rényi was born in Budapest ...
's eight famous papers on
random graphs
In mathematics, random graph is the general term to refer to probability distributions over graphs. Random graphs may be described simply by a probability distribution, or by a random process which generates them. The theory of random graphs ...
. For
social networks
A social network is a social structure made up of a set of social actors (such as individuals or organizations), sets of dyadic ties, and other social interactions between actors. The social network perspective provides a set of methods for ...
the
exponential random graph model
Exponential family random graph models (ERGMs) are a family of statistical models for analyzing data from social and other networks. Examples of networks examined using ERGM include knowledge networks, organizational networks, colleague networks, ...
or p* is a notational framework used to represent the probability space of a tie occurring in a
social network
A social network is a social structure made up of a set of social actors (such as individuals or organizations), sets of dyadic ties, and other social interactions between actors. The social network perspective provides a set of methods for ...
. An alternate approach to network probability structures is the
network probability matrix
The network probability matrix describes the probability structure of a network based on the historical presence or absence of edges in a network. For example, individuals in a social network are not connected to other individuals with uniform ...
, which models the probability of edges occurring in a network, based on the historic presence or absence of the edge in a sample of networks.
In 1998,
David Krackhardt David Krackhardt is Professor of Organizations at Heinz College and the Tepper School of Business, with courtesy appointments in the Department of Social and Decision Sciences (Dietrich College of Humanities and Social Sciences) and the Machine L ...
and
Kathleen Carley
Kathleen M. Carley is an American social scientist specializing in dynamic network analysis. She is a professor in the School of Computer Science in the Institute for Software Research at Carnegie Mellon University and also holds appointments in ...
introduced the idea of a meta-network with the PCANS Model. They suggest that "all organizations are structured along these three domains, Individuals, Tasks, and Resources". Their paper introduced the concept that networks occur across multiple domains and that they are interrelated. This field has grown into another sub-discipline of network science called
dynamic network analysis
Dynamic network analysis (DNA) is an emergent scientific field that brings together traditional social network analysis (SNA), link analysis (LA), social simulation and multi-agent systems (MAS) within network science and network theory. Dynamic ...
.
More recently other network science efforts have focused on mathematically describing different network topologies.
Duncan Watts and
Steven Strogatz reconciled empirical data on networks with mathematical representation, describing the
small-world network
A small-world network is a type of mathematical graph in which most nodes are not neighbors of one another, but the neighbors of any given node are likely to be neighbors of each other and most nodes can be reached from every other node by a sm ...
.
Albert-László Barabási and
Reka Albert
Reka may refer to:
Places
* Řeka, a village in the Czech Republic
* Reka, Cerkno, a village near Cerkno, Slovenia
* Reka, Laško, a village near Laško, Slovenia
* Reka (Kladovo), a village near Kladovo, Serbia
* Reka, Koprivnica, a village nea ...
discovered
scale-free networks
A scale-free network is a network whose degree distribution follows a power law, at least asymptotically. That is, the fraction ''P''(''k'') of nodes in the network having ''k'' connections to other nodes goes for large values of ''k'' as
:
P( ...
, a property that captures the fact that in real network hubs coexist with many small degree vertices, and offered a dynamical model to explain the origin of this scale-free state.
Department of Defense initiatives
The U.S. military first became interested in
network-centric warfare
Network-centric warfare, also called network-centric operations or net-centric warfare, is a military doctrine or theory of war that aims to translate an information advantage, enabled partly by information technology, into a competitive advant ...
as an operational concept based on network science in 1996. John A. Parmentola, the U.S. Army Director for Research and Laboratory Management, proposed to the Army's Board on Science and Technology (BAST) on December 1, 2003 that Network Science become a new Army research area. The BAST, the Division on Engineering and Physical Sciences for the National Research Council (NRC) of the National Academies, serves as a convening authority for the discussion of science and technology issues of importance to the Army and oversees independent Army-related studies conducted by the National Academies. The BAST conducted a study to find out whether identifying and funding a new field of investigation in basic research, Network Science, could help close the gap between what is needed to realize Network-Centric Operations and the current primitive state of fundamental knowledge of networks.
As a result, the BAST issued the NRC study in 2005 titled Network Science (referenced above) that defined a new field of basic research in Network Science for the Army. Based on the findings and recommendations of that study and the subsequent 2007 NRC report titled Strategy for an Army Center for Network Science, Technology, and Experimentation, Army basic research resources were redirected to initiate a new basic research program in Network Science. To build a new theoretical foundation for complex networks, some of the key Network Science research efforts now ongoing in Army laboratories address:
* Mathematical models of network behavior to predict performance with network size, complexity, and environment
* Optimized human performance required for network-enabled warfare
* Networking within ecosystems and at the molecular level in cells.
As initiated in 2004 by Frederick I. Moxley with support he solicited from David S. Alberts, the Department of Defense helped to establish the first Network Science Center in conjunction with the U.S. Army at the United States Military Academy (USMA). Under the tutelage of Dr. Moxley and the faculty of the USMA, the first interdisciplinary undergraduate courses in Network Science were taught to cadets at West Point. In order to better instill the tenets of network science among its cadre of future leaders, the USMA has also instituted a five-course undergraduate minor in Network Science.
In 2006, the U.S. Army and the United Kingdom (UK) formed the Network and Information Science
International Technology Alliance The International Technology Alliance (ITA) refers to a series of research programs that were jointly sponsored by UK Ministry of Defence (United Kingdom) and the US Army Research Laboratory (ARL). One such program focusing on network sciences NIS ...
, a collaborative partnership among the Army Research Laboratory, UK Ministry of Defense and a consortium of industries and universities in the U.S. and UK. The goal of the alliance is to perform basic research in support of Network- Centric Operations across the needs of both nations.
In 2009, the U.S. Army formed the
Network Science CTA
The Network Science Collaborative Technology Alliance (NS CTA) is a collaborative research alliance funded by the US Army Research Laboratory (ARL) and focused on fundamental research on the critical scientific and technical challenges that emerge ...
, a collaborative research alliance among the
Army Research Laboratory,
CERDEC
The Combat Capabilities Development Command (CCDC) C5ISR Center, formerly the Communications-Electronics RD&E Center (CERDEC), is the United States Army information technologies and integrated systems center. CCDC C5ISR Center is headquartered ...
, and a consortium of about 30 industrial R&D labs and universities in the U.S. The goal of the alliance is to develop a deep understanding of the underlying commonalities among intertwined social/cognitive, information, and communications networks, and as a result improve our ability to analyze, predict, design, and influence complex systems interweaving many kinds of networks.
Subsequently, as a result of these efforts, the U.S. Department of Defense has sponsored numerous research projects that support Network Science.
Network properties
Often, networks have certain attributes that can be calculated to analyze the properties & characteristics of the network. The behavior of these network properties often define
network model
The network model is a database model conceived as a flexible way of representing objects and their relationships. Its distinguishing feature is that the schema, viewed as a graph in which object types are nodes and relationship types are arcs, ...
s and can be used to analyze how certain models contrast to each other. Many of the definitions for other terms used in network science can be found in
Glossary of graph theory.
Size
The size of a network can refer to the number of nodes
or, less commonly, the number of edges
which (for connected graphs with no multi-edges) can range from
(a tree) to
(a complete graph). In the case of a simple graph (a network in which at most one (undirected) edge exists between each pair of vertices, and in which no vertices connect to themselves), we have
; for directed graphs (with no self-connected nodes),
; for directed graphs with self-connections allowed,
. In the circumstance of a graph within which multiple edges may exist between a pair of vertices,
.
Density
The density
of a network is defined as a normalized ratio between 0 and 1 of the number of edges
to the number of possible edges in a network with
nodes. Network density is a measure of the percentage of "optional" edges that exist in the network and can be computed as
where
and
are the minimum and maximum number of edges in a connected network with
nodes, respectively. In the case of simple graphs,
is given by the
binomial coefficient
In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers and is written \tbinom. It is the coefficient of the t ...
and
, giving density
.
Another possible equation is
whereas the ties
are unidirectional (Wasserman & Faust 1994). This gives a better overview over the network density, because unidirectional relationships can be measured.
Planar network density
The density
of a network, where there is no intersection between edges, is defined as a ratio of the number of edges
to the number of possible edges in a network with
nodes, given by a graph with no intersecting edges
, giving
Average degree
The
degree
Degree may refer to:
As a unit of measurement
* Degree (angle), a unit of angle measurement
** Degree of geographical latitude
** Degree of geographical longitude
* Degree symbol (°), a notation used in science, engineering, and mathemati ...
of a node is the number of edges connected to it. Closely related to the density of a network is the average degree,
(or, in the case of directed graphs,
, the former factor of 2 arising from each edge in an undirected graph contributing to the degree of two distinct vertices). In the
ER random graph model (
) we can compute the expected value of
(equal to the expected value of
of an arbitrary vertex): a random vertex has
other vertices in the network available, and with probability
, connects to each. Thus,
.
Average shortest path length (or characteristic path length)
The average shortest path length is calculated by finding the
shortest path
In graph theory, the shortest path problem is the problem of finding a path between two vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is minimized.
The problem of finding the shortest path between t ...
between all pairs of nodes, and taking the average over all paths of the length thereof (the length being the number of intermediate edges contained in the path, i.e., the distance
between the two vertices
within the graph). This shows us, on average, the number of steps it takes to get from one member of the network to another. The behavior of the expected average shortest path length (that is, the ensemble average of the average shortest path length) as a function of the number of vertices
of a random network model defines whether that model exhibits the small-world effect; if it scales as
, the model generates small-world nets. For faster-than-logarithmic growth, the model does not produce small worlds. The special case of
is known as ultra-small world effect.
Diameter of a network
As another means of measuring network graphs, we can define the diameter of a network as the longest of all the calculated shortest paths in a network. It is the shortest distance between the two most distant nodes in the network. In other words, once the shortest path length from every node to all other nodes is calculated, the diameter is the longest of all the calculated path lengths. The diameter is representative of the linear size of a network. If node A-B-C-D are connected, going from A->D this would be the diameter of 3 (3-hops, 3-links).
Clustering coefficient
The clustering coefficient is a measure of an "all-my-friends-know-each-other" property. This is sometimes described as the friends of my friends are my friends. More precisely, the clustering coefficient of a node is the ratio of existing links connecting a node's neighbors to each other to the maximum possible number of such links. The clustering coefficient for the entire network is the average of the clustering coefficients of all the nodes. A high clustering coefficient for a network is another indication of a
small world.
The clustering coefficient of the
'th node is
:
where
is the number of neighbours of the
'th node, and
is the number of connections between these neighbours. The maximum possible number of connections between neighbors is, then,
:
From a probabilistic standpoint, the expected local clustering coefficient is the likelihood of a link existing between two arbitrary neighbors of the same node.
Connectedness
The way in which a network is connected plays a large part into how networks are analyzed and interpreted. Networks are classified in four different categories:
* ''Clique''/''Complete Graph'': a completely connected network, where all nodes are connected to every other node. These networks are symmetric in that all nodes have in-links and out-links from all others.
* ''Giant Component'': A single connected component which contains most of the nodes in the network.
* ''Weakly Connected Component'': A collection of nodes in which there exists a path from any node to any other, ignoring directionality of the edges.
* ''Strongly Connected Component'': A collection of nodes in which there exists a ''directed'' path from any node to any other.
Node centrality
Centrality indices produce rankings which seek to identify the most important nodes in a network model. Different centrality indices encode different contexts for the word "importance." The
betweenness centrality
In graph theory, betweenness centrality (or "betweeness centrality") is a measure of centrality in a graph based on shortest paths. For every pair of vertices in a connected graph, there exists at least one shortest path between the vertices suc ...
, for example, considers a node highly important if it form bridges between many other nodes. The
eigenvalue centrality, in contrast, considers a node highly important if many other highly important nodes link to it. Hundreds of such measures have been proposed in the literature.
Centrality indices are only accurate for identifying the most important nodes. The measures are seldom, if ever, meaningful for the remainder of network nodes.
[
][
] Also, their indications are only accurate within their assumed context for importance, and tend to "get it wrong" for other contexts.
[
] For example, imagine two separate communities whose only link is an edge between the most junior member of each community. Since any transfer from one community to the other must go over this link, the two junior members will have high betweenness centrality. But, since they are junior, (presumably) they have few connections to the "important" nodes in their community, meaning their eigenvalue centrality would be quite low.
Node influence
Limitations to centrality measures have led to the development of more general measures.
Two examples are
the accessibility, which uses the diversity of random walks to measure how accessible the rest of the network is from a given start node,
and the expected force, derived from the expected value of the
force of infection generated by a node.
Both of these measures can be meaningfully computed from the structure of the network alone.
Community structure
Nodes in a network may be partitioned into groups representing communities. Depending on the context, communities may be distinct or overlapping. Typically, nodes in such communities will be strongly connected to other nodes in the same community, but weakly connected to nodes outside the community. In the absence of a
ground truth
Ground truth is information that is known to be real or true, provided by direct observation and measurement (i.e. empirical evidence) as opposed to information provided by inference.
Etymology
The ''Oxford English Dictionary'' (s.v. "ground t ...
describing the
community structure
In the study of complex networks, a network is said to have community structure if the nodes of the network can be easily grouped into (potentially overlapping) sets of nodes such that each set of nodes is densely connected internally. In the par ...
of a specific network, several algorithms have been developed to infer possible community structures using either supervised of unsupervised clustering methods.
Network models
Network models serve as a foundation to understanding interactions within empirical complex networks. Various
random graph generation models produce network structures that may be used in comparison to real-world complex networks.
Erdős–Rényi random graph model

The
Erdős–Rényi model
In the mathematical field of graph theory, the Erdős–Rényi model is either of two closely related models for generating random graphs or the evolution of a random network. They are named after Hungarian mathematicians Paul Erdős and Alf ...
, named for
Paul Erdős and
Alfréd Rényi
Alfréd Rényi (20 March 1921 – 1 February 1970) was a Hungarian mathematician known for his work in probability theory, though he also made contributions in combinatorics, graph theory, and number theory.
Life
Rényi was born in Budapest ...
, is used for generating
random graphs in which edges are set between nodes with equal probabilities. It can be used in the
probabilistic method to prove the existence of graphs satisfying various properties, or to provide a rigorous definition of what it means for a property to hold for almost all graphs.
To generate an Erdős–Rényi model
two parameters must be specified: the total number of nodes and the probability that a random pair of nodes has an edge.
Because the model is generated without bias to particular nodes, the degree distribution is binomial: for a randomly chosen vertex
,
:
In this model the clustering coefficient is
a.s. The behavior of
can be broken into three regions.
''Subcritical''
: All components are simple and very small, the largest component has size
;
''Critical''
:
;
''Supercritical''
:
where
is the positive solution to the equation
.
The largest connected component has high complexity. All other components are simple and small
.
Configuration model
The configuration model takes a degree sequence
or degree distribution
(which subsequently is used to generate a degree sequence) as the input, and produces randomly connected graphs in all respects other than the degree sequence. This means that for a given choice of the degree sequence, the graph is chosen uniformly at random from the set of all graphs that comply with this degree sequence. The degree
of a randomly chosen vertex is an
independent and identically distributed
In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usua ...
random variable with integer values. When
, the configuration graph contains the
giant connected component, which has infinite size.
The rest of the components have finite sizes, which can be quantified with the notion of the size distribution. The probability
that a randomly sampled node is connected to a component of size
is given by
convolution powers of the degree distribution:
where
denotes the degree distribution and
. The giant component can be destroyed by randomly removing the critical fraction
of all edges. This process is called
percolation on random networks. When the second moment of the degree distribution is finite,
, this critical edge fraction is given by
, and the
average vertex-vertex distance in the giant component scales logarithmically with the total size of the network,
.
In the directed configuration model, the degree of a node is given by two numbers, in-degree
and out-degree
, and consequently, the degree distribution is two-variate. The expected number of in-edges and out-edges coincides, so that