In
computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, ...
, Chord is a protocol and
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
for a
peer-to-peer
Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer ...
distributed hash table
A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table: key–value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The ...
. A distributed hash table stores
key-value pairs by assigning keys to different computers (known as "nodes"); a node will store the values for all the keys for which it is responsible. Chord specifies how keys are assigned to nodes, and how a node can discover the value for a given key by first locating the node responsible for that key.
Chord is one of the four original
distributed hash table
A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table: key–value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The ...
protocols, along with
CAN,
Tapestry
Tapestry is a form of textile art, traditionally woven by hand on a loom. Tapestry is weft-faced weaving, in which all the warp threads are hidden in the completed work, unlike most woven textiles, where both the warp and the weft threads may ...
, and
Pastry
Pastry is baked food made with a dough of flour, water and shortening (solid fats, including butter or lard) that may be savoury or sweetened. Sweetened pastries are often described as '' bakers' confectionery''. The word "pastries" suggests ...
. It was introduced in 2001 by
Ion Stoica
Ion Stoica is a Romanian-American computer scientist specializing in distributed systems, cloud computing and computer networking. He is a professor of computer science at the University of California, Berkeley and co-director of AMPLab. He co-fo ...
,
Robert Morris,
David Karger
David Ron Karger (born May 1, 1967) is a professor of computer science and a member of the Computer Science and Artificial Intelligence Laboratory ( CSAIL) at the Massachusetts Institute of Technology.
Education
Karger received a Bachelor of A ...
,
Frans Kaashoek, and
Hari Balakrishnan
Hari Balakrishnan is the Fujitsu Professor of Computer Science and Artificial Intelligence in the Department of Electrical Engineering and Computer Science at MIT, and the Co-founder and CTO at Cambridge Mobile Telematics.
Early life and car ...
, and was developed at
MIT
The Massachusetts Institute of Technology (MIT) is a private land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern technology and science, and is one of the ...
.
The 2001 Chord paper
won a ACM SIGCOMM Test of Time award in 2011.
Subsequent research by
Pamela Zave has shown that the original Chord algorithm (as specified in the 2001 SIGCOMM paper,
the 2001 Technical report,
the 2002 PODC paper,
[
] and
the 2003 TON paper
[
]) can mis-order the ring, produce several rings, and break the ring.
Overview

Nodes and keys are assigned an
-bit ''identifier'' using ''
consistent hashing''. The
SHA-1
In cryptography, SHA-1 (Secure Hash Algorithm 1) is a cryptographically broken but still widely used hash function which takes an input and produces a 160-bit (20- byte) hash value known as a message digest – typically rendered as 40 hexadec ...
algorithm is the base
hashing function
A hash function is any function that can be used to map data of arbitrary size to fixed-size values. The values returned by a hash function are called ''hash values'', ''hash codes'', ''digests'', or simply ''hashes''. The values are usually ...
for consistent hashing. Consistent hashing is integral to the robustness and performance of Chord because both keys and nodes (in fact, their
IP address
An Internet Protocol address (IP address) is a numerical label such as that is connected to a computer network that uses the Internet Protocol for communication.. Updated by . An IP address serves two main functions: network interface ident ...
es) are uniformly distributed in the same identifier space with a negligible possibility of
collision
In physics, a collision is any event in which two or more bodies exert forces on each other in a relatively short time. Although the most common use of the word ''collision'' refers to incidents in which two or more objects collide with great fo ...
. Thus, it also allows nodes to join and leave the network without disruption. In the protocol, the term ''node'' is used to refer to both a node itself and its identifier (ID) without ambiguity. So is the term ''key''.
Using the Chord lookup protocol, nodes and keys are arranged in an identifier circle that has at most
nodes, ranging from
to
. (
should be large enough to avoid collision.) Some of these nodes will map to machines or keys while others (most) will be empty.
Each node has a ''successor'' and a ''predecessor''. The successor to a node is the next node in the identifier circle in a clockwise direction. The predecessor is counter-clockwise. If there is a node for each possible ID, the successor of node 0 is node 1, and the predecessor of node 0 is node
; however, normally there are "holes" in the sequence. For example, the successor of node 153 may be node 167 (and nodes from 154 to 166 do not exist); in this case, the predecessor of node 167 will be node 153.
The concept of successor can be used for keys as well. The ''successor node'' of a key
is the first node whose ID equals to
or follows
in the identifier circle, denoted by
. Every key is assigned to (stored at) its successor node, so looking up a key
is to query
.
Since the successor (or predecessor) of a node may disappear from the network (because of failure or departure), each node records an arc of
nodes in the middle of which it stands, i.e., the list of
nodes preceding it and
nodes following it. This list results in a high probability that a node is able to correctly locate its successor or predecessor, even if the network in question suffers from a high failure rate.
Protocol details
Basic query
The core usage of the Chord protocol is to query a key from a client (generally a node as well), i.e. to find
. The basic approach is to pass the query to a node's successor, if it cannot find the key locally. This will lead to a
query time where
is the number of machines in the ring.
Finger table
To avoid the linear search above, Chord implements a faster search method by requiring each node to keep a ''finger table'' containing up to
entries, recall that
is the number of bits in the hash key. The
entry of node
will contain
. The first entry of finger table is actually the node's immediate successor (and therefore an extra successor field is not needed). Every time a node wants to look up a key
, it will pass the query to the closest successor or predecessor (depending on the finger table) of
in its finger table (the "largest" one on the circle whose ID is smaller than
), until a node finds out the key is stored in its immediate successor.
With such a finger table, the number of nodes that must be contacted to find a successor in an ''N''-node network is
. (See proof below.)
Node join
Whenever a new node joins, three invariants should be maintained (the first two ensure correctness and the last one keeps querying fast):
# Each node's successor points to its immediate successor correctly.
# Each key is stored in
.
# Each node's finger table should be correct.
To satisfy these invariants, a ''predecessor'' field is maintained for each node. As the successor is the first entry of the finger table, we do not need to maintain this field separately any more. The following tasks should be done for a newly joined node
:
# Initialize node
(the predecessor and the finger table).
# Notify other nodes to update their predecessors and finger tables.
# The new node takes over its responsible keys from its successor.
The predecessor of
can be easily obtained from the predecessor of
(in the previous circle). As for its finger table, there are various initialization methods. The simplest one is to execute find successor queries for all
entries, resulting in
initialization time. A better method is to check whether
entry in the finger table is still correct for the
entry. This will lead to
. The best method is to initialize the finger table from its immediate neighbours and make some updates, which is
.
Stabilization
To ensure correct lookups, all successor pointers must be up to date. Therefore, a stabilization protocol is running periodically in the background which updates finger tables and successor pointers.
The stabilization protocol works as follows:
* Stabilize(): n asks its successor for its predecessor p and decides whether p should be n's successor instead (this is the case if p recently joined the system).
* Notify(): notifies n's successor of its existence, so it can change its predecessor to n
* Fix_fingers(): updates finger tables
Potential uses
*Cooperative Mirroring: A
load balancing mechanism by a local network hosting information available to computers outside of the local network. This scheme could allow developers to balance the load between many computers instead of a central server to ensure availability of their product.
*Time-shared storage: In a network, once a computer joins the network its available data is distributed throughout the network for retrieval when that computer disconnects from the network. As well as other computers' data is sent to the computer in question for offline retrieval when they are no longer connected to the network. Mainly for nodes without the ability to connect full-time to the network.
*Distributed Indices: Retrieval of files over the network within a searchable database. e.g. P2P file transfer clients.
*Large scale combinatorial searches: Keys being candidate solutions to a problem and each key mapping to the node, or computer, that is responsible for evaluating them as a solution or not. e.g. Code Breaking
*Also used in wireless sensor networks for reliability
Proof sketches
With high probability, Chord contacts
nodes to find a successor in an
-node network.
Suppose node
wishes to find the successor of key
. Let
be the predecessor of
. We wish to find an upper bound for the number of steps it takes for a message to be routed from
to
. Node
will examine its finger table and route the request to the closest predecessor of
that it has. Call this node
. If
is the
entry in
's finger table, then both
and
are at distances between
and
from
along the identifier circle. Hence, the distance between
and
along this circle is at most
. Thus the distance from
to
is less than the distance from
to
: the new distance to
is at most half the initial distance.
This process of halving the remaining distance repeats itself, so after
steps, the distance remaining to
is at most
; in particular, after
steps, the remaining distance is at most
. Because nodes are distributed uniformly at random along the identifier circle, the expected number of nodes falling within an interval of this length is 1, and with high probability, there are fewer than
such nodes. Because the message always advances by at least one node, it takes at most
steps for a message to traverse this remaining distance. The total expected routing time is thus
.
If Chord keeps track of
predecessors/successors, then with high probability, if each node has probability of 1/4 of failing, find_successor (see below) and find_predecessor (see below) will return the correct nodes
Simply, the probability that all
nodes fail is
, which is a low probability; so with high probability at least one of them is alive and the node will have the correct pointer.
Pseudocode
;Definitions for pseudocode
:;finger
first node that succeeds
:;successor: the next node from the node in question on the identifier ring
:;predecessor: the previous node from the node in question on the identifier ring
The pseudocode to find the ''successor'' node of an id is given below:
''// ask node n to find the successor of id''
n.find_successor(id)
''// Yes, that should be a closing square bracket to match the opening parenthesis.''
''// It is a half closed interval.''
if id ∈ (n, successor] then
return successor
else
// forward the query around the circle
n0 := closest_preceding_node(id)
return n0.find_successor(id)
''// search the local table for the highest predecessor of id''
n.closest_preceding_node(id)
for i = m downto 1 do
if (finger
∈ (n, id)) then
return finger
return n
The pseudocode to stabilize the chord ring/circle after node joins and departures is as follows:
''// create a new Chord ring.''
n.create()
predecessor := nil
successor := n
''// join a Chord ring containing node n'.''
n.join(n')
predecessor := nil
successor := n'.find_successor(n)
''// called periodically. n asks the successor''
''// about its predecessor, verifies if n's immediate''
''// successor is consistent, and tells the successor about n''
n.stabilize()
x = successor.predecessor
if x ∈ (n, successor) then
successor := x
successor.notify(n)
''// n' thinks it might be our predecessor.''
n.notify(n')
if predecessor is nil or n'∈(predecessor, n) then
predecessor := n'
''// called periodically. refreshes finger table entries.''
''// next stores the index of the finger to fix''
n.fix_fingers()
next := next + 1
if next > m then
next := 1
finger
ext
Ext, ext or EXT may refer to:
* Ext functor, used in the mathematical field of homological algebra
* Ext (JavaScript library), a programming library used to build interactive web applications
* Exeter Airport (IATA airport code), in Devon, Engla ...
:= find_successor(n+2);
''// called periodically. checks whether predecessor has failed.''
n.check_predecessor()
if predecessor has failed then
predecessor := nil
See also
*
Kademlia
*
Koorde
In peer-to-peer networks, Koorde is a distributed hash table (DHT) system based on the Chord DHT and the De Bruijn graph (De Bruijn sequence). Inheriting the simplicity of Chord, Koorde meets hops per node (where is the number of nodes in the D ...
*
OverSim - the overlay simulation framework
*
SimGrid - a toolkit for the simulation of distributed applications -
References
External links
The Chord Project(redirect from: http://pdos.lcs.mit.edu/chord/)
Open Chord - An Open Source Java ImplementationChordless - Another Open Source Java ImplementationjDHTUQ- An open source java implementation. API to generalize the implementation of peer-to-peer DHT systems. Contains GUI in mode data structure
{{DEFAULTSORT:Chord (Peer-To-Peer)
Articles with example pseudocode
Distributed data storage
Software using the MIT license