The content-addressable network (CAN) is a distributed, decentralized
P2P infrastructure that provides
hash table
In computing, a hash table, also known as hash map, is a data structure that implements an associative array or dictionary. It is an abstract data type that maps keys to values. A hash table uses a hash function to compute an ''index'', ...
functionality on an
Internet
The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a ''internetworking, network of networks'' that consists ...
-like scale. CAN was one of the original four
distributed hash table proposals, introduced concurrently with
Chord
Chord may refer to:
* Chord (music), an aggregate of musical pitches sounded simultaneously
** Guitar chord a chord played on a guitar, which has a particular tuning
* Chord (geometry), a line segment joining two points on a curve
* Chord ( ...
,
Pastry
Pastry is baked food made with a dough of flour, water and shortening (solid fats, including butter or lard) that may be savoury or sweetened. Sweetened pastries are often described as '' bakers' confectionery''. The word "pastries" suggests ...
, and
Tapestry
Tapestry is a form of textile art, traditionally woven by hand on a loom. Tapestry is weft-faced weaving, in which all the warp threads are hidden in the completed work, unlike most woven textiles, where both the warp and the weft threads may ...
.
Overview
Like other distributed hash tables, CAN is designed to be
scalable,
fault tolerant, and
self-organizing. The architectural design is a virtual multi-dimensional
Cartesian coordinate space
In mathematics and physics, a vector space (also called a linear space) is a set whose elements, often called ''vectors'', may be added together and multiplied ("scaled") by numbers called '' scalars''. Scalars are often real numbers, but c ...
, a type of
overlay network, on a multi-
torus
In geometry, a torus (plural tori, colloquially donut or doughnut) is a surface of revolution generated by revolving a circle in three-dimensional space about an axis that is coplanar with the circle.
If the axis of revolution does not ...
. This n-dimensional coordinate space is a
virtual logical address, completely independent of the physical location and physical connectivity of the nodes.
Points within the space are identified with coordinates. The entire coordinate space is dynamically partitioned among all the nodes in the system such that every node possesses at least one distinct zone within the overall space.
Routing
A CAN node maintains a
routing table
In computer networking, a routing table, or routing information base (RIB), is a data table stored in a router or a network host that lists the routes to particular network destinations, and in some cases, metrics (distances) associated with t ...
that holds the
IP address
An Internet Protocol address (IP address) is a numerical label such as that is connected to a computer network that uses the Internet Protocol for communication.. Updated by . An IP address serves two main functions: network interface ident ...
and virtual coordinate zone of each of its neighbors. A node routes a message towards a destination point in the coordinate space. The node first determines which neighboring zone is closest to the destination point, and then looks up that zone's node's IP address via the routing table.
[
]
Node joining
To join a CAN, a joining node must:
# Find a node already in the overlay network.
# Identify a zone that can be split
# Update the routing tables of nodes neighboring the newly split zone.[
To find a node already in the overlay network, bootstrapping nodes may be used to inform the joining node of IP addresses of nodes currently in the overlay network.][
After the joining node receives an IP address of a node already in the CAN, it can attempt to identify a zone for itself. The joining node randomly picks a point in the coordinate space and sends a join request, directed to the random point, to one of the received IP addresses. The nodes already in the overlay network route the join request to the correct device via their zone-to-IP routing tables. Once the node managing the destination point's zone receives the join request, it may honor the join request by splitting its zone in half, allocating itself the first half, and allocating the joining node the second half. If it does not honor the join request, the joining node keeps picking random points in the coordinate space and sending join requests directed to these random points until it successfully joins the network.][
After the zone split and allocation is complete, the neighboring nodes are updated with the coordinates of the two new zones and the corresponding IP addresses. Routing tables are updated and updates are propagated across the network.][
]
Node departing
To handle a node departing, the CAN must
# identify a node is departing
# have the departing node's zone merged or taken over by a neighboring node
# update the routing tables across the network.[
Detecting a node's departure can be done, for instance, via heartbeat messages that periodically broadcast routing table information between neighbors. After a predetermined period of silence from a neighbor, that neighboring node is determined as failed and is considered a departing node.][ Alternatively, a node that is willingly departing may broadcast such a notice to its neighbors.
After a departing node is identified, its zone must be either merged or taken over. First the departed node's zone is analyzed to determine whether a neighboring node's zone can merge with the departed node's zone to form a valid zone. For example, a zone in a 2d coordinate space must be either a square or rectangle and cannot be L-shaped. The validation test may cycle through all neighboring zones to determine if a successful merge can occur. If one of the potential merges is deemed a valid merge, the zones are then merged. If none of the potential merges are deemed valid, then the neighboring node with the smallest zone takes over control of the departing node's zone.][ After a take-over, the take-over node may periodically attempt to merge its additionally controlled zones with respective neighboring zones.
If the merge is successful, routing tables of neighboring zones' nodes are updated to reflect the merge. The network will see the subsection of the overlay network as one, single zone after a merge and treat all routing processing with this mindset. To effectuate a take-over, the take-over node updates neighboring zones' nodes' routing tables, so that requests to either zone resolve to the take-over node. And, as such, the network still sees the subsection of the overlay network as two separate zones and treats all routing processing with this mindset.
]
Developers
Sylvia Ratnasamy, Paul Francis, Mark Handley
Mark Handley is a playwright and screenwriter.
In 1977, he and his wife moved to the Pacific Northwest where they lived in isolation in a log cabin that they built themselves. He is best known for his play '' Idioglossia'', which was later pro ...
, Richard Karp, Scott Shenker
See also
* Chord
Chord may refer to:
* Chord (music), an aggregate of musical pitches sounded simultaneously
** Guitar chord a chord played on a guitar, which has a particular tuning
* Chord (geometry), a line segment joining two points on a curve
* Chord ( ...
* Pastry
Pastry is baked food made with a dough of flour, water and shortening (solid fats, including butter or lard) that may be savoury or sweetened. Sweetened pastries are often described as '' bakers' confectionery''. The word "pastries" suggests ...
* Tapestry
Tapestry is a form of textile art, traditionally woven by hand on a loom. Tapestry is weft-faced weaving, in which all the warp threads are hidden in the completed work, unlike most woven textiles, where both the warp and the weft threads may ...
References
{{reflist
Routing
Distributed data storage
Peer-to-peer computing
Hash based data structures