A distributed data store is a
computer network
A computer network is a set of computers sharing resources located on or provided by network nodes. The computers use common communication protocols over digital interconnections to communicate with each other. These interconnections ar ...
where information is stored on more than one
node
In general, a node is a localized swelling (a "knot") or a point of intersection (a vertex).
Node may refer to:
In mathematics
* Vertex (graph theory), a vertex in a mathematical graph
* Vertex (geometry), a point where two or more curves, line ...
, often in a
replicated fashion. It is usually specifically used to refer to either a
distributed database
A distributed database is a database in which data is stored across different physical locations. It may be stored in multiple computers located in the same physical location (e.g. a data centre); or maybe dispersed over a network of interconne ...
where users store information on a ''number of nodes'', or a
computer network
A computer network is a set of computers sharing resources located on or provided by network nodes. The computers use common communication protocols over digital interconnections to communicate with each other. These interconnections ar ...
in which users store information on a ''number of peer network nodes''.
Distributed databases
Distributed database
A distributed database is a database in which data is stored across different physical locations. It may be stored in multiple computers located in the same physical location (e.g. a data centre); or maybe dispersed over a network of interconne ...
s are usually
non-relational database
A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
s that enable a quick access to data over a large number of nodes. Some distributed databases expose rich query abilities while others are limited to a
key-value store semantics. Examples of limited distributed databases are
Google
Google LLC () is an American Multinational corporation, multinational technology company focusing on Search Engine, search engine technology, online advertising, cloud computing, software, computer software, quantum computing, e-commerce, ar ...
's
Bigtable, which is much more than a
distributed file system
A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for ...
or a
peer-to-peer network
Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer ...
,
Amazon
Amazon most often refers to:
* Amazons, a tribe of female warriors in Greek mythology
* Amazon rainforest, a rainforest covering most of the Amazon basin
* Amazon River, in South America
* Amazon (company), an American multinational technolog ...
's
Dynamo
"Dynamo Electric Machine" (end view, partly section, )
A dynamo is an electrical generator that creates direct current using a commutator. Dynamos were the first electrical generators capable of delivering power for industry, and the foundat ...
and
Microsoft Azure Storage.
As the ability of arbitrary querying is not as important as the
availability
In reliability engineering, the term availability has the following meanings:
* The degree to which a system, subsystem or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at ...
, designers of distributed data stores have increased the latter at an expense of consistency. But the high-speed read/write access results in reduced consistency, as it is not possible to guarantee both
consistency
In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consistent ...
and availability on a partitioned network, as stated by the
CAP theorem.
Peer network node data stores
In peer network data stores, the user can usually reciprocate and allow other users to use their computer as a storage node as well. Information may or may not be accessible to other users depending on the design of the network.
Most
peer-to-peer
Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer ...
networks do not have distributed data stores in that the user's data is only available when their node is on the network. However, this distinction is somewhat blurred in a system such as
BitTorrent, where it is possible for the originating node to go offline but the content to continue to be served. Still, this is only the case for individual files requested by the redistributors, as contrasted with networks such as
Freenet
Freenet is a peer-to-peer platform for censorship-resistant, anonymous communication. It uses a decentralized distributed data store to keep and deliver information, and has a suite of free software for publishing and communicating on the Web ...
,
Winny
Winny (also known as WinNY) is a Japanese peer-to-peer (P2P) file-sharing program developed by Isamu Kaneko, a research assistant at the University of Tokyo in 2002. Like Freenet, a user must add an encrypted node list in order to connect to oth ...
,
Share and
Perfect Dark where any node may be storing any part of the files on the network.
Distributed data stores typically use an
error detection and correction
In information theory and coding theory with applications in computer science and telecommunication, error detection and correction (EDAC) or error control are techniques that enable reliable delivery of digital data over unreliable comm ...
technique.
Some distributed data stores (such as
Parchive over NNTP) use
forward error correction
In computing, telecommunication, information theory, and coding theory, an error correction code, sometimes error correcting code, (ECC) is used for controlling errors in data over unreliable or noisy communication channels. The central idea is ...
techniques to recover the original file when parts of that file are damaged or unavailable.
Others try again to download that file from a different mirror.
Examples
Distributed non-relational databases
Peer network node data stores
*
BitTorrent
*
Blockchain (database)
*
Chord project
*
Freenet
Freenet is a peer-to-peer platform for censorship-resistant, anonymous communication. It uses a decentralized distributed data store to keep and deliver information, and has a suite of free software for publishing and communicating on the Web ...
*
GNUnet
*
IPFS
*
Mnet
*
Napster
Napster was a peer-to-peer file sharing application. It originally launched on June 1, 1999, with an emphasis on digital audio file distribution. Audio songs shared on the service were typically encoded in the MP3 format. It was founded by Sh ...
*
NNTP
The Network News Transfer Protocol (NNTP) is an application protocol used for transporting Usenet news articles (''netnews'') between news servers, and for reading/posting articles by the end user client applications. Brian Kantor of the Univ ...
(the distributed data storage protocol used for
Usenet
Usenet () is a worldwide distributed discussion system available on computers. It was developed from the general-purpose Unix-to-Unix Copy (UUCP) dial-up network architecture. Tom Truscott and Jim Ellis conceived the idea in 1979, and it was ...
news)
* Unity, of the software
Perfect Dark
*
Share
*
Siacoin
* DeNet
*
Storage@home
Storage@home was a distributed data store project designed to store massive amounts of scientific data across a large number of volunteer machines. The project was developed by some of the Folding@home team at Stanford University, from about 2007 ...
* STORJ
*
Tahoe-LAFS
*
Winny
Winny (also known as WinNY) is a Japanese peer-to-peer (P2P) file-sharing program developed by Isamu Kaneko, a research assistant at the University of Tokyo in 2002. Like Freenet, a user must add an encrypted node list in order to connect to oth ...
*
ZeroNet
ZeroNet is a decentralized web-like network of peer-to-peer users, created by Tamas Kocsis in 2015, programming for the network was based in Budapest, Hungary; is built in Python; and is fully open source. Instead of having an IP address, ...
See also
*
Cooperative storage cloud
*
Data store
In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpret ...
*
Distributed file system
A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for ...
*
Keyspace, the DDS
schema
The word schema comes from the Greek word ('), which means ''shape'', or more generally, ''plan''. The plural is ('). In English, both ''schemas'' and ''schemata'' are used as plural forms.
Schema may refer to:
Science and technology
* SCHEMA ...
*
Peer-to-peer
Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer ...
*
Distributed hash table
*
Distributed cache
*
Cyber Resilience
Cyber resilience refers to an entity's ability to continuously deliver the intended outcome, despite cyber attacks. Resilience to cyber attacks is essential to IT systems, critical infrastructure, business processes, organizations, societies, an ...
References
{{Reflist
Data management
ja:分散ファイルシステム#分散データストア