Shardmap
   HOME

TheInfoList



OR:

Shardmap is a directory index design by Daniel Phillips who created the HTree and PHTree tree data structures and the
Tux3 Tux3 is an open-source versioning filesystem created by Daniel Phillips. He introduced the filesystem as a public replacement for his Tux2 filesystem which had encountered licensing issues due to the filing of several patents. Phillips had previo ...
file system. A Shardmap index consists of a scalable number of index shards. Each shard entry maps a hash key to the logical block number of a directory entry block known to contain a name that hashes to that key. Each shard is represented as an unsorted fifo on disk and a small
hash table In computer science, a hash table is a data structure that implements an associative array, also called a dictionary or simply map; an associative array is an abstract data type that maps Unique key, keys to Value (computer science), values. ...
in memory. Shardmap scales in two ways: # Rehash a cached shard to a larger number of hash buckets # Reshard a stored shard fifo to divide it into multiple, smaller shards. These operations are staggered to avoid latency spikes. The reshard operation imposes a modest degree of write multiplication on the Shardmap design, asymptotically approaching a factor of two. The key ideas of Shardmap are: # The representation of directory data is not the same on media as it is in cache. On media we have fifos, but in cache we have hash tables. # Updating a fifo is cache efficient. Only the tail block of the fifo needs to be present in cache. The cache footprint of the media image of a shardmap is therefore just one disk block per shard. # A small fifo on media is easily loaded and converted to an efficient hash table shard on demand. Once in cache, index updates are performed by updating the cached hash table and appending the same entries to the final block of the shard fifo. The shardmap implementation in the
Tux3 Tux3 is an open-source versioning filesystem created by Daniel Phillips. He introduced the filesystem as a public replacement for his Tux2 filesystem which had encountered licensing issues due to the filing of several patents. Phillips had previo ...
file system uses
SipHash SipHash is an add–rotate–xor (ARX) based family of pseudorandom functions created by Jean-Philippe Aumasson and Daniel J. Bernstein in 2012, in response to a spate of "hash flooding" denial-of-service attacks (HashDoS) in late 2011. SipHash ...
hash function designed by Jean-Philippe Aumasson and Daniel J. Bernstein.


See also

* Dirhash


External links

* https://lkml.org/lkml/2013/6/18/869 on the
Linux kernel mailing list The Linux kernel mailing list (LKML) is the main electronic mailing list for Linux kernel development, where the majority of the announcements, discussions, debates, and flame wars over the kernel take place. Many other mailing lists exist to d ...
(LKML) Database index techniques