UUHash
   HOME

TheInfoList



OR:

UUHash is a
hash algorithm A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable-length output. The values returned by a hash function are called ''hash values'', ...
employed by clients on the
FastTrack FastTrack is a peer-to-peer (P2P) protocol that was used by the Kazaa, Grokster, iMesh and Morpheus file sharing programs. FastTrack was the most popular file sharing network in 2003, and used mainly for the exchange of music MP3 files. The ne ...
network. It is employed for its ability to hash very large files in a very short period of time, even on older computers. However, this is achieved by only hashing a fraction of the file. This weakness makes it trivial to create a hash collision, allowing large sections to be completely altered without altering the
checksum A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify dat ...
. This method is used by
Kazaa Kazaa Media Desktop ( ) (once stylized as "KaZaA", but later usually written "Kazaa") was a peer-to-peer file sharing application using the FastTrack protocol licensed by Joltid Ltd. and operated as Kazaa by Sharman Networks. Kazaa was subsequ ...
. The weakness of UUHash is exploited by anti- p2p agencies to
corrupt Corruption is a form of dishonesty or a criminal offense that is undertaken by a person or an organization that is entrusted in a position of authority to acquire illicit benefits or abuse power for one's gain. Corruption may involve activities ...
downloads.


How it works

The UUHash is a 160-bit string that is usually
Base64 In computer programming, Base64 is a group of binary-to-text encoding schemes that transforms binary data into a sequence of printable characters, limited to a set of 64 unique characters. More specifically, the source binary data is taken 6 bits ...
-encoded for presentation. It is a concatenation of an MD5 hash and a
CRC32 Computation of a cyclic redundancy check is derived from the mathematics of polynomial division, modulo two. In practice, it resembles long division of the binary message string, with a fixed number of zeroes appended, by the "generator poly ...
sum of selected chunks of the file.sig2dat source code
file ''sig2dat.c'', function ''GetHashWin32'', retrieved 2014-08-20 The first 307,200 bytes (300
Kibibyte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable un ...
, one "chunk size") of the file are
MD5 The MD5 message-digest algorithm is a widely used hash function producing a 128-bit hash value. MD5 was designed by Ronald Rivest in 1991 to replace an earlier hash function MD4, and was specified in 1992 as Request for Comments, RFC 1321. MD5 ...
-hashed (less if file is shorter). The 32 bit
little endian '' Jonathan_Swift.html" ;"title="Gulliver's Travels'' by Jonathan Swift">Gulliver's Travels'' by Jonathan Swift, the novel from which the term was coined In computing, endianness is the order in which bytes within a word (data type), word of d ...
integer value ''smallhash'' is initialized to 0. If the file is strictly larger than one chunk size, a series of chunks at file offsets of 2n MiB (n ≥ 0) and one chunk right at the end of the file are hashed using a CRC32 (polynomial 0xEDB88320 reversed, 0x04C11DB7 normal). The last chunk of the power-of-two series ends strictly ''more'' than one chunk size before the end of the file, i.e. there is always at least one unread byte between the last two chunks (if there are that many chunks).BitCollider/0.4.0
implemented this unfaithfully
The end-of-file chunk may be shorter than one chunk size; it starts at or after one chunk size into the file. The CRC is initialized using ''smallhash'' and stored into ''smallhash''. So, for example: :offset 0 MiB, 300 KiB hashed with MD5 :offset 1 MiB, 300 KiB hashed with CRC32 :offset 2 MiB, 300 KiB hashed... :offset 4 MiB, 300 KiB hashed... :offset 8 MiB, 300 KiB hashed... :... :last 300 KiB of file hashed with CRC32 Finally, the bitwise complement of ''smallhash'' (still zero for files up to 300 KiB) is XORed together with the file size in bytes. The 160-bit UUHash is now the concatenation of the 128-bit MD5 hash and the final 32-bit ''smallhash'' value.


Test Vectors

Given below are hashes (base64 and hex) for strings of various lengths containing only 0x00 or 0xFF bytes, generated by sig2dat. Note ''here'' that all strings that have a complete MD5 chunk have the same 128-bit prefix because their first chunks are the same (either 0x00 or 0xFF). For files that have the same number of chunks the CRC part differs only because of the included file length (all chunks are identical, or this weren't the case). For files up to 300 KiB, the file length can be extracted from the last four bytes of the hash; ''smallhash'' is ~0.


Sig2Dat

The name UUHash derives from th

utility which creates
URI Uri may refer to: Places * Canton of Uri, a canton in Switzerland * Úri, a village and commune in Hungary * Uri, Iran, a village in East Azerbaijan Province * Uri, Jammu and Kashmir, a town in India * Uri (island), off Malakula Island in V ...
s referencing files on Kazaa. These URIs are of the form: sig2dat://, File: surprise.mp3, Length:5845871Bytes, UUHash:=1LDYkHDl65OprVz37xN1VSo9b00= Not considering the fact that this
URI Uri may refer to: Places * Canton of Uri, a canton in Switzerland * Úri, a village and commune in Hungary * Uri, Iran, a village in East Azerbaijan Province * Uri, Jammu and Kashmir, a town in India * Uri (island), off Malakula Island in V ...
format is not RFC compliant, UUHash refers to the
Base64 In computer programming, Base64 is a group of binary-to-text encoding schemes that transforms binary data into a sequence of printable characters, limited to a set of 64 unique characters. More specifically, the source binary data is taken 6 bits ...
-encoding of the hash and not the hash itself.


Notes


External links

{{DEFAULTSORT:Uuhash Cryptographic hash functions