cryptography Cryptography, or cryptology (from "hidden, secret"; and ''graphein'', "to write", or ''-logy, -logia'', "study", respectively), is the practice and study of techniques for secure communication in the presence of Adversary (cryptography), ...

and

computer security Computer security (also cybersecurity, digital security, or information technology (IT) security) is a subdiscipline within the field of information security. It consists of the protection of computer software, systems and computer network, n ...

, a length extension attack is a type of attack where an attacker can use Hash(''message₁'') and the length of ''message₁'' to calculate Hash(''message₁'' ‖ ''message₂'') for an attacker-controlled ''message₂'', without needing to know the content of ''message₁''. This is problematic when the hash is used as a

message authentication code In cryptography, a message authentication code (MAC), sometimes known as an authentication tag, is a short piece of information used for authentication, authenticating and Data integrity, integrity-checking a message. In other words, it is used t ...

with construction Hash(''secret'' ‖ ''message''), and ''message'' and the length of ''secret'' is known, because an attacker can include extra information at the end of the message and produce a valid hash without knowing the secret. Algorithms like

MD5 The MD5 message-digest algorithm is a widely used hash function producing a 128-bit hash value. MD5 was designed by Ronald Rivest in 1991 to replace an earlier hash function MD4, and was specified in 1992 as Request for Comments, RFC 1321. MD5 ...

SHA-1 In cryptography, SHA-1 (Secure Hash Algorithm 1) is a hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest – typically rendered as 40 hexadecimal digits. It was designed by the United States ...

and most of SHA-2 that are based on the Merkle–Damgård construction are susceptible to this kind of attack. Truncated versions of SHA-2, including SHA-384 and SHA-512/256 are not susceptible, nor is the SHA-3 algorithm.

HMAC In cryptography, an HMAC (sometimes expanded as either keyed-hash message authentication code or hash-based message authentication code) is a specific type of message authentication code (MAC) involving a cryptographic hash function and a se ...

also uses a different construction and so is not vulnerable to length extension attacks. A secret suffix MAC, which is calculated as Hash(''message'' ‖ ''secret''), isn't vulnerable to a length extension attack, but is vulnerable to another attack based on a hash collision.

Explanation

The vulnerable hashing functions work by taking the input message, and using it to transform an internal state. After all of the input has been processed, the hash digest is generated by outputting the internal state of the function. It is possible to reconstruct the internal state from the hash digest, which can then be used to process the new data. In this way, one may extend the message and compute the hash that is a valid signature for the new message.

Example

A server for delivering waffles of a specified type to a specific user at a location could be implemented to handle requests of the given format: Original Data: count=10&lat=37.351&user_id=1&long=-119.827&waffle=eggo Original Signature: 6d5f807e23db210bc254a28be2d6759a0f5f5d99 The server would perform the request given (to deliver ten waffles of type eggo to the given location for user "1") only if the signature is valid for the user. The signature used here is a MAC, signed with a key not known to the attacker. It is possible for an attacker to modify the request in this example by switching the requested waffle from " eggo" to " liege." This can be done by taking advantage of a flexibility in the message format if duplicate content in the query string gives preference to the latter value. This flexibility does not indicate an exploit in the message format, because the message format was never designed to be cryptographically secure in the first place, without the signature algorithm to help it. Desired New Data: count=10&lat=37.351&user_id=1&long=-119.827&waffle=eggo&waffle=liege In order to sign this new message, typically the attacker would need to know the key the message was signed with, and generate a new signature by generating a new MAC. However, with a length extension attack, it is possible to feed the hash (the signature given above) into the state of the hashing function, and continue where the original request had left off, so long as the length of the original request is known. In this request, the original key's length was 14 bytes, which could be determined by trying forged requests with various assumed lengths, and checking which length results in a request that the server accepts as valid. The message as fed into the hashing function is often padded, as many algorithms can only work on input messages whose lengths are a multiple of some given size. The content of this padding is always specified by the hash function used. The attacker must include all of these padding bits in their forged message before the internal states of their message and the original will line up. Thus, the attacker constructs a slightly different message using these padding rules: New Data: count=10&lat=37.351&user_id=1&long=-119.827&waffle=eggo\x80\x00\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00 \x00\x00\x00\x02\x28&waffle=liege This message includes all of the padding that was appended to the original message inside of the hash function before their payload (in this case, a 0x80 followed by a number of 0x00s and a message length, 0x228 = 552 = (14+55)*8, which is the length of the key plus the original message, appended at the end). The attacker knows that the state behind the hashed key/message pair for the original message is identical to that of new message up to the final "&." The attacker also knows the hash digest at this point, which means they know the internal state of the hashing function at that point. It is then trivial to initialize a hashing algorithm at that point, input the last few characters, and generate a new digest which can sign his new message without the original key. New Signature: 0e41270260895979317fff3898ab85668953aaa2 By combining the new signature and new data into a new request, the server will see the forged request as a valid request due to the signature being the same as it would have been generated if the password was known.

Notes

References

Cryptographic attacks {{cryptography navbox, hash