Billion Laughs Attack
   HOME

TheInfoList



OR:

In
computer security Computer security (also cybersecurity, digital security, or information technology (IT) security) is a subdiscipline within the field of information security. It consists of the protection of computer software, systems and computer network, n ...
, a billion laughs attack is a type of denial-of-service (DoS) attack which is aimed at
parser Parsing, syntax analysis, or syntactic analysis is a process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar by breaking it into parts. The term '' ...
s of
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
documents. It is also referred to as an XML bomb or as an exponential entity expansion attack.


Details

The example attack consists of defining 10 entities, each defined as consisting of 10 of the previous entity, with the document consisting of a single instance of the largest entity, which expands to one
billion Billion is a word for a large number, and it has two distinct definitions: * 1,000,000,000, i.e. one thousand million, or (ten to the ninth power), as defined on the short scale. This is now the most common sense of the word in all varieties of ...
copies of the first entity. Versions with larger amount of entries also exist. In the most frequently cited example, the first entity is the
string String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * ''Strings'' (1991 film), a Canadian anim ...
" lol", hence the name "billion laughs". At the time this vulnerability was first reported, the
computer memory Computer memory stores information, such as data and programs, for immediate use in the computer. The term ''memory'' is often synonymous with the terms ''RAM,'' ''main memory,'' or ''primary storage.'' Archaic synonyms for main memory include ...
used by a billion instances of the string "lol" would likely exceed that available to the process parsing the XML. While the original form of the attack was aimed specifically at XML parsers, the term may be applicable to similar subjects as well. The problem was first reported as early as 2002, but began to be widely addressed in 2008. Defenses against this kind of attack include capping the memory allocated in an individual parser if loss of the document is acceptable, or treating entities symbolically and expanding them lazily only when (and to the extent) their content is to be used.


Code example

]> &lol9; When an XML parser loads this document, it sees that it includes one root element, "lolz", that contains the text "&lol9;". However, "&lol9;" is a defined entity that expands to a string containing ten "&lol8;" strings. Each "&lol8;" string is a defined entity that expands to ten "&lol7;" strings, and so on. After all the entity expansions have been processed, this small (< 1 KB) block of XML will actually contain 109 = a billion "lol"s, taking up almost 3
gigabyte The gigabyte () is a multiple of the unit byte for digital information. The SI prefix, prefix ''giga-, giga'' means 109 in the International System of Units (SI). Therefore, one gigabyte is one billion bytes. The unit symbol for the gigabyte i ...
s of memory.


Variations

The billion laughs attack described above can take an
exponential Exponential may refer to any of several mathematical topics related to exponentiation, including: * Exponential function, also: **Matrix exponential, the matrix analogue to the above *Exponential decay, decrease at a rate proportional to value * Ex ...
amount of space or time. The quadratic blowup variation causes
quadratic growth In mathematics, a function or sequence is said to exhibit quadratic growth when its values are proportional to the square of the function argument or sequence position. "Quadratic growth" often means more generally "quadratic growth in the limi ...
in resource requirements by simply repeating a large entity over and over again, to avoid countermeasures that detect heavily nested entities. (See
computational complexity theory In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and explores the relationships between these classifications. A computational problem ...
for comparisons of different growth classes.) A "billion laughs" attack could exist for any file format that can contain macro expansions, for example this
YAML YAML ( ) is a human-readable data serialization language. It is commonly used for configuration files and in applications where data is being stored or transmitted. YAML targets many of the same communications applications as Extensible Marku ...
bomb: a: &a lol","lol","lol","lol","lol","lol","lol","lol","lol"b: &b a,*a,*a,*a,*a,*a,*a,*a,*ac: &c b,*b,*b,*b,*b,*b,*b,*b,*bd: &d c,*c,*c,*c,*c,*c,*c,*c,*ce: &e d,*d,*d,*d,*d,*d,*d,*d,*df: &f e,*e,*e,*e,*e,*e,*e,*e,*eg: &g f,*f,*f,*f,*f,*f,*f,*f,*fh: &h g,*g,*g,*g,*g,*g,*g,*g,*gi: &i h,*h,*h,*h,*h,*h,*h,*h,*h This crashed earlier versions of Go because the Go YAML processor (contrary to the YAML spec) expands references as if they were macros. The Go YAML processor was modified to fail parsing if the result object becomes too large. Enterprise software like
Kubernetes Kubernetes (), also known as K8s is an open-source software, open-source OS-level virtualization, container orchestration (computing), orchestration system for automating software deployment, scaling, and management. Originally designed by Googl ...
has been affected by this attack through its YAML parser. For this reason, either a parser with intentionally limited capabilities is preferred (like StrictYAML) or file formats that do not allow references are often preferred for data arriving from untrusted sources.


See also

*
Fork bomb In computing, a fork bomb (also called rabbit virus) is a denial-of-service (DoS) attack wherein a process continually replicates itself to deplete available system resources, slowing down or crashing the system due to resource starvation. ...
: a similar method to exhaust a system's resources through
recursion Recursion occurs when the definition of a concept or process depends on a simpler or previous version of itself. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in m ...
*
Zip bomb In computing, a zip bomb, also known as a decompression bomb or zip of death (ZOD), is a malicious archive file designed to crash or render useless the program or system reading it. The older the system or program, the less likely it is that the ...
: a similar attack utilizing zip archives *
XML external entity attack XML External Entity attack, or simply XXE attack, is a type of attack against an application that parses XML input. This attack occurs when XML input containing a reference to an external entity is processed by a weakly configured XML parser. Thi ...
: an XML attack to return arbitrary server files * Document type definition: a template for validating XML files


References

{{Reflist Algorithmic complexity attacks Denial-of-service attacks XML