Top 2025 Application Security Predictions with Aaron Lord. Register Now.

Top 2025 Application Security Predictions with Aaron Lord. Register Now.

Top 2025 Application Security Predictions with Aaron Lord. Register Now.

XML injection vulnerability: Examples, cheatsheet and prevention

XML Injection is a type of attack that targets web applications that generate XML content. Attackers use malicious code to exploit vulnerabilities in XML parsers to manipulate the content of an XML document.

Author Image

Medusa

Sep 19, 2023

XML Injection: examples, cheatsheet and prevention
XML Injection: examples, cheatsheet and prevention
XML Injection: examples, cheatsheet and prevention

What is XML Injection?

XML (Extensible Markup Language) is a markup language used to store and transport data. It is designed to be self-descriptive and allows users to define their own tags, making it highly customizable. XML is commonly used for data storage, configuration files, and web services.

XML Injection is a type of attack that targets web applications that generate XML content. Attackers use malicious code to exploit vulnerabilities in XML parsers to manipulate the content of an XML document. This can result in unauthorized access to sensitive data, denial of service, and other potential risks to the application and its users

What is XML and what are its common use cases?

XML is a widely used markup language for storing and exchanging data. It provides a flexible and extensible way to represent data in a hierarchical format, making it easy to organize and access complex data structures. One of the most common use cases for XML is data storage, where it is used to store large amounts of data in a structured format. XML's hierarchical structure also makes it a popular choice for configuration files, where it can be used to store settings and options for applications and systems. In addition, XML is used extensively in web services, where it is used to exchange data between different systems and applications. Overall, XML's flexibility, extensibility, and hierarchical structure make it a powerful tool for managing and exchanging data in a variety of contexts.

Potential risks of XML Injection

  • Unauthorized access to sensitive data: XML Injection can allow attackers to access sensitive information such as user credentials, financial data, personal identifiable information (PII), and more.

  • Denial of service attacks: Attackers can send large amounts of data to web applications using XML Injection, causing them to become unresponsive, crash, or slow down.

  • Data manipulation or corruption: Attackers can modify or delete XML data, causing incorrect data to be displayed or processed by web applications.

  • Compromised system security: XML Injection can compromise the security of a web application and its underlying systems, allowing attackers to gain unauthorized access to resources and perform other malicious actions.

The Role of an XML Parser

XML parsers play a critical role in processing and interpreting XML data. However, they can be exploited by attackers due to improper configuration or implementation flaws. If a parser does not adequately validate or sanitize input, it might misinterpret malicious XML content as legitimate, potentially leading to unintended actions or exposure of sensitive information.

XPath, XQuery, and XXE in XML Processing

XPath, XQuery, and XXE are three distinct technologies related to XML processing, and they serve different purposes.

XPath (XML Path Language)

  • What it is: XPath is a language used for navigating and querying XML documents. It provides a way to locate and select elements and attributes within an XML document.

  • Use Cases: XPath is primarily used for extracting specific data from XML documents. It's commonly employed in web scraping, XML-based databases, and XSLT transformations.

  • Exploitation (XPath Injection): XPath injections can occur if an application uses untrusted user input to construct XPath queries. An attacker may manipulate input to extract sensitive data or potentially disrupt the application. To mitigate, input validation and sanitization are essential.

XPath Vulnerable Code

Let's say you have a web application that takes a username and password and uses XPath to query an XML file to authenticate users.

import lxml.etree as ET

# User input (usually obtained through a web form)
username = input("Enter your username: ")
password = input("Enter your password: ")

# Construct the XPath query with user input
query = "/users/user[username/text()='" + username + "' and password/text()='" + password + "']"

# Parse the XML file
tree = ET.parse("users.xml")
root = tree.getroot()

# Attempt to find a user with the given credentials
user = root.xpath(query)

# Check if the user was found
if user:
    print("Authentication successful. Welcome, " + username + "!")
else:
    print("Authentication failed. Invalid username or password.")

XPath Vulnerable Code Exploitation

An attacker can exploit this code by entering carefully crafted input to manipulate the XPath query. For example, if the attacker enters the following as the username:

' or 1=1 or ''='
/users/user[username/text()=' ' or 1=1 or ''='']

This query always evaluates to true (1=1) because of the injected condition. As a result, the attacker can log in without knowing a valid username or password, effectively bypassing authentication.

XQuery

  • What it is: XQuery is a query and functional programming language designed for querying and manipulating XML data. It can perform complex queries and transformations on XML documents.

  • Use Cases: XQuery is often used in databases and content management systems to retrieve and manipulate XML data. It's useful for generating reports, searching XML databases, and transforming XML data.

  • Exploitation (XQuery Injection): Similar to SQL injection, XQuery injection can occur if user input is improperly included in XQuery expressions. Attackers can manipulate input to execute unintended queries or even cause data leakage. Protect against this by using parameterized queries or prepared statements.

XXE (XML External Entity Injection)

  • What it is: XXE is a security vulnerability that arises when an application parses XML input without properly validating or sanitizing external entities (references to external resources).

  • Use Cases: XXE vulnerabilities can be exploited when an application processes XML input from untrusted sources. This can occur in various contexts, including web applications, file uploads, and XML-based APIs.

  • Exploitation: To exploit XXE, an attacker submits XML input containing malicious external entity declarations. They might read sensitive files, initiate network requests, or perform denial-of-service attacks. To prevent XXE, disable external entity processing, use secure parsers, and validate XML inputs carefully.

Types of XML Injection

  1. Entity Injection: An attacker can inject XML entities to modify or expose sensitive data, resulting in security vulnerabilities like information disclosure, denial of service, and remote code execution.

  2. Attribute Injection: An attacker can inject malicious code into XML attributes, which can be executed when the attribute is processed.

  3. Schema Poisoning: An attacker can modify the schema of an XML document to manipulate the parser's behavior or cause it to crash.

  4. XSLT Injection: An attacker can inject malicious code into the XSLT stylesheet used to transform an XML document. This can be used to execute arbitrary code or manipulate the output of the transformation.

How to Discover XML injection vulnerabilities?

The initial phase of evaluating an application for potential XML Injection vulnerabilities involves an attempt to introduce XML metacharacters.

These XML metacharacters encompass:

  • Single quote: ' - When not adequately sanitized, this character has the potential to disrupt XML parsing, especially if the injected content becomes part of an attribute within a tag.

    For instance, imagine a tag such as:

    <item type='$itemName'/>

    If we consider:

    itemName = gadget'

    When this gets incorporated into the attribute's value:

    <item type='gadget''/>

    This results in an XML document that is not properly formed.

  • Double quote: " - This character operates similarly to a single quote and can create issues if placed within attribute values enclosed by double quotes.

    Let's consider the tag:

    <item type="$itemName"/>

    If we have:

    itemName = device"

    The substitution yields:

    <item type="device""/>

    Leading to an XML document that isn't valid.

  • Angle brackets: > and < - Incorporating open or closed angle brackets in user input could disturb the XML structure.

    For example, if a user enters:

    ItemName = cool<x>

    The application constructs the structure like this:

    <product> <name>cool<</name> <price>19.99</price> <stock>100</stock> </product>

    Due to the '<x>' character, the XML produced turns out to be invalid.

  • Comment tag: <!--/--> - This sequence of characters signifies the initiation or culmination of a comment. Infusing this sequence into a parameter can lead to improperly structured XML.

    Let's say a user inputs:

    ItemName = amazing<!--

    The application assembles a tag like:


    <product>
        <name>amazing<!--</name>
        <price>24.99</price>
        <stock>50</stock>
    </product>

    Resulting in XML that isn't in the proper sequence.

  • Ampersand: & - The ampersand is utilized in XML to signify entities. An entity is denoted as &symbol; and maps to a Unicode character.

These are some examples of XML metacharacters that, if not managed correctly, can lead to vulnerabilities in the application.

XML Vulnerability Testing Guidelines

As organizations increasingly rely on XML (Extensible Markup Language) for data interchange and communication, ensuring the security of applications that handle XML data becomes paramount. XML vulnerabilities pose significant risks, ranging from information disclosure to remote code execution. To safeguard your applications and data, it's crucial to conduct thorough XML vulnerability testing. In this guide, we'll explore essential guidelines and best practices for identifying and mitigating XML-related security threats. By following these recommendations, you can enhance the robustness of your applications and protect against potential exploitation.

  • Locate all input fields that accept XML data or XML-like content.

  • Assess input validation by injecting payloads containing malformed XML, invalid characters, or encoding anomalies.

  • Probe for file inclusion vulnerabilities by infusing XML payloads that reference external resources like DTD files, assessing for susceptibility.

  • Investigate XML External Entity (XXE) attacks by inserting payloads harboring external entities, and validate their impact to ascertain successful attacks.

  • Scrutinize SOAP vulnerabilities by embedding payloads within SOAP messages, subsequently confirming the efficacy of the attack.

  • Examine injection vulnerabilities by introducing payloads containing SQL, XPath, or other query languages, then scrutinize outcomes to identify successful breaches.

  • Test for denial-of-service vulnerabilities by introducing payloads designed to consume substantial system resources, potentially leveraging recursive or deeply nested XML structures.

  • Evaluate input filtering and sanitization mechanisms by injecting payloads with HTML tags, JavaScript code, or malicious content, gauging the success of the attack.

  • Review application error messages and logs for any information that could be exploited to construct XML Injection attacks.

  • Validate that the application refrains from disclosing sensitive data, such as user credentials or personal information, within its XML output.

  • Confirm that the application implements robust restrictions and validation for user-uploaded or downloaded XML files to prevent potential vulnerabilities.

Automated Security Testing for XML Injection

  1. Akto: Uncover vulnerabilities in all your APIs through the execution of more than 100 built-in tests. You can also create custom tests and streamline API Security Testing with automation.

  2. Burp Suite: A powerful and comprehensive web application security testing tool that offers specialized features for detecting XML injection vulnerabilities. Burp Suite allows security professionals to intercept and modify XML requests and responses to identify potential weaknesses in web applications.

  3. OWASP ZAP: An open-source web application security testing tool equipped with a suite of utilities for pinpointing XML injection vulnerabilities. It provides a range of functionalities, including a proxy, scanner, and various testing tools, all designed to enhance the security assessment of web applications.

  4. SOAPUI: Designed primarily for testing web services, SOAPUI excels in testing XML-based web services. It enables testers to send and receive XML requests and responses while allowing manipulation for vulnerability testing purposes.

  5. W3af: As an open-source web application security testing tool, W3af provides capabilities for detecting XML injection vulnerabilities. It supports both manual and automated testing, making it versatile for security assessments.

  6. SoapUI Pro: The commercial version of SoapUI offers advanced features for web service testing, including those using XML. It boasts an intuitive user interface and support for advanced scripting, making it an ideal choice for organizations seeking comprehensive testing capabilities.

These tools are essential resources for security professionals and developers to proactively identify and address XML injection vulnerabilities in web applications and web services.

XML Injection CVES

Check out Akto's API CVE's page for XML Injection CVEs.

XML Injection example with Practical Demonstration

This form takes a username and password and creates a low-privileged user by adding an entry to an XML file on the backend, which is being used as a database.

form takes username and password

This is how the XML looks like in the backend after registration

XML in the backend

Vulnerable backend code

<?php
if ($_SERVER['REQUEST_METHOD'] === 'POST') {
$username = $_POST['username'];
$password = $_POST['password'];

// Open the file
$db = fopen("users.xml", "r");

// Read the file contents
$contents = fread($db, filesize("users.xml"));

// Remove the last 8 characters (</users>)
$modifiedContents = substr($contents, 0, -8);

$newUser = "<user><username>$username</username><password>" . password_hash($password, PASSWORD_DEFAULT) . "</password><role>user</role></user></users>";
fclose($db);

$db = fopen("users.xml", "w");
fwrite($db, $modifiedContents . $newUser);
fclose($db);

}
?>

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Register User</title>
</head>
<body>
<h1>Create low-privileged user</h1>
<form action="" method="post">
<div>
<label for="username">Username:</label>
<input type="text" name="username" id="username" required>
</div>
<div>
<label for="password">Password:</label>
<input type="password" name="password" id="password" required>
</div>
<div>
<button type="submit">Register</button>
</div></form>
</body>
</html>

The issue with the above code that leads to XML Injection is that it does not properly validate or sanitize user input before it is processed by the XML parser. Specifically, the code takes user input for the username and password fields and concatenates it with other XML data to create a new XML element. If an attacker were to provide malicious input containing XML metacharacters, the parser would interpret it as legitimate XML content, potentially leading to unauthorized access, data manipulation, or exposure of sensitive information.

Exploitation of XML Injection vulnerability

The hacker could inject a payload like this:

hacker</username><password>a</password><role>admin</role></user><user><username>a

Vulnerable application

Notice that in the payload, we are creating our own XML elements and overwriting the value of the <role> element with "admin".

injected XML payload in the backend

The backend will interpret the payload as legitimate XML content and append it to the backend, thereby granting the "hacker" user admin-level privileges.

How to mitigate XML Injection vulnerability?

To mitigate this, you can use proper PHP methods for XML creation instead of using insecure string concatenation. You can securely implement the same thing like this:

<?php
if ($_SERVER['REQUEST_METHOD'] === 'POST') {
    $username = $_POST['username'];
    $password = $_POST['password'];

    $xml = new SimpleXMLElement('<users/>');

    if (file_exists('users.xml')) {
        $xml = simplexml_load_file('users.xml');
    }

    $user = $xml->addChild('user');
    $user->addChild('username', $username);
    $user->addChild('password', password_hash($password, PASSWORD_DEFAULT));
    $user->addChild('role', 'user');

    $xml->asXML('users.xml');
}
?>

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Register User</title>
</head>
<body>
    <form action="" method="post">
        <div>
            <label for="username">Username:</label>
            <input type="text" name="username" id="username" required>
        </div>
        <div>
            <label for="password">Password:</label>
            <input type="password" name="password" id="password" required>
        </div>
        <div>
            <button type="submit">Register</button>
        </div>
    </form>
</body>
</html>

XML Injection Prevention

To prevent XML Injection, it is important to properly validate and sanitize all user input before it is processed by the XML parser. Here are some best practices to follow:

  • Use secure XML parsing libraries: Use well-established, secure XML parsing libraries that are regularly maintained and updated to protect against known vulnerabilities.

  • Input validation: Validate all user input to ensure that it conforms to expected formats and does not contain any unexpected or malicious characters. Use input validation techniques such as whitelisting to ensure that only expected characters are allowed.

  • Sanitize input: Sanitize all user input to remove any characters that could be used to inject malicious XML content.

  • Limit access: Limit access to XML parsers and related resources to authorized personnel only.

  • Regularly update software: Keep all software and libraries up-to-date with the latest security patches and updates to prevent known vulnerabilities from being exploited.

  • Secure configuration: Configure XML parsers securely by disabling unnecessary features and restricting access to sensitive resources.

Take a look at this cheatsheet.

To prevent XML Injection, it is important to use secure XML parsing libraries that are regularly maintained and updated to protect against known vulnerabilities. These libraries often have features built in to validate and sanitize user input, preventing malicious code from being injected into the XML document. Some examples of secure XML parsing libraries include lxml, libxml2, and Apache Xerces. It is important to keep these libraries up-to-date with the latest security patches and updates to prevent known vulnerabilities from being exploited.

Final Thoughts

In conclusion, XML Injection is a serious threat that can lead to unauthorized access to sensitive data, denial of service attacks, and other potential risks to applications. It is important to understand how attackers inject malicious XML content and the potential risks to applications. To prevent XML Injection, it is crucial to properly validate and sanitize all user input before it is processed by the XML parser. Additionally, using secure XML parsing libraries, limiting access to XML parsers, and regularly updating software can help prevent XML Injection attacks. By following these best practices, organizations can stay secure and protect against XML Injection attacks.

Learn More: If you want to know more about keeping your API endpoints safe and secure, take a look at Akto. Find out how we can help protect your API endpoints from bad actors and unauthorized access.

Important Links

Want to ask something?

Our community offers a network of support and resources. You can ask any question there and will get a reply in 24 hours.

Want to ask something?

Our community offers a network of support and resources. You can ask any question there and will get a reply in 24 hours.

Want to ask something?

Our community offers a network of support and resources. You can ask any question there and will get a reply in 24 hours.

Follow us for more updates

Experience enterprise-grade API Security solution