Gopal's Blog: Code Injection: XPath Injection

XPath Injection:

SQL is the most popular type of code injection attack, there are several others that can be just as dangerous to your applications and your data, including LDAP injection and XPath injection. An ‘XPath injection’ attack is similar to an SQL injection attack, but its target is an XML document rather than an SQL database. ‘XPath Injection’ is an attack technique used to exploit web sites that construct XPath queries from user-supplied input.

What is XML?

XML stands for Extensible Markup Language and was designed to describe data. It allows programmers to create their own customized tags to store data. In XML the data is stored in nodes in a tree form. XML Path or XPath language is used for querying information from the nodes of an XML document. Please refer to XML Tutorial for more details on XML.

What is XPath?

“XML Path” or “XPath” 1.0 is a language used to refer to parts of an XML document. Path expressions are used to access elements and attributes in an XML document, which return a node-set, a string, a Boolean or a number. It can be used directly to query an XML document by an application, or as part of a larger operation such as applying an XSLT transformation to an XML document, or applying an XQuery to an XML document. Please refer to XPath Tutorial for more details on XPath.

In Detail:

Code Injection is a technique to Inject code into a program or application code by taking advantage of the unchecked assumptions the application makes about its inputs to bypass or modify the originally intended functionality of the code. All code injection attacks work in a same way; an attacker injects malicious code into the application code through an input field of the application. So, to perform such attacks there must be entry points that are not performing adequate validation.
Consider a Web application that uses XPath to query an XML document to retrieve the social security number of a customer by passing name and password values that are supplied by the user of the application. If the application embeds these values directly in the XPath query then it is vulnerable to XPath Injection.

For Example:

An application is using “CustProfile.xml” to store the customer related data and is like below:

<? xml version = " 1.0" encoding =" utf-8" ?>
<Customers>
<Customer>
<CLogIn> Malapati </CLogIn>
<CName> Pradeep Malapati </CName>
<Email> pradeep.malapati@malapaticorp.com </Email>
<Pwd> Malapati2020 </Pwd>
<SSN> xxxxxxxxxxxx </SSN>
<ACNO> 11111111111 </ACNO>
</Customer>

<Customer>
<CLogIn> Vijaya </CLogIn>
<CName> Vijaya Gavuji </CName>
<Email> Vijaya.Gavuji@vigavinc.com </Email>
<Pwd> Vigav1010 </Pwd>
<SSN> xxxxxxxxxxxx </SSN>
<ACNO> 99999999999 </ACNO>
</Customer>
--------
--------
</Customers>

In application, the code is written like below to retrieve Social Security Number of the customer:

XmlDocument XmlDoc = new XmlDocument();
XmlDoc.Load("CustProfile.xml");
...
XPathNavigator custnav = XmlDoc.CreateNavigator();
XPathExpression xexpr = custnav.Compile("string(//Customers/Customer[CLogIn/text()='"+TextBox1.Text+ "' and Pwd/text()='"+TextBox2.Text+ "']/SSN/text())");
String ssn=Convert.ToString(custnav.Evaluate(xexpr));
if (!ssn=="")
{
// some logic using ssn or return ssn
}
else
{
// return with error message
}

In the above sample code application is allowing its users to retrieve and perform operations on SSN based on the invalidated values supplied by them. Now, this code is vulnerable to XPath Injection! If user enters legitimate values like Malapati and Malapati2020 in Login and password fields respectively then the Query will be looks like below:

// Customers/Customer[CLogIn/text()='Malapati’ and Pwd/text()= 'Malapati2020']/SSN/text()

Works well. But, suppose a user/ attacker enters the following value in the text box provided for customer Login and a blank value in password field.
' Or 1=1 Or 'a'='a

Now, what will be the actual XPath query?

// Customers/Customer[CLogIn/text()=' ' Or 1=1 Or ‘a’ = ‘a’ and Pwd/text()=' ']/SSN/text()

The above expression CLogIn/text() = ‘ ’ Or 1=1 Or ‘a’ = ‘a’ and Pwd/text() = ‘ ’ can be simply represented as below:
(CLogIn/text() = ‘ ’ Or 1=1) Or (‘a’ = ‘a’ and Pwd/text() = ‘ ’) as logical operator AND has higher precedence than OR. So if either first or second condition is true the expression will evaluate to true. In this case the attacker input is having 1=1 is always returns true thus making first condition always becomes true. Now the above query is identical to //Customers/Customer/SSN/text() that results first record’s/node SSN number.

Countermeasures / Preventions:

XPATH Injection can be prevented in the same way as SQL injection since XPath injection attacks are much like SQL injection attacks. Most of these preventative methods are the same as well to prevent other typical code injection attacks.
Input Validation: is one of the best measures to defend applications from XPATH injection attacks. The developer has to ensure that the application does accept only legitimate input. Please refer to my previous posting on Input Validation for more details.
Parameterization: Use parameterized queries to prevent XPATH injection. In Parameterized queries, the queries are precompiled and instead of passing user input as expressions, parameters are passed. For example:
//Customers/Customer[CLogIn/text() = $login and Pwd/text() = $password]/SSN/text()

Please refer to the Mitigating XPath Injection Attacks in .NET for more details.

Gopal's Blog

Wednesday, October 10, 2007

Code Injection: XPath Injection

1 comment:

Gopal Rao Joginipally

My Blog Archive

Other Links:

Security Related Links:

Topics Posted

Useful References