Monday, September 17, 2007

Assume all input is malicious until proven otherwise

Input validation:

Input validation is the most important ingredient of a secure application. Most major security holes today result from input validation flaws. This is something you can fix only by writing secure code; no settings or firewalls can save you here.

Your application’s user input is the attacker’s primary weapon when targeting your application. Various attacks like Buffer overflow; cross-site scripting; SQL injection; canonicalization; code injection; and numerous other denial of service and elevation of privilege attacks can exploit poor input validation. For example Non-validated input in the Hypertext Markup Language (HTML) output stream leads to cross-site scripting, Non-validated input used to generate SQL queries leads to SQL injection, and Use of input file names, URLs, or user names for security decisions leads to canonicalization attack.

Input is anything that isn’t well known at compile time. Web applications receive input from various sources, for example, all data sent from the user or that is round-tripped by your application (post back data, view State, cookies, headers, query string parameters, and so forth) and back-end data (databases, configuration data, and other data sources). All that input data influences your request processing at some point.

If you make unfounded assumptions about the type, length, format, or range of input, your application is unlikely to be secure. The attacker can supply carefully crafted input that compromises your application.

Assume all input is malicious until proven otherwise, and apply a defense in depth strategy to input validation, taking particular precautions to make sure that input is validated whenever a trust boundary in your application is crossed. Your applications must ensure that input from query strings, form fields, and cookies are valid for the application. Consider all user input as possibly malicious, and sanitize for the context of the downstream code. Validate all input for known valid values and then reject all other input. Use regular expressions to validate input data.

You should also validate the data coming from the database as treating it as one form of the user input, especially if other applications write to the database. But, Input validation is not always necessary if the input is passed from a trusted source inside your trust boundary, but it should be considered mandatory if the input is passed from sources that are not trusted.

Proper input validation is one of your strongest measures of defense against today’s application attacks. Consider the following guidelines for input validation.
  • Assume all input is malicious: Input validation starts with a fundamental supposition that all input is malicious until proven otherwise. Whether input comes from a service, a file share, a user, or a database, validate your input if the source is outside your trust boundary.
  • Centralize your approach: Make your input validation strategy a core element of your application design. Consider a centralized approach to validation.
  • Do not rely on client-side validation: Server-side code should perform its own validation. Client side validations can be easily bypassed. For example you used java script to validate the value entered by the user can be easily by passed by disabling script in the browser.
  • Be careful with canonicalization issues: Data in canonical form is in its most standard or simplest form. Canonicalization is the process of converting data to its canonical form.
  • Constrain, reject, and sanitize your input: The preferred approach to validating input is to constrain what you allow from the beginning. Validate all input for known valid values and then reject all other input. Best way is to use regular expressions to validate input data.
  • Other Countermeasures (Defense in Depth approach):
    In addition to the techniques discussed earlier, use other countermeasures for defense in depth like set the correct character encoding, use the ASP.NET version 1.1 validateRequest option, Install URLScan on your Web server etc.

Input Validation Techniques:

You can use a variety number of input validation techniques to validate the data. But, most of all can be categorized into either white listing approach or black listing approach. Some of the validation techniques are described below.

  • Black Listing: Developers feel black listing approach is an easiest approach for input validation but indeed it is very hard to black listing, you cannot predict what unexpected input might prove dangerous as new exploits are developed. How can you determine the all malicious characters? For example to overcome Cross site scripting attack you may black list ‘<’, ‘>’ and few more special characters but these characters can be represented in many ways. So, this approach is always the most unreliable. You can use this technique along with the other validation techniques for defense in depth purpose, but don’t rely only on this technique to validate the input. The above mentioned ASP.NET version 1.1 “validateRequest” feature uses this technique.
  • White Listing: Well, this is the preferred technique for input validation. White listing is defining a set of allowed characters and rejecting anything outside this set. This is exactly the opposite of black listing and is much more powerful because it allows only reliable characters, and it is easier to implement.
  • Data Type Conversion: It is always recommended to validate the data for type, format, range, and length. The simple fundamental input data checks you can do is to make sure that data is of the correct data type you are expecting. Every data type in .NET has a method called Parse/TryParse that allows you to create the corresponding data type from a string.
  • Regular Expressions: Wow, here is the right choice for you to validate the data for format, type, range, and length all with a single shot; this is incredibly powerful way to implement white listing and pattern matching of strings. .NET Framework providing System.Text.RegularExpression namespace for this purpose.
  • XML Validation: Validation of XML data against schema is another white-listing technique. You should know what to allow and expect, XML Schema is an powerful way to make sure XML documents are comply to a certain format. .NET Framework providing System.Xml namespace for this purpose.
    Along with the above few more techniques are there like sand boxing, integrity checking with hashing, etc.

Summary:

While developing applications always remember that the majority of application level attacks approx 80% rely on maliciously formed input data and poor application input validation. Most Web application attacks require that malicious input is passed within HTTP requests. The general goal is either to coerce the application into performing unauthorized operations or to disrupt its normal operation. This is why thorough input validation is an essential countermeasure to many attacks and should be made a top priority when you develop Web pages. Take special care in this area to make sure that your validation strategy is sound and that all data that is processed from a non-trusted source is properly validated.

References:

There are several validation routines freely available at:
How To: Use Regular Expressions to Constrain Input in ASP.NET
Security Guidelines: ASP.NET 2.0

Add to Technorati Favorites

No comments: