Mitigating Injection Attacks

Thursday, July 07, 2011

kapil assudani

67a9d83011f3fbb2cf8503aff453cc24

Classifying the Information Flow Lifecycle of Variables - Mitigating Injection Attacks

In continuation of my strong inclination towards the effectiveness of having a secure coding framework used by developers compared to teaching secure coding to developers (esp. for non-software companies), here are some thoughts towards the effort of mitigating the injection attacks like cross-site scripting, SQL injection, OS injection etc.

The inspiration for this idea stems from my recent visit to an auto repair shop that wanted to perform a dye test to locate the exact valve that was leaking in my car and expanding on the talk by Brian Chess and Jacob West at Black Hat about Taint Propagation. 

The lifecycle of any variable that is declared by an application code belongs to one of the following categories: 

  • The variable is set by the application code only and passed around within the application server itself and never leaves the server host; we will call this a "server side variable".
  • The variable is set by the application code only and passed around to other server side components like a database server but never passed to a client browser; we will call this “multiple server side variable”.
  • The variable is set by the application code only and can be passed around between a server and a client, with the intention that client never tampers the variable; we will call this a "server fixed variable".
  • The variable is intended to be set by the client side and is passed around between a server and a client one or more times; we will call this a "client side variable".

In the above scenarios, we have identified four variable types based on the information flow or alternately identified 4 trust levels types of variables in an application code. Now, imagine a secure coding framework has the ability to identify these four variable types by setting them with a standard nomenclature that differentiates them from each other.

For example: A server fixed variable will always be declared with a tag "pi" in its name i.e. stringpi x or intpi y etc, or a client side variable will always be declared with a tag "sigma" in its name i.e. stringsigma x or intsigma y etc.

With this ability of a secure coding framework to classify variable type, one could apply specific input validation/output encoding technique depending on the variable type.

Lets say a client side variable maintains it tag for each and every operation that is performed on it right from when it is declared in an application code. Since it is a client side variable and hence a low trust variable, a regex check is applied to it for each operation.

Similarly, lets say a server side variable that never leaves the server and hence is a high trust variable a regex check is applied to it only during compile time of the application code.

For e.g. every time the stringpi is declared, operated upon, moved, copied etc. a specific input validation technique that is most optimal for that variable type is applied by the framework like regex, parameterization, white-list approach etc.

Similarly a client side variable might get applied to both input validation and output encoding to prevent, lets say, cross-site scripting attack. With this approach, we achieve a number of objectives for mitigation against injection attacks:

  • No variable goes unobserved, since it can be traced through the standard nomenclature.
  • As soon as a variable is operated upon, based on the variable type, a requisite input validation/output encoding technique is applied by the framework.
  • The developers job gets easier since if he/she is working on an independent code that is a module for the master code, the variable type is identified and hence corresponding input validation / output encoding technique automatically gets applied through the framework.
Please keep in mind, the basic premise of the idea is to have the capability in the coding frameworks to classify the identified variable types based on information flow as they traverse trust boundaries. Subsequently have the capability to apply validation controls or output encoding operation whenever these "now traceable" variables are operated upon. The validation control can be applied repeatedly for an untrusted variable every time its operated upon by applying a coding framework class or may be applied once for a trusted variable at run time by the compiler, to minimize any performance impact. These are just initial thoughts and I would appreciate any feedback and validation of this idea.
Possibly Related Articles:
14113
Network->General
Information Security
SQl Injection Secure Coding Cross Site Scripting Mitigation OS Injection Taint Propagation
Post Rating I Like this!
The views expressed in this post are the opinions of the Infosec Island member that posted this content. Infosec Island is not responsible for the content or messaging of this post.

Unauthorized reproduction of this article (in part or in whole) is prohibited without the express written permission of Infosec Island and the Infosec Island member that posted this content--this includes using our RSS feed for any purpose other than personal use.