Automated Code Repair in the C Programming Language
This blog post is co-authored by Will Snavely.
Finding violations of secure coding guidelines in source code is daunting, but fixing them is an even greater challenge. We are creating automated tools for source code transformation. Experience in examining software bugs reveals that many security-relevant bugs follow common patterns (which can be automatically detected) and that there are corresponding patterns for repair (which can be performed by automatic program transformation). For example, integer overflow in calculations related to array bounds or indices is almost always a bug. While static analysis tools can help, they typically produce an enormous number of warnings. Once an issue has been identified, teams are only able to eliminate a small percentage of the vulnerabilities identified. As a result, code bases often contain an unknown number of security bug vulnerabilities. This blog post describes our research in automated code repair, which can eliminate security vulnerabilities much faster than the existing manual process and at a much lower cost. While this research focuses to the C programming language, it applies to other languages as well.
Vast Amounts of Code Contain Many Security Vulnerabilities
Based on our experience with the Department of Defense (DoD) source code analysis labs, we know that most software contains many vulnerabilities. Most are caused by simple coding errors. Static analysis tools, typically used late in the development process, produce a huge number of diagnostics. Even after excluding false positives, the volume of true positives can overwhelm the abilities of development teams to fix the code. Consequently, the team eliminates only a small percentage of the vulnerabilities. The existing installed codebases in the DoD now consist of billions of lines of C code that contain an unknown number of security vulnerabilities.
Most analyzers provide basic diagnostics but do not provide automated fixes or code modifications. Integrated development environments (IDEs), such as Eclipse, offer some automated code modification. Some IDEs can fix code that has specific compilation errors, such as Quick Fixes in Eclipse. While IDEs provide some refactoring options, they do not typically change the behavior of working code. Existing techniques for addressing security problems in code often require programmers to add more information--such as annotations and attributes--that can then be post-processed. These techniques can work in new development, but they have the same practical limitations that manually addressing thousands of diagnostics has in existing programs. We need a better way to fix existing code.
Foundations of Our Work
We are developing automated source code transformation tools to remediate vulnerabilities in code that are caused by violations of rules in the CERT Secure Coding Standards. These tools convert noncompliant code into code that complies with the CERT standards. The tools reduce vulnerabilities without the need for developers to manually review thousands of diagnostics produced by static analysis tools. Often the process is completely automated, although some code transformations are only semi-automated and require input from the user.
Our work on automated repair is based on three premises:
- Many security bugs follow common patterns. For example, one common pattern of security bugs is a memory allocation such as
"p = malloc(n * sizeof(T))"where n is attacker-controlled. If n is very large, integer overflow occurs (as discussed below), and too little memory gets allocated, setting the stage for a subsequent buffer overflow.
- By recognizing such a pattern, it is possible to make a reasonable guess of the developer's intention (which we'll call an inferred specification). For example, the inferred specification in the above malloc case would be "Try to allocate enough memory to hold n objects of type T".
- It is possible to repair the code to satisfy this inferred specification. For example, for the
mallocexample, the repair would be to check if overflow occurs, and if so, to simulate
To develop our tool for automated code repair, we extended ROSE, which is a framework for source code transformation. Our ultimate goal is to reduce the number of rule violations that require manual inspection by two orders of magnitude--from thousands to tens. At this scope, a development team can mitigate all unhandled violations. Automated code repair reduces a system's attack surface and improves its ability to withstand cyber-attacks while sustaining critical functions.
Repairing Undesired Integer Overflows
Primitive integer types are represented by a fixed number of bits (e.g., 32 or 64). Overflow occurs when the result of an arithmetic operation cannot be represented by the available number of bits. In this case, the result is computed using modular arithmetic.
In this project, we focused on integer overflow that leads to memory corruption, specifically: (1) Integer overflow in calculation of how much memory to allocate, (2) Integer overflow in bounds checks for an array. The Android Stagefright vulnerabilities disclosed in July 2015 exhibit both of these types of integer overflow.
Our key assumption for repairing these types of integer overflow is that inequality comparisons involving array indices or array bounds should behave as if normal arithmetic (not modular arithmetic) were used. This assumption includes
malloc(), which internally compares the requested amount of memory to the size of the largest unallocated chunk of virtual memory. Integer overflows that are unrelated to memory are not necessarily undesired and are left unchanged. For example, cryptographic and hashing functions are designed to employ modular arithmetic.
In the general case, allowing unbounded loops, normal arithmetic cannot be emulated by fixed-width integers. However, in certain frequently occurring special cases, there are ways to emulate it. In particular, for non-negative integers with only addition or multiplication (no subtraction or division), the value is monotonically non-decreasing (except for multiplication by zero). In this case, when a potentially overflowed value stored in an n-bit integer is compared to a non-overflowed value less than 2n, we can emulate normal arithmetic by using saturation arithmetic: When overflow occurs, we replace the overflowed value with the greatest representable value (2n − 1). Note that passing the saturated value
malloc() will produce the desired behavior of simulating an out-of-memory error.
Figure 1. The arithmetic expression
"start + n" can potentially overflow. To repair this code snippet, we replace
"start + n" with
"UADD(start, n)", where
UADD is defined in a separate header file.
UADD computes the sum using saturation arithmetic instead of modular arithmetic. It can do this using the
__builtin_add_overflow function available in GCC and Clang or using a standard C idiom for detecting unsigned overflow.
Our team is engaged with the DoD Software Assurance Community of Practice and the service evaluation labs, including CERDEC. These labs serve as effective and willing partners for both feedback and technology transition. Project members also collaborated with Claire Le Goues and Christian Kästner, both assistant professors at Carnegie Mellon University's School of Computer Science. Dr. Le Goues is a leading researcher in using genetic programming for automated code repair. In this approach, there are three inputs:
(1) a defective program,
(2) test cases that exercise a fault in the program, and
(3) test cases that exercise normal program behavior.
Genetic programming uses computational analogs of biological mutation and crossover to generate new program variants and search for a variant that gives desired results for all test cases.
Dr. Le Goues's genetic programming approach is a very different to automated repair than our approach, but we plan to work with Dr. Le Goues to explore possibilities of combining our two approaches in a way that preserves the best features of each approach.
Dr. Kästner has pioneered work on symbolically analyzing code under all possible build configurations (combinations of compile-time options). Large projects often have tens or even hundreds of compile-time options, and in the worst case, the number of possible build configurations grows exponentially with the number of options. For example, a project with 20 compile-time options might have a million possible build configurations, way too many to analyze individually. This situation requires a symbolic approach. Most existing work on static analysis considers only one build configuration at a time; the automated code project team hopes to eventually develop a repair that works for all possible build configurations.
Challenges and Future Work
One challenge we encountered in developing an automated repair specifically at the source-code level (as opposed to a source-to-binary compiler pass) is the IR↔AST mapping problem (i.e., the problem of maintaining a mapping between the intermediate representation (IR) and the abstract syntax tree (AST)). Most static analyses work best and/or are far easier to write on an intermediate representation (IR) such as LLVM's IR, and often it is advantageous to perform certain optimization passes such as converting to static single assignment (SSA) form. However, the actual source-to-source repair necessarily must be done at the level of the AST or similar high-level representation that has a direct correspondence to the original source code.
The trouble then is that, once we have found a spot in the IR that needs to be repaired, existing tools provide no way to map it back to the AST. Likewise, it is not just a matter of engineering effort: some useful optimization passes (e.g., copy propagation) destroy the ability to provide an unambiguous mapping from the IR back to the AST. In the coming year we will work on this problem.
The use of macros in C provides an additional obstacle on top of the IR↔AST mapping program. The newest development version of the ROSE compiler framework has some support to modifying source code while preserving macros. Templates in C++ are another, perhaps more vexing, source of difficulty.
A potential downside of automated code repair is that an erroneous transformation can introduce new bugs into the codebase. For this reason, we aim to be extremely conservative in what transformations we apply fully automatically.
In the year ahead, we are also looking forward to working on two automated repair projects:
- Inference of memory bounds
- Incorrect usage of security-related APIs
In conclusion, to achieve the organization's software assurance goals, automated code repair promises to greatly reduce the number of vulnerabilities in a codebase, freeing the organization's resources to focus on fixing the remaining coding errors and developing secure code.
This blog post is based on the paper "Automated Code Repair Based on Inferred Specifications" presented by Will Klieber and Will Snavely at IEEE SecDev Conference 2016 (November 2016.)
Listen to the podcast Is Java More Secure Than C? by David Svoboda.
Read the blog post Prioritizing Alerts from Static Analysis to Find and Fix Code Flaws by Lori Flynn.
Read the SEI technical note Improving the Automated Detection and Analysis of Secure Coding Violations.