icon-carat-right menu search cmu-wordmark

Helping Developers Address Security with the CERT C Secure Coding Standard

Headshot of David Keaton

By analyzing vulnerability reports for the C, C++, Perl, and Java programming languages, the CERT Secure Coding Team observed that a relatively small number of programming errors leads to most vulnerabilities. Our research focuses on identifying insecure coding practices and developing secure alternatives that software programmers can use to reduce or eliminate vulnerabilities before software is deployed. In a previous post, I described our work to identify vulnerabilities that informed the revision of the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC) standard for the C programming language. The CERT Secure Coding Team has also been working on the CERT C Secure Coding Standard, which contains a set of rules and guidelines to help developers code securely. This posting describes our latest set of rules and recommendations, which aims to help developers avoid undefined and/or unexpected behavior in deployed code.

History of Addressing Security Issues in C

The C programming language began to take shape in 1969, long before security concerns became important for its applications. C was first standardized in 1989, too soon to take into account the then-budding security problems on the ARPANET. Due to lack of customer demand for security, even the 1999 revision of the C standard contained only one security-related feature, the snprintf() function mentioned in my previous blog post.

In recent years, however, C developers have been forced to turn their attention to security issues. The CERT C Secure Coding Standard addresses this need by providing rules and recommendations for avoiding security problems in the following categories:

  • preprocessor - issues dealing with macros
  • declarations and initialization - choosing the right storage duration and type qualifiers, and C language rules for the uniqueness of variable names
  • expressions - order of evaluation, safe use of C syntax
  • integers - arithmetic issues such as avoiding integer overflow
  • floating point - quirks of computer arithmetic that are often overlooked by people who are used to using integers
  • arrays - allocating and communicating the correct size, and using the correct types
  • characters and strings - ensuring that character sequences are null-terminated, and proper use of narrow and wide characters
  • memory management - avoiding memory leaks, double free, and underallocation
  • input output (I/O) - proper use of C's file I/O library
  • environment - interfacing with the operating system
  • signals - best practices for handling asynchronous events
  • error handling - ensuring correct detection of error conditions
  • application programming interfaces - security-conscious design of the interfaces between parts of a program
  • concurrency - issues that arise in multithreaded programs.
  • miscellaneous - issues not covered by other categories, such as assertions, and maintaining the security of function pointers.
  • POSIX - issues specific to the POSIX operating system, which is widely used with C.

Examples of CERT C Secure Coding Rules

The remainder of this blog posting gives some examples of the types of secure coding rules we've defined for C.

Preprocessor macros. One part of the CERT C Secure Coding Standard focuses on the C preprocessor, which is a macro expander that executes at the beginning of the compilation process. Far too often, programmers overlook security-related consequences of preprocessor misuse. If a programmer passes an expression that has a side effect as an argument to a macro, the macro may cause the side effect to occur multiple times, depending on how it uses that argument.

The CERT Secure Coding team recently developed rules to ensure that a programmer doesn't accidentally pass an argument that slips a side effect into a macro, or that the programmer writes the code in such a way that the side effects only occur one time. While these and other rules described in this post aren't ones that would normally be considered security risks, they have resulted in security problems when the code is deployed and activated.

For example, we have developed the following rules and recommendations for developers to follow when they write code that involves the C preprocessor:

  1. Avoid side effects in arguments to unsafe macros (Identifier PRE31-C). This hard-and-fast rule states that if a developer is using a macro that uses its arguments more than once, then the developer must avoid passing any arguments with side effects to that macro. An example of the application of PRE31-C is

    #define ABS(x) (((x) < 0) ? -(x) : (x))
    /* ... */
    m = ABS(++n); /* undefined behavior */

  2. Do not define unsafe macros (Identifier PRE12-C). This recommendation relates to PRE31-C that defines an unsafe macro as one that evaluates any of its arguments more than one time.
  3. Macro replacement lists should be parenthesized (Identifier PRE02-C). This recommendation suggests that developers should use parentheses around macro replacement lists; otherwise, operator precedence may cause the expression to be computed in unexpected ways. For example, if an argument contains a plus sign (+), and a macro contains a multiplication sign (*), and the argument has not been parenthesized, then the multiplication will occur first, followed by the addition, which may not be what a developer expected. An example of the application of PRE02-C is listed below:

    #define CUBE(X) (X) * (X) * (X)
    int i = 3;
    int a = 81 / CUBE(i); /* evaluates to 243 */

Declarations. C provides a range of mechanisms to declare data types and variables of these data types. For example, a developer might declare a variable in an outer scope and also declare another variable of the same name in a nested inner scope. In such a case, the variable in the inner scope hides the variable in the outer scope. When a developer makes changes to that variable, he or she might assume that changes are being made to the outer-scope variable when, in fact, only the inner-scope variable is being changed.

The declarations problem is compounded by the fact that in C, there is a limit to how many characters are required to be unique in a variable name. The C standard has several requirements including the following:

  1. A macro name has 63 significant initial characters. If a program has two macro names and they differ only in the 64th character, the compiler is allowed to think that those are the same name.
  2. Programs also have 31 significant initial characters in an external identifier. If a program has two variables whose names differ only in the 32nd character or after, the compiler is allowed to think that those are the same variable.

The situation described above can cause a problem wherein a developer has declared two variables that can inadvertently reside in the same scope. While the two variables might not have the same name in English, the compiler might truncate the name to such a degree that the names are the same.

Characters and strings. C defines a set of functions that operate on strings composed of characters. It is common practice for developers to count the number of characters needed in a string and allocate exactly the same number of bytes.

For example, if a developer allocates enough space to store the text version of the IPv4 address 255.255.255.255, the developer might allocate 15 bytes when he or she actually needs 16 bytes to accommodate the null terminator at the end. One additional character is needed for the null terminator, a byte whose value is zero that defines the end of the string.

While languages such as Fortran store a count of how many characters the string contains as part of the string data structure, C doesn't do that. If the marker that indicates the end of the string is missing, then the software doesn't know that it needs to stop. Instead, it keeps searching through the memory. While the previous rules and recommendations are intended to prevent vulnerabilities that eventually lead to security problems, neglecting to include a byte for the null terminator leads directly to a buffer overflow.

Future Work

Since publishing the first version of the CERT C Secure Coding Standard, we've learned and improved our approach to vulnerability analysis and developing rules and recommendations.

A new version of the CERT Secure Coding Standard will eventually be published to update the existing rules and recommendations, as well as to add some for new C features, such as the standard C multithreading library.

Meanwhile, the current work has already met with success. Cisco and Oracle have adopted the CERT C Secure Coding Standard as part of their internal processes. We continue to hear of additional interest from various organizations.

Additional ResourcesFor more information about the CERT Secure Coding initiative, please visit
https://www.sei.cmu.edu/research-capabilities/all-work/display.cfm?customel_datapageid_4050=21274

Get updates on our latest work.

Each week, our researchers write about the latest in software engineering, cybersecurity and artificial intelligence. Sign up to get the latest post sent to your inbox the day it's published.

Subscribe Get our RSS feed