Is Java More Secure than C?
PUBLISHED IN
Secure DevelopmentWhether Java is more secure than C is a simple question to ask, but a hard question to answer well. When we began writing the SEI CERT Oracle Coding Standard for Java, we thought that Java would require fewer secure coding rules than the SEI CERT C Coding Standard because Java was designed with security in mind. We naively assumed that a more secure language would need fewer rules than a less secure one. However, Java has 168 coding rules compared to just 116 for C. Why? Was our (admittedly simplistic) assumption completely spurious? Or, are there problems with our C or Java rules? Or, are Java programs, on average, just as susceptible to vulnerabilities as C programs? In this post, I attempt to analyze our CERT rules for both C and Java to determine if they indeed refute the conventional wisdom that Java is more secure than C.
I will assume that the C and Java rules are both consistent and comprehensive. That is, every vulnerability that can be encoded using the subsets of C and Java, as specified in both standards' Scope sections, can be categorized as noncompliant with exactly one rule. Thus, there are no vulnerabilities covered by two rules, nor do the rules omit any vulnerabilities.
Both standards were written with these goals in mind. For C we restrict our rules to the ISO C standard [C11], and for Java we restrict our rules to the Java Language Specification plus some of the core libraries. These standards were also developed on our public wiki and reviewed by experts in the C and Java communities. We acknowledge that the number of rules for any domain is an interesting but not persuasive metric regarding the domain's security. A more persuasive line of thought is as follows: If the secure coding rules for one domain are a proper subset of the secure coding rules for a second domain, then the first domain is more secure than the second. We would also assert that the first domain is simpler to use, because a developer concerned with security would need to remember fewer rules.
Media coverage of exploits of Java desktop and server apps has been relatively limited. While Java has suffered a few high-profile exploits, the most notorious of these exploited vulnerabilities in the Java core libraries and only worked on applets. Most exploits that involve Java are injection exploits, such as cross-site scripting (XXS), that are not specific to the language itself. In contrast, C has a long and sordid history of exploits going back to the late 1980s (and probably earlier). For these reasons, Java is often considered more secure.
For this analysis, I decided to focus on the most critical rules in the C and Java coding standards. Both coding standards provide a severity metric for each rule, which ranks the consequences of violating the rule as follows:
So I simply counted the rules that have a high severity for both C and Java, assuming Java would have fewer high-severity rules than C. Here are my results:
It appears that Java does have slightly fewer rules that address code execution or privilege escalation. The difference, however, appears rather marginal.
I then looked more deeply into the high-severity rules in both C and Java and tried to categorize them by asking why each one was high-severity. Here is a summary of my results, which are then explained further below.
Categorization of High-Severity Rules
- Memory Corruption. Memory corruption comprises the biggest category of high-severity rules in C. Java has no analogous rules because its type system prevents memory corruption, which includes vulnerabilities, such as buffer overflows, format-string vulnerabilities, and use-after-free errors. It has become harder in C programs, however, to exploit memory corruption due to the advent of memory-protection technologies, such as address space layout randomization (ASLR) and data execution prevention (DEP). ASLR randomizes the layout of the program and its associated data in memory. On a typical 32-bit Linux system, ASLR reduces the success rate of a code-execution attack by a factor of 65,536 (or 216), according to Shacham and colleagues (Shacham, 2004). ASLR can be defeated by learning the memory layout of a long-running program by means of some lesser exploit that reveals memory layout. ASLR also requires support from the program, all associated libraries, and the operating platform.
DEP partitions memory into writable memory (that contains data) and executable memory (that contains code) and forbids executing memory that is also writable. Consequently, DEP thwarts simple exploit techniques but can still be subject to more advanced techniques, such as return-oriented programming (Shacham 2007). Both ASLR and DEP are supported by major desktop and mobile operating systems. They may not be available, however, on embedded platforms, which also support C programs. Because these technologies are neither perfect nor universally available, we continue to promote adherence to the CERT C rules associated with memory corruption.
- Privilege Escalation. Both C and Java have rules describing privilege escalations, but the privileges vary widely between the two languages. Java has an internal privilege system, which allows some code to run with no restrictions, while other code in the same program can have various privileges withheld, such as the ability to write a file to disk or send data over the network. Applets, typically, have restricted privilege, and the most well-publicized Java exploits undermined Java's internal privilege model, granting themselves the same privilege as desktop Java applications. Thus, the Java rules address internal privilege escalation (IPE), which undermines Java's internal security model.
In contrast, C has no internal privilege model; there is no feature in the C standard that a program can use to limit what some other portion of the program can do. The C rules that address privilege escalation focus instead on an external privilege model, in which some C programs execute. The Windows privilege model and the UNIX permissions model typify the privilege systems alluded to in the four C rules that discuss external privilege escalation (EPE). These privilege models are external to any programming language, so they apply to Java code just as much as they apply to C code.
- Injection. Injection refers to the ability of an attacker to run malicious code in some language other than C or Java. For SQL injection, the language is SQL, and for XSS, the language is HTML, which can include Flash or JavaScript. The two C rules that cover injection are ENV33-C. Do not call system() and FIO30-C. Exclude user input from format strings. One might argue that FIO30-C qualifies under both memory corruption and injection. I have classified it under injection because so few other rules could be classified here. Injection vulnerabilities can occur in both languages. Java has more injection rules than C simply because it comes with more subsystems than standard C. For example, SQL injection is possible in both C and Java, but only Java provides a standard library for connecting to SQL databases (the JDBC); hence, only Java has a rule about SQL injection.
- Leftovers. The remaining categories contain only seven C rules and nine Java rules. I will show that all of the Java rules have analogous C rules, or would have C rules if standard C covered the same categories. In other words, these categories only provide vulnerabilities in Java that also exist in C.
The single Java rule about C code execution is JNI03-J. Do not use direct pointers to Java objects in JNI code. This is our first rule about the Java Native Interface (JNI), and it did not fit well in any other category. The rule has high severity because it describes JNI code, which is typically written in C.
Finally, the C standard (C11, Section 3.4.3) provides this definition of undefined behavior:
Undefined Behavior - behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements
The C standard is aimed at compiler writers and platform builders, so this definition gives writers and builders the leeway to allow a program that exhibits undefined behavior to do anything it chooses, including allowing itself to get hacked.
The two Java rules classified under unexpected behavior are MET00-J. Validate method arguments and MSC02-J. Generate strong random numbers. Java lacks an analogous concept of undefined behavior. These rules have high severity because an attacker can use violations of these rules to bypass an authentication mechanism and acquire escalated privileges. Both of these problems can occur in C code as well as Java.
Evaluation
The preceding analysis demonstrates that all of the high-severity Java rules also apply to C code, except for those in Java's biggest category, which is internal privilege escalation (IPE). C has no possibility of IPE because C lacks an internal privilege model. We also note that C's biggest security category is memory corruption, which does not affect Java code. Finally, Java has only a marginally lower percentage of high-severity rules than C.
Let's now consider the scope of these rules. Some rules are simply out-of-scope for many programs. For example, the concurrency rules in both C and Java are inapplicable to single-threaded programs. A developer who writes a single-threaded program therefore has fewer secure coding rules to comply with than a developer writing a multithreaded program.
Next, let's consider Java's IPE category. This set of rules is designed for Java code that operates in the face of untrusted code that runs in the same program. Most Java code, however, does not work with untrusted code. Desktop applications typically do not involve untrusted code. An application with a plugin framework, such as Eclipse, could employ Java's security architecture to restrict plugins, but it does not have to. Eclipse, for example, assumes that any installed plugins are just as trusted as the core application. In other words, Eclipse depends on the user to protect it from untrusted code. Consequently, the IPE rules do not apply to Eclipse or to most desktop applications.
What about applets and servlets? Java applets may be dying, but servlets are quite alive and well, as used by JBoss and Apache Tomcat. Most applets and servlets do run with restricted privileges but typically do not interact with untrusted code. They interact only with their container framework, such as JBoss, so the IPE rules do not apply to applets or servlets.
The IPE rules are designed for code to handle untrusted code, including in applet containers and servlet containers, such as JBoss and Tomcat, and any libraries these containers may depend on, such as the Java core libraries. If you are writing Java core library code, or code that is used in a servlet framework, the IPE rules apply to you. If you are writing only desktop applications, applets, or servlets themselves, however, you can ignore the IPE rules.
The closest analogue provided by C is C code that is privileged by a platform but which must interact with unprivileged code. UNIX programs with root privileges or Windows programs with administrative privileges would apply here.
Consequently, if you are writing unprivileged C code, you can ignore the four EPE rules of C. If you are writing unprivileged Java code, you can ignore the 20 IPE rules. Most code, both in C and Java, is unprivileged.
So, how do our high-severity rules look if we restrict our rules to those that don't cover privilege escalation, either internal or external?
Now that we have excluded privilege escalation, the difference in severity between Java and C rules is striking: 6 percent of Java rules vs. 25 percent of C rules. If you are writing unprivileged code, therefore, you have many fewer rules to worry about in Java than you do in C. Consequently, this table strongly hints that Java is more secure than C. In fact, as we showed earlier, all nine of the remaining high-severity Java rules also apply to C, which provides more rigorous support for our hypothesis.
Wrapping Up and Looking Ahead
In summary, the only technologies that prevent remote execution in the face of memory corruption (such as buffer overflows) are not yet trusted enough by the C community to allow us to ignore the CERT C rules that address memory corruption. In contrast, Java programs do not suffer from memory corruption, because the Java Virtual Machine never loads untrusted Java code into memory unless specifically instructed to do so by a Java program. Most Java programs do not load untrusted code into memory, so many rules in our Java standard do not apply to them, including the 20 high-severity IPE rules.
If you are writing Java code to manage unprivileged Java code (such as an applet container), you are subject to about as many severe rules as if you are writing C code. If your Java code is itself unprivileged, however, or if you are ignoring Java's privilege model, you are subject to far fewer high-severity rules than if you program in C.
Additional Resources
For more information about the work of the CERT Secure Coding Team, please visit
http://www.cert.org/secure-coding/.
To read the paper On the Effectiveness of Adress-Space Randomization by Shacham, Hovav, Page, Matthew, Pfaff, Ben, Goh, Eu-Jin, Modadugu, Nagendra, and Boneh, Dan, please click here.
To read the paper, The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls (on the x86) by Shacham, Hovav, please click here.
To view the standard for (C11]) ISO/IEC. Programming Languages--C, 3rd ed , please click here.
More By The Author
PUBLISHED IN
Secure DevelopmentGet updates on our latest work.
Sign up to have the latest post sent to your inbox weekly.
Subscribe Get our RSS feedGet updates on our latest work.
Each week, our researchers write about the latest in software engineering, cybersecurity and artificial intelligence. Sign up to get the latest post sent to your inbox the day it's published.
Subscribe Get our RSS feed