Rust Vulnerability Analysis and Maturity Challenges

While the memory safety and security features of the Rust programming language can be effective in many situations, Rust’s compiler is very particular on what constitutes good software design practices. Whenever design assumptions disagree with real-world data and assumptions, there is the possibility of security vulnerabilities–and malicious software that can take advantage of those vulnerabilities. In this post, we will focus on users of Rust programs, rather than Rust developers. We will explore some tools for understanding vulnerabilities whether the original source code is available or not. These tools are important for understanding malicious software where source code is often unavailable, as well as commenting on possible directions in which tools and automated code analysis can improve. We also comment on the maturity of the Rust software ecosystem as a whole and how that might impact future security responses, including via the coordinated vulnerability disclosure methods advocated by the SEI’s CERT Coordination Center (CERT/CC). This post is the second in a series exploring the Rust programming language. The first post explored security issues with Rust.

Rust in the Current Vulnerability Ecosystem

A MITRE CVE search for “Rust” in December 2022 returned recent vulnerabilities affecting a wide range of community-maintained libraries but also cargo itself, Rust’s default dependency management and software build tool. cargo searches and installs libraries by default from crates.io, an online repository of mostly community-contributed unofficial libraries similar to other software ecosystems, such as Java’s Maven and the Python Package Index (PYPI). The Rust compiler developers regularly test compiler release candidates against crates.io code to look for regressions. Further research will likely be needed to consider the security of crates.io and its impact for vulnerability management and maintaining a software bill of materials (or software supply chain), especially if the Rust ecosystem is used in critical systems.

Perhaps one of Rust’s most noteworthy features is its borrow checker and ability to track memory lifetimes, along with the unsafe keyword. The borrow checker’s inability to reason about certain situations around the use of unsafe code can result in interesting and surprising vulnerabilities. CVE-2021-28032 is an example of such a vulnerability, in which the software library was able to generate multiple mutable references to the same memory location, violating the memory safety rules normally imposed on Rust code.

The problem addressed by CVE-2021-28032 arose from a custom struct Idx that implemented the Borrow trait, allowing code to borrow some of the internal data contained inside Idx. According to the Borrow trait documentation, to do this correctly and safely, one must also implement the Eq and Hash traits in such a manner to ensure that the borrow provides consistent references. In particular, borrowable traits that also implement Ord need to ensure that Ord’s definition of equality is the same as Eq and Hash.

In the case of this vulnerability, the Borrow implementation did not properly check for equality across traits and so could generate two different references to the same struct. The borrow checker did not identify this as a problem because the borrow checker does not check raw pointer dereferences in unsafe code as it did for Idx. The issue was mitigated by adding an intermediate temporary variable to hold the borrowed value, to ensure that only one reference to the original object was generated. A more complete solution could include more resilient implementations of the related traits to enforce the assumed unique borrowing. Improvements can also be made to the Rust borrow-checker logic to better search for memory safety violations.

While this is only one example, other CVEs appeared for undefined behavior and other memory access errors in our basic CVE search. These existing CVEs seem to confirm our earlier observations on the limitations of the Rust security model. While it is hard to compare Rust-related CVEs to those of other languages and draw general conclusions about the safety of the language, we can infer that Rust’s memory safety features alone are insufficient to eliminate the introduction of memory-related software vulnerabilities into the code at build time, even if the language and compiler do well at reducing them. The Rust ecosystem must integrate vulnerability analysis and coordination of vulnerability fixes between researchers and vendors as well as field solutions rapidly to customers.

In addition to other actions that will be discussed at the end of this post, the Rust community would greatly benefit if the Rust Foundation applied to become or create a related CVE Numbering Authority (CNA). Rust Foundation contributors would be ideal for identifying, cataloging (by assigning CVEs, which are often important for triggering business and government processes), and managing vulnerabilities within the Rust ecosystem, especially if such vulnerabilities stem from rustc, cargo, or basic Rust libraries. Participation in the CVE ecosystem and coordinated vulnerability disclosure (CVD) could help mature the Rust ecosystem as a whole.

Even with Rust’s memory safety features, software engineering best practices will still be needed to avoid vulnerabilities as much as possible. Analysis tools will also be necessary to reason about Rust code, especially to look for vulnerabilities that are more subtle and hard for humans to recognize. We therefore turn to an overview of analysis tools and Rust in the next few sections.

Analysis When Source Code Is Available

The Rust ecosystem provides some experimental tools for analyzing and understanding source code using several methods, including static and dynamic analysis. The simplest tool is Clippy, which can scan source code for certain programming mistakes and adherence to Rust recommended idioms. Clippy can be useful for developers new to Rust, but it is very limited and catches only easy-to-spot errors such as inconsistencies with comments.

Rudra is an experimental static-analysis tool that can reason about certain classes of undefined behavior. Rudra has been run against all the crates listed on crates.io and has identified a significant number of bugs and issues, including some that have been assigned CVEs. For example, Rudra discovered CVE-2021-25900, a buffer overflow in the smallvec library, as well as CVE-2021-25907, a double drop vulnerability (analogous to a double-free vulnerability due to Rust’s use of default OS allocators) in the containers library.

For dynamic analysis, Miri is an experimental Rust interpreter that is designed to also detect certain classes of undefined behavior and memory access violations that are difficult to detect from static analysis alone. Miri works by compiling source code with instrumentation, then running the resulting intermediate representation (IR) in an interpreter that can look for many types of memory errors. Similar to Rudra, Miri has been used to find a number of bugs in the Rust compiler and standard library including memory leaks and shared mutable references.

So how does source-code analysis in Rust compare to source-code analysis in other languages? C and C++ have the most widespread set of static-analysis and dynamic-analysis tools. Java is similar, with the note that FindBugs!, while obsolete today, was at one time the most popular open-source static-analysis tool, and consequently has been incorporated into several commercial tools. (C has no analogous most popular open-source static-analysis tool.) In contrast, Python has several open-source tools, such as Pylint, but these only catch easy-to-spot errors such as inconsistent commenting. True static analysis is hard in Python due to its interpreted nature. We would conclude that while the set of Rust code-analysis tools may appear sparse, this sparseness can easily be attributed to Rust’s relative youth and obscurity, plus the fact that the compiler catches many errors that would normally be flagged only by static-analysis tools in other languages. As Rust grows in popularity, it should acquire static- and dynamic-analysis tools as comprehensive as those for C and Java.

While these tools can be useful to developers, source code is not always available. In these cases, we must also look at the status of binary-analysis tools for code generated from Rust.

Binary Analysis Without Source Code

An important example of binary analysis if source code is not immediately available is in malware identification. Malware often spreads as binary blobs that are sometimes specifically designed to resist easy analysis. In these cases, semi-automated and fully-automated binary-code analysis tools can save a lot of analyst time by automating common tasks and providing crucial information to the analysis.

Increasingly, analysts are reporting malware written in languages other than C. The BlackBerry Research and Intelligence Team identified in 2021 that Go, Rust, and D are increasingly used by malware authors. In 2022, Rust has been seen in new and updated ransomware packages, such as BlackCat, Hive, RustyBuer, and Luna. Somewhat ironically, Rust’s memory safety properties make it easier to write cross-platform malware code that “just works” the first time it is run, avoiding memory crashes or other safety violations that may occur in less-safe languages, such as C, when running on unknown hardware and software configurations.

First-run safety is growing in importance as malware authors increasingly target Linux devices and firmware, such as BIOS and UEFI, instead of the historical focus on Windows operating systems. It is very likely that Rust will increasingly be used in malware in the years to come, given that (1) Rust is receiving more support by toolchains and compilers such as GCC, (2) Rust code is now being integrated into the Linux kernel, and (3) Rust is moving toward full support for UEFI-targeted development.

A consequence of this growth is that traditional malware-analysis techniques and tools will need to be modified and expanded to reverse-engineer Rust-based code and better detect non-C-family malware.

To see the sorts of problems that the use of Rust might cause for current binary-analysis tools, let’s look at one concrete example involving representation of types and structures in memory. Rust uses a different default memory layout than C. Consider the following C code in which a struct consists of two Boolean values in addition to an unsigned int. In C, this could look like:

struct Between
{
    bool flag;
    unsigned int value;    
    bool secondflag;
}

The C standard requires the representation in memory to match the order in which fields are declared; therefore, the representation is far different in memory usage and padding if the value appears in between the two bools, or if it appears after or before the bools. To align along memory boundaries set by hardware, the C representation would insert padding bytes. In struct Between, the default compiler representation on x86 hardware prefers alignment of value. However, flag is represented as 1 byte, which would not need a full 4-byte “word”. Therefore, the compiler adds padding after flag, to start value on the appropriate alignment boundary. It can then add additional padding after secondflag to ensure the entire struct’s memory usage stays along alignment boundaries. This means both bools take up 4 bytes (with padding) instead of 1 byte, and the entire struct takes 4+4+4 = 12 bytes.

Meanwhile, a developer might place value after the two bools, such as the following:

struct Trailing
{
    bool flag;   
    bool secondflag;
    unsigned int value;
}

In struct Trailing, we see that the two bools, take 1 byte each in typical representation, and both can fit within the 4-byte alignment boundary. Therefore they are packed together with 2 bytes of padding into a single machine word, followed by 4 more (aligned) bytes for value. Therefore, the typical C implementation will represent this reordered struct with only 8 bytes – 2 for the two Booleans, 2 bytes as padding up to the word boundary, and then 4 bytes for value.

A Rust implementation of this structure might look like:

struct RustLayout
{
    flag: bool,
    value: u32,
    secondflag: bool,
}

The Rust default layout representation is not required to store fields in the order they are written in the code. Therefore, whether value is placed in between or at the end of the struct in the source code doesn’t matter for the default layout. The default representation allows the Rust compiler freedom to allocate and align space more efficiently. Typically, the values will be placed into memory from larger sizes to smaller sizes in a way that maintains alignment. In this struct RustLayout example, the integer’s 4 bytes might be placed first, followed by the two 1-byte Booleans. This is acceptable for the typical 4-byte hardware alignment and wouldn’t require any additional padding between the fields’ layout. This results in a more compact layout representation, taking only 8 bytes regardless of the source code’s struct field order, as opposed to C’s possible layouts.

In general, the layout used by the Rust compiler depends on other factors in memory, so even having two different structs with the exact same size fields does not guarantee that the two will use the same memory layout in the final executable. This could cause difficulty for automated tools that make assumptions about layout and sizes in memory based on the constraints imposed by C. To work around these differences and allow interoperability with C via a foreign function interface, Rust does allow a compiler macro, #[repr(C)] to be placed before a struct to tell the compiler to use the typical C layout. While this is useful, it means that any given program might mix and match representations for memory layout, causing further analysis difficulty. Rust also supports a few other types of layouts including a packed representation that ignores alignment.

We can see some effects of the above discussion in simple binary-code analysis tools, including the Ghidra software reverse engineering tool suite. For example, consider compiling the following Rust code (using Rust 1.64 and cargo’s typical release optimizations; also noting that this example was compiled and run on OpenSUSE Tumbleweed Linux):

fn main() {
    println!( "{}", hello_str() );
    println!( "{}", hello_string() );
}
 
fn hello_string() -> String {
    "Hello, world from String".to_string()
}
 
fn hello_str() -> &'static str {
    "Hello, world from str"
}

Loading the resulting executable into Ghidra 10.2 results in Ghidra incorrectly identifying it as gcc-produced code (instead of rustc, which is based on LLVM). Running Ghidra’s standard analysis and decompilation routine takes an uncharacteristically long time for such a small program, and reports errors in p-code analysis, indicating some error in representing the program in Ghidra’s intermediate representation. The built-in C decompiler then incorrectly attempts to decompile the p-code to a function with about a dozen local variables and proceeds to execute a wide range of pointer arithmetic and bit-level operations, all for this function which returns a reference to a string. Strings themselves are often easy to locate in a C-compiled program; Ghidra includes a string search feature, and even POSIX utilities, such as strings, can dump a list of strings from executables. However, in this case, both Ghidra and strings dump both of the "Hello, World" strings in this program as one long run-on string that runs into error message text.

Meanwhile, consider the following similar C program:

#include <stdio.h>
 
char* hello_str_p() {
   return "Hello, world from str pointer\n";
}
 
char hello[] = "Hello, world from string array\n";
char* hello_string() {
   return hello;
}
 
int main() {
   printf("Hello, World from main\n");
   printf( hello_str_p() );
   printf( hello_string() );
   return 0;
}

Ghidra imports and analyzes the file quickly, correctly identifies all strings separately in memory, and decompiles both the main function to show calls to printf. It also properly decompiles both secondary functions as returning a reference to their respective strings as a char*. This example is but one anecdote, but considering that software doesn’t get much simpler than “Hello, World,” it is easy to envision much more difficulty in analyzing real-world Rust software.

Additional points where tooling may need to be updated include the use of function name mangling, which is necessary to be compatible with most linkers. Linkers generally expect unique function names so that the linker can resolve them at runtime. However, this expectation conflicts with many languages’ support for function/method overloading in which several different functions may share the same name but are distinguishable by the parameters they take.

Compilers address this issue by mangling the function name behind the scenes, creating a compiler-internal unique name for each function by combining the function’s name with some type of scheme to represent its number and types of parameters, its parent class, etc.—all information that helps uniquely identify the function. Rust developers considered using the C++ mangling scheme to support compatibility but ultimately scrapped the idea when creating RFC 2603, which defines a Rust-specific mangling scheme. Since the rules are well-defined, implementation in existing tools should be relatively straightforward, although some tools may require further architectural or user-interface changes for full support and usability.

Similarly, Rust has its own implementation of dynamic dispatch that is distinct from C++. Rust’s use of trait objects to connect the actual object data with a pointer to the trait implementation adds a layer of indirection compared with the C++ implementation of attaching a pointer to the implementation directly inside the object. Some argue that this implementation is a worthwhile tradeoff given Rust’s design and objectives; regardless, this decision does impact the binary representation and therefore existing binary-analysis tools. The implementation is also thankfully straightforward, but it is unclear how many tools have so far been updated for this analysis.

While reverse engineering and analysis tools will need more thorough testing and improved support for non-C-family languages like Rust, we must ask: Is it even possible to consistently and accurately determine only from binary code if a given program was originally written in Rust compared to some other language like C or C++? If so, can we determine if, for example, code using unsafe was used in the original source to conduct further vulnerability analysis? These are open research topics without clear answers. Since Rust uses unique mangling of its function names, as discussed earlier, this could be one way to determine if an executable uses Rust code, but it is unclear how many tools have been updated to work with Rust’s mangled names. Many tools today use heuristics to estimate which C or C++ compiler was used, which suggests that similar heuristics may be able to determine with reasonable accuracy if Rust compiled the binary. Since abstractions are generally lost during the compilation process, it is an open question how many Rust abstractions and idioms can be recovered from the binary. Tools such as the SEI’s CERT Pharos suite are able to reconstruct some C++ classes and types, but further research is needed to determine how heuristics and algorithms must be updated for Rust’s unique features.

While research is needed to investigate how much can be reconstructed and analyzed from Rust binaries, we must remark that using crates where source is available (such as from public crates on crates.io) conveys a good deal more assurance than using a source-less crate, since one may inspect the source to determine if unsafe features are used.

Rust Stability and Maturity

Much has been written about the stability and maturity of Rust. For this post, we will define stability as the likelihood that working code in one version of a programming language does not break when built and run on newer versions of that language.

The maturity of a language is hard to define. Many strategies have evolved to help measure maturity, such as the Capability Maturity Model Integration. While not complete, we would define the following features as contributing to language maturity:

a working reference implementation, such as a compiler or interpreter
a complete written specification that documents how the language is to be interpreted
a test suite to determine the compliance of third-party implementations
a committee or group to manage evolution of the language
a transparent process for evolving the language
technology for surveying how the language is being used in the wild
a meta-process for allowing the committee to rate and improve its own processes
a repository of free third-party libraries

The maturity for several popular languages, including Rust, are summarized in the following table:

Language	C	Java	Python	Rust
First Appearance	1972	1995	1991	2010
Reference Implementation	None	JDK / HotSpot VM	cpython	rustc
Complete Specification	ISO/IES 9899:2017	JLS	Python Language Reference	The Rust Reference
Compliance Test Suite	Third-party commercial testsuites	JavaTest Harness	None	None
Language Maintenance Group	ISO / IEC / JTC1 / SC22 / WG14	Sun , Oracle	Python Software Foundation	The Rust Project
Transparent Evolution Process	ISO	JCP	PEP Process	Request For Comments (RFC) process
Language Survey Technology	None	None	None	crater
Meta-process to Improve Committee	ISO	None	None	None
Third-party Code Repository	None	None	Python Package Index (PyPI)	crates.io

All four languages have similar approaches to achieving stability. They all use versions of their language or reference implementation. (Rust uses editions rather than versions of its rustc compiler to support stable but old versions of the language.)

However, maturity is a thornier issue. The table showcases a decades-long evolution in how languages seek maturity. Languages born before 1990 sought maturity in bureaucracy; having authoritative organizations, such as ISO or ECMA, and documented processes for managing the language. Newer languages rely more on improved technology to enforce compliance with the language. They also rely less on formal documentation and more on reference implementations. Rust continues in this evolutionary vein, using technology (crater) to measure the extent to which improvements to the language or compiler would break working code.

To assist the Rust language in achieving stability, the Rust Project employs a process (crater) to build and test every Rust crate in crates.io and on github.com. The Rust Project uses this large body of code as a regression test suite when testing changes in the rustc compiler, and the data from these tests help guide them in their mantra of “stability without stagnation.” A public crate that has a test which passes under the stable build of the compiler but fails under a nightly build of the compiler would qualify as breaking code (if the nightly build eventually became stable). Thus, the crater process detects both compiler bugs and intentional changes that might break code. If the Rust developers must make a change that breaks code in crates.io, they will at least notify the maintainer of the fragile code of the potential breakage. Unfortunately, this process does not currently extend to privately owned Rust code. However, there is talk about how to resolve this.

The Rust Project also has a process for enforcing the validity of their borrow checker. Any weakness in their borrow checker, which might allow memory-unsafe code to compile without incident, merits a CVE, with CVE-2021-28032 being one such example.

While all crates in crates.io have version numbers, the crates.io registry guarantees that published crates will not become unavailable (as has happened to some Ruby Gems and Javascript packages in the past). At worst, a crate might be deprecated, which forbids new code from using it. However, even deprecated crates can still be used by already-published code.

Rust offers one more stability feature not common in C or other languages. Unstable, experimental features are available in every version of the Rust compiler, but if you wish to use an experimental feature, you must include a #![feature(…)] string in your code. Without such syntax, your code is limited to the stable features of Rust. In contrast, most C and C++ compilers happily accept code that uses unstable, non-portable, and compiler-specific extensions.

We would conclude that for non-OSS code, Rust offers stability and maturity comparable to Python: The code might break when upgraded to a new version of Rust. However, for OSS code published to crates.io, Rust’s stability is considerably stronger in that any such code on crates.io will not break without prior notification, and the Rust community can provide assistance in fixing the code. Rust currently lacks a complete written specification, and this omission will become acute when other Rust compilers (such as GCC’s proposed Rust front-end) become available. These third-party compilers should also prompt the Rust Project to publish a compliance test suite. These improvements should bring Rust’s maturity close to the level of maturity currently enjoyed by C/C++ developers.

Security Tools Must Mature Alongside Rust

The Rust language will improve over time and become more popular. As Rust evolves, its security—and analysis tools for Rust-based code—should become more comprehensive as well. We encourage the Rust Foundation to apply to become or create a related CVE Numbering Authority (CNA) to better engage in coordinated vulnerability disclosure (CVD), the process by which security issues—along with mitigation guidance and/or fixes—are released to the public by software maintainers and vendors in coordination with security researchers. We would also welcome a complete written specification of Rust and a compliance test suite, which is likely to be prompted by the availability of third-party Rust compilers.

Software Engineering Institute

SEI Blog