Differences Between ASLR on Windows and Linux
Hi folks, it's Will again. In my last blog entry, I discussed a behavior of NX on the Linux platform. Given that NX (or DEP as it's known on the Windows platform) and Address Space Layout Randomization (ASLR) work hand-in-hand, it's worth looking into how ASLR works on Linux. As it turns out, the implementation of ASLR on Linux has some significant differences from ASLR on Windows.
On the Windows platform, ASLR does not affect the performance of an application does not affect runtime performance, but it can slow down the initial loading of modules. A program or library that is linked with the /DYNAMICBASE option will be compatible with ASLR on Windows. According to the Windows ISV Software Security Defenses document:
In general, ASLR has no performance impact. In some scenarios, there's a slight performance improvement on 32-bit systems. However, it is possible that degradation could occur in highly congested systems with many images that are loaded at random locations. The performance impact of ASLR is difficult to quantify because the quantity and size of the images need to be taken into account. The performance impact of heap and stack randomization is negligible.
For this reason, there really is no reason to link anything without the /DYNAMICBASE option, which enables ASLR. With /DYNAMICBASE enabled, a module's load address is randomized, which means that it cannot easily be used in Return Oriented Programming (ROP) attacks. When it comes to Windows applications, we recommend that all vendors use both DEP and ASLR, as well as the other mitigations outlined in the Windows ISV Software Security Defenses document. If vendors have not elected to use /DYNAMICBASE, users have the ability to force ASLR through the use of Microsoft EMET.
On the Linux platform, ASLR does have a performance penalty. This penalty is greatest on the x86 architecture, and perhaps most noticeable in benchmarks. For an executable to be compatible with ASLR on Linux, it must be compiled with the Position Independent Executable (PIE) option. According to the paper Too much PIE is bad for performance [pdf], "Our analysis shows that the overhead for PIE on 32bit x86 is up to 26% for some benchmarks with an (arithmetic) average of 10% and a geometric mean of 9.4%."
Presumably due to this potential performance penalty, Linux distributions are not enabling PIE for all of the executables provided. For example, Ubuntu indicates, "PIE has a large (5-10%) performance penalty on architectures with small numbers of general registers (e.g. x86), so it should only be used for a select number of security-critical packages..."
Red Hat has a similar perspective:
The Fedora Engineering Steering Committee maintains a conservative list of packages that must be built using security features of GCC. Packages not on this list have these security features enabled at the packagers' descretion [sic]. There is not currently a consensus in the community as to when security hardened binaries are necessary. As a result the use of security hardened binaries can be a controversial topic. Most arguments can be reduced to whether the security benefit outweighs the performance overhead involved in using the feature.
If a goal of ASLR is to have executable code at an unpredictable address, why is there such a difference between the Windows and Linux implementations? It's important to note that ASLR compatibility on Windows is a link-time option, while on Linux it's a compile-time option.
With Windows, the code is patched at run time for relocation purposes. In the Linux and Unix worlds, this technique is known as text relocation. With Linux, ASLR is achieved in a different way. Rather than patching the code at runtime, the code is compiled in a way that makes it position independent. That is, it can be loaded at any memory address and still function properly.
At least on the x86 platform, this position-independent capability is accomplished through the use of a general-purpose CPU register. With one less register available to use by a program, it doesn't operate as efficiently. This limitation is most noticeable on architectures with a small number of registers, such as x86.
Why did the Linux developers choose this technique for implementing ASLR? As with most things security, there is a tradeoff. Because text relocations involve patching, loading such a module would trigger copy-on-write, which subsequently increases the memory footprint of a system. Position-independent code does not require patching, and therefore does not trigger copy-on-write. For a much more detailed look into the position-independent code implementation on Linux, as well as a comparison to load-time relocation, see Eli Bendersky's blog entry: Position Independent Code (PIC) in shared libraries.
What does this mean for most Linux users?
- ASLR is not as prevalent in most Linux distributions as it is on modern Windows systems.
- ASLR cannot be force-enabled for applications on Linux, as EMET can do on Windows.
One thing to consider is that the x86 architecture is becoming less relevant as time passes. The x86_64 architecture doesn't have a significant performance penalty for position-independent code. This smaller penalty is because x86_64 has twice as many general-purpose registers as x86 and because unlike x86, it supports a PC-relative addressing scheme. Moving forward, it is my hope that vendors utilize PIE in a more comprehensive manner, which will allow for ubiquitous use of ASLR.
Thanks to The Pax Team for information provided in this blog entry.
Update: The Reddit /netsec community has a discussion about this topic.