The Curious Case of the Bouncy Castle BKS Passwords

While investigating BKS files, the path I went down led me to an interesting discovery: BKS-V1 files will accept any number of passwords to reveal information about potentially sensitive contents!

In preparation for my BSidesSF talk, I've been looking at a lot of key files. One file type that caught my interest is the Bouncy Castle BKS (version 1) file format. Like password-protected PKCS12 and JKS keystore files, BKS keystore files protect their contents from those who do not know the password. That is, a BKS file may contain only public information, such as a certificate. Or it may contain one or more private keys. But you won't know until after you use the password to unlock it.

Update March 21, 2018:
We have updated this blog post based on feedback from Thomas Pornin, and confirmation from the Bouncy Castle author. Like JKS files, BKS files do not protect the metadata of their contents by default. The keystore-level password and associated key is only used for integrity checking. By default, private keys are encrypted with the same password as the keystore. These private keys are not affected by the keystore-level weakness outlined in this blog post. That is, even if an unexpected password is accepted by a keystore itself, that same password will not be accepted to decrypt the private key contained within a keystore. Original wording in this blog post that is now understood to be inaccurate has been marked in strikeout notation for transparency.

Cracking BKS Files

As I investigated the first BKS file in my list, I quickly realized assumed that I could not determine what was contained in it unless I had the password. Naively searching the web for things like "bks cracker" and stopping there, I concluded that I'd need to roll my own BKS bruteforce cracker.

Update March 21, 2018:
Tools used to inspect BKS files will refuse to list the contents of the keystore if a valid password is not provided. However, this is actually not because the metadata of the keystore contents are protected. Because the metadata of the keystore contents are not encrypted, this information can be viewed without needing to use a valid password.

Using the pyjks library, I wrote a trivial script:

#!/usr/bin/env python3

import os
import sys
import jks

def trypw(bksfile, pw):
    try:
        keystore = jks.bks.BksKeyStore.load(bksfile, pw)
        if keystore:
            print('Password for %s found: "%s"' % (bksfile, pw))
            sys.exit(0)
    except jks.util.KeystoreSignatureException:
        pass
    except UnicodeDecodeError:
        pass

with open(sys.argv[1]) as h:
    pwlist = h.readlines()

for pw in pwlist:
    trypw(sys.argv[2], pw.rstrip())
sys.exit(1)

Let's try this on the test BKS file that I have:

$ python crackbks.py strings.txt test.bks
Password for test.bks found: "Redefinir senha"

Cool. "Redefinir senha" seems like an unexpected password to me, but it's not terrible in strength. It has 15 characters, and uses mixed-case and a non-alphanumeric character (a space). Depending on the password-cracking technique used, it could hold up pretty well to bruteforce attacks.

The above proof-of-concept script is quite slow, since it will serially attempt passwords, one at a time. Taking advantage of multi-core systems in Python isn't as easy as it should be, due to the Python GIL. As a simple test, I tried using the ProcessPoolExecutor to see if I could increase my password-attempt throughput. ProcessPoolExecutor side-steps the GIL by spreading the work across multiple Python processes. Each Python process has its own GIL, but because multiple Python processes are being used, this approach should help better utilize my multiprocessor system.

Let's try this version of the brute-force cracking tool:

$ python crackbks.py strings.txt test.bks
Password for test.bks found: "Redefinir senha"
Password for test.bks found: "Activity started without extras"
Password for test.bks found: "query.is.any.user.logged.in"

Wait, what is going on here? How can a single BKS file accept multiple passwords? As it turns out, there are two things going on:

First, when I optimized my BKS bruteforce script with the use of ProcessPoolExecutor, I didn't factor in how the script would behave when it is distributed across multiple processes. In the single-threaded instance above, the script exits as soon as it finds the password. However, when it's distributed across multiple processes using ProcessPoolExecutor, things are different. I didn't have any code to explicitly terminate the parent Python process or any of the forked Python processes. The impact of this is that my multi-process BKS cracking script will continue to make attempts after it finds the password.

The other thing that is happening is related to the BKS file format, which I discuss below.

Hashes and Collisions

When a resource is password-protected with a single password, it is extremely unlikely that another password can also be used to unlock the resource. Consider the simple case where a collision-resistant hash function is used to verify the password: Is this password unique?

Applying a cryptographic hash function to the password results in the following hashes:
MD5 (128-bit): 18fcfa801383d10dd0a1fea051674469
SHA-1 (160-bit): c9e2ef80e5f2afb8aef0d058182cc7f59e93e025
SHA-256 (256-bit): 08a6c455079687616e997c7bfd626ae754ba1a71b229db1b3a515cfa45e9d4ea

The MD5 hash algorithm, which has a digest size of 128 bits, was shown in 1996 to be unsafe if a collision-resistant hash is required. By 2005, researchers produced a pair of PostScript documents and a pair of X.509 certificates where each pair shared the same MD5 hash. While it takes a bit of CPU processing power to find such collisions, it's feasible to do so with modern computing hardware.

The SHA-1 hash algorithm, which has a digest size of 160 bits, is more resistant to collisions than MD5. However by February 2017, the first known SHA-1 collision was produced. This attack required "the equivalent processing power as 6,500 years of single-CPU computations and 110 years of single-GPU computations."

The SHA-256 hash algorithm, which has a digest size of 256 bits, is even more resistant to collisions than SHA-1. To date, no collisions have been found using the SHA-256 hashing algorithm.

BKS-V1 Files and Accidental Collisions

My naive BKS bruteforcing script produced three different passwords for the same BKS file. Let's look at the code for handling BKS files in pyjks:

hmac_fn = hashlib.sha1
hmac_digest_size = hmac_fn().digest_size
hmac_key_size = hmac_digest_size*8 if version != 1 else hmac_digest_size
hmac_key = rfc7292.derive_key(hmac_fn, rfc7292.PURPOSE_MAC_MATERIAL, store_password, salt, iteration_count, hmac_key_size//8)

Here we can see that the HMAC function is SHA-1, which isn't bad. However, it turns out that it's the HMAC key (and its size) that is important, since that's what determines whether the correct password has been provided to unlock the BKS keystore file. If the file is a BKS version 1 file, the hmac_key_size value will be the same as hmac_digest_size.

In the case of hashlib.sha1, the digest_size is 20 bytes (160 bits). But where it gets interesting is the derivation of hmac_key. The size of hmac_key is determined by hmac_key_size//8 (integer division, dropping any remainder). In this case, it's 20//8, which is 2 bytes (16 bits). Why is there integer division by 8 at all? It's not clear, but perhaps the developer confused where bits are used and bytes are used in the code.

Let's add a debugging print() statement to the bks.py component of pyjks and test our three different passwords for the same BKS keystore:

$ python -c "import jks; keystore = jks.bks.BksKeyStore.load('test.bks', 'Redefinir senha')"                                                                                                                       hmac_key: c019
$ python -c "import jks; keystore = jks.bks.BksKeyStore.load('test.bks', 'Activity started without extras')"                                                                                                       hmac_key: c019
$ python -c "import jks; keystore = jks.bks.BksKeyStore.load('test.bks', 'query.is.any.user.logged.in')"

Here we can see that the hmac_key value is c019 (hex) with each of the three different passwords that are provided. In each of the three cases, the BKS-V1 keystore is decrypted, despite the likelihood that not one of the three accepted passwords was the one chosen by the software developer.

Why was I accidentally able to find BKS-V1 password collisions due to my shoddy Python programming skills? The maximum entropy you get from any BKS-V1 password is only 16 bits. This is nowhere near enough bits to represent a password. When it comes to password strength, entropy can be used as a measure. If only bruteforce techniques are used, each case-sensitive Latin alphabet character adds 5.7 bits of entropy. So a randomly-chosen three-character,case-sensitive Latin alphabet password will have 17.1 bits of entropy, which already exceeds the complexity of what you can represent in 16 bits. In other words, while a developer can choose a reasonably-strong password to protect the contents integrity of a BKS-V1 file, the file format itself only supports complexity equivalent to just less than what is provided by a randomly-selected case-sensitive three-letter password.

Cracking BKS-V1 Files

What amount of integrity protection does a 16-bit hmac_key provide? Virtually nothing. 16 bits can only represent 65,536 different values. What this means is regardless of the password complexity the developer has chosen, a bruteforce password cracker needs to try at most 65,536 times. A high-end GPU these days can crunch through over 10 billion SHA-1 operations per second.

As it turns out John the Ripper does have BKS file support, despite what my earlier web searches turned up. While there isn't currently GPU support for cracking BKS files, a CPU is plenty fast enough. My limited testing has shown that any BKS-V1 file can be cracked in about 10 seconds or less using just a single CPU core on a modern system.

Conclusion and Recommendations

Without a doubt, BKS-V1 keystore files are insecure, due to insufficient HMAC key size. Although BKS files support password protection to protect their contents integrity, the protection supplied by version 1 of the file format is nearly zero. For these reasons, here are recommendations for developers who use Bouncy Castle:

Be sure to use Bouncy Castle version 1.47 or newer. This version, which was introduced on March 30, 2012, increases the default MAC of a BKS key store from 2 bytes to 20 bytes.

This information has been in the release notes for Bouncy Castle for about six years, but it may have been overlooked because no CVE identifier was assigned to this weakness. Approximately 84% of the BKS files seen in Android applications are using the vulnerable version 1. We assigned CVE-2018-5382 to this issue to help ensure that it gets the attention it deserves.
On modern Bouncy Castle versions, do not use the "BKS-V1" format, which was added for legacy compatibility with Bouncy Castle version 1.46 and earlier.
If you have rely on password protection provided by BKS-V1 to protect private key material, these private keys should be considered compromised. Such keys should be regenerated and stored in a keystore that provides adequate protection against brute-force attacks, along with a sufficiently complex and long password. For BKS files that contain only public information, such as certificates, the weak password protection provided by version 1 of the format is not important.

For more details, please see CERT Vulnerability Note VU#306792.

Software Engineering Institute

SEI Blog