Implications and Mitigation Strategies for the Loss of End-Entity Private Keys

When a private key in a public-key infrastructure (PKI) environment is lost or stolen, compromised end-entity certificates can be used to impersonate a principal (a singular and identifiable logical or physical entity, person, machine, server, or device) that is associated with it. An end-entity certificate is one that does not have certification authority to authorize other certificates. Consequently, the scope of a compromise or loss of an end-entity private key is limited to only those certificates whose keys were lost.

Since it is the certificate that provides the identity used for authorization, authentication of a compromised certificate can lead to critical consequences, such as loss of proprietary data or exposure of sensitive information. Compromised certificates can be used as client-authentication certificates in SSL to authenticate principals associated with the certificate (e.g., a principal mapped in Active Directory, LDAP, or another database) or they may be accepted as is, depending on the service. This blog post describes strategies for how to recover and minimize consequences from the loss or compromise of an end-entity private key.

How Private Keys are Lost

The private key is the part of the certificate that the principal using PKI keeps private. In some cases, people think they may have lost control of private keys but are not sure. Since anyone in possession of a private key can impersonate its intended user, losing a private key is functionally equivalent to losing a password (as in the Yahoo breach). A private key is harder to replace than a lost password, however, so losing a private key is harder to recover from.

There is no central repository of private keys. Unless a certificate appears on a certificate revocation list (CRL), the private key plus the certificate are valid until the certificate expires. The certificate typically remains valid for one to three years. So until the certificate expires, unless a check of the CRL occurs, the certificate will always be valid, and anyone in possession of the private key can impersonate its intended user for the lifetime of the certificate's validity period.

Compromised client-authentication certificates are analogous to the loss of usernames and passwords. Attackers use the stolen credentials to obtain unauthorized access to the breached system. Loss of server-authentication keys allows an attacker to spoof a trusted service and potentially capture or alter confidential data sent between a client and the breached service.

Keys are lost in various ways. A key can be on a system that crashes and be irretrievably lost, preventing the user from getting any more encrypted traffic. A key can be stolen when an attacker breaks into a system on which it is stored. Often private key loss occurs because people accidentally send the private key in a message when they mean to send the public key. In such a case, they haven't lost the key, they've lost control of the key. They are then vulnerable to having the private key on the system to which it was mistakenly sent forwarded elsewhere without permission or attacked and stolen on that system.

Another possibility is that a key might be exposed for some period of time. This case is the digital equivalent of leaving a written secret on a desk in a public area where people walk by and could potentially copy it. Organizations using PKI need to institute and document measures that assure proper custody of private keys.

At CERT, we know of an organization in which a breakdown in policies, processes, and procedures for private-key protection caused users to be careless with protecting keys, and a large number of keys were digitally exposed. The organization did not know if anyone copied the private keys while they were exposed, but they easily could have been, so they had to assume that they were compromised and respond accordingly. In this case, revoking all of the certificates and issuing new ones was not an option because doing so would necessitate the system's being shut down and inaccessible. The system was on a closed network that was not accepting information of any kind from outside its network, and it was being actively used, so it couldn't just be brought down completely or even temporarily to make changes.

The same would hold true if someone left a paper secret (say a password on a Post-It note) on a desk while people walked by--that person would have to assume that someone saw the Post-It and would be wise to change the password. A principle that everyone using PKI should adopt is: if you think it's possible that you have lost control of your key, you have effectively lost control of your key. Regardless of whether you know you've lost it, think you've lost it, or don't know whether you've lost it, the actions you should take to mitigate the problem are identical.

First Response

The best first response to the loss or compromise of a private key is to revoke the certificate and use the CRL or the online certificate status protocol (OCSP) to inform users that the certificate is no longer valid. The entity that issues certificates bears the responsibility to maintain a list of certificates that have been revoked and to propagate that list to users.

OCSP is an automated version of the CRL that gets pushed so that a certificate can check for real-time status, rather than having to check a list that may be dated. In practice, however, not all implementations of certificates check CRLs or the OCSP. Machines do the checking, but they must be coded to do so for this check to take place. In many cases, the check that is implemented confirms only that the certificate is valid and not expired, without checking the CRL or the OCSP to see if the certificate has been revoked. Many times not checking CRLs or OCSP is a conscious design decision made because the system has limited resources or is operating in an environment with a closed network.

Moreover, web browser makers have typically chosen not to perform OCSP checks. Certificates include information about the location of their CRL and OCSP responder. The browser receiving a certificate from a server, however, often does not validate whether the certificate has been revoked by default. The client still needs to go look at the CRL or ask if the certificate is still valid. This process consumes time and incurs performance overhead, so the client may choose not to take this action. In many cases, this is not simply poor practice; it could be a design decision motivated by the critical need for performance optimization, such as in an real-time embedded system.

All modern browsers rely on a process going on behind the scenes in which the browsers monitor for events, respond to them, and push out lists of bad certificates at client-update time. When clients update, they receive a new list of blacklisted certificates. It is assumed that the scope of each breach is known and that a discrete set of compromised certificates can be collected. It is also assumed that the certificates will be properly revoked via appropriate channels by the issuing certification authority (CA).

Browser makers prefer this scenario to one where each connection is checked, with the consequent performance penalty. There is a flaw in this scenario, however: it does not protect against the unknown compromised certificate since there is lag time between an event's occurrence and discovery of the event, along with the time it takes to make the changes and push out updates. This lag time represents a window of opportunity to exploit the vulnerability.

This scenario also relies on all users keeping updated versions of browsers, which we know not everyone does. Moreover, for a certificate to appear on a blacklist, there must have been some exploit or compromise that has taken place to put it there. As a result, someone paid a price for every certificate that is on the blacklist.

Disassociating Credentials and Principals

We stated earlier in the post that a principal is a singular and identifiable logical or physical entity. In LDAP, a principal is a person, but it need not be; it could be a machine, a server, or a device. There can be multiple credentials for identifying a single principal. Often, certificates that are associated with a principal establish rights for performing system actions. The certificate provides identity used for authorization to look at a principal and find out what its access rights are.

In the event of a compromise, it is possible to disassociate the compromised certificate from the principal; in such a case, when an entity looks up the certificate, the certificate will still be valid. It will not possess any rights to perform system activities, however, because the link between the certificate and the principal has been severed. Although anyone can bring the plagiarized certificate and present it, that certificate will no longer carry authorizations for any system actions or activities for anyone to do anything.

A client-authentication certificate represents an authentication credential that is associated with a single principal, but it does not represent the principal itself. For instance, I may be able to authenticate to a service using username/password, Kerberos, or a certificate and be logged in to the same account regardless of the authentication mechanism that was used at login. The mapping of credential to principal is typically done by another service, such as Active Directory or LDAP. A client certificate can be disassociated from a principal by deleting the correct attribute(s) from the directory (e.g., userCertificate). This disassociation is analogous to resetting a user's password.

Some services accept a certificate absolutely without further lookups. For these types of services, SSL-protocol-level mitigations must be put into place to prevent the use of a compromised, but otherwise valid, certificate.

Mitigations

In the following sections, we detail specific mitigation strategies, using these assumptions:

Operating Environment: STIG-compliant Windows and Red Hat Enterprise Linux (RHEL) hosts
Certificate types: non-person entity non-CA certificates
Certificate role: Primarily client authentication, however I will address server authentication where relevant
CRLs and/or OCSP responders are unavailable
The scope of breach is known and a discrete set of compromised certificates can be collected.
The certificates will be properly revoked through appropriate channels with the issuing CA. After revocation the CRL can be downloaded manually from the CA for use in a disconnected environment.

Mitigation Through Principal Mapping

For services that perform principal mapping after the establishment of an SSL session based on the client's certificate, the compromised certificates should be disassociated from those principals. Likewise, in the case where principal mapping is static (e.g., cn -> username), the principal should be deleted or disabled and a new principal provisioned. After these steps are taken, authentication will fail because the compromised, but valid, certificate will not successfully map to a known security principal.

Mitigation Through Client Certificates

Mitigations depend on the platforms and services in use. Below I give as comprehensive a set of mitigation guidelines as possible along with exemplars of mitigating certain specific services. Given the lack of centralized CRL/OCSP, host-level mitigations must be made to all nodes that may communicate with clients presenting compromised certificates.

Windows

MS Crypto API: Import the set of compromised certificates in to Disallowed Certificates

Since the compromised certificates are issued by a third party, the typical Active-Directory-based certificate-management capabilities cannot be used since the compromised certificate distribution point (CDP) refers to outside the Windows domain. In this case the certificate trust list (CTL) must be installed on to all affected Windows systems.

This solution will mitigate the use of compromised certificates to Windows systems utilizing the MS Crypto API (CAPI) only.

RHEL 6/7

Add compromised certificates to system certificate blacklist.

On systems using the update-ca-trust mechanism, place DER or PEM-encoded copies of the compromised certificates in to /etc/pki/ca-trust/source/blacklist/ directory and re-run update-ca-trust. This solution will add the compromised certificates to system-wide OpenSSL and NSS blacklists for all purposes. Applications utilizing the system-level trust stores will distrust these certificates automatically. Not all services utilize the extended certificate trust list (ca-bundle.trust.crt) generated by this procedure.

Update non-system NSS stores

For applications that use NSS databases other than the system NSS database, the CRL generated by the compromised certificate's CA must be imported in to these NSS databases:

https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS/tools/NSS_Tools_crlutil

I do not know of a central mechanism to manage blacklisted certificates on RHEL 5. Non-system NSS stores can be managed the same as for RHEL 6/7.

Application Specific

Applications may have advanced control over their TLS/SSL session handling and can be configured with CRLs to check against. This solution is not a comprehensive set of applications:

Apache HTTPD: Set SSLCARevocationFile/SSLCARevocationPath as appropriate to point at a file/directory of files containing the CRL signed by the CA of the compromised certificates.

In addition, Apache starting in version 2.4 supports advanced ACL syntax that can be leveraged to check the DN or issuer of client certificates and allow/deny access based thereon.

Apache Tomcat: Set crlFile in the <Connector /> section for the SSLEnabled=yes to the path of the CRL file signed by the CA of the compromised certificates.
nginx: Set ssl_crl to the path of the CRL file signed by the CA of the compromised certificates.

Network-Level Mitigation

If it is not possible to mitigate at the host or application level, it may be possible to employ network-level mitigations, either in-line or by widening the security perimeter around the affected services.

Introduce an SSL-Terminating Proxy

If an application allows for it, place an SSL-terminating proxy (e.g., F5 or HAProxy) in front of SSL-enabled services. These devices allow the installation and checking of CRLs at the proxy before allowing traffic back to the service. The downside is that SSL-terminating proxies cannot pass along client-authentication certificates at the protocol level to back-end services. These services typically accept certificates out of band, either via AJP to Java services, or in trusted headers to other HTTP-based services. If the back-end service needs to authenticate the client certificate at the protocol level, an SSL-terminating proxy cannot be used.

Introduce an Intrusion-Prevention System (IPS) Performing Deep-Packet Inspection (DPI) on SSL Session Handshake

For services that cannot utilize traditional SSL-terminating proxies, it may be possible to introduce an in-line intrusion-prevention system that performs deep-packet inspection on SSL session establishment looking for certificates passed between the client and the server. These devices are transparent to the server and client. Moreover, since certificates are passed in clear text during handshake, the IPS can simply terminate the TCP session between the client and server when it detects the use of compromised certificates.

Post-Cleanup Monitoring

After mitigation and after the time that the compromised certificates have been revoked and replaced with new, key-secure certificates, monitoring of the use of these credentials should be performed.

Intrusion Detection System (IDS)

Utilize IDS services located on span ports at both the network core and network edges configured to look for the use of the compromised certificates and send alert notifications to security operations for manual intervention.

Application Audit Logging

Applications should be configured to audit-log the certificate information for all uses, successful and unsuccessful, of clients accessing those applications. These logs should be captured centrally, and the centralized log-analysis system should be configured to alert on the use of the compromised certificates.

Additional Resources

Read Oracle's Public Key Infrastructure (PKI) policies and procedures to establish a secure information exchange.

Read the National Institute of Standard &Technology's overview of Public Key Infastructure testing.

Read the paper Observations on Certification Authority Key Compromise by Moez Ben MBarka and Julien P. Stern.

Software Engineering Institute

SEI Blog