Study of Malicious Domain Names: TLD Distribution
Hello, folks. This post comes to you courtesy of Aaron Shelmire from the Network Situational Awareness team. Aaron writes:
Recently the Network Situational Awareness team at CERT has been researching the characteristics of malicious network touchpoints. The findings of this initial research are very telling as to the true state of security on the internet.
The Domain Name System (DNS) can be thought of as a multi-level addressing scheme that overlays the numerical IP addresses. This allows content on a numerical address to be called by an easily remembered name such as cert.org. It also expands the possible naming space to a nearly unlimited number of options, like hcjakaudbre.net or ajkcausdih.biz. While the options are not necessarily easily pronounceable, the possibilities for addresses are endless.
The DNS is laid out in a series of labels referred to as levels. The first, or top, level is the last part of the domain name. As an example, www.google.com is a three-level domain name with a top-level domain (TLD) of com, a second-level domain of google, and a third-level domain of www. So, when your computer needs to look for www.google.com, it asks the DNS root servers responsible for the .com TLD who the authoritative DNS server for google is based on the .com zone file. The google domain server would then supply the answer of the numerical address of www.google.com. Addresses on other domains use the same backward resolution.
Looking at this construction from a security perspective, we can identify the level responsible for a malicious domain. For example, let's consider the invented address somebadhost.hoster.com. Because somebadhost is a subset of hoster.com, hoster.com would be responsible cleaning up any malicious content on somebadhost. However, if the malicious touchpoint was something like badguy.com, the responsibility for removing the malicious content lies with the registrar that allowed Mr. Bad Guy to register the domain badguy.com as well as with the root server operators.
In our research, we established a control case of randomly chosen domains to compare against a population of malicious domains. This allows us to see how the behaviors and characteristics of malicious domains involved in criminal and espionage operations contrast to those behaviors of the general population of domains.
As shown, a random sample of domains is mostly distributed over the .com top-level domain, with some distributed over the .org and .net TLDs. For the purpose of comparison, note that the China top-level domain (.cn) is only seen 1.7% of the time.
Using the data of malicious domains, we see a very different distribution.
Most of the malicious domains are still using the .com, .org, and .net TLDs, but these TLDs are less popular than they are in the control sample. In the malicious sample, the .info TLD becomes very prominent (as opposed to only .9% of the control case), and the .biz TLD appears as a more popular TLD (compared to .2% in the control case). The China TLD nearly doubles its presence to 3.0% of the malicious domains.
These results may imply that the TLDs that have smaller proportions in the malicious sample than in the control sample have applied policies and practices that enable them to prevent the use of their resources for malicious activity.