network traffic analysis of zeroaccess bot · 2016-01-22 · botnets and gave a case study on storm...

2

Network Traffic Analysis of ZeroAccess Bot Shree Garg, Anil K. Sarje, Sateesh K. Peddoju

Department of Computer Science & Engineering

Indian Institute of Technology Roorkee, Roorkee, India

{shreedec, sarjefec, drpskfec}@iitr.ernet.in

Abstract

Botnets have become a general-purpose platform to perform

malicious cyber-activity and extortion. Botnets use specially designed communication channels to receive commands from their operators and respond accordingly. In early design of botnets, bot-masters used the centralized control. However, in order to overcome the failures due to centralized control, botnet community has started using distributed P2P architecture and also designing their own protocol to efficiently handle their bots. ZeroAccess botnet is one such an emerging P2P botnet. Its architecture has made it one of the most robust and durable botnets. Recent reports indicate that 1.9 million computers were infected with ZeroAccess by mid-2013. In view of the seriousness and impact of the threats imposed by ZeroAccess bot, this paper aims at network traffic analysis of ZeroAccess bot focusing on its protocol and network behaviour for easy and faster detection. The results achieved are analysed and presented. The paper concludes with discussion on directions for detection of this bot.

Keywords: ZeroAccess; Botnet; Traffic analysis; Malware; peer-to-

peer; Bot; protocol

1. Introduction

Botnets are the latest platform used for cybercrimes and threats to the Internet today by various attacks like click-fraud, bitcoin mining, spam, and credential theft. Botnets are the networks of compromised machines (bots) remotely controlled by attacker. Bots have started communicating through peer-to-peer (P2P) networks. Distributed nature of P2P botnets has made them more difficult to detect compared to IRC and HTTP based botnets. In July 2011, an emerging botnet known as ZeroAccess (ZA) was discovered and is responsible for infecting more than 1.9 million computers worldwide. In late 2013, Microsoft [1] reported that the efforts are being made to disrupt and to stop malicious cyber-attacks and business running by ZA. However it is not removed completely. The detection has become difficult as it is based on distributed P2P architecture [2]. Hence, studying on this botnet has become need of the day. This paper aims in understanding the network behavior of ZA bot. Experiments were conducted and its network traffic was monitoring and analyzed for 40 hours. Our findings are reported in this paper. The remaining sections of this paper explain how ZA propagates and communicates over internet using network protocols. Related Work is discussed in Section 2. Section 3 explains the architecture of ZA botnet. Detailed network traffic analysis of ZA is given in Section 4. Future work and conclusion is provided in Section 5.

2. Related Work

P2P botnets are becoming serious threat as they keep on changing the design of their protocols. Prior to ZeroAccess, similar botnets have been identified and detected. Holz et.al [3] presented measurements and mitigation of P2P based botnets and gave a case study on Storm Worm. Storm botnet was based on Kademlia protocol. Sinclair et.al [4] described the protocol and working of Waledac botnet. Waledac botnet uses a mix of HTTP protocol, P2P and fast-flux based DNS network. A recent work by Rossow et.al [5] described that Sality botnet uses unstructured P2P network to spread URLs where payloads are to be downloaded and contact their neighbors to exchange new URLs. Symantec gave infection analysis of ZA malware [6]. Different modules namely backup, infection tracker, network traffic inception, java script for search engine redirection, click fraud and backdoor are discussed. System level changes (registry, driver, files) made by ZA are described in detail. Our focus is to elucidate the network traffic behavior of ZA bot unlike [6].

3. Architecture of ZeroAccess

Bots can communicate with each other by using some protocol. They may use some existing protocol, for example - Storm botnet used Kademlia [3], or may design a new protocol. ZA uses its own protocol. ZA is a P2P botnet that has a layered architecture to distribute the commands, updates and necessary files over the network. There are two types of nodes in ZA botnet. The infected systems running on public IP will act as Super nodes while systems behind NAT are Normal nodes [7] as shown in Figure 1. Super nodes are capable of communicating with every other node in the botnet. They act as server and client both. Super nodes provide necessary files for download, other malicious plug-ins and IP addresses of currently active peers on the botnet, acting as server for other peers. Super nodes also act as a client by connecting and requesting for the same to other Super nodes on the botnet.

Botmaster transfers all the necessary files, commands and

updates only to the Super nodes. Super nodes are responsible for distributing files and commands in the botnet. Normal nodes can only request to Super nodes for commands, files and updates. Normal nodes cannot be contacted directly by other peers in the botnet. Initially nodes can only contact to hardcoded list of Super nodes in the binary.

3

Figure 1. Architecture of ZeroAccess Botnet

4. Traffic Analysis and Results

This section presents the environment used for analyzing the traffic. Further, it also discusses the various stages of analysis, approach used, and the results achieved.

4.1 Analysis Enviornment

In order to collect and analyze the network traffic, a system was infected with Zero Access malware md5: 5ba0515536ddfbddf46f70909559393d8e8c39bb528613cd8bafa0fcce03dfe6 on a virtual machine in a 32-bit System running Windows environment with 256 MB RAM and 20GB Hard Disk. The system was assigned a private IP in the campus network but mapped to a public IP on the internet (Figure 2). Hence the infected monitoring system became a super node on the botnet. Wireshark is used to capture the network traffic of the monitoring system. Wireshark was started just before the system gets infected. The system was infected on 15 January 2014 at 6:30 PM IST.

Figure 2. Experimental Setup

Entire traffic collected for 40 hours is analyzed in two phases: initial phase and popular phase. During the Initial phase, the machine is trying to contact others peers in the botnet by sending/receiving traffic and in Popular phase, the host became a popular server of ZA botnet. Initial phase ran for 10 hours and popular phase for 25 hours (approximately one day) with 5-hours break in between to give a room for the infected machine to became popular. The timeline of experiment is shown in Figure 3. System gets infected and malware start installing on the host, described in Section 4.2.

Figure 3. Timeline of Experiment

Major focus was given to popular phase as analyzing of this traffic leads to understanding the network behavior of the ZA in a better manner, explained in Section 4.3.

4.2 Initial Phase

Just after the infection, first packet it sends is a DNS query for "j.maxmind.com" to public DNS address (8.8.8.8:53). It is a geo-IP locator service used to locate the position of infected host. Further system kept on informing to a hardcoded Command and Control (CnC) server using some DNS (to port 53) packets using 1044-1054 port numbers followed by 3

Simple Service Discovery Protocol (SSDP) packets to

239.255.255.250:1900 from source port 1057. For the initial communication with other bots in the botnet,

every bot uses some rallying mechanism. Hardcoded list of initial peers is a very common method and it is used by ZA. The system gets a list of 256 IP addresses and start sending UDP packets with 16-byte payload to these 256 distinct IP addresses from source port 1062 to their destination port 16471. The source port number changes when a fresh system gets infected with ZA, it may be randomly or algorithmically generated by the malware. We have used the same experimental environment to infect another similar system running behind the NAT. It makes request to same 256 IP addresses but from source port 1169 to destination port number 16471 indicates the change of source port. System requests each host (from the list of 256-IPs) at the interval of one second (Figure 4) and it keeps on requesting to all 256 hosts in a loop until it gets the response from any of the hosts. Time pattern of requests made to a particular host by monitoring system is shown in Figure 5. For the first 8-hours of initial infection, it does not get any reply from the requested 256 hosts. Requested peers may be offline, shut down or cleaned by some method of system cleaning. In this duration, system kept on sending packets at the same frequency and interval.

4

Figure 4. Pattern of unique hosts contacted during initial phase

Figure 5. Request time pattern to a host - in initial phase

After this duration system gets the first packet from the list of 256 IPs. Not all but only 7% of the total requested IPs responded from the list within next one hour. This might be due to the case that these 7% IPs belong to a common time zone and all the machines were logged-on in this period of one hour. This may be due to diurnal nature of bot activity depending on time zone [7]. To check the geographic information of responded hosts, we used a geo-location database [8]. This database provides the country, subdivisions, city, postal code, latitude, and longitude associated with IPv4 addresses worldwide. Most of the hosts that has responded to infected system, approximately 61%, are in Europe, 27% are in America. This indicates that bot-host that has responded to our monitoring system belongs to same locality and similar time zones.

System gets the packets with payload 848 bytes from the

16471 source port and it starts communicating with other hosts (except from 256 initial list of IPs).

Within around half-an-hour, system also starts getting

requests at port number. 16471 and it started responding. During the next two hours of monitoring the system, it was observed that the system was communicating with lots of different IP addresses. It was sending more packets than it received. Longer duration of infection will allow us to see new unique IPs as the nature of P2P is very dynamic.

4.3 Popular Phase

In order to let the monitoring Super node become popular in the botnet a 5-hours break was given. Soon after the Super node became popular, the analysis was carried out for the next 25 hours (16 January 2014 9:30 AM IST to 17 January 2014 10:30 AM IST). It was observed that during the popular phase

the system started getting large number of requests. In this phase, abundance of the traffic shows that this system became a prevalent node over the botnet. The monitoring system was online through the initial infection.

Figure 6 shows the number of unique incoming and

outgoing IP addresses that were contacted over monitoring period. Within a minute, around 1800 unique IP addresses were contacted and number reached to 20000 within an hour, and which further increased linearly. As time progressed, infected system running on a public IP became popular and more and more different IP addresses contacted to it.

Network protocol distribution among the traces collected

is given in the Table 1. It mostly used UDP to generate the traffic and very few TCP packets were seen. The system gets packets from 140K unique IP addresses while it sent packets to 99K IPs only.

Table1: Summary of Network Traffic

Network Protocol Size in MB Number of Packets

UDP 3520 13M

TCP 6 38K

ICMP 136 1392K

DNS 0.0015 12

4.3.1. UDP Traffic

The monitoring system got UDP requests from 140K

different. It responded approximately to only 98K (70% of

incoming IPs) hosts quickly which made the request for port

number 16471. The rest un-replied UDP requests were made

for port numbers as listed in Table 2. Encrypted payloads of

all such reply packets are either 848 bytes, 148 bytes or 16

bytes may be chosen from updated P2P addresses list.

Table 2: Un-replied UDP request at different port numbers

Requested UDP Port Numbers % of un-replied UDP

requests (out of 42K)

16464, 16465 and 16470 97.7%

16471 2.2%

13774, 33436, 33437, 33438, 33439,

33440, 33441, 49153, 53, 5353, 7 0.01%

Figure 6. Popularity of infected system over time

5

Apart from 98K responses, it has contacted additional 0.8K unique super nodes, leading to a total of approximately 99K contacted hosts. Although there is no response received for these requests for 0.8K hosts. The periodicity pattern of contacting different hosts is different as shown in figure 7. It is observed that this contacting rate was ranging between 1 to 8 hosts in one second.

On average it sends more than 650 packets to one such IP

alone and the payload remains 16 bytes. It is found that its Bot-ID remain same during the whole capture.

Figure 7. Pattern of unique hosts requested during popular phase

It keeps on sending the request to one host at a pattern of time, shown below in Figure 8. It is observed that ZA kept on changing its network behaviour timing to mimics on the network to bypass the fixed pattern based security mechanism.

Figure 8. Request time pattern to a host - in popular phase

4.3.2. ICMP Traffic For the rest 42K hosts (30% of total requested hosts) super

nodes did not responded with UDP packets. It sends an ICMP port unreachable packet to all such requests except requests for port number 16471. It is observed that 99.9% of such requests are from ZA-like port numbers 16470, 16465, 16464 and 16471as shown in Table 2. The system used for experiments is running over 32-bit environment and might have opened only 16471 port number for the reply/service.

4.3.3. TCP Traffic Monitoring system replies to all TCP-SYN request, either

with a RST+ACK or SYN+ACK. Requests for port numbers 16471 made successful TCP handshakes as SYN/ACK was sent in return while rest of the requested ports were replied with a RST+ACK. Each host makes lots of connections to the system using many different port numbers. We got connection request (TCP-SYN) at 16471 from 293 distinct host out of

98K (requested at UDP) using 3.6K different source port numbers. TCP is used to download click fraud plug-in, malicious DLL and other requested files.

4.3.4. DNS Traffic System generates only 12 DNS packets. First packet is a

request while the rest packets are malformed as DNS. ZA used DNS to send its existence on the network to their botmaster for the first time of infection when the bots were initially infected. No other DNS packets were seen during the whole monitoring. DNS is abused by lots of malware, ZA is one of them.

5. Peer Communication Patterns

Once the Super node has become popular, two way communication is set-up and system starts responding to other peer nodes. It receive a UDP packet (16 byte payload) at 16471 port number. This request contains the BOTID, requesting file, timestamp and size of file. In most of the cases system replied back to endpoint (IP address and port number) from 16471 port number with 848 byte payload. Soon after another UDP packet (16 byte payload) was sent from veracity (1062) port number to the 16471 port. A diagrammatic flow of this conversation is shown below in figure 9a.

Figure 9a. Conversation between two infected host over UDP (left

side system is a Super node)

In most cases, an ICMP-port-unreachable packet was received in the return of responded (2

nd packet) packet as

shown in figure 9b. Data payload of this ICMP packet is 590 byte which is different from standard size of 8 bytes. Peer repeats this process (fig: 6b) from same source port. This seems to be some kind of acknowledgement or reply from the requested peer because communication is repeated similarly further on same port.

Figure 9b. Conversation between two infected host over UDP and

ICMP (left side system is a Super node)

6

In few cases, after a conversation on UDP/ICMP we saw many TCP SYN packets (Figure 9c) from different source port numbers coming from other peers. Monitoring Super node acknowledges all such SYN requests at TCP port number 16471 and set-up proper TCP connection. Data is transferred from the super node.

Figure 9c. Conversation between two infected host over UDP, ICMP

and TCP (left side system is a Super node)

6. Conclusions

The traffic analysis of ZeroAccess bots demonstrates implementation of their distributed P2P functionality. The bot uses their own network to get the address of other active peers over UDP. It is a very busty protocol. CnC address is called many times during installation of bots using forgery DNS packets. Only few TCP connections are made to download plug-ins and other malicious files. Time pattern to request the service from super nodes changes until the bot becomes part of botnet.

The detection and mitigation of bots using their network behavior is a very prevalent method. The combination of size of packets, time interval between the packets, in-degree, out-

degree of a node, usage of source-port numbers and destination port numbers, or in combination will help to identify the infected hosts. In future we will focus on these issues to build a framework to detect these bots using network traffic.

Acknowledgement

We would like to thank Chris Lee for providing ZeroAccess

samples, used in our experiments.

References

[1] Microsoft, the FBI, Europol and industry partners disrupt the notorious ZeroAccess botnet http://www.microsoft.com/en-us/news/press/2013/dec13/12-05zeroaccessbotnetpr.aspx, 2013.

[2] J. Wyke. Technical Report, ZeroAccess, SophosLabs UK. [3] T. Holz, M. Steiner, F. Dahl, E. Biersack, and F. Freiling Measurements

and mitigation of peer-to-peer-based botnets: a case study on storm worm, USENIX Association Berkeley, CA,USA, 2008.

[4] G. Sinclair, C. Nunnery, and B.B.H Kang, The waledac protocol: The how and why, MALWARE, iDefense, Univ. of North Carolina at Charlotte, Charlotte, NC, USA pp. 69-77, 2009.

[5] C. Rossow, D. Andriesse, T. Werner, B. Stone-Gross, D. Plohmann, C. J. Dietrich, and H. Bos, SoK: P2PWNED - Modeling and Evaluating the Resilience of Peer-to-Peer Botnets, Security & Privacy, Inst. for Internet Security, Gelsenkirchen, Germany, pp. 97-111, 2013.

[6] S. Hittel, and R. Zhou, Trojan.ZeroAccess Infection Analysis, Symantec White paper, 2012.

[7] J. Wyke, The ZeroAccess Botnet – Mining and Fraud for Massive Financial Gain, Sophos Technical Paper, 2012.

[8] WebService, MaxMind: GeoIP2 Precision Web Services, http://www.maxmind.com/en/city, 2014.

network traffic analysis of zeroaccess bot · 2016-01-22 · botnets and gave a case study on storm...

Documents