domain generation algorithm malware - cert.or.id · pdf filewhat is domain generation...

45
Domain Generation Algorithm Malware Domain Generation Algorithm Malware Enrico Hugo, CFP, CEH ID-CERT Malware Summit II 13 April 2017 | Graha Merah Putih PT Telkom Indonesia | Bandung, Indonesia Enrico Hugo, CFP, CEH ID-CERT Malware Summit II 13 April 2017 | Graha Merah Putih PT Telkom Indonesia | Bandung, Indonesia

Upload: trinhkhuong

Post on 21-Mar-2018

234 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Domain Generation Algorithm Malware

Domain Generation Algorithm Malware

Enrico Hugo, CFP, CEH

ID-CERT Malware Summit II

13 April 2017 | Graha Merah Putih PT Telkom Indonesia | Bandung, Indonesia

Enrico Hugo, CFP, CEH

ID-CERT Malware Summit II

13 April 2017 | Graha Merah Putih PT Telkom Indonesia | Bandung, Indonesia

Page 2: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

About MeEnrico Hugo, CFP, CEHBachelor of Science in Computer Science at Binus International

Ex-IT Security Intern at CBN

enrico.hugo [at] yahoo.co.id

http://www.linkedin.com/enricohugo

I have just finished my undergraduate study in Binus University International IndonesiaInternational Indonesia

Current Research Interests

CommunityIndonesia Honeynet Project - Member

DNS Analysis

Netflow Analysis

Data Mining

Machine Learning

Page 3: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Agenda

• Domain Name System and its threats

• Domain Generation Algorithm

• Environment Setup

• Detecting DGA

• DGA Case Study

• Possible Improvements

• Conclusion

Page 4: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Domain Name Systemand its threatsand its threats

Page 5: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Domain Name System (DNS)• Phonebook system that maps domain

names into IP addresses

• Also supports reverse lookup to search the domain name that corresponds to an IP addressaddress

• Provides caching system

• Has not been upgraded since first release, unlike the case of telnet to ssh or ftp to sftpfor security countermeasures

Page 6: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

DNS Threats• DNS cache poisoning

• DNS tunneling

• DNS amplification attack

• Domain Generation Algorithm

• DNS Fast Flux• DNS Fast Flux

• and many more ...

Page 7: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

DNS Threats• DNS cache poisoning

• DNS tunneling

• DNS amplification attack

• Domain Generation Algorithm

• DNS Fast Flux• DNS Fast Flux

• and many more ...

Page 8: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Domain Generation AlgorithmAlgorithm

Page 9: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

What is Domain Generation Algorithm?

Domain generation algorithms(DGA) are algorithms seen in various families of malware that are used to periodically generate a large number of domain names a large number of domain names that can be used as rendezvous points with their command and control servers.

Page 10: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

DGA Characteristics• NXDOMAIN responses

• Usually random on the 2LD or 3LD domains

• A lot of requests from the same IP address

• Ranges from completely unreadable words (not compliant to Zipf’s Law) to dictionary (not compliant to Zipf’s Law) to dictionary words (harder to detect).

Page 11: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Malwares using DGA

• Kraken

• Conficker

• Gameover Zeus

• Pykspa

• Mad Max

• PandaBanker

• Pushdo

• Ramnit

• Cryptolocker

• Dyre

• Darkshell

• Locky

• Srizbi

• Torpig

• Virut

• etc.

Page 12: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Environment Setup

Page 13: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Environment Setup

Page 14: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Environment Setup

Page 15: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Detecting DGA

Page 16: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Detecting DGA - Zipf’s Law

Zipf's law states that given some corpus of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table. Thus the most frequent the frequency table. Thus the most frequent word will occur approximately twice as often as the second most frequent word, three times as often as the third most frequent word.

Page 17: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Detecting DGA - Zipf’s Law

Page 18: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Detecting DGA - Zipf’s Law

Page 19: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Detecting DGA - Zipf’s Law

Page 20: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Detecting DGA - Zipf’s Law

Page 21: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Detecting DGA - Zipf’s Law

Page 22: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

DGA Monitor

Page 23: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

DGA Monitor

Page 24: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Detecting DGA - Hierarchical Clustering

Level 1Level 1

• Query Length

• Numeric Chars

Level 2Level 2

• Unreadable Bigram Ratio

• Consonant-Vowel Ratio

Level 3Level 3• Squared Value of Numeric Chars

5

clusters

2

clusters

2

clustersLevel 3• Squared Value of Numeric Chars

Level 4Level 4

• Maximum Consonant Sequence Length

• Maximum Label Length

Level 5Level 5

• 2LD Frequency Score

• 3LD Frequency Score

clusters

2

clusters

3

clusters

Page 25: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

UBRatio and CVRatio

Page 26: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Detecting DGA - Hierarchical Clustering

Level 1Level 1

• Query Length

• Numeric Chars

Level 2Level 2

• Unreadable Bigram Ratio

• Consonant-Vowel Ratio

Level 3Level 3• Squared Value of Numeric Chars

5

clusters

2

clusters

2

clustersLevel 3• Squared Value of Numeric Chars

Level 4Level 4

• Maximum Consonant Sequence Length

• Maximum Label Length

Level 5Level 5

• 2LD Frequency Score

• 3LD Frequency Score

clusters

2

clusters

3

clusters

Page 27: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Maximum Consonant Sequence Length (MCSLen)

• google.com -> 2 characters

• domobhdst.net -> 5 characters

Algorithmically-generated domains tend to have longer Maximum Consonant Sequence

Length (MCSLen).

Page 28: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Detecting DGA - Hierarchical Clustering

Level 1Level 1

• Query Length

• Numeric Chars

Level 2Level 2

• Unreadable Bigram Ratio

• Consonant-Vowel Ratio

Level 3Level 3• Squared Value of Numeric Chars

5

clusters

2

clusters

2

clustersLevel 3• Squared Value of Numeric Chars

Level 4Level 4

• Maximum Consonant Sequence Length

• Maximum Label Length

Level 5Level 5

• 2LD Frequency Score

• 3LD Frequency Score

clusters

2

clusters

3

clusters

Page 29: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Detecting DGA - Hierarchical Clustering

Page 30: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Cluster Descriptions

Page 31: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Clustering Results

Page 32: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Case Study

Page 33: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Case Study – The Discovery of Pykspa Malware2nd of November 2016 8th of November 2016 14th of November 2016

N times shows the number of blocked DNS request (by Palo Alto) from an IP address.

As can be seen, 210.210.150.30 is on all shown lists. Only three days of sample is

shown in this slide, but in fact the IP is on the Top 20 list everyday, which is suspicious.

Page 34: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Case Study - Steps of Detection

• Deploy Dionaea honeypot on same subnet

• Direct SSH access

• List running processes using ps aux

Page 35: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Case Study - Steps of Detection

• See resource consumption using top

Page 36: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Case Study – Steps of Detection

• Find the suspected file location using find

• Upload the files to VirusTotal• Upload the files to VirusTotal

– sujeljlanddrcsuj.exe => KillAV Trojan

– vmqaw.exe => Pykspa Worm

Page 37: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Case Study – Steps of Detection

• Pykspa is said to be spread through Skype, so I searched for Skype and found no running Skype instance, but found a Skype installer file.

• Or ...

Page 38: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Case Study – Proof of Detection

• Johannes Bader (https://johannesbader.ch) did a reverse engineering of the Pykspa worm and figured out its DGA algorithm, consisting of many noisy (camouflage) DGA and some useful (intended) DGA.

• Using his Python script, we get some domain names that will be used by Pykspa in the same day the script is run, as seen in the next slide.

• The script: https://johannesbader.ch/2015/03/the-dga-of-pykspa/dga.zip

Page 39: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Case Study – Proof of Detection

• 10 sample DGA of Pykspa for 15th of November 2016

Page 40: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Case Study – Proof of Detection

Page 41: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Conclusion

Page 42: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Possible Improvements

• Improve DGA Monitor by creating blacklist and whitelist

• Find a method to confirm whether a given domain name is a DGA domaindomain name is a DGA domain

Page 43: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Conclusion

• Blocked does not mean solved.

• Look for NXDOMAIN and SERVFAIL queries when detecting DGA

• It is necessary to be proactive, not reactive, • It is necessary to be proactive, not reactive, by consistently performing Threat Hunting

Page 44: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

Join Us

• http://www.ihpcon.id

• Indonesia Honeynet Project

• idhoneynet• idhoneynet

• http://www.honeynet.or.id

• http://groups.google.com/group/id-honeynet

Page 45: Domain Generation Algorithm Malware - cert.or.id · PDF fileWhat is Domain Generation Algorithm? Domain generation algorithms (DGA) are algorithms seen in various families of malware

enrico.hugo [at] yahoo.co.id and +62 857 1631 5877