malicious client detection using machine learning

16
Malicious Client Detection using machine learning SATYAM SAXENA

Upload: cysinfo-cyber-security-community

Post on 21-Feb-2017

67 views

Category:

Technology


5 download

TRANSCRIPT

Page 1: Malicious Client Detection using Machine learning

Malicious Client Detection using machine learning SATYAM SAXENA

Page 2: Malicious Client Detection using Machine learning

Threats•There are many types of malware for all types of devices and operating systems

•Most if not all malware relies on a support system – command and control infrastructure

•Bad guys use DNS to scale and hide their C&C infrastructure

•Bad guys use DNS for C&C to bypass corporate security (tunneling)

•Bad guys use cloud providers to roll out, scale, manage and quickly move their C&C Infrastructure

Without reliance on any particular end point operation system or configuration, we can use big data analytics on network data to detect malware.

Page 3: Malicious Client Detection using Machine learning

Malware use of DNS

rndruppbakyokv[.]com

1.2.3.4

rndruppbakyokv[.]com

1.2.3.4

Command andControl

Infrastructure

CommunicationChanel with C&C is established. Compromised device receives updates, instructions, targets.

DNS Server

DNS Server

End point device

Page 4: Malicious Client Detection using Machine learning

RawpDNS

Domain Nameclassifier

DNS Resolverclassifier

Device Behavior classifier

Compromised Device(Security Event)

classifier

MaliciousDomains

MaliciousResolvers

Behavior Anomalies

Machine Learning Pipeline

DGA Network Time

Tunnel

Network Time

Network Time

Page 5: Malicious Client Detection using Machine learning

Architecture

Page 6: Malicious Client Detection using Machine learning

DGA Model• Detect Randomly generated domains in the pDNS data.

• Model is trained on 6 categories of malware families like zeus, tinba, pushdo, etc.

• 29 features extracted from the domain.

• 29 features dimensionally reduced to 16 features using PCA.

• Those reduced features set is then used to train a GBM classifier.

Page 7: Malicious Client Detection using Machine learning

Domain FeaturesCommon Letter Score Entropy

Page 8: Malicious Client Detection using Machine learning

Domain Features(2)Length of largest meaningful string Mean length of dictionary words

Page 9: Malicious Client Detection using Machine learning

DGA Features

Page 10: Malicious Client Detection using Machine learning

DGA Classification PerformanceOverall model performance

(Random Forrest)

Metric Performance Accuracy 98.738% Precision 99.288% Recall 98.181% AUC 99.801%

Performance per malware family

Malware Family % Detection

Conflicker 86.309%

Cryptolocker 98.348%

Pushdo 95.515%

Ramdo 99.823%

Tinba 96.715%

Zeus 100.0%

Page 11: Malicious Client Detection using Machine learning

Network Model• Using WHOIS record to find if a domain is malicious or benign.

• WHOIS record contains very rich information about a domain.

• Age based features.

• Registration Features.

Page 12: Malicious Client Detection using Machine learning

Network Features – Whois Server

Malicious Domains Benign Domains

Page 13: Malicious Client Detection using Machine learning

Network Features – creation Date

Page 14: Malicious Client Detection using Machine learning

Network Model Performance • Final Set of features :- creation Date, update Date, expiration Date,admin country, registrant country, tech country, status, whois server

Metric Performance Error 0.00450864127

Area Under Curve 0.96615884041

Page 15: Malicious Client Detection using Machine learning

Compromised Client Detection

Hadoop HDFS

Spark Compute

IP DGA WHOIS NX SERVERip1 #10 #3 #4 #5

Ip2 #8 #1 #2 #3

ip3 #5 #2 #0 #0

ip4 #3 #3 #0 #0

pDNS Data

Group By

Page 16: Malicious Client Detection using Machine learning

Thank You