improving intrusion detectors by crook-sourcing · improving intrusion detectors by crook-sourcing...

Improving Intrusion Detectors by Crook-Sourcing

—Frederico AraujoIBM Research

The 35th Computer Security Applications Conference

Gbadebo Ayoade, Khaled Al-Naami, Yang Gao, Kevin Hamlen, and Latifur KhanThe University of Texas at Dallas

The research reported herein was supported in part by ONR award N00014-17-1-2995; NSA award H98230-15-1-0271; AFOSR award FA9550-14-1-0173;NSF FAIN awards DGE-1931800, OAC-1828467, and DGE-1723602; NSF awards DMS-1737978 and MRI-1828467; an IBM faculty award (Research); andan HP grant. Any opinions, recommendations, or conclusions expressed are those of the authors and not necessarily of the aforementioned supporters.

Information Asymmetry(Kasparov vs. Deep Blue, 1997)

2

1997: IBM Deep Blue becomes the first machine to beat a chess grandmaster (Garry Kasparov) under tournament conditions.

After the match, Kasparov complains match was unfair:

“It was difficult to prepare for an opponent with no games. … I couldn’t prepare myself properly for such an event. … You have to

know your opponent!” –Garry Kasparov

In contrast, Deep Blue had trained using every match Kasparov had ever played.

IBM Research / June 29, 2018 / © 2018 IBM Corporation

Information asymmetry in cyber defenseAttackers have months or years to study vulnerabilities and defenses

Defenders have seconds to react to never-before-seen attacks

3IBM Research / June 29, 2018 / © 2018 IBM Corporation

3

ISTR, vol. 23, 2018

1 in 13Web requests lead to malware

Edgescan, 2019

Source: Edgescan, 2019

Ponemon, 2019

19.2% of all web application vulnerabilities high or critical (24.9% if internal networks)


§ ML offers so much promise for powerful, fast intrusion detection– Face and speech recognition, recommendation systems, natural language translation, …

§ Yet, most deployed IDS solutions are still human rule-based with weak AI support... Why?(1) Unbalanced data: Hard to get enough malicious data to properly train ML-based IDSes

(2) Huge feature space: Security-relevant features within the data not known in advance

(3) Encryption opacity: Encrypted traffic is commonplace and hides much of the best data.

(4) False alarms: High false alarm rates lead to very low base detection rates.

The task of identifying attacks is fundamentally different from other application domains where machine learning is applied

Information asymmetry & ML for intrusion detection

5

Main idea:

When an attack is detected, don’t disconnect it!

Keep the attacker talking to harvest threat data.

Apply automated data mining for IDS training.

IDS learns over time with no data collection burden.

Research Question: Does such an IDS actuallylearn concepts useful for thwarting real attacks?

(Spoiler alert: Yes, with surprising effectiveness!)

IBM Research / June 29, 2018 / © 2018 IBM Corporation

crook-sourcing —noun. the conscription and manipulation of attackers into performing free penetration testing for improved IDS model training and adaptation.

Detected attacks are missed IDS training opportunities

Attack kill chain: a vicious cycle


secrets

attack

Attack kill chain: a vicious cycle


attack

reject

§ facilitates low-risk reconnaissance§ accentuates the information and time asymmetry that favors attackers§ amplifies the impact of n-day exploits

conventional software security patches advertise themselves to attackers

Enhancing IDSes through crook-sourcing


attack

fake secrets

software security patches repurposed as feature extractors

Crook-sourcing advantages


§ Deceive attackers into performing free penetration testing for IDS model training and adaptation– attackers contribute their TTP patterns to the data streams processed by the

embedded deceptions– automatically labels malicious attacker behavior

§ Enables (semi-) supervised learning for intrusion detection – improves base detection rates– enables multi-class detection and contextually-richer predictions

§ Overcomes issues related to concept differences between honeypot attacks and those against genuine assets– deceptions are embedded into the actual target of attacks

System architecture


UserAttacker

monitoring stream

honey-patched

anomalydetector audit stream

attack traces

System architecture


attack detectionattack modeling

featureextraction

data queueing

audit stream

attack traces feature

extraction

attackdata

audit data

attack model classifier

model update

featureextraction

monitoring stream

monitoring data

alerts

UserAttacker

monitoring stream

honey-patched


attack traces

Feature set models


§ Network features (Bi-Di)− Packet length− Uni-burst size, time, count− Bi-burst size, time

§ System features (N-Gram)− System calls: enter or exit− Bi-, tri-, and quad-events

Attack detection (model 1)


§ Bi-Di-SVM: Network features + SVM§ N-Gram-SVM: System features + SVM§ Ens-SVM: ensemble

Attack detection (model 2)


§ Bi-Di-OML: Network features + OAML + k-NN§ N-Gram-OML: System features + OAML + k-NN

Online Adaptive Metric Learning


Experimental framework


UserAttacker

monitoring stream

honey-patched


attack traces

UserAttacker

monitoring stream

honey-patched


attack tracesUser

Attacker

monitoring stream

honey-patched


attack traces

red teaming

Vulnerability Classes


Dataset


§ Raw data: 42 GB of (uncompressed) network packets and system events over a period of three weeks

§ Training data: after feature extraction, the training data comprised 1800 normal instances and 1600 attack instances

§ Testing data: 3400 normal and attack instances gathered from monitors deployed at unpatched servers, where the distribution of normal and attack instances varies per experiment

§ Red teaming data: collected over three days, 10 graduate students with basic to advanced offensive security skills, average 45 min sessions.

Detection accuracies on simulated environment


Bi-Di:networkfeaturesN-Gram:systemfeatures

Red teaming validation


Bi-Di:networkfeaturesN-Gram:systemfeatures

False positive rate reduction


Crook-sourcing advantage


50

55

60

65

70

75

80

85

90

95

100

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

accu

racy

(%)

no deception deceptive defense

Experiments on synthetic dataapproximating numerous attackers

number of attack classes

Human subject evaluation: a cautionary tale


50

55

60

65

70

75

80

85

90

95

100

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1650556065707580859095

100

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

accu

racy

(%)

no deception deceptive defense

Experiments on synthetic dataapproximating numerous attackers

Experiments with 10 actualhuman attackers (students)

number of attack classes

Monitoring performance


Host:16cores,24GBRAM,64-bitUbuntu16.04LTSBenchmarkprofile:c=10,500req/s

Conclusions


§ Crook-sourcing yields higher-accuracy detection models– no additional developer effort apart from routine patching activities– effortless labeling of the data

§ Deceive attackers into disclosing their TTP patterns for IDS model evolution– embedded deceptions extract relevant features from attack sessions

§ Enables semi-supervised learning for intrusion detection – Improves base detection rates– Enables multi-class detection and contextually-richer predictions

Thank you

26

Frederico Araujo—[email protected]/faraujo

IBM Research / June 12, 2019 / © 2019 IBM Corporation ©D. Kirat

improving intrusion detectors by crook-sourcing · improving intrusion detectors by crook-sourcing...

Documents