machine learning and advanced cyber security analytics learning and advanced... · index 1. company...

49
Security Summit Machine Learning and Advanced Cyber Security Analytics

Upload: phamnga

Post on 28-May-2018

223 views

Category:

Documents


1 download

TRANSCRIPT

Security SummitMachine Learning and Advanced Cyber Security Analytics

Index

1. Company profile

2. Cyber defense

3. Statistics and data viz

4. Machine Learning and Advanced Cyber Security Analytics

- Advanced Cyber Security Analytics

- Machine Learning Analytics

5. Analytics driven by our malware lab

6. Q&A

Index

1. Company profile

2. Cyber defense

3. Statistics and data viz

4. Machine Learning and Advanced Cyber Security Analytics

- Advanced Cyber Security Analytics

- Machine Learning Analytics

5. Analytics driven by our malware lab

6. Q&A

Company profile

• aizoOn delivers unmatched quality services to over 150 clients all over the

world.

• launched in 2005, innovation as primary focus, aizoOn has achieved record

growth with offices across Europe, USA and Oceania..

Company profile

ENTERPRISE CYBER RISK

CONSULTING

INCIDENT HANDLING SYSTEMS & NETWORK

SECURITYAPPLICATION SECURITY

SERVICES PLATFORM & PROGRAMS

The Technology Unit Cyber Security is the department through which aizoOn deliver its expertise to

provide COMPLETE COVERAGE of the SECURITY NEEDS that may arise within an organization.

DIGITAL FORENSICSIoT SECURITY

MOBILE SECURITYEMBEDDED SECURITY

Company profile

Index

1. Company profile

2. Cyber defense

3. Statistics and data viz

4. Machine Learning and Advanced Cyber Security Analytics

- Advanced Cyber Security Analytics

- Machine Learning Analytics

5. Analytics driven by our malware lab

6. Q&A

Cyber defenseIn the past

▌ Castles and Moat

▌ Traditional passive technical controls

▌ Isolation via network architecture and access controls

▌ Layered security

▌ Tools:

▌ Firewalls

▌ VPN

▌ IDS/IPS

▌ Web application firewall

▌ Anti-viruses

▌ NAC

Cyber defenseToday

▌ Current defensive technologies and procedures are just partly effective

▌ The importance of security awareness is often underestimated

▌ Motivated attackers have a high likelihood of breaching

▌ Criminals often manage to escape justice

▌ Effective attacking tools and techniques are economically viable

▌ Most of the dangerous attacks use malware software

Cyber defenseClusit stats

Cyber defenseOur stats

Successful

100%

Unsuccessf

ul 0%

AIZOON BREAK-IN SUCCESS RATE

Performing security services (eg. Penetration Test, Data Exfiltration Tests, ecc.) aizoOn

was able to break into at least one system and retrieve information on every

engagement and every client.

Business Critical /

Highly

confidential

17%

Business

Relevant /

Confidential

28%Proprietary

/Internal use

39%

Other business

data

16%

COMPROMIZED DATA CONFIDENTIALITY

DETE

CTI

ON

DIF

FIC

ULT

Y

High Profile

APT

95% of attacks

are SPECIFIC

to the

organization

BUDGET to MAKE

EUR 100K – x M

Comercially

Advanced

Malware

BUDGET to MAKE

or BUY

EUR 10K – 100K

Cheap

Malware

BUDGET to BUY

EUR 100 – 10K

Any company

can afford

this gap!

Cyber defenseOur stats

Index

1. Company profile

2. Cyber defense

3. Statistics and data viz

4. Machine Learning and Advanced Cyber Security Analytics

- Advanced Cyber Security Analytics

- Machine Learning Analytics

5. Analytics driven by our malware lab

6. Q&A

Statistics and data vizyou can’t have one without the other

An example:

http://www.datasciencecentral.com/profiles/blogs/when-data-viz-trumps-statistics

Statistics and data vizyou can’t have one without the other

http://www.datasciencecentral.com/profiles/blogs/when-data-viz-trumps-statistics

Statistics and data vizPreattentive Processing

“Human perception plays an important role in the area of visualization.

An understanding of perception can significantly improve both the

quality and the quantity of information being displayed”

- Ware -

Ware, C. Information Visualization: Perception for Design. Morgan Kaufmann Publishers, Inc., San Francisco, California, 2000.

(a) (b)

Statistics and data vizPreattentive Processing

Index

1. Company profile

2. Cyber defense

3. Statistics and data viz

4. Machine Learning and Advanced Cyber Security Analytics

- Advanced Cyber Security Analytics

- Machine Learning Analytics

5. Analytics driven by our malware lab

6. Q&A

Machine Learning and Advanced Cyber Security Analytics

NETWORK TRAFFIC

ANALYSIS

Style 1

NETWORK FORENSICS

Style 2

PAYLOAD ANALYSIS

Style 3

ENDPOINT BEHAVIOR

ANALYSIS

Style 4

ENDPOINT FORENSICS

Style 5

REAL TIME /

NEAR REAL TIMEPOSTCOMPROMISE

(DAYS / WEEKS)

NETWORK

PAYLOAD

ENDPOINT

TIME

WH

ER

E T

O L

OO

K

GARTNER (August 2013) - Five Styles of Advanced Threat Defense - "Lawrence Orans, Jeremy D'Hoinne"

Cyber Security

Expert

Knowledge

Data Mining

techniques

Machine

Learning

techniques

Self-Learned

Knowledge

Advanced

AnalyticsMachine

Learning Engine

Threat

Knowledge

Machine Learning and Advanced Cyber Security Analytics

Index

1. Company profile

2. Cyber defense

3. Statistics and data viz

4. Machine Learning and Advanced Cyber Security Analytics

- Advanced Cyber Security Analytics

- Machine Learning Analytics

5. Analytics driven by our malware lab

6. Q&A

Advanced cyber security analytics

Cyber Security expert

Data Mining techniques

Advanced

Analytics

Strings Domain Analysis

Traffic Outlier Analysis

Network Analysis

et al.

Advanced cyber security analyticsStrings Domain Analysis

faceb000k.7host08.com

googlle.in

zn_1i8wqguxh5uzvbz.interceptics.com

5fe8a3d39a5226a54c2.activitylt.com

Similar but not

equal to original

ones

Random &

pseudorandom

domains

Advanced cyber security analyticsStrings Domain Analysis

DNS Analysis

• Near Blacklist possible malware domain

• Abnormal distant from black and white list

User Agent Analysis

• Near Blacklist possible malware domain

• Warning distant from black and white list

• Unseen unknown user agent (distant from user agents seen in

the network)

Advanced cyber security analyticsStrings Domain Analysis

Jaro-Winkler Distance

𝑠 =𝑚

3𝑎+

𝑚

3𝑏+

𝑚−𝑡

3𝑚

m matching characters

t transpositions

a, b lengths of strings

Matching if 𝑠 <max 𝑎,𝑏

2− 1

Advanced cyber security analyticsTraffic Outlier Analysis

Shape:

- Traffic direction

Dimension:

- Volume

Position:

- Probability of anomaly

Color:

- Machine class

Advanced cyber security analyticsTraffic Outlier Analysis

Traffic Analysis:

• Identification of anomalies by analysing the trend of traffic

previously seen in the network

Protocols and Services Analysis:

• Identification of anomalies by analysing the trend of protocols

and services previously seen in the network

Advanced cyber security analyticsTraffic Outlier Analysis

Approaches to calculation of outlier probability:

• Gaussian distribution

• ARIMA model

• Kernel Density Estimate

• Mean

• Median

• Mean of quantiles

Advanced cyber security analyticsNetwork Analysis

Advanced cyber security analyticsNetwork Analysis

Network Analysis

• Analysis of IPs and ports

• Analysis of connections between intranet hosts

Advanced cyber security analyticsNetwork Analysis

Network Analysis

• Link analysis (associations between objects)

• Network robustness (identify critical fraction of nodes or links)

• Centrality measures (relative importance of nodes and edges)

Index

1. Company profile

2. Cyber defense

3. Statistics and data viz

4. Machine Learning and Advanced Cyber Security Analytics

- Advanced Cyber Security Analytics

- Machine Learning Analytics

5. Analytics driven by our malware lab

6. Q&A

Machine Learning Analyticsan unsupervised approach

Bayesian Network Support Vector Machine Bootstrap and Distribution Tests

Anomaly detection

Machine Learning analyticsan unsupervised approach

Unsupervised approach enables aramis to:

- Self learn the behaviour of the network

- Spot unusual activity

- Automatically detect patterns and relationships

- Work without a priori information

- Don’t need human input

Machine Learning analyticsBayesian Networks (BNs)

Variables are expressed through

Direct Acyclic Graph (DAG).

Every node on the network is a

variable and the arc expresses a

dependency between them.

Machine Learning analyticsBayesian Networks (BNs)

pc1_model

pc2_model

pc3_model

Global_pc_model

srv1_model

srv2_model

Global_srv_model

First Layer Evaluation Second Layer Evaluation Outcome

Single Event Evaluation

Single Machine Evaluation

Class Machine Evaluation

Overall Network Evaluation

Machine Learning analyticsSupport Vector Machine (SVM)

Maximum margin hyperplane:

Minimization problem:

Machine Learning analyticsSupport Vector Machine (SVM)

Mapping to a space where patterns are separable

Machine Learning analyticsSupport Vector Machine (SVM)

Identification of data in terms of distance from the original distribution anomalies

Separation of the region capturing the training data points and maximization of the

distance from the origin

Outliers

Region

Machine Learning analyticsBootstrap and distribution tests

▌ HTTP requests and replies

▌ FTP activity

▌ SSL sessions

▌ SSL certificates used

▌ SMTP traffic on a network

▌ DNS activity on a network

▌ Connections

▌ Network activity on non-standard ports

▌ Files transmitted over the network

▌ Unexpected protocol-level activity

Outcome distribution training test

Machine Learning analyticsBootstrap and distribution tests

bootstrap

training test

Same distribution?

Cyber Security

Expert

Knowledge

Data Mining

techniques

Machine

Learning

techniques

Self-Learned

Knowledge

Advanced

AnalyticsMachine

Learning Engine

Threat

Knowledge

Machine Learning and advanced cyber security analytics

Index

1. Company profile

2. Cyber defense

3. Statistics and data viz

4. Machine Learning and Advanced Cyber Security Analytics

- Advanced Cyber Security Analytics

- Machine Learning Analytics

5. Analytics driven by our malware lab

6. Q&A

Analytics driven by our malware Lab

• Enhance malware knowledge

• Identify and analyze new malware behavioural patterns

• Testing and refining algorithms

• We cannot infect a real network

• Malware detects sandbox

• Machine without user is not realistic

Analytics driven by our malware Lab

• Employee bot

• Tipical local, intranet, and internet

operations

• Random component (temporal patterns

similar to real usage)

Analytics driven by our malware Lab

Questions & Answers

Contact:

[email protected]