how machine learning can fence-in bad actors

© 2020 SPLUNK INC.

How Machine Learning Can Fence-in Bad ActorsWSTA | September 2020

Julio Gomez

Financial Services Strategist, [email protected]

During the course of this presentation, we may make forward‐looking statements

regarding future events or plans of the company. We caution you that such statements

reflect our current expectations and estimates based on factors currently known to us

and that actual events or results may differ materially. The forward-looking statements

made in the this presentation are being made as of the time and date of its live

presentation. If reviewed after its live presentation, it may not contain current or

accurate information. We do not assume any obligation to update

any forward‐looking statements made herein.

In addition, any information about our roadmap outlines our general product direction

and is subject to change at any time without notice. It is for informational purposes only,

and shall not be incorporated into any contract or other commitment. Splunk undertakes

no obligation either to develop the features or functionalities described or to include any

such feature or functionality in a future release.

Splunk, Splunk>, Data-to-Everything, D2E and Turn Data Into Doing are trademarks and registered trademarks of Splunk Inc. in the

United States and other countries. All other brand names, product names or trademarks belong to their respective owners. © 2020

Splunk Inc. All rights reserved

Forward-LookingStatements

© 2020 SPLUNK INC.

© 2020 SPLUNK INC.

Agenda

● Key challenges with predictive cybersecurity defense

● Opportunity for data analytics, machine learning, and AI

● Lessons learned from the field

© 2020 SPLUNK INC.

Why Do Organizations Struggle to Get “Predictive” with Security? Data

Volume, Variety, Velocity, and

Veracity

MachineLearning

Expertise

Difficult to obtain

Analytic Skill Set and Data Literacy

Data Accessand

Integrationis Difficult

Security Domain

Expertise

Adequate Tools

ComplicatedData

Structure

© 2020 SPLUNK INC.

What Data Scientists Really DoData Preparation accounts for about 80% of the work of data scientists

“Cleaning Big Data: Most Time-Consuming, Least Enjoyable Data Science Task, Survey Says”, Forbes Mar 23, 2016

© 2020 SPLUNK INC.

Statistical Analysis Is A Start

Account Enumeration & Credential Testing● Abnormally high number of failed logins from device or IP● Abnormally high number of account access from device or IP

ATM Transactions & Wire Transfers● Anomalously high number of transactions by merchant● Anomalously high transaction by account

Data Exfiltration & Access● User with high reads & writes to database compared to others in the same role● Servers or users with high bytes_out in comparison to peers

IP Theft● High number of requests to API service ● Speed violations: accounts requesting data at machine speed

© 2020 SPLUNK INC.

Prediction & Time Series Forecasting

Use Case: As a network analyst for an Investment Bank, I want to forecast ip traffic patterns to ensure proper resourcing and identify

denial of service attacks.

© 2020 SPLUNK INC.

Enterprises Want Answers from their Data

► Deviation from past behavior

► Deviation from peers

► (aka Multivariate AD or Cohesive AD)

► Unusual change in features

► Identify peer groups

► Event Correlation

► Reduce alert noise

► Behavioral Analytics

Anomaly detection Predictive Analytics Clustering

► Predict Service Health Score/Churn

► Predicting Events

► Trend Forecasting

► Detecting influencing entities

► Early warning of failure

© 2020 SPLUNK INC.

Every Search Can Use Machine Learning

Analytics

SOAR & 3rd

party

applications

Smartphones

and Devices

Tickets

Email

Send an

email

File a

ticket

Send a text

Flash lights

Trigger

automated

response &

other

integrations

AlertReal Time

OT

Industrial Assets

IT

Consumer and

Mobile Devices Machine Learning

Tool

© 2020 SPLUNK INC.

Supervised Machine Learning Process

Data Prep /

Pre-

Processing

Attribute

Selection

Apply

Predictions

against the

Test Set

Measure

Model

Accuracy

Modify

Attribute

Selection

Re-Run the

Model

Measure

Model

Accuracy

Deploy the Model

to Unseen Data

Structured

Data

Unstructured

Data

Training Set

Test Set

Attribute

Selection

Model Creation

Predictor

Field

Algorithm

Selection

Run the

Model (Fit)

Clean, Transform, Data

Validate & Refine the Model Productionize

© 2020 SPLUNK INC.

Benefits of an ML Strategy for Securityand the opportunity for technology and data

BI Rule Based Clustering ML Based Clustering

Possible

Fraud Ring

© 2020 SPLUNK INC.

Lessons Learned from the FieldCustomer Success Stories

● ML used to help automate threat huntingand 90% of security metrics process in two months

● Security analysts given back 30+ hours a month to focus on proactive security, instead of manual data collection and reporting

● ML used to detect insider and external threats.

● Analyst efficiency to gather data and conduct security investigations increased by 50%

● Provides deep reusable correlation rules across all support engineer levels

© 2020 SPLUNK INC.

What are ML/AI Practitioners Detecting?

● Privilege abuse of admin account

● Detection of account sharing

● Detection of Shadow IT Servers

● Privilege operations on self

● Short lived accounts on Box & AD

● Interactive logins by service accounts

● Unauthorized password change

attempt

● Critical file access by service accounts

● Unauthorized file access

● Unauthorized application usage

● Compromised service accounts

that enabled remote access

● Usage of co-worker’s machine by

user

● Query blank password on admin

acct

● Malware communication where

the standard security tools failed

to detect (multiple customers)

● Exposed company's sensitive

information on public web

services

● Compromised mobile phone

generating suspicious outbound

connection

● Infected machine based on alert

correlation & internal detection

Account Misuse

Compromised User Account

Compromised / Infected Machine

● Creation of temporary local

accounts across multiple

machines

● Unusual process creating

sockets across internal machines

Lateral Movement

© 2020 SPLUNK INC.

What are ML/AI Practitioners Detecting?

▶ User forwarding all corporate emails to

personal email address

▶ User copying few years older email

archive data from central repository

▶ High volume of data downloads from

box and previews indicating data

gathering/snooping

▶ Users exfiltrating data out of

organization - detected by deviations

from user’s and peer group’s profiles

▶ Users with web proxy disabled

▶ Users logging in from unusual/unauthorized

geo locations

▶ Suspicious call home activity

▶ Users encountering malvertising

▶ Suspicious account lockout

▶ Misconfigured services using expired

credentials

▶ Accounts impersonating user logins

▶ Misconfigured/corrupted VDI profile

▶ Unauthorized use of P2P software

▶ Unauthorized corporate resource usage for

BitCoin mining

▶ DNS abuse activity

▶ Automatic detection of user/account

related details

- Account types - domain

administrators, service accounts

- User risk scoring – internal &

external risk

▶ Automatic detection of device types in

environment

- DCs, exchange servers, email

servers, DNS servers, personal

laptops, web servers, NTP servers

▶ Popularity of external domains & IPs

Data Exfiltration

Suspicious Behavior / Unknown

ThreatContextual Intelligence

▶ Detection of directory traversal

attacks based on malicious

strings in Web URL requests to

web servers

▶ RDP traffic from suspicious

sources

External Attack

© 2020 SPLUNK INC.

To Learn More...

MLTK / DLTK / Fraud Fraud, Security, and Compliance

This is your map

Apps WorkshopsEssential

Guide

Learn from the experts

Blogs

All of these are complimentary

© 2020 SPLUNK INC.

DataLakesMaster Data

ManagementETL

Point Data Management

Solutions

DataSilos

Business Processes

The Data-to-Everything

Platform

IT

Security

DevOps

how machine learning can fence-in bad actors

Documents