detection of anomalous behavior
DESCRIPTION
To take action before IT security attacks become critical, organizations need the analytics capabilities necessary to identify anomalous and suspicious behavior quickly.Our Anomalous Behavior Detection Solution addresses security issues that conventional methods can’t. It can help to detect and prevent theft of data or intellectual property (IP), for instance at the behest of nation states, organized crime, or by a disenchanted employee. It can quickly identify when a user is behaving in a way that is abnormal for them and take appropriate action to limit what they can do, or flag up the situation for managerial attention. It can also predict when anomalous behavior is likely to occur, flagging events of interest for further investigation for potential security breach.TRANSCRIPT
Detecting Anomalous Behaviorwith the Business Data LakePaul Gittins & Steve Jones
2
BIM
Copyright © 2014 Capgemini. All rights reserved.
Detecting Anomalous Behavior with the Business Data Lake | Steve Jones
Bad “Actors”
Organized
criminals
Foreign States
Hactivists
Utilities: Disrupt as a
strategic asset
Financial Services:
Operational code, user
accounts, fraud
Gain access to critical
Intellectual Property
Traditional Security approaches wouldn’t catch Edward Snowden and can’t adapt quickly enough to
new cyber-crime attacks.
The new threat vectors are highly targeted
3
BIM
Copyright © 2014 Capgemini. All rights reserved.
Detecting Anomalous Behavior with the Business Data Lake | Steve Jones
Clouds add complexity
Blurred boundaries: Increased need to share data/information
across the business and with 3rd parties
Data volumes, variety and velocity are increasing
The attack surface of the business has significantly increased
Three drivers have increased the attack surface:
4
BIM
Copyright © 2014 Capgemini. All rights reserved.
Detecting Anomalous Behavior with the Business Data Lake | Steve Jones
Increased Threats
Traditional tools don’t
protect against “bad
actors” who target IP,
financial Information and
strategic access.
Our approach creates
insight into anomalous
behavior and threats within
the business and
surrounding ecosystem.
Allows you to take
appropriate action based
on potential impact of
threat to reduce risk.
Detect Anomalous Behavior
React
A new approach is needed to counter the threats
5
BIM
Copyright © 2014 Capgemini. All rights reserved.
Detecting Anomalous Behavior with the Business Data Lake | Steve Jones
Social engineering has been a primary
attack vector for large threats
Significant IP breaches are often socially
engineered
Current tooling is important but
insufficient:
• Governance, risk and compliance (GRC)
defines a set of “allowed behavior”
• Identity and access management tooling
provide the system level access controls
based on policy
• SIEM collates but does not provide insight or
analytics in the right ways to identify these
threats.
SIEM and GRC could not prevent Mr. Snowden
User accessing
critical systems
within role
GRC
Edward Snowden:
In role
Logs collated his activity
Yet the assets were accessed
Right identity & access controls
The NSA could not spot the anomalous
behavior.
SIEM
6
BIM
Copyright © 2014 Capgemini. All rights reserved.
Detecting Anomalous Behavior with the Business Data Lake | Steve Jones
Anomalous Behavior
Traditional approaches need to be
complemented – SIEM, GRC are still needed
GRC says what is approved – the tasks you
can do, the gates you can go through.
Abnormal Behavior Detection says whether
you should have.
Extend using Anomalous Behavior Detection:
This approach:
1. Learns what is normal [the difference between
approved and allowed]
2. Identifies what is anomalous and categorizes
the risk
3. Alerts so you can react before it becomes a
problem.
New Outcomes are Possible
It is an extension of current security
approaches that enables a reduction in GRC
and can identify threats that GRC cannot
• It shows where “allowed” is not “normal”
and the scope of the deviation from the
norm.
• Detect social engineering attacks as well as
network level detections
• Minimize the exposure time and loss
• Potentially predict the leakage areas ahead
of the attack
• This can be applied to both GRC areas
(Snowden) and non-GRC areas (networks,
non-controlled information) to build up a
broader pattern of behavior.
We need a different approach
7
BIM
Copyright © 2014 Capgemini. All rights reserved.
Detecting Anomalous Behavior with the Business Data Lake | Steve Jones
Detection of Anomalous Behavior – from Insight to Action
Structured data Machine learning
defines “normal”
across user base
Inform management Adjust policies Lockdown
SIEM
AD
HR
Unstructured data
Images
Social
Video
Automated response based on level of deviation and system criticality
Users accessing key systems within role as defined by GRC
Deviation
from norm
triggers
action
8
BIM
Copyright © 2014 Capgemini. All rights reserved.
Detecting Anomalous Behavior with the Business Data Lake | Steve Jones
How we generate insight into anomalies to enable action
By taking a Data Science approach:
Tools:
• Use of the opensource MADlib library to
provide in-database functions
• Leading edge tools to implement machine
learning collaboratively
Methods:
• Parallelized a wide variety of machine
learning algorithms for optimum
performance on the Business Data Lake
• Agile, test-driven, customer focused
Process:
• Analytical workflow aligned with business
needs and optimized for speed
• Supports iterative and collaborative working.
Business Data Lake
Ingestion tier
Insights tier
Unified operations tier
System monitoring System management
Unified data management tier
Data mgmt. services
MDMRDM
Audit and policy mgmt.
Processing tier
Workflow management
Distillation tier
HDFS storageUnstructured and structured data
In-memory
MPP database
Real
time
Micro
batch
Mega
batch
SQL
NoSQL
SQL
MapReduce
Query interfaces
SQL
Sources Action tier
Real-timeingestion
Micro batchingestion
Batch ingestion
Real-time insights
Interactive insights
Batch insights
IAM
SIEM
GRC
Network
Images
Social
SIEM
HR
AD
Video
9
BIM
Copyright © 2014 Capgemini. All rights reserved.
Detecting Anomalous Behavior with the Business Data Lake | Steve Jones
1Out of policy
access
In policy but
extremely
abnormal access
3
2In policy but
abnormal access
Examples – SIEM, GRC and Detection of Anomalous Behavior
User tries to access what
they shouldn’t
GRC says “no”,
notifies SIEM
SIEM collates, alerts, may
reduce privileges via GRC/IAM
User accesses single item out of
norm but in policy
GRC says
“yes”
AB ‘but that isn’t normal’,
alert to SIEM
SIEM collates, alerts, may
reduce privileges via GRC/IAM
User accesses multiple areas
out of ordinary but in policy
GRC says
“yes”
AB ‘this is the ONLY person
EVER to do this!’ alert to SIEM
Shutdown of user
access + manager alerts
10
BIM
Copyright © 2014 Capgemini. All rights reserved.
Detecting Anomalous Behavior with the Business Data Lake | Steve Jones
Ingest both network and wider business information at scale.
Ingest
Store for both near real time and long term analysis.
Store
Create insight into possible anonymous behavior.
Analyze
Surface insight to management tools with context.
Surface
Take automated action based on risk and potential impact of anomaly.
Act automatically
For final action and improve algorithms.
Investigate
GRC, SIEM, Investigator
Use Identity and Access
management to reduce/remove
rights automatically
Alert management
Real time, batch, based on business
need, swap and switch without
re-engineering or recoding
Extendable common platform for
whole business, not just security
Ingest as many events as
practical for long term
analysis
Ensure closed loop
How do we build this approach?
11
BIM
Copyright © 2014 Capgemini. All rights reserved.
Detecting Anomalous Behavior with the Business Data Lake | Steve Jones
Typical Use Cases
Visualizing heat maps of issues across an organization by business unit
or profile
Profiling systems or devices for indicators of risk, highlighting places where an
alert needs to prioritized over others because of its likelihood of affecting the
business
Spotting a compromised host when a particular IP address or user exhibits
multiple suspicious characteristics over a week-long period
Providing investigative context after an alert gets triggered to determine the
cause or impact of an issue, e.g. if the user downloaded an executable prior to
the alert, or the IP accessed a critical asset after triggering the alert
Detecting lateral movement based on active data by using graph analytics to
profile user behavior and peers’ behaviors.
12
BIM
Copyright © 2014 Capgemini. All rights reserved.
Detecting Anomalous Behavior with the Business Data Lake | Steve Jones
Business Data Lake Architecture
Ingestion
tier
Insights
tier
Unified operations tier
System monitoring System management
Unified data management tier
Data mgmt.
services
MDM
RDM
Audit and policy mgmt.
Processing tier
Workflow management
Distillation tier
HDFS storageUnstructured and structured data
In-memory
MPP database
Real
time
Micro
batch
Mega
batch
SQL
NoSQL
SQL
MapReduce
Query
interfaces
SQL
Sources Action tier
Real-time
ingestion
Micro batch
ingestion
Batch
ingestion
Real-time
insights
Interactive
insights
Batch
insights
IAM
SIEM
GRC
Network
Images
Social
SIEM
HR
AD
Video
13
BIM
Copyright © 2014 Capgemini. All rights reserved.
Detecting Anomalous Behavior with the Business Data Lake | Steve Jones
Provide platform for future defense capability
Advanced
Machine
Learning
Advanced
Automation
Anticipate
Attacks
Enhance
through
federated
sharing
of threats
Automated
quarantine
of resources
14
BIM
Copyright © 2014 Capgemini. All rights reserved.
Detecting Anomalous Behavior with the Business Data Lake | Steve Jones
MADlib in-database functions
Predictive Modeling Library
Generalized Linear Models
Linear Regression
Logistic Regression
Multinomial Logistic Regression
Cox Proportional Hazards
Regression
Elastic Net Regularization
Sandwich Estimators (Huber
white, clustered, marginal
effects).
Matrix Factorization
Singular Value Decomposition
(SVD).
Linear Systems
Sparse and Dense Solvers.
Machine Learning Algorithms
ARIMA
Principal Component Analysis
(PCA)
Association Rules (Affinity
Analysis, Market Basket)
Topic Modeling (Parallel LDA)
Decision Trees
Ensemble Learners (Random
Forests)
Support Vector Machines
Conditional Random Field (CRF)
Clustering (K-means)
Cross Validation.
Descriptive Statistics
Sketch-based
Estimators
• CountMin (Cormode-
Muthukrishnan)
• FM (Flajolet-Martin)
• MFV (Most Frequent
Values)
Correlation
Summary.
Support Modules
Array Operations
Sparse Vectors
Random Sampling
Probability Functions.
The information contained in this presentation is proprietary.
Copyright © 2014 Capgemini. All rights reserved.
Rightshore® is a trademark belonging to Capgemini.
www.pivotal.io/big-data/businessdatalake
www.capgemini.com/bdl
About Capgemini
With almost 140,000 people in over 40 countries, Capgemini is
one of the world's foremost providers of consulting, technology
and outsourcing services. The Group reported 2013 global
revenues of EUR 10.1 billion.
Together with its clients, Capgemini creates and delivers
business and technology solutions that fit their needs and drive
the results they want. A deeply multicultural organization,
Capgemini has developed its own way of working, the
Collaborative Business Experience™, and draws on Rightshore®,
its worldwide delivery model.