security insights at scale
TRANSCRIPT
Security- InsightsAtScale
RaffaelMartyVPSecurityAnalytics@Sophos
May2016
XLDB2016,Stanford,USA
©RaffaelMarty 2
"This presentation was prepared solely by RaffaelMarty in his personal capacity. The material, views,and opinions expressed in this presentation are theauthor's own and do not reflect the views of SophosLtd. or its affiliates."
Disclaimer
Security– ShiftTowardsAnalytics
6
Past Present Future
Prevention
• Singleinstancefocus• AV,firewalls,IDS
• Crossentityintelligence• Synchronizedsecurity
Detection
• Datacollectionandcentralization
• Bigdatatechnologies• Machinelearningattempts• Manychallenges
• Prediction?• Machineassistedinsights• UXfocus• Patterns,behaviors,collaboration
+
• Datadriven
learn
Whytheshift?Attackersusenovelandspecificmethodstocompromiseeachtarget.
Security
7
GainingInsights:Findingnovelattacks
Data
9
• Types ofdatao Time-series (withlotsofcategoricalfields)o Context(spatialdata)– Entities,blacklists,etc.o Multiple recordsforone“transaction”(fusion?)
• Manyaccess use-caseso Lookups/joins(externalservicesalso)o Search,aggregate,compute,…(Oneinterface?(extended)SQL?)
• Datachallengeso Collection(manydataformats,manytransports)o Scale(storagecost,accessspeed)o Encryption(transparent,fast)o Operationalchallenges(bottlenecks,etc.)o Collaboration(security,transport)o Howtofindrelevantinsights?Notstatistical anomalies!
• Canwegetareferenceimplementation? Theproverbialhairball
Analytics
10
• Mostlyanomaly/outlierdetection!Findingattackerbehaviorinthedatao Butwhat’snormal?Thisisnotaboutstatistical outliers!
• Approacheso Cohortanalysis(usersandmachines)->e.g.,clusteringo Hypothesisimplementation ->e.g.,beacondetectiono ”Learning”behavior->e.g.,interactivevisualizationofmetrics
• Analyticschallengeso Categoricaldatao Largeamountsofdatao Statisticalvs.actualanomalieso Distance functionso Nota‘closed’ system
• Weneedhumansintheloop!And that’swherevisualizationcomesin.Analyticsdrivesvisualization.
10
Visualization– Why?
©RaffaelMarty 14
1. Use analytics to prepare and summarize data. 2. Visualize the output.3. Help human analysts make decisions and take actions.
WhyVisualization?
15
• SELECT count(distinct protocol) FROM flows;
• SELECT count(distinct port) FROM flows;
• SELECT count(distinct src_network) FROM flows;
• SELECT count(distinct dest_network) FROM flows;
• SELECT port, count(*) FROM flows GROUP BY port;
• SELECT protocol,
count(CASE WHEN flows < 200 THEN 1 END) AS [<200],
count(CASE WHEN flows>= 201 AND flows < 300 THEN 1 END) AS [201 - 300],
count(CASE WHEN flows>= 301 AND flows < 350 THEN 1 END) AS [301 - 350],
count(CASE WHEN flows>= 351 THEN 1 END) AS [>351]
FROM flows GROUP BY protocol;
• SELECT port, count(distinct src_network) FROM flows GROUP BY port;
• SELECT src_network, count(distinct dest_network) FROM flows GROUP BY port;
• SELECT src_network, count(distinct dest_network) AS dn, sum(flows) FROM flows GROUP BY port, dn;
• SELECT port, protocol, count(*) FROM flows GROUP BY port, protocol;
• SELECT sum(flows), dest_network FROM flows GROUP BY dest_network;
• etc.
port dest_network
protocol src_network flows
VisualizationChallenges
• Visualizing1TBofdata?• VisualizationMantrabyBenShneiderman
• Drivesbackendrequirements• Capturevisuallearnings– automatefindings
Secur i ty. Analyt ics . Ins ight .27
Information Visualization Mantra
Overview Zoom / Filter Details on Demand
Principle by Ben Shneiderman
Sophos– SecurityMadeSimple
20
• Fornonexperts
• Consolidating securitycapabilities
• Openarchitecture
• DatasciencetoSOLVE problemsnottohighlightissues
Analytics
UTM/Next-GenFirewall
Wireless
Web
DiskEncryption
FileEncryption
Endpoint/Next-GenEndpoint
Mobile
Server
SophosCentral
[email protected]@raffaelmarty
©RaffaelMarty 21