machine learning + analytics in splunk
TRANSCRIPT
Copyright©2015SplunkInc.
OperationalizingMachineLearning
2
DisclaimerDuringthecourseofthispresentation,wemaymakeforwardlookingstatementsregardingfuture
eventsortheexpectedperformanceofthecompany.Wecautionyouthatsuchstatementsreflectourcurrentexpectationsandestimatesbasedonfactorscurrentlyknowntousandthatactualeventsorresultscoulddiffermaterially.Forimportantfactorsthatmaycauseactualresultstodifferfromthose
containedinourforward-lookingstatements,pleasereviewourfilingswiththeSEC.Theforward-lookingstatementsmadeinthethispresentationarebeingmadeasofthetimeanddateofitslivepresentation.Ifreviewedafteritslivepresentation,thispresentationmaynotcontaincurrentoraccurateinformation.
Wedonotassumeanyobligationtoupdateanyforwardlookingstatementswemaymake.
Inaddition,anyinformationaboutourroadmapoutlinesourgeneralproductdirectionandissubjecttochangeatanytimewithoutnotice.Itisforinformationalpurposesonlyandshallnot,beincorporatedintoanycontractorothercommitment.Splunkundertakesnoobligationeithertodevelopthefeatures
orfunctionalitydescribedortoincludeanysuchfeatureorfunctionalityinafuturerelease.
Copyright©2015SplunkInc.
WhyMachineLearning?
4
Humans are good at learning, but we get lost
in volume and details…
5
WhydoweneedMachineLearning?
- ImproveDecisionMaking- ForecastorPredictKPIs- AlertonDeviation- Uncoverhiddentrendsor
relationships
AllofthisrequiresDiverseDatafromacrossManySilos.LotsofUnstructured,RealTimeData.
6
RuntheBusinessinReal-time
DataFromthePast Real-timeData StatisticalForecastT– afewdays T+afewdays
SecurityOperationsCenter
ITOperationsCenter
BusinessOperationsCenter
Predictive(Models)
Descriptive(BITools,DataLakes) Greyspace
Copyright©2015SplunkInc.
WhatisMachineLearning?
8
ML 101: What is Machine Learning?What: “Field of study that gives computers the ability to learn
without being explicitly programmed” – A. Samuel, 1959How: Generalizing (learning) from examples (data)
Simple ML workflow:– EXPLORE data– FIT models based on data– APPLY models in production– VALIDATE models– REPEAT
9
How Machines Learn[Prediction]• When we see thick clouds and an overcast sky, we
predict that it’s likely going to rain
[Estimation/ Regression]• Estimate how much an apartment costs based on its
location, condition and prices of properties in that neighborhood
[Classification/ Clustering]• Determine the gender of a person based on her/his
features, hair style and the way s/he dresses
[Anomaly Detection] • Identify the odd one out
[Reinforcement Learning]• If I made a mistake this time, can I do better next time?
Allofushavehadsomeexperienceinlearning.But…what’sbehindourexperience?Howdowetranslatethatknowledgetocode?
10
Major Types of Machine Learning1. Supervised Learning: generalizing from labeled data
11
Major Types of Machine Learning2.Unsupervised Learning:generalizingfromunlabeled data?
12
3. Reinforcement Learning: • System is rewarded (or punished) based on the outcomes it generates• Action leads to a change in the state of the world and generates an error score
Major Types of Machine Learning
Copyright©2015SplunkInc.
Splunk’s MachineLearningTour
14
OverviewofMLatSplunk
CorePlatformSearch PackagedPremiumSolutions CustomML
PlatformforOperationalIntelligence
15
SearchIncludesMachineLearningCorePlatformSearchisapowerfulandhighlyflexibleinterfacebuiltwithML
anomalydetection
16
SplunkITServiceIntelligence
GetData Defineservices,entitiesandKPIs
Monitorandtroubleshoot
Analyzeanddetect
Data-Defined,Data-DrivenServiceInsights
PackagedML:AdaptiveThresholdsandAnomalyDetection
OneofseveralPremiumSolutions
17
SplunkMachineLearningToolkit
Assistants: Guidemodelbuilding,testing,&deployingforcommonobjectivesShowcases: InteractiveexamplesfortypicalIT,security,business,IoTusecases
Algorithms: 25+standardalgorithmsavailableprepackagedwiththetoolkitSPLMLCommands:Newcommandstofit,testandoperationalizemodelsPythonforScientificComputingLibrary:300+opensourcealgorithmsavailableforuse
Buildcustomanalyticsforanyusecase
ExtendsSplunkplatformfunctionsandprovidesaguidedmodelingenvironment
18
Algorithmssupported(v2.0,.conf2016)
ITSI,UBA
DomainExpertise(IT,Security,…)
DataScienceExpertise
SplunkExpertise
CustomMachineLearning– SuccessFormula
Identifyusecases
Drivedecisions
Setbusiness/opspriorities
SPL
Dataprep
Statistics/mathbackground
Algorithmselection
Modelbuilding
SplunkMLToolkitfacilitatesandsimplifiesviaexamples&guidance
Operationalsuccess
20
Summary:TheMLProcessProblem:<Stuffintheworld>causesbigtime&moneyexpense.ValueHypothesisSolution:BuildMLmodeltoforecast<possibleincidents>,actpre-emptively&learn
Ope
ratio
nalize
1. Getalltherelevantdatatotheproblem;Explore thedata
2. SelectandFitanalgorithmonthedata,generatingamodel
3. Apply &Validatemodelsuntilpredictionssolvetheproblem
4. SurfacethemodeltoXOps,whoconsumethemodeltosolvetheproblem
21
MachineLearningProcesswithSplunk
21
CollectData
Explore/Visualize
Model
Evaluate
Clean/Transform
Publish/Deploy
props.conf,transforms.conf,DatamodelsAdd-onsfromSplunkbase,etc.
Pivot,TableUI,SPLMLToolkit
Alerts,Dashboards,Reports
Copyright©2015SplunkInc.
SplunkArchitecture&ML
23
ContinuousDataIngestatScale
DevelopVisualize PredictAlertSearch
Engineers DataAnalysts
SecurityAnalysts
BusinessUsers
NativeInputsTCP,UDP,Logs,Scripts,Wire,Mobile
IndustrialData
SCADA,AMI,MeterReads
ModularInputsMQTT,AMQP,COAP,REST,JMS
HTTPEventCollectorTokenAuthenticatedEvents
RealTime
TechnologyPartnershipsKepware,AWSIoT,Cisco,PaloAlto
MaintenanceInfo
AssetInfo
DataStores
ExternalLookups/Enrichment
23
OT
IndustrialAssets
IT
ConsumerandMobileDevices
24
SenseandRespond
RealTime Search Alert
Third-PartyApplications
SmartphonesandDevices
Tickets
Sendanemail
Fileaticket
Sendatext
Flashlights
Triggerprocessflow
24
OT
IndustrialAssets
IT
ConsumerandMobileDevices
EverySearchCanUseMachineLearning
25
Splunk:DataFabric
25
OT
IndustrialAssets
IT
ConsumerandMobileDevices
RealTime
ITusers Analysts BusinessUsers
AdHocSearch
CustomDashboards
MonitorandAlert
Reports/Analyze
Clickstreams HadoopDevices Networks
GPS/Cellular
OnlineShoppingCarts
Servers Applications
Analysts BusinessUsers
DataWarehouses
StructuredDataSources
CRM ERP HR Billing Product Finance
DBConnectLook-ups
ODBCSDKAPI
Differentlenses intothesamedata
SCADAOpsCenter BizOpsCenter
ITOpsCenter
Compliance
SecurityOpsCenter
DataReuse=GreaterDataLeverageFraudOpsCenter,etc…
Copyright©2015SplunkInc.
MLUseCasesAndCustomerStories
28
MLIsAllAroundYou!Recall:EXPLORE>FIT>APPLY>VALIDATE>REPEAT
• Facedetection:findfacesinimages
• Spamfiltering:identifySPAMmessages
• ShoppingRecommendations:predictwhatcustomerswouldliketobuy
• Frauddetection:identifycreditcardtransactionswhichmaybefraudulentinnature
• Weatherforecast:predictwhetherornotitwillraintomorrow;estimatedailymax/min
29
MachineLearningCustomerSuccess
NetworkIncidentDetectionServiceDegradationDetection Security/FraudPrevention
PrioritizeWebsiteIssuesandPredictRootCause
PredictGamingOutagesFraudPrevention
MachineLearningConsultingServices AnalyticsAppbuiltonMLToolkit
Optimizingoperationsandbusinessresults
CellTowerIncidentDetectionOptimizeRepairOperations
Entertainment Company
15
30
MLToolkitCustomerUseCases
30
Speedingwebsiteproblemresolutionbyautomaticallyrankingactionsforsupportengineers
Reducingcustomerservicedisruptionwithearlyidentificationofdifficult-to-detectnetworkincidents
Minimizingcelltowerdegradationanddowntimewithimprovedissuedetectionsensitivity
Improvingcelltoweruptimeandreducingrepairtruckroleswithanomalydetectionandrootcauseanalysis
Predictingandavertingpotentialgamingoutageconditionswithfiner-graineddetection
EnsuringmobiledevicesecuritybydetectinganomaliesinIDauthentication
PreventingfraudbyIdentifyingmaliciousaccountsandsuspiciousactivitiesEntertainment Company
31
DetectNetworkOutliersReduceddowntime+increasedserviceavailability=bettercustomersatisfaction
31
MLUseCase Monitornoiserisefor20,000+celltowerstoincreaseserviceanddeviceavailability,reduceMTTR
Technicaloverview • Acustomizedsolutiondeployedinproductionbasedonoutlierdetection.• Leveragepreviousmonthdataandvotingalgorithms
“TheabilitytomodelcomplexsystemsandalertondeviationsiswhereITandsecurityoperationsareheaded…SplunkMachineLearninghasgivenusaheadstart...”
32
ReliablewebsiteupdatesProactivewebsitemonitoringleadstoreduceddowntime
32
“SplunkMLhelpsusrapidlyimproveend-userexperiencebyrankingissue severitywhichhelpsusdeterminerootcausesfasterthusreducingMTTRandimprovingSLA”
• Veryfrequentcodeandconfig updates(1000+daily)cancausesiteissues• Finderrorsinserverpools,thenprioritizeactionsandpredictrootcause
• CustomoutlierdetectionbuiltusingMLToolkitOutlierassistant• BuiltbySplunkArchitectwithnoDataSciencebackground
MLUseCase
Technicaloverview
Copyright©2015SplunkInc.
ShowmetheML!
34
NextStepswithSplunkML• ReachouttoyourTechTeam!WecanhelparchitectMLworkflows.• LotsofMLcommandsinCoreSplunk(predict,anomalydetection,stats)• MLToolkit&Showcase– availableandfree,readytouse• SplunkITSI:AppliedMLforITOAusecases
– Manage1000sofKPIs&alerts– AdaptiveThresholding&AnomalyDetection
• SplunkUBA:AppliedMLforSecurity– UnsupervisedlearningofUsers&Entities– SurfacesAnomalies&Threats
• MLCustomerAdvisoryProgram:– ConnectwithProduct&Engineeringteams- [email protected]
35
WhatElse?• GettheMachineLearningToolkitfromSplunkbase• GowatchMachineLearningVideosonSplunkYoutube Channel
http://tiny.cc/splunkmlvideos
• Go watchtheMachineLearningstalksfromConf 2016:– AdvancedMachineLearninginSPLwiththeMachineLearningToolkitbyJacob
Leverich– ExtendingSPLwithCustomSearchCommandsandtheSplunkSDKforPythonby
JacobLeverich
• EarlyAdopterAndCustomerAdvisoryProgram:[email protected]• FieldMLArchitects:AndrewStein(astein@),BrianNash(bnash@)
36
MarkYourCalendars!• .conf2017isgoingtoDC!• Sept26-28,2017• WalterEWashingtonConventionCenter
Copyright©2015SplunkInc.
Thankyou!