![Page 1: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/1.jpg)
Copyright©2016Splunk Inc.
JacobLeverichSoftwareEngineer,Splunk
AdvancedMachineLearninginSPLwiththeMachineLearningToolkit
![Page 2: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/2.jpg)
Disclaimer
2
Duringthecourseofthispresentation,wemaymakeforwardlookingstatementsregardingfutureeventsortheexpectedperformanceofthecompany.Wecautionyouthatsuchstatementsreflectourcurrentexpectationsandestimatesbasedonfactorscurrentlyknowntousandthatactualeventsorresultscoulddiffermaterially.Forimportantfactorsthatmaycauseactualresultstodifferfromthose
containedinourforward-lookingstatements,pleasereviewourfilingswiththeSEC.Theforward-lookingstatementsmadeinthethispresentationarebeingmadeasofthetimeanddateofitslivepresentation.Ifreviewedafteritslivepresentation,thispresentationmaynotcontaincurrentoraccurateinformation.Wedonotassumeanyobligationtoupdateanyforwardlookingstatementswemaymake.Inaddition,anyinformationaboutourroadmapoutlinesourgeneralproductdirectionandissubjecttochangeatanytimewithoutnotice.Itisforinformationalpurposesonlyandshallnot,beincorporatedintoanycontractorothercommitment.Splunkundertakesnoobligationeithertodevelopthefeaturesor
functionalitydescribedortoincludeanysuchfeatureorfunctionalityinafuturerelease.
![Page 3: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/3.jpg)
WhoamI?Splunker for2years,basedinSanFrancisco
Engineeringleadfor…– MLToolkitandShowcaseApp– ITSIAnomalyDetectionandAdaptiveThresholdingfeatures– Splunk customsearchcommandinterface
Initialauthoroffit/applycommandsinMLToolkit
Die-hardLonghornsfan
3
![Page 4: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/4.jpg)
Agenda
MachineLearning+SplunkML-SPL:MachineLearninginSPL– Whatitis– Howitworks
OverviewofAlgorithmsandAnalyticsavailableinML-SPLTipsforFeatureEngineeringinSPLWrapup
4
![Page 5: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/5.jpg)
MachineLearning+Splunk
![Page 6: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/6.jpg)
MachineLearningisNotMagic
…it’saprocess.
Theprocessstartswithaquestion:– HowmanyrequestsdoIexpectinthenexthour?– Howlikelyisthisharddrivetofailinthenearfuture?– AmIbeinghacked?
ê IsitunexpectedforJoetologintothebastionhostat2am?
6
![Page 7: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/7.jpg)
MachineLearningisNotMagic
…it’saprocess.
7
CollectData
Explore/Visualize
Model
Evaluate
Clean/Transform
Publish/Deploy
![Page 8: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/8.jpg)
8
“CleaningBigData:MostTime-Consuming,LeastEnjoyableDataScienceTask,SurveySays”,ForbesMar23,2016
![Page 9: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/9.jpg)
Splunk forDataPreparation
9
CollectData
Explore/Visualize
Model
Evaluate
Clean/Transform
Publish/Deploy
props.conf,transforms.conf,DatamodelsAdd-onsfromSplunkbase,etc.
Pivot,TableUI,SPLMLToolkit
Alerts,Dashboards,Reports
![Page 10: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/10.jpg)
ML-SPL:MachineLearninginSPL
![Page 11: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/11.jpg)
ML-SPL:Whatisit?AsuiteofSPLsearchcommandsspecificallyforMachineLearning:– fit– apply– summary– listmodels– deletemodel– sample
ImplementedusingmodulesfromthePythonforScientificComputingadd-onforSplunk:– scikit-learn,numpy,pandas,statsmodels,scipy
![Page 12: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/12.jpg)
ML-SPLCommands:A“grammar”forMLFit(i.e.train)amodelfromsearchresults… | fit <ALGORITHM> <TARGET> from <VARIABLES …>
<PARAMETERS> into <MODEL>
Applyamodeltoobtainpredictionsfrom(new)searchresults… | apply <MODEL>
Inspectthemodelbuiltby<ALGORITHM> (e.g.displaycoefficients)| summary <MODEL>
![Page 13: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/13.jpg)
ML-SPLCommands:fit
13
… | fit <ALGORITHM> <TARGET> from <VARIABLES …><PARAMETERS> into <MODEL>
Examples:
… | fit LinearRegressionsystem_temp from cpu_load fan_rpminto temp_model
… | fit KMeans k=10downloads purchases posts days_active visits_per_dayinto user_behavior_clusters
… | fit LinearRegressionpetal_length from species
optional
![Page 14: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/14.jpg)
14
![Page 15: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/15.jpg)
fit:HowItWorks
15
1. Discardfieldsthatarenullforallsearchresults.2. Discardnon-numericfieldswith>100distinctvalues.3. Discardsearchresultswithanynullfields.4. Convertnon-numericfieldstobinaryindicatorvariables
(i.e.“dummycoding”).5. Converttoanumericmatrixandhandoverto<ALGORITHM>.6. Computepredictionsforallsearchresults.7. Savethelearnedmodel.
![Page 16: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/16.jpg)
fit:HowItWorks
16
1. Discardfieldsthatarenullforallsearchresults.
field_A field_B field_C field_D field_E
ok 41 red 172.24.16.5
ok 32 green 192.168.0.2
FRAUD 1 blue 10.6.6.6
ok 43 171.64.72.1
2 blue 192.168.0.2
Target ExplanatoryVariables…
… | fit LogisticRegression field_A from field_*
![Page 17: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/17.jpg)
fit:HowItWorks
17
2. Discardnon-numericfieldswith>100distinctvalues.
field_A field_B field_D field_E
ok 41 red 172.24.16.5
ok 32 green 192.168.0.2
FRAUD 1 blue 10.6.6.6
ok 43 171.64.72.1
2 blue 192.168.0.2
Target ExplanatoryVariables…
… | fit LogisticRegression field_A from field_*
![Page 18: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/18.jpg)
fit:HowItWorks
18
3. Discardsearchresultswithanynullfields.
field_A field_B field_D
ok 41 red
ok 32 green
FRAUD 1 blue
ok 43
2 blue
Target ExplanatoryVariables…
… | fit LogisticRegression field_A from field_*
![Page 19: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/19.jpg)
fit:HowItWorks
19
field_A field_B field_D
ok 41 red
ok 32 green
FRAUD 1 blue
Target ExplanatoryVariables…
… | fit LogisticRegression field_A from field_*
4. Convertnon-numericfieldstobinaryindicatorvariables.
field_A field_B field_D=red …=green …=blue
ok 41 1 0 0
ok 32 0 1 0
FRAUD 1 0 0 1
![Page 20: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/20.jpg)
fit:HowItWorks
20
5. Converttoanumericmatrixandhandoverto<ALGORITHM>.y = X=
… | fit LogisticRegression field_A from field_*
[1,1,0] [[41,1,0,0],[32,0,1,0],[1,0,0,1]]
𝑦" = 1
1 + 𝑒((*+,)Find𝜃 usingmaximumlikelihoodestimation.
e.g.forLogisticRegression:
Modelinferencegenerallydelegatedtoscikit-learnandstatsmodels.(e.g.sklearn.linear_model.LogisticRegression)
![Page 21: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/21.jpg)
fit:HowItWorks
21
6. Computepredictionsforallsearchresults.
field_A field_B field_C field_D field_E predicted(field_A)
ok 41 red 172.24.16.5 ok
ok 32 green 192.168.0.2 ok
FRAUD 1 blue 10.6.6.6 FRAUD
ok 43 171.64.72.1 ok
2 blue 192.168.0.2 FRAUD
Target ExplanatoryVariables…
… | fit LogisticRegression field_A from field_*
Prediction
![Page 22: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/22.jpg)
22
![Page 23: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/23.jpg)
23
![Page 24: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/24.jpg)
24
![Page 25: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/25.jpg)
fit:HowItWorks
25
7. Savethelearnedmodel.
Serializemodelsettings,coefficients,etc.intoaSplunk lookuptable.• ReplicatedamongstmembersofSearchHeadCluster.• AutomaticallydistributedtoIndexerswithsearchbundle.
… | fit LogisticRegression field_A from field_* into logreg_model
![Page 26: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/26.jpg)
26
![Page 27: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/27.jpg)
fit:Properties
27
Eacheventisan“example”forthelearningalgorithm.
Resilienttomissingvalues.(butbecareful!)
Automaticallyhandlescategorical(e.g.non-numeric)fields.
SAVESITSWORK:– Learnedmodelcanbeappliedtonew,unseen data
withtheapply command.
![Page 28: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/28.jpg)
fit:Scalability
28
Somealgorithmsareinherentlynotscalable.– e.g.Kernel-basedSupportVectorMachinesis𝑂 𝑁1
Inputissampledusingreservoirsampling.– Per-algorithmsamplereservoirsize,typically100,000events– Configurableinmlspl.conf.
Somealgorithmssupportincrementalfitting,e.g.:SGDRegressor,SGDClassifier,NaiveBayes– Use“partial_fit=t”optionwithfit command.– Nosampling,noeventlimit!
Forthemostpart,youdon’tneedtocare.
![Page 29: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/29.jpg)
ML-SPLCommands:apply
29
… | apply <MODEL>
Examples:
… | apply temp_model
… | apply user_behavior_clusters
… | apply petal_length_from_species
![Page 30: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/30.jpg)
30
![Page 31: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/31.jpg)
apply:HowItWorks
31
1. Loadthelearnedmodel.2. Discardfieldsthatarenullforallsearchresults.3. Discardnon-numericfieldswith>100distinctvalues.4. Convertnon-numericfieldstobinaryindicatorvariables
(i.e.“dummycoding”).5. Discardvariablesnotinthelearnedmodel.6. Fillmissingfieldswith0’s.7. Converttoanumericmatrixandhandoverto<ALGORITHM>.8. Computepredictionsforallsearchresults.
![Page 32: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/32.jpg)
field_A field_B field_D
ok 41 red
ok 32 green
FRAUD 1 blue
41 yellow
field_A field_B field_D=red …=green …=blue …=yellow
ok 41 1 0 0 0
ok 32 0 1 0 0
FRAUD 1 0 0 1 0
41 0 0 0 1
apply:HowItWorks
32
Target ExplanatoryVariables…
… | apply fraud_model
4. Convertnon-numericfieldstobinaryindicatorvariables.
![Page 33: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/33.jpg)
field_A field_B field_D=red …=green …=blue …=yellow
ok 41 1 0 0 0
ok 32 0 1 0 0
FRAUD 1 0 0 1 0
41 0 0 0 1
apply:HowItWorks
33
Target ExplanatoryVariables…
… | apply fraud_model
5. Discardvariablesnotinthelearnedmodel.
![Page 34: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/34.jpg)
apply:HowItWorks
34
5. Converttoanumericmatrixandhandoverto<ALGORITHM>.
y = X=
… | apply fraud_model
[1,1,0,1,?] [[41,1,0,0],[32,0,1,0],[1,0,0,1],[41,0,0,0]]
𝑦" = 1
1 + 𝑒((*+,)Compute𝑦" usingθ foundbyfit command.
e.g.forLogisticRegression:
![Page 35: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/35.jpg)
apply:HowItWorks
35
7. Computepredictionsforallsearchresults.
field_A field_B field_C field_D field_E predicted(field_A)
ok 41 red 172.24.16.5 ok
ok 32 green 192.168.0.2 ok
FRAUD 1 blue 10.6.6.6 FRAUD
ok 43 171.64.72.1 ok
41 yellow 192.168.0.2 ok
Target ExplanatoryVariables…
… | apply fraud_model
Prediction
![Page 36: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/36.jpg)
apply:Properties
36
Learnedmodelscanbeappliedtonew,unseen data.
| fit isto| apply
as
| outputlookup isto| lookup
Resilienttomissingvalues.(but,again,becareful!)
Automaticallyhandlescategorical(e.g.non-numeric)fields.
![Page 37: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/37.jpg)
apply:Scalability
37
Nolimits.
Whenpossible,executesattheIndexingtier.– Fullyparallelized;harnesstheCPUpowerofyourIndexingCluster.– Mustset“streaming_apply = true”inmlspl.conf.
![Page 38: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/38.jpg)
ML-SPLCommands:summary
38
… | summary <MODEL>
Examples:
… | summary temp_model
… | summary user_behavior_clusters
… | summary petal_length_from_species
![Page 39: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/39.jpg)
39
![Page 40: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/40.jpg)
40
𝑦" = 1
1 + 𝑒((*+,)
![Page 41: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/41.jpg)
AlgorithmsandAnalyticsinML-SPL
![Page 42: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/42.jpg)
RegressionAlgorithms(e.g.predictnumericfields)
LinearRegression– …includingLasso,Ridge,ElasticNet
KernelRidgeDecisionTreeRegressorRandomForestRegressorSGDRegressor
Allimplementedwithsklearn models.
![Page 43: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/43.jpg)
ClassificationAlgorithms(e.g.predictcategoricalfields)
LogisticRegressionDecisionTreeClassifierRandomForestClassifierSGDClassifierSVMNaïve Bayes– IncludingBernoulliNB andGuassianNB
![Page 44: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/44.jpg)
ClusteringAlgorithms(e.g.grouplikewithlike)
KMeansDBSCANBirchSpectralClustering
![Page 45: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/45.jpg)
FeatureEngineeringAlgorithms(e.g.datapre-processing)
TFIDF (term-frequencyxinversedocument-frequency)– Transformfree-formtextintonumericfields
StandardScaler (i.e.normalization)FieldSelector (i.e.chooseKbestfeaturesforregression/classification)PCA andKernelPCA
![Page 46: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/46.jpg)
“Pipeline”MultipleAlgorithms
46
Example:TextAnalytics– TFIDF totransformfree-formmessagesintonumericfields,
followedby…ê KMeans togroupsimilarmessagesê BernoulliNB toclassifymessages(e.g.accordingtosentiment)ê PCA tovisualizedistributionofmessages
– … | fit TFIDF message | fit Kmeans message_tfidf_* | ...
AnalogoustoPipelineconceptfromsklearn orSparkMLLib
![Page 47: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/47.jpg)
47
![Page 48: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/48.jpg)
48
![Page 49: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/49.jpg)
“Pipeline”MultipleAlgorithms
49
ML-SPLanalyticsarestackable.
VeryadvancedMLuse-casesaresuccinctlyexpressible.
![Page 50: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/50.jpg)
TipsforFeatureEngineering
![Page 51: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/51.jpg)
TipsforFeatureEngineeringWorkonaggregates,notrawevents.– DONOTusefiton1,000,000,000events.DOusestats.
Useeval tocomputenewfeatures.
Usestreamstats toconstructleadingindicators.
…
![Page 52: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/52.jpg)
Workonaggregates,notrawevents… | fit KMeans k=10
downloads purchases posts days_active visits_per_dayinto user_behavior_clusters
Usestats andlookuptablestoconstructfeatures:
index=activity_logs| stats count by action user_id| xyseries user_id action count | fillnull| lookup user_activity user_id
OUTPUT days_active visits_per_day| fit KMeans k=10 …
![Page 53: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/53.jpg)
Useeval tocomputenewfeatures
53
Coercenumbersintocategoriesbyprependingastring:– … | eval region_id = “Region ” + region_id | …
Modelinteractionsbetweenfeatures:– … | eval X_factor = importance * urgency | …– Use+forcategoricalfields,*fornumeric
Makenon-linearfeaturesoutofnumericvalues:– … | eval temperature = pow(temperature,2) | …– … | eval latency = log(latency) | …
![Page 54: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/54.jpg)
54
![Page 55: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/55.jpg)
55
![Page 56: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/56.jpg)
Usestreamstats forleadingindicators
56
index=application_log OR index=tickets| timechart span=1d count(failure) as FAILS,
count(“Change Request”) as CHANGES| reverse| streamstats window=3 sum(FAILS) as FAILS_NEXT_3DAYS| reverse| fit LinearRegression FAILS_NEXT_3DAYS from CHANGES
into FAILS_PREDICTION_MODEL
![Page 57: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/57.jpg)
Wrap-up
![Page 58: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/58.jpg)
Whatdidwecover?
MachineLearning+SplunkML-SPL:MachineLearninginSPL– Whatitis– Howitworks
OverviewofAlgorithmsandAnalyticsavailableinML-SPLTipsforFeatureEngineeringinSPL
58
![Page 59: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/59.jpg)
WhatNow?
59
InstalltheMLToolkitfromSplunkbase!– http://tiny.cc/splunkmlapp
Don’tmissManishSainani’s orAdamOliner’s talks!
ProductManager:ManishSainani <[email protected]>FieldExpert:AndrewStein<[email protected]>Me:JacobLeverich<[email protected]>
![Page 60: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/60.jpg)
THANKYOU
![Page 61: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/61.jpg)
fit:Misc.details
61
Multi-classclassificationproblemstypicallymodeledas“one-vs-rest”
SomealgorithmsdoNOTsupportsavedmodels,e.g.:– DBSCANandSpectralClustering
![Page 62: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/62.jpg)
ML-SPLCommandsfit <ALGORITHM> <TARGET> from <VARIABLES …> <PARAMETERS> into <MODEL>– Fit(i.e.train)amodelfromsearchresults
apply <MODEL>– Applyamodeltoobtainpredictionsfrom(new)searchresults
summary <MODEL>– Inspectthemodelinferredby<ALGORITHM> (e.g.,displaycoefficients)
![Page 63: Advanced Machine Learning in SPL with the …...Advanced Machine Learning in SPL with the Machine Learning Toolkit Disclaimer 2 During the course of this presentation, we may make](https://reader030.vdocuments.mx/reader030/viewer/2022041017/5ecb1615175edb27d35fcdf4/html5/thumbnails/63.jpg)
SlideTitle