real-time alarm verifica/on with spark streaming and ... · 1/23/2017  · real-time alarm...

54
Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger Zurich University of Applied Sciences Swiss Big Data User Group Mee3ng January 23, 2017

Upload: others

Post on 03-Jun-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Real-TimeAlarmVerifica/onwithSparkStreamingandMachineLearning

AnaSima,JanStampfli,KurtStockingerZurichUniversityofAppliedSciences

SwissBigDataUserGroupMee3ng

January23,2017

Page 2: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

AboutMe

•  Prof.Dr.KurtStockinger•  Sincesummer2013atZHAW

•  Databases,DataWarehousing,BigData

•  2007-2013:•  DataWarehouse&BusinessIntelligenceArchitect

•  2004-2007:•  ComputerScien3st

•  1999-2003:•  ComputerScien3st

2

Page 3: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

WhatistheProblemwithAlarmSystems?

3

Page 4: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

ATypicalExample

4

Seriousuniversi3es…

Page 5: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

…withSeriousStudents

5

Page 6: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

ASeriousPartyinaStudentHome

6

Page 7: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

…SeriousFireFighters

7

Page 8: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

WhyMachineLearning?

•  Around90%ofreportedincidentsarefalsealarms•  Humanoperatorsshouldpriori3zetruealarms•  Machinelearningisableto

•  discoverthepa_ernsburiedinthealarmdata•  separatetruefromfalsealarmswithahighaccuracy

8

Page 9: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

9

DIGITAL FOOTPRINT OF A SECURED OBJECT

Alarm panel

Transmitter

User behavior

Alarm types

User behavior will improve overall process User based business models are possible with machine learning tools e.g. Services payments only when an alarm panel is activated

Page 10: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

ResearchProjectwithSitasys

10

Page 11: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

TalkOutline

•  Machinelearningforalarmverifica3on

•  End-to-endperformanceanalysis:•  Streamprocessing(liveanalysis)•  Batchprocessing(historicanalysis)•  Onlinemachinelearning(liveverifica3on)

11

Page 12: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

MachineLearning

withSpark

12

Page 13: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

AboutMe

JanStampfli•  ZHAWBachelorinComputerScience

•  Graduatedin2014•  ResearchassistantatZHAW

•  SinceSeptember2014

•  Workinginresearchprojectsonthetopics•  BigData•  MachineLearning•  Informa3onRetrieval

13

The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.

Page 14: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

ATypicalMachineLearningProcess

14

Page 15: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

SparkMLPipeline

h_p://spark.apache.org/docs/latest/ml-pipeline.html

15

loca/on labelBern 1Chur 0Sion 1

hash206691120996822577109

predic/on probabili/es0 [0.51,0.49]0 [0.99,0.01]1 [0.13,0.87]

DataFrame

Pipeline

PipelineTransformer EstimatorTransformerTransformer

EstimatorEstimator

Page 16: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Pipeline

SparkMLTuning

•  RandomForest

•  Numberoftrees•  Maxdepthoftrees

•  ParameterMap•  [10,20,30]•  [5,25,50]

16

DataFrame

CrossValidator

Pipeline Evaluator

h_p://spark.apache.org/docs/latest/ml-tuning.html

Page 17: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

SparkMLEvalua3on

h_p://spark.apache.org/docs/latest/mllib-evalua3on-metrics.html 17

PredictionDataFrame

label probabili/es1 [0.51,0.49]0 [0.99,0.01]1 [0.13,0.87]

Evaluator

AccuracyPrecision

Recall

Page 18: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

AppliedMachineLearning

18

Page 19: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

DataaboutRealAlarms

•  Features

•  Loca3on,sensortype,dayofweek,…•  Labels

•  0=falsealarm•  1=truealarm

19

Total Falsealarms TruealarmsTotal 339’841 165’997 173’844Training 169’920 84’960 84’960Test 169’921 81’037 88’884

Page 20: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

AppliedAlgorithms

•  RandomForest

•  SupportVectorMachine

•  Logis3cRegression

•  DeepNeuralNetwork(DNN)

20

Page 21: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Results

21

JanStampfli,KurtStockinger(2016).ApplieddataScience:UsingMachineLearningforAlarmVerifica3on,ERCIMNews,107,10.

Results

21

Accuracy

92.33% 92.05%

Page 22: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

•  Confusionmatrix

•  Precision(correctlypredictedalarms)

•  Recall(correctlydetectedalarms)

22

Evalua3on–RandomForest

2222

Predic/onTrue Predic/onFalse

LabelTrue 82’999 5’885

LabelFalse 7’148 73’889

Predic/onTrue Predic/onFalse

LabelTrue 92.07%

LabelFalse 92.62%

Predic/onTrue Predic/onFalse

LabelTrue 93.38%

LabelFalse 91.18%

Page 23: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

ConclusionaboutML

•  Humanoperatorsarenowabletopriori3zealarmspredictedastrue

•  Falsealarmsmusts3llbeevaluated

23

Page 24: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Performance

24

Page 25: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

AboutMe

AnaSima

•  EPFLMasterinSWSystems,2015

•  Previouslyworkedonaframeworkforunifiedstream&batchprocessing(Cyclone)

•  JoinedZHAWsinceNovember201625

Page 26: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Welcomebackfromtheholidays!

26

Page 27: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Whyisalarmprocessinga3me-cri3calmission?

27

Page 28: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Whenthingsgowrong…

28

It’sallaboutresponse3me!

Page 29: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Performancetes3ng

29

DataConsumer

Streaming

Historic

MachineLearning

DataGenerator

Alarmstream

Page 30: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Designingtheconsumer

•  Spark-”AunifiedengineforBigDataProcessing”

30

SQL

StreamingMachineLearning

GraphProcessing

DataFrames

Kryo Gson Jackson

DataSets

RDDs

Page 31: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Noonesizefitsall,sohowdoIchoose???

31

Page 32: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Thebasecase

32

Producer

WriteNalarms/s

Consumer

Project&FilterDis3nctComputeHistogramPredictTrue/False

Alarms

Historicquery

Page 33: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

ConsumerWorkflow

•  Everywindowof10s•  Streaming

•  mAddrsInWindow = alarms.map(Alarm::getMacAddress).distinct();

•  Historic•  Select MacAddress, count(*) from mongoDB Where MacAddress in mAddrsInWindow

Group By MacAddress

•  MachineLearning•  (Coveredinfirsthalfoftalk)

33

Page 34: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

34

SAVEBenchmarkIden3fyingBo_lenecks

1

ZHAWDatalab

Page 35: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Iden3fyingbo_lenecks(1)

1.  Startedexperimentwith•  8-coreKataProducer•  8-coreSparkConsumer

2.  Consumerslowevenforabasicstreamingtask•  Countnumberofelementsinwindow=>afewsec

3.  Producernotoutpuungexpectedthroughput•  Stumbled@around12K/s(recordsize≃600B) 35

Page 36: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Commonrootcause?

36

Producer

Serialize,(fasterxmlJackson)WriteNalarms/s

ConsumerDeserialize,

Project&FilterDis3nctComputeHistogramPredictTrue/False

SerializedAlarms

Historicquery

Page 37: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Doesitreallyma_er?

•  Per-objectserializa3ononlytakesafewtensofns

37

•  Mul3plythatby12K..

•  Andbtw,haveitallsentin1s!

Page 38: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Areweusingtherighttool?

•  Answer:benchmarks*!

•  ”IfyourenvironmentprimarilydealswithlotsofsmallJSONrequests[…]thenGSONisyourlibraryofinterest.Jacksonstrugglesthemostwithsmallfiles.”

•  Jackson:upto3xslowerforsmallobjects

*TheulMmateJSONlibrary:hRp://blog.takipi.com/the-ulMmate-json-library-json-simple-vs-gson-vs-jackson-vs-json/

38

Page 39: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

39

SAVEBenchmarkIden3fyingBo_lenecks

2

ZHAWDatalab

Page 40: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Results(1)

05,00010,00015,00020,00025,00030,000

Jackson Gson

Producermaxthroughput(alarms/s)

+

40*<30LOCchange=>�~2xspeedupinbothProducer&Consumer

05,00010,00015,00020,00025,00030,000

Jackson Gson

Consumermaxthroughput(alarms/s)

Page 41: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Howmuchwillfitin10s?

41

2.91

2.3

0.02

0.02

0.02

7

5

7.5

0

2

4

6

8

10

12

Kata+Jackson(3.5Kalarms)

Kata+GSON(3.5Kalarms)

Kata+GSON(5.7Kalarms)

Breakdownof/me/subtask

Streaming Historic MachineLearning

Note:thealarmsRDDisusedforboththestreaming&theMLpart=>serializerchangebenefitsboth

Page 42: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Iden3fyingbo_lenecks(2)

1.  Caching?•  Broughtsomeimprovement,butsmall(~5%)•  Theamountofdatareusedistooli_le(20MB)

2.  Simplifyingthedesign?•  TheKISS*principle

•  KeepItSimple,Stupid!

42

Page 43: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Cappedcollec3on

Simplifica3on

43

Alarms

Cappedcollec3on

Producer

WriteNalarms/s

Consumer

Project&FilterDis3nctComputeHistogramPredictTrue/False

Alarms

Page 44: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

44

SAVEBenchmarkIden3fyingBo_lenecks

3

ZHAWDatalab

Page 45: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Results(2)

Producer

05,00010,00015,00020,00025,00030,000

Maxthroughput(alarms/s)

45

05,00010,00015,00020,00025,00030,000

Maxthroughput(alarms/s)

Note:Producerslow,buts3ll@previousmaxconsumerthroughput!Consumer:~50%speedup.

+

Page 46: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Howmuchwillfitin10s?

46

2.91 1

0.02

0.02 0.02

7

5

2

0

2

4

6

8

10

12

Kata+Jackson(3.5Kalarms)

Kata+GSON(3.5Kalarms)

Mongo+GSON(3.5Kalarms)

Breakdownof/me/subtask

Streaming Historic MachineLearning

Page 47: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Rootcause?

•  UsingKatadirectstreamwithasinglepar33on=>poorparallelism•  Mostopera3onsrunonasingleexecutor

•  EventhoughSparkisconfiguredtorunon8cores

•  UsingMongoDB=>finergrainedcontroloverparallelismlevelfortheRDD•  Readanarrayofdocumentsfromtailablecursor•  ParallelizearrayintoRDD&runMLoneachpart

47

Page 48: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

FIXIT!

ConfiguringtheKataDirectStreaminSparkwithproperseungs….

(numpar33ons=numcores)

48

Page 49: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Results(3)

Producer

05,00010,00015,00020,00025,00030,000

Maxthroughput(alarms/s)

49

05,00010,00015,00020,00025,00030,000

Maxthroughput(alarms/s)

+

Page 50: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Designingtheconsumer

•  Spark-”AunifiedengineforBigDataProcessing”

50

SQL

StreamingMachineLearning

GraphProcessing

DataFrames

Kryo Gson Jackson

DataSets

RDDs

Easytousetools…cansome/mesbetrickytogetright

Page 51: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Conclusions & Lessons Learned

•  Machine learning algorithm of Spark can be directly applied •  Works well for small and big data •  No re-writing necessary to make algorithm scale across computing cluster

•  Spark Streaming can’t be used out of the box for stream and batch processing: •  Need to use a persistency layer •  Requires significant performance tuning

•  Working with our industry partner Sitasys: •  Great collaboration •  Working with real-world problem is very rewarding •  Can make real contribution and enhance existing algorithms and

technology 51

Page 52: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Appendix

52

Page 53: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

PipelineExampleCode(Java)//CreatestringindexerTransformer.StringIndexerstringIndexer=newStringIndexer()

.setInputCol("loca/on").setOutputCol("indexLoca/on");//CreaterandomforestEsMmator.RandomForestClassifierrandomForestClf=newRandomForestClassifier()

.setFeaturesCol("indexLoca/on").setLabelCol("label");//CreatepipelineEsMmator(specifyingtheMLworkflow).Pipelinepipeline=newPipeline()

.setStages(newPipelineStage[]{stringIndexer,randomForestClf});//CreatecrossvalidaMonEsMmator(wrappingthepipelineEsMmator).ParamMap[]paramGrid=newParamGridBuilder()

.addGrid(randomForestClf.numTrees()(),newint[]{10,20,30}).addGrid(randomForestClf.maxDepth(),newint[]{5,25,50}).build();

CrossValidatorcrossValidator=newCrossValidator().setEs3mator(pipeline).setEs3matorParamMaps(paramGrid).setEvaluator(newBinaryClassifica3onEvaluator()).setNumFolds(10);

53

Page 54: Real-Time Alarm Verifica/on with Spark Streaming and ... · 1/23/2017  · Real-Time Alarm Verifica/on with Spark Streaming and Machine Learning Ana Sima, Jan Stampfli, Kurt Stockinger

Training,Tes3ngandEvalua3onExampleCode(Java)//Given(creaMonofdatasetnoincludedinexamplecode)Dataset<Row>trainSet=…;Dataset<Row>testSet=…;//Fitthecrossvalidatortotrainingdocuments.CrossValidatorModelmodel=crossValidator.fit(trainSet);//MakepredicMonsontestdocuments.Dataset<Row>predic3ons=model.transform(testSet);//Createevaluator.Mul3classClassifica3onEvaluatorevaluator=newMul3classClassifica3onEvaluator()

.setLabelCol("label").setPredic3onCol("predic/on");//EvaluatepredicMons.evaluator.setMetricName("accuracy");doubleaccuracy=evaluator.evaluate(predic3ons); 54