ashfaq munshi, ml7 fellow, pepperdata

29
Classifying Multivariate Time Series Scalably Ashfaq Munshi, Saeed Bidhendi, Faramarz Munshi November 10, 2017

Upload: mlconf

Post on 22-Jan-2018

221 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Ashfaq Munshi, ML7 Fellow, Pepperdata

Classifying Multivariate Time Series Scalably

Ashfaq Munshi, Saeed Bidhendi, Faramarz Munshi

November 10, 2017

Page 2: Ashfaq Munshi, ML7 Fellow, Pepperdata

• Background and Motivation

• Univariate Time Series (UTS)

• Multivariate Time Series (MTS)

• Conclusion

Overview

© Pepperdata, Inc.2

Page 3: Ashfaq Munshi, ML7 Fellow, Pepperdata

Background

Page 4: Ashfaq Munshi, ML7 Fellow, Pepperdata
Page 5: Ashfaq Munshi, ML7 Fellow, Pepperdata

Pepperdata Telemetry Data Scale

Example production deployment:

© Pepperdata, Inc.5

570Nodes

20Tasks /Node

300Metrics /

Task

5-Sec Sampling

41 MillionPoints / Minute

Page 6: Ashfaq Munshi, ML7 Fellow, Pepperdata

300Trillion

PerformanceData Points Collected

Our Big Data About Production Big Data

© Pepperdata, Inc.6

22Thousand

ProductionNodes

50MillionJobs/Year

Page 7: Ashfaq Munshi, ML7 Fellow, Pepperdata

Example Time Series

© Pepperdata, Inc.7

Page 8: Ashfaq Munshi, ML7 Fellow, Pepperdata

• Highly variable in length

• 10 data points to 10K+ data points

• Missing data

• Extremely noisy

Characteristics of our TS

© Pepperdata, Inc.8

Page 9: Ashfaq Munshi, ML7 Fellow, Pepperdata

Problem

© Pepperdata, Inc.9

Classify this collection of time series

to give operators a better understanding of

resource utilization on their clusters and to

enable a scheduler to better optimize cluster

resources

Page 10: Ashfaq Munshi, ML7 Fellow, Pepperdata

Univariate Time Series

Page 11: Ashfaq Munshi, ML7 Fellow, Pepperdata

• Two recent approaches from the literature

• Transform the TS into an image then use a tiled CNN

[Wang & Oats 2015]

• Transform the TS into a bag of patterns

[Schafer & Leser 2017]

• Dataset is the UCR data set

• 82 time series data sets

• Number of series < 10K

• Data points per series < 2K

Approaches and Data Set

© Pepperdata, Inc.11

Page 12: Ashfaq Munshi, ML7 Fellow, Pepperdata

• Map the time series into

• Gramian Angular Summation Fields

• Gramian Angular Difference Fields

• Markov Transition Fields

• Feed images into a tiled CNN for classification

Time Series and Images

© Pepperdata, Inc.12

[Wang & Oats, 2015]

Page 13: Ashfaq Munshi, ML7 Fellow, Pepperdata

• Normalize the time series into [-1,1]

• Transform to Polar Coordinates

Gramian Angular Fields

© Pepperdata, Inc.13

[Wang & Oats, 2015]

Page 14: Ashfaq Munshi, ML7 Fellow, Pepperdata

Example GADF Image

© Pepperdata, Inc.14

[Wang & Oats, 2015]

Page 15: Ashfaq Munshi, ML7 Fellow, Pepperdata

• Divide TS into windows

• Fourier Transform TS in window

• Apply low-pass filter

• Quantize the Fourier coefficients

• Map window to words

• Extract features from sentences

• Use Logistic Regression classifier

Time Series and Bag of Patterns

© Pepperdata, Inc.15

[Schafer & Leser 2017]

Page 16: Ashfaq Munshi, ML7 Fellow, Pepperdata

• Convert TS into image (GADF)

• Use Google’s pre-trained CNN; trained on inception v3

• Embed into 2,048-dimensional vector space

• Train MLP

• 2 hidden layers (50 nodes each)

• ReLU activation

• Dropout for regularization (.1, .2)

• Softmax final layer

Our “Off the shelf” Approach (PD)

© Pepperdata, Inc.16

Page 17: Ashfaq Munshi, ML7 Fellow, Pepperdata

Accuracies for a subset of UCR

© Pepperdata, Inc.17

0%

20%

40%

60%

80%

100%

BOSS (91.1)

PD (89.8)

GADF+GASF+MTF (86.4)

Page 18: Ashfaq Munshi, ML7 Fellow, Pepperdata

Accuracy on a subset of UCR

© Pepperdata, Inc.18

68%

70%

72%

74%

76%

78%

80%

82%

84%

86%

WEASEL 1-NN DTW CV 1-NN DTW BOSS LearningShapelet (LS)

TSBF ST EE (PROP) COTE(ensemble)

PD

Page 19: Ashfaq Munshi, ML7 Fellow, Pepperdata

Training Time Comparison

© Pepperdata, Inc.19

PD

Page 20: Ashfaq Munshi, ML7 Fellow, Pepperdata

Multivariate Time Series

Page 21: Ashfaq Munshi, ML7 Fellow, Pepperdata

• Two recent approaches from the literature

• Use an ESN (“Echo State Network”) to map MTS into

state clouds [Wang, Wang, Liu 2015]

• Use Dynamic Time Warping with Mahalanobis distance

metric [Mei, Liu, Wang, Gao 2016]

• Dataset is from UCI, a small subset of UCR and others

• Number of series ~ 10K

• Data points per series ~ 200

Approaches and Data Set

© Pepperdata, Inc.21

Page 22: Ashfaq Munshi, ML7 Fellow, Pepperdata

• Make TS for each variable the same length by zero

padding

• Convert each TS into a GADF image

• Interpolate any missing data points in the image using

linear interpolation on the image

• Stack the images for the five variables

• Use the same process as before for univariate time

series

Our “Off the Shelf” Approach (PD)

© Pepperdata, Inc.22

Page 23: Ashfaq Munshi, ML7 Fellow, Pepperdata

5-Fold Cross Validation Error

© Pepperdata, Inc.23

0

5

10

15

20

25

30

Robot failure LP1 Robot failure LP2 Robot failure LP3 Robot failure LP4 Robot failure LP5

MDDTW Best

PD 5-fold

Page 24: Ashfaq Munshi, ML7 Fellow, Pepperdata

10-Fold Cross Validation Error

© Pepperdata, Inc.24

0

5

10

15

20

25

30

Robot failure LP1 Robot failure LP2 Robot failure LP3 Robot failure LP4 Robot failure LP5

Echo Network Best

PD 10-fold

Page 25: Ashfaq Munshi, ML7 Fellow, Pepperdata

• Four variables:

• CPU, Virtual Memory, HDFS reads, Network Ops

• Each time series collected over one week

• 10 data points to 10K+ data points

• Missing data

• Extremely noisy

• For periods longer than a week, data is much larger

• Sampling rate is the same for all TS

PD Data

© Pepperdata, Inc.25

Page 26: Ashfaq Munshi, ML7 Fellow, Pepperdata

Accuracy per Label on PD Dataset G

© Pepperdata, Inc.26

0

20

40

60

80

100

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Accuracy

Number of TS = 3092

Lengths per TS = 5 to 8500

Average Accuracy = 78.14%

Page 27: Ashfaq Munshi, ML7 Fellow, Pepperdata

Accuracy per Label on PD Dataset R

© Pepperdata, Inc.27

Number of TS = 6715

Lengths per TS = 5 to 9400

Average Accuracy = 75.95

0

20

40

60

80

100

120

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

Page 28: Ashfaq Munshi, ML7 Fellow, Pepperdata

Summary

© Pepperdata, Inc.28

Our “Off the Shelf” approach is as good as the

best approaches for both UTS and MTS. And,

the methodology is the same for both types of

TS.

Page 29: Ashfaq Munshi, ML7 Fellow, Pepperdata

Thank You