deep recurrent neutral networks for sequence learning in spark

Download Deep recurrent neutral networks for Sequence Learning in Spark

Post on 09-Jan-2017

252 views

Category:

Technology

0 download

Embed Size (px)

TRANSCRIPT

Prsentation PowerPoint

Deep recurrent neural networks for Sequence Learning in Spark

Yves MABIALA

www.thalesgroup.com

OPEN

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.OutlineThales & Big DataOn the difficulty of Sequence LearningDeep Learning for Sequence LearningSpark implementation of Deep LearningUse casesPredictive maintenanceNLP

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.Thales & Big DataThales systems produce a huge quantity of dataTransportation systems (ticketing, supervision, )Security (radar traces, network logs, )Satellite (photos, videos, )which is oftenMassiveHeterogeneousExtremely dynamic

and where understanding the dynamic of the monitored phenomena is mandatory Sequence Learning

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.

What is sequence learning ?Sequence learning refers to a set of ML tasks where a model has to either deal with sequences as input, produce sequences as output or both

Goal : Understand the dynamic of a sequence toClassifyPredictModelTypical applicationsTextClassify texts (sentiment analysis)Generate textual description of images (image captioning)VideoVideo classification SpeechSpeech to text

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.How is it typically handled ?The chair is red1 0 1 1 0 0 0 0The cat is on a chairThe cat is young1 1 0 0 1 1 0 01 1 1 0 0 1 1 1The is chair red young cat on a

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.Link with artificial neural network ?

inputhidden layersoutput

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.Able to cope with varying size sequences either at the input or at the outputRecurrent Neural Network basicsOne to many (fixed size input, sequence output)

e.g. Image captioningMany to many(sequence input to sequence output)

e.g. Speech to textMany to one(sequence input to fixed size output)e.g. Text classificationArtificial neural networks with one or more recurrent layersClassical neural network

Recurrent neural network

Unrolled through time

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.On the difficulty of training recurrent networks

LSTMGRU

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.Recurrent neural networks in SparkSpark implementation of DL algorithms (data parallel)All the needed blocksAffine, convolutional, recurrent layers (Simple and GRU)SGD, rmsprop, adadelta optimizersSigmoid, tanh, reLu activationsCPU (and GPU backend)Fully compatible with existing DL library in Spark MLPerformanceOn 6 nodes cluster (CPU)5.46 average speedup (some communication overhead)About the same speedup as MLP in Spark ML

DriverWorker 1Worker 2Worker 3

Resulting gradients (2)Model broadcast (1)

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.Use case 1 : predictive maintenance (1)ContextThales and its clients build systems in different domainsTransportation (ticketing, controlling), Defense (radar), Satellites

Need better and more accurate maintenance servicesFrom planned maintenance (every x days) to an alert maintenanceFrom expert detection to automatic failure predictionFrom whole subsystem changes to more localized reparationsGoalDetect early signs of a (sub)system failure using data coming from sensors monitoring the health of a system (HUMS)

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.Use case 1 : predictive maintenance (2)Example on a real system20 sensors (20 values every 5 minutes), label (failure or not)

Take 3 hours of data and predict the probability of failure in the next hour (fully customizable)Learning using MLLIB

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.Use case 1 : predictive maintenance (3)Recurrent net learning

Impact of recurrent netsLogistic regression70% detection with 70% accuracyRecurrent Neural Network85% detection with 75% accuracy

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.Use case 2 : Sentiment analysis (1)ContextSocial network analysis application developed at Thales (Twitter, Facebook, blogs, forums)Analyze both the content of the texts and the relations (texts, actors)Multiple (big data) analysis Actor community detectionText clustering (themes)Focus onSentiment analysis on the collected textsClassify texts based on their sentiment

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.Use case 2 : Sentiment analysis (2)Learning datasetSentiment140 + Kaggle challenge (1.5M labeled tweets)50% positives, 50% negativesCompare Bag of words + traditional classifiers (Nave Bayes, SVM, logistic regression) versus RNN

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.Use case 2 : Sentiment analysis (3)NBSVMLog RegNeural Net (perceptron)RNN (GRU)10061.458.458.455.6NA1 00070.670.670.670.868.110 00075.475.175.476.172.3100 00078.176.676.978.579.2700 0008078.378.38084.1

Results

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.The end

THANK YOU !

#

OPEN

This document may not be reproduced, modified, adapted, published, translated, in any way, in whole or in part or disclosed to a third party without the prior written consent of Thales - Thales 2015 All rights reserved.