deep learning and recurrent neural networks in the enterprise

32
Deep Learning and Recurrent Neural Networks in the Enterprise StampedeCon St. Louis 2016 Josh Patterson, Skymind

Upload: josh-patterson

Post on 08-Jan-2017

388 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: Deep Learning and Recurrent Neural Networks in the Enterprise

Deep Learning and Recurrent Neural Networks in the Enterprise

StampedeConSt. Louis 2016

Josh Patterson, Skymind

Page 2: Deep Learning and Recurrent Neural Networks in the Enterprise

Presenter: Josh Patterson

Past

Research in Swarm Algorithms: Real-time optimization techniques in mesh sensor networks

TVA / NERC: Smartgrid, Sensor Collection, and Big Data

Cloudera: Principal SA, Working with Fortune 500

Patterson Consulting: Working with Fortune 500 on Big Data, ML

Today

Skymind, Director Field Engineering

[email protected] / @jpatanoogaDL4J Co-creator,

Co-Author on Upcoming Oreilly Book“Deep Learning: A Practitioner’s Approach”

Page 3: Deep Learning and Recurrent Neural Networks in the Enterprise

Topics

• What is Deep Learning?• DL4J• Recurrent Neural Network Applications

Page 4: Deep Learning and Recurrent Neural Networks in the Enterprise

WHAT IS DEEP LEARNING?

Page 5: Deep Learning and Recurrent Neural Networks in the Enterprise

Defining Deep Learning

• Higher neuron counts than in previous generation neural networks

• Different and evolved ways to connect layers inside neural networks

• More computing power to train• Automated Feature Learning

Page 6: Deep Learning and Recurrent Neural Networks in the Enterprise

Automated Feature Learning

• Deep Learning can be thought of as workflows for automated feature construction– From “feature construction” to “feature learning”

• As Yann LeCun says:– “machines that learn to represent the world”

Page 7: Deep Learning and Recurrent Neural Networks in the Enterprise
Page 8: Deep Learning and Recurrent Neural Networks in the Enterprise
Page 9: Deep Learning and Recurrent Neural Networks in the Enterprise

These are the features learned at each neuron in a Restricted Boltzmann Machine (RBMS)

These features are passed to higher levels of RBMs to learn more complicated things.

Part of the “7” digit

Page 10: Deep Learning and Recurrent Neural Networks in the Enterprise

Unreasonable Effectiveness: Benchmark Records

1. Text-to-speech synthesis (Fan et al., Microsoft, Interspeech 2014) 2. Language identification (Gonzalez-Dominguez et al., Google, Interspeech 2014) 3. Large vocabulary speech recognition (Sak et al., Google, Interspeech 2014) 4. Prosody contour prediction (Fernandez et al., IBM, Interspeech 2014) 5. Medium vocabulary speech recognition (Geiger et al., Interspeech 2014) 6. English to French translation (Sutskever et al., Google, NIPS 2014) 7. Audio onset detection (Marchi et al., ICASSP 2014) 8. Social signal classification (Brueckner & Schulter, ICASSP 2014) 9. Arabic handwriting recognition (Bluche et al., DAS 2014) 10. TIMIT phoneme recognition (Graves et al., ICASSP 2013) 11. Optical character recognition (Breuel et al., ICDAR 2013) 12. Image caption generation (Vinyals et al., Google, 2014) 13. Video to textual description (Donahue et al., 2014) 14. Syntactic parsing for Natural Language Processing (Vinyals et al., Google, 2014) 15. Photo-real talking heads (Soong and Wang, Microsoft, 2014).

Page 11: Deep Learning and Recurrent Neural Networks in the Enterprise

Four Major Architectures

• Deep Belief Networks• Convolutional Neural Networks• Recurrent Neural Networks• Recursive Neural Networks

Page 12: Deep Learning and Recurrent Neural Networks in the Enterprise

Quick Usage Guide

• If I have Timeseries or Audio Input– I should use a Recurrent Neural Network– Examples: Fraud Detection, Anomaly Detection

• If I have Image input– I should use a Convolutional Neural Network

• If I have Video input– I should use a hybrid Convolutional + Recurrent

Architecture!

Page 13: Deep Learning and Recurrent Neural Networks in the Enterprise

Convolutional Generated Art

Page 14: Deep Learning and Recurrent Neural Networks in the Enterprise

The More Things Change…

• Deep Learning is still trying to answer the same fundamental questions such as:– “is this image a face?”

• The difference is Deep Learning makes hard questions easier to answer with better architectures and more computing power– We do this by matching the correct architecture

w the right problem

Page 15: Deep Learning and Recurrent Neural Networks in the Enterprise

DL4JBuilding Deep Neural Networks with

Page 16: Deep Learning and Recurrent Neural Networks in the Enterprise

DL4J• “The Hadoop of Deep Learning”

– Java, Scala, and Python APIs– ASF 2.0 Licensed

• Java implementation– Parallelization (Yarn + Spark)– GPU support

• Also Supports multi-GPU per host

• Runtime Neutral– Local– Hadoop / YARN + Spark

• https://github.com/deeplearning4j/deeplearning4j

Page 17: Deep Learning and Recurrent Neural Networks in the Enterprise

DL4J Workflow Toolchain

ETL(DataVec)

Vectorization

(DataVec)

Modeling

(DL4J)

Evaluation

(Arbiter)

Execution Platforms: Spark, Single Machine

ND4J - Linear Algebra Runtime: CPU, GPU

Page 18: Deep Learning and Recurrent Neural Networks in the Enterprise

ND4J: The Need for Speed• Javacpp (cython for java)

– Auto generate JNI bindings for C++ by parsing classes– Allows for easy maintenance and deployment of c++ binaries in java

• CPU backends– Openmp (multithreading within native operations)– Openblas or MKL (BLAS operations)– SIMD-extensions

• GPU backends– DL4J supports Cuda 7.5 at the moment, and will support 8.0

support as soon as it comes out.– Leverages cudnn as well

Page 19: Deep Learning and Recurrent Neural Networks in the Enterprise

Prepping Data is Time Consuming

http://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says/#633ea7f67f75

Page 20: Deep Learning and Recurrent Neural Networks in the Enterprise

Preparing Data for Modeling is Hard

Page 21: Deep Learning and Recurrent Neural Networks in the Enterprise

DataVec

• DataVec is a tool for machine learning ETL (Extract, Transform, Load) operations. – Spark-Enabled and focused on Supporting DL4J

• Also performs vectorization– Image, CSV, Sequences (timeseries), more

• Open Source, ASF 2.0 Licensed– https://github.com/deeplearning4j/DataVec

Page 22: Deep Learning and Recurrent Neural Networks in the Enterprise

RECURRENT NEURAL NETWORK APPLICATIONS

Using DL4J for

Page 23: Deep Learning and Recurrent Neural Networks in the Enterprise

Source: IDC White Paper - sponsored by EMC. As the Economy Contracts, the Digital Universe Expands. May 2009.

.

Transactional Data Explosion

• 2,500 exabytes of new information in 2012 with Internet as primary driver• Digital universe grew by 62% last year to 800K petabytes and will grow to 1.2 “zettabytes” this year

Relational

Transactional (Logs, Sensors)

(You)

Page 24: Deep Learning and Recurrent Neural Networks in the Enterprise

NERC Sensor Data CollectionopenPDC PMU Data Collection circa 2009

• 120 Sensors• 30 samples/second• 4.3B Samples/day• Housed in Hadoop

Page 25: Deep Learning and Recurrent Neural Networks in the Enterprise

Sensor Timeseries Classification with RNNs

• Recurrent Neural Networks have the ability to model change of input over time

• Older techniques (mostly) do not retain time domain– Hidden Markov Models do…• but are more limited

• Key Takeaway: – For working with Timeseries data, RNNs will be

more accurate

Page 26: Deep Learning and Recurrent Neural Networks in the Enterprise

RNN Architectures

Standard supervised learning

Imagecaptioning

Sentiment analysis

Video captioning,Natural language translation

Part of speechtagging

Generative models for text

Page 27: Deep Learning and Recurrent Neural Networks in the Enterprise

Anomaly Detection

• Model the normal patterns in the data• Autoencoders give us the ability to look at

data that it hasn’t seen before– Find anomalous patterns in sequences– Can also use RNNs for pattern classification

• Interesting Industry Applications– Telecom– Financial Services

Page 28: Deep Learning and Recurrent Neural Networks in the Enterprise

Audio Applications

• Text-to-Speech• Recognize specific songs / audio• Enables natural language interfaces

Page 29: Deep Learning and Recurrent Neural Networks in the Enterprise

“Google is living a few years in the future and sending the rest of us

messages”-- Doug Cutting in 2013

• However– Most organizations are not built like Google• (and Jeff Dean does not work at your company…)

• Anyone building Next-Gen infrastructure has to consider these things

Page 30: Deep Learning and Recurrent Neural Networks in the Enterprise

Certified on Two Hadoop Distributions

• Running Spark on Hadoop via YARN gives us– Sharing cluster resources between heterogeneous

workloads concurrently– Access to the yarn scheduler capabilities– Better control of executors in Spark– Kerberos support for security

• Certified on CDH 5.4• Certified on HDP 2.4– [ Coming later this month ]

Page 31: Deep Learning and Recurrent Neural Networks in the Enterprise

Questions?

Thank you for your time and attention

“Deep Learning: A Practitioner’s Approach” (Oreilly, October 2016)

Page 32: Deep Learning and Recurrent Neural Networks in the Enterprise

Running DL4J Workflows on Spark

• DataVec is built to scale out via Spark RDDs– RDD<LabeledPoint>– RDD<DataSet>

• DL4J Uses same MultiLayerConfiguration as single host version– Uses SparkDl4jMultiLayer to drive the training on spark– Performs Parameter Averaging

spark-submit --class io.skymind.spark.dl4j.datavec.BasicDataVecExample --master yarn --num-executors 1 --properties-file ./spark_extra.props ./Skymind_spark-1.0-SNAPSHOT.jar