nvidia® cudnn gpu-accelerated machine learningspeech.ee.ntu.edu.tw/~tlkagk/courses/mlds_2015/nn...

29
NVIDIA® cuDNN GPU-Accelerated Machine Learning

Upload: others

Post on 04-Jun-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

NVIDIA® cuDNN

GPU-Accelerated Machine Learning

Page 2: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

How GPU Acceleration Works

Application Code

+

GPU CPU 5% of Code

~ 80% of run-time

Compute-Intensive Functions

Rest of Sequential CPU Code

Page 3: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

3 Ways to Program GPUs

Applications

Libraries

“Drop-in”

Acceleration

Programming

Languages

Maximum

Flexibility

OpenACC

Directives

Easily Accelerate

Applications

Page 4: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

HPC Today cuDNN is a library of primitives for deep learning

Page 5: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

Deep Learning with cuDNN cuDNN is a library of primitives for deep learning

GPUs

cuDNN

Frameworks

Applications

Tesla TX-1 Titan

Page 6: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

LARGE SCALE VISUAL RECOGNITION CHALLENGE (ILSVRC)

person

car

helmet

motorcycle

bird

frog

person

dog

chair

person

hammer

flower pot

power drill

1.2M training images • 1000 object categories

Page 7: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

Image Classification Error Rates

2012

CHALLENGE SUMMARY

4

60

110

0

20

40

60

80

100

120

2010 2011 2012 2013 2014

Entries using GPUs

28% 26%

16%

12%

7%

0%

5%

10%

15%

20%

25%

30%

2010 2011 2012 2013 2014

Page 8: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

DEEP LEARNING VISUALIZED

Page 9: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

Image Classification, Object Detection, Localization Face Recognition

Speech & Natural Language Processing

Medical Imaging & Interpretation

Seismic Imaging & Interpretation Recommendation

Example Use Cases

Page 10: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

Deep learning revolutionizing medical research

Detecting Mitosis in

Breast Cancer Cells — IDSIA

Predicting the Toxicity

of New Drugs — Johannes Kepler University

Understanding Gene Mutation

to Prevent Disease — University of Toronto

Page 11: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

cuDNN Version 2

Page 12: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

cuDNN Design Goal

Basic Deep Learning Subroutines

Allow user to write a DNN application without any CUDA code

Flexible Layout

Handle any data layout

Basic Deep Learning Subroutines

Great performance with more memory use

Good performance with minimal memory usage

Page 13: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

DNN ROUTINES

Convolutions – 80-90% of the execution time

Pooling – Spatial smoothing

Activation – Pointwise non-linear function

Page 14: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

CONVOLUTIONS – The MAIN Workload

Page 15: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

2D conv as a GEMV

I1 I2 I3 I4 I5 I6

I7 I8 I9 I10 I11 I12

I13 I14 I15 I16 I17 I18

I19 I20 I21 I22 I23 I24

I25 I26 I27 I28 I29 I30

I31 I32 I33 I34 I35 I36

F1 F2 F3

F4 F5 F6

F7 F8 F9

I1 I2 I3 I7 I8 I9 I13 I14 I15

I2 I3 I4 I8 I9 I10 I14 I15 I16

I3 I4 I5 I9 I10 I11 I15 I16 I17

F1

F2

F3

F4

F5

F6

F7

F8

F9

Image

Filter

Page 16: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

Multi-convolve

Page 17: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

cuDNN V2 Flexibility

Page 18: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

cuDNN V2 new features

Page 19: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

cuDNN Version 2

Accelerates key routines to

improve performance of neural

net training

Up to 1.8x faster on AlexNet than

a baseline GPU implementation

New support for 3D convolutions

Integrated into all major Deep

Learning frameworks: Caffe,

Theano, Torch

1.0x 1.0x

1.6x

1.2x

Caffe (GoogLeNet) Torch (OverFeat)

Baseline (GPU)

With cuDNN

2.5M

18M

23M

43M

0

10

20

30

40

50

16 Core CPU GTX Titan Titan BlackcuDNN v1

Titan XcuDNN v2

Millions

of

Images

Images Trained Per Day (Caffe AlexNet)

E5-2698 v3 @ 2.3GHz / 3.6GHz Turbo

Page 20: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

cuDNN Version 2

Accelerates key routines to

improve performance of neural

net training

Up to 1.8x faster on AlexNet than

a baseline GPU implementation

New support for 3D convolutions

Integrated into all major Deep

Learning frameworks: Caffe,

Theano, Torch

1.0x 1.0x

1.6x

1.2x

Caffe (GoogLeNet) Torch (OverFeat)

Baseline (GPU)

With cuDNN

2.5M

18M

23M

43M

0

10

20

30

40

50

16 Core CPU GTX Titan Titan BlackcuDNN v1

Titan XcuDNN v2

Millions

of

Images

Images Trained Per Day (Caffe AlexNet)

E5-2698 v3 @ 2.3GHz / 3.6GHz Turbo

Page 21: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

NVIDIA® cuDNN Roadmap

Q3’14 Q4’14

Layers (foward & backprop)

- Convolutional

- Pooling

- Softmax

- ReLu/Sigmoid/Tanh

Performance Features

Release 1 September 2014

High performance

convolution

Layers

- Local receptive field

- Contrast normalization

- Fully-connected

- Recurrent

Support for multiple GPUs

per node

Faster convolution routines

Release 3 Release 2

Q2’15 Q1’15

Tuning for future chips

Page 22: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

GPU-Accelerated Deep Learning Frameworks

CAFFE TORCH THEANO Mernava neo CUDA-

CONVNET2 KALDI

Description Deep Learning

Framework

Scientific Computing

Framework

Math Expression

Compiler

Deep Learning

Framework

Deep Learning

Application

Speech Recognition

Toolkit

cuDNN R2 R2 R2 -- -- --

Multi-GPU In Progress In Progress In Progress (nnet2)

Multi-CPU (nnet2)

License BSD-2 BSD BSD Apache 2.0 Apache 2.0 Apache 2.0

Interface(s) Text-based definition

files, C++. Python,

MATLAB

Python, Lua,

MATLAB Python Python C++ C++, Shell scripts

Embedded (TK1)

http://developer.nvidia.com/deeplearning

Page 23: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

Using cuDNN

Page 24: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

cuDNN Easy to Enable

Page 25: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

DIGITS

Visualization tool for DNN training

Use default network, import one, or

design your own

Import your training data from disk or

web

Monitor multiple trainings in parallel

Deep Learning GPU Training System

Page 26: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

DIGITS

Test Image

Monitor Progress Configure DNN Process Data Visualize Layers

Page 27: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

DIGITS

Deep Learning GPU Training System

Who it is for

Deep learning researchers

Automotive

Medical Researchers

Defense

Intelligent Video Analytics

Web Companies

Startups

Page 28: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep
Page 29: NVIDIA® cuDNN GPU-Accelerated Machine Learningspeech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015/NN Lecture/CUDN… · Deep Learning with cuDNN cuDNN is a library of primitives for deep

Thank you!

Developer Zone: https://developer.nvidia.com/deeplearning

GPU Technology Conference: http://www.gputechconf.com/

cuDNN Download: https://developer.nvidia.com/cuDNN

DIGITS Download: https://developer.nvidia.com/digits

DIGITS Source: https://www.github.com/nvidia/digits