efficient deep learning in communications - itu...2018/01/29 · human champ deep net outperforms...

Image ProcessingFraunhofer

Heinrich Hertz Institute

Fraunhofer Heinrich Hertz Institute, Einsteinufer 37, 10587 Berlin www.hhi.fraunhofer.de

Efficient Deep Learning in Communications

Dr. Wojciech Samek

Fraunhofer HHI, Machine Learning Group

Today’s AI Systems

AlphaGo beats Go

human champ

Deep Net outperforms humans

in image classification

Deep Net beats human at

recognizing traffic signs

DeepStack beats

professional poker players

Computer out-plays

humans in "doom"

Dermatologist-level classification

of skin cancer with Deep Nets

Revolutionizing Radiology

with Deep Learning

Huge volumes of data Computing power Powerful models

Today’s AI Systems

Communications settings are often different.

- Millions of labeled

examples available

- huge models (up to 137 billion

parameters and 1001 layers)

- architectures adapted to images,

speech, text …

- highly parallel processing

- large power consumption

(600 Watts per GPU card)

Many additional requirements: Small size, efficient execution, low energy consumption …

Satellite Communications Autonomous driving

Smartphones Internet of Things 5G Networks

Smart Data

ML in Communications

Distributed setting

Large nonstationarity

Restricted ressources

Communications costs

Interoperability

Security & privacy

Interpretability

Trustworthiness

ML in Communications

We need ML techniques which are

adapted to communications

But it’s not only the algorithms, also:

- protocols

- data formats

- frameworks

- mechanisms

Problem 1: Restricted ressources

DNN with Millions of weight parameters

- large size

- energy-hungry training & inference

- floating point operations

Many recent work on compressing neural networks by weight quantization.

Problem 1: Restricted resources

compressed sparse row format

- reduces storage

- fast multiplications

can we do better ?

quantization

(rate-distortion

theory)

RD-theory based weight quantization does not necessarily lead to sparse matrices.

Weight sharing property: Subsets of connections share the same weight value

rewriting trick

more efficient format

than CSR

VGG-16

size: 553 MB, acc: 68.73 %, ops: 30940 M, energy: 71 mJ

Simple quantization (8 bit)

size: 138 MB, acc: 68.52 %, ops: 30940 M, energy: 68 mJ

Sparse format

size: 773 (217) MB, acc: 68.52 %, ops: 29472 M, energy: 65 mJ

iphone8 25 kJ

WS format

size: 247 (99) MB, acc: 68.52 %, ops: 16666 M, energy: 36 mJ

State-of-the-art compression + sparse format

size: 17.8 MB, acc: 68.83 %, ops: 10081 M, energy: 22 mJ

State-of-the-art compression + WS format

size: 12.8 MB, acc: 68.83 %, ops: 7225 M, energy: 16 mJ

Problem 2: Interpretability

verify

system

understand

weaknesses

aspects learn new

strategies

Theoretical interpretation: (Deep) Taylor decomposition of neural network

Explanation

rooster

what speaks for / against

classification as “3”

what speaks for / against

classification as “9”

Predictions

25-32 years old

60+ years old

Conclusion

Bringing ML to communications comes with new challenges

AI systems may behave differently than expected

Need for best practices & recommendations (protocols, formats, …)

Thank you for your attention

Questions ???

All our papers available on:

http://iphome.hhi.de/samek

Acknowledgement

Simon Wiedemann (HHI)

Klaus-Robert Müller (TUB)

Grégoire Montavon (TUB)

Sebastian Lapuschkin (HHI)

Leila Arras (HHI)

efficient deep learning in communications - itu...2018/01/29 · human champ deep net outperforms...

Documents

nais-net: stable deep networks from non-autonomous...

sa-net: shuffle attention for deep convolutional …

raia...

deep reinforcement learning and complex environments ·...

deep dive openshitt on azure & .net core on openshift

human attribute recognition by deep hierarchical...

trascenda a deep human impulse - trend usa · trascenda™,...

deep packet inspection and its effects on net neutrality

net collection classes deep dive - rocksolid tour 2013

deep reinforcement learning from human...

deepid-net: multi-stage and deformable deep convolutional...

deep learning for human language processing

net native deep dive

sparse deep belief net models for visual area...

deep aggregation net for land cover...

"deep learning" chap.6 convolutional neural net

fast learning algorithm for deep belief net

deepid-net: deformable deep convolutional neural networks...

finding the neural net: deep-learning idiom type

tri-net for semi-supervised deep learning