searching for efficient multi-scale architectures for …...2019/02/27  · searching for efficient...

32
Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys 2019 Authors: Liang-Chieh Chen, Maxwell D. Collins, Yukun Zhu, George Papandreou, Barret Zoph, Florian Schroff, Hartwig Adam, Jonathon Shlens

Upload: others

Post on 29-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Searching for Efficient Multi-Scale Architectures for Dense Image Prediction

Presented by Paras Jain AISys 2019

Authors: Liang-Chieh Chen, Maxwell D. Collins, Yukun Zhu, George Papandreou, Barret Zoph, Florian Schroff, Hartwig Adam, Jonathon Shlens

Page 2: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Background Paper overview Search space Sampling strategy Performance estimation Results

Page 3: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

DNNs… now ubiquitous!

Page 4: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

But DNN design is getting more complex

AlexNet (2012)

VGG16 (2014) ResNet (2016)

Page 5: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

# of applications >> # of AI experts

Growing design space of DNNs

Falling price per FLOP

What is the Design Automation stack for DNNs?

AutoML tries to automatically generate high-accuracy models (subject to constraints)

Page 6: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Blueprint image: https://arxiv.org/pdf/1808.05377.pdf Loop image courtesy Barret Zoph, Quoc Le

Page 7: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Blueprint for an AutoML paper

Blueprint image: https://arxiv.org/pdf/1808.05377.pdf Loop image courtesy Barret Zoph, Quoc Le

Page 8: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Learning straight-line DNNs (simple data)

Page 9: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Learning straight-line DNNs (simple data)

NASNet exceeded human performance on CIFAR and COCO

Page 10: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Learning straight-line DNNs (simple data)

NASNet exceeded human performance on CIFAR and COCO(classification, object detection)

Constrained optimization objective for mobile inference latency

Page 11: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Learning straight-line DNNs (simple data)

NASNet exceeded human performance on CIFAR and COCO(classification, object detection)

Constrained optimization objective for mobile inference latency

Low-cost architecture search via backprop into architecture

Page 12: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Background Paper overview Search space Sampling strategy Performance estimation Results

Page 13: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Motivation

• AutoML has exceeded human performance on classification

• Can we apply search to a new vision task (semantic segmentation)?

Page 14: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

What is segmentation? Label each pixel of an image with an class

Key application: Autonomous driving, cancer detection, deforestation detection

Metric: Intersection-over-Union aka Jaccard index

Semantic Segmentation task

Images: Mapillary Vistas

Page 15: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Current state of the art in semantic segmentation

Results generalize to scene parsing (above) and person-part matching

Used AutoML to search space of 1011 models, sampled 28000 models

“Cheap AutoML” = 370 GPUs over one week

Page 16: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Only learn a single “Dense Prediction Cell”

Sample graphs using random search

(Vizier)

A) Train using mobile backbone B) Cache feature maps

C) Early stopping (90m per sample = 100x speedup)

Blueprint for an AutoML paper

Page 17: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Background Paper overview Search space Sampling strategy Performance estimation Results

Page 18: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Search spacePre-trained backbone• Majority of network arch is fixed

• MobileNet V2 classification net • Xception classification net

• Chop last few layers off classification net and add some new layers (DPC)

Search this

Page 19: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Dense Prediction

Cell

Random Sampling Proxy task

• Average spatial pyramid pooling (downsample, conv1x1, upsample)

• 1x1 convolution

• 3x3 dilated convolution

output

output

4.2 x 1011 search space

Page 20: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Dense Prediction

Cell

Random Sampling Proxy task

4.2 x 1011 search space

Page 21: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Background Paper overview Search space Sampling strategy Performance estimation Results

Page 22: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Dense Prediction

Cell

Random Sampling Proxy task

Page 23: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Background Paper overview Search space Sampling strategy Performance estimation Results

Page 24: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Faster NAS using proxy tasks• IDEA: Estimate architecture performance using a proxy task • The better the proxy task is, the more efficient search is • Key contribution of this paper is task-specific proxy tasks

Proxy #1Proxy #2Proxy #3Proxy #4Proxy #5

Model #1Model #2Model #4Model #3Model #5

Page 25: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Proxy task 1: Train using MobileNet

• Predict final accuracy by using a smaller classification network • Xception: 21% top-1 error, 22M params • MobileNet v2: 28% top 1 error, 3.4M params

Page 26: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Proxy task 2: Cache activationsCache classification network activations and only train new layers (freeze gradient)

$

Page 27: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Background Paper overview Search space Sampling strategy Performance estimation Results

Page 28: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Cityscapes Semantic Segmentation

Page 29: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Person-part identification

Page 30: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

PASCAL VOC scene understanding

Page 31: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

#1 #2 #3

Dense Prediction Cells learned

Page 32: Searching for Efficient Multi-Scale Architectures for …...2019/02/27  · Searching for Efficient Multi-Scale Architectures for Dense Image Prediction Presented by Paras Jain AISys

Some discussion points• What are new application areas for NAS?

• Ideas? object detection, speech generation, GANs? • Does NAS un-democratize ML?

• Google leads the training compute arms race • Will the NAS workload influence how hardware should look? • Seems like significant domain knowledge is necessary to develop

SoTA NAS methods — is NAS most useful as a research productivity tool?