squeeze-and-excitation networks - imagenetimage-net.org/challenges/talks_2017/senet.pdf ·...
TRANSCRIPT
![Page 1: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/1.jpg)
Squeeze-and-Excitation Networks
Jie Hu1 , Li Shen2 , Gang Sun1
1 Momenta 2 University of Oxford
![Page 2: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/2.jpg)
Convolution
A convolutional filer is expected to be an informative combination
• Fusing channel-wise and spatial information
• Within local receptive fields
![Page 3: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/3.jpg)
Exploration on Spatial Enhancement
Multi-scale embedding
Inception [9]
Contextual embedding
Inside-outside Network [13]
![Page 4: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/4.jpg)
Squeeze-and-Excitation (SE) Networks
• If a network can be enhanced from the aspect of channel relationship?
• Motivation:• Explicitly model channel-interdependencies within
modules
• Feature recalibration• Selectively enhance useful features and suppress less useful
ones
![Page 5: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/5.jpg)
Squeeze-and-Excitation Module
Squeeze
• Shrinking feature maps ∈ℝ𝑤×ℎ×𝑐2 through spatial dimensions (𝑤 × ℎ)
• Global distribution of channel-wise responses
Excitation
• Learning 𝑊 ∈ ℝ𝑐2×𝑐2 to explicitly model channel-association
• Gating mechanism to produce channel-wise weights
Scale
• Reweighting the feature maps∈ ℝ𝑤×ℎ×𝑐2
![Page 6: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/6.jpg)
SE-Inception Module SE-ResNet Module
Inception
Global pooling
FC
𝑥
𝑥
FC
𝑐 × ℎ × 𝑤
𝑐 × 1 × 1
𝑐
16× 1 × 1
𝑐 × 1 × 1
𝑐 × ℎ × 𝑤Scale
Sigmoid 𝑐 × 1 × 1
Residual
𝑥
𝑥+
Global pooling
FC
FC
𝑐 × ℎ × 𝑤
𝑐 × 1 × 1
𝑐
16× 1 × 1
𝑐 × 1 × 1
𝑐 × ℎ × 𝑤
𝑐 × ℎ × 𝑤Scale
Sigmoid𝑐 × 1 × 1
Inception
𝑥
𝑥
Residual
𝑥+
𝑥
Inception Module
ResNet Module
![Page 7: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/7.jpg)
Model and Computational Complexity
SE-ResNet-50 vs. ResNet-50• Parameters: 2%~10% additional parameters
• Computation cost: <1% additional computation (theoretical)
• GPU inference time: 10% additional time
• CPU inference time: <2% additional time
![Page 8: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/8.jpg)
Training – Momenta ROCS
• Data augmentation✓ Mirror flip, Random size crop [9], Rotation, Color Jitter
• Mini-batch data sampling✓ Balance-data strategy [7]
• Training hyper-parameters✓ 4 or 8 GPU severs (8 NVIDIA Titan X per server)
✓ Batch-size: 1024 / 2048 (32 per GPU)
✓ Initial learning rate : 0.6 (decrease each 30 epochs)
✓ Synchronous SGD with momentum 0.9
![Page 9: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/9.jpg)
Experiments on ImageNet-1k dataset
• Empirical investigations on:• Benefits against Deeper Networks
• Incorporation with modern architectures
• ILSVRC 2017 Classification Task
![Page 10: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/10.jpg)
Benefits against Network Depth
![Page 11: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/11.jpg)
Benefits against Network Depth
![Page 12: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/12.jpg)
Incorporation with Modern Architectures
![Page 13: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/13.jpg)
Incorporation with Modern Architectures
![Page 14: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/14.jpg)
Comparison with State-of-the-art
SENet is a SE-ResNeXt-152 (64 × 4d)
![Page 15: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/15.jpg)
ILSVRC 2017 Classification Task
Team Top-5 error (%)
WMW 2.251
Trimps-Soushen 2.481
NUS-Qihoo-DPNs 2.740
BDAT 2.962
ILSVRC 2016 Winner 2.991
![Page 16: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/16.jpg)
References
[1] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In CVPR, 2015.
[2] K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers: Surpassing human-level performance
on imagenet classification. In ICCV, 2015.
[3] K. He, X. Zhang, S. Ren, and J. Sun. Identity mappings in deep residual networks. In ECCV, 2016.
[4] G. huang, Z. Liu, K. Weinberge, and L. Maaten. Densely connected convolutional networks. In CVPR, 2017.
[5] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal
covariate shift. In ICML, 2015.
[6] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein,
A. C. Berg, and L. Fei-Fei. Imagenet large scale visual recognition challenge. In IJCV, 2015.
[7] L. Shen, Z. Lin, and Q. Huang. Relay backpropagation for effective learning of deep convolutional neural networks.
In ECCV, 2016.
[8] C. Szegedy, S. Ioffe, and V. Vanhoucke. Inception-v4, inception-resnet and the impact of residual connections on
learning. In arXiv preprint arXiv:1602.07261, 2016.
[9] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich.
Going deeper with convolutions. In CVPR, 2015.
[10] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. Rethinking the inception architecture for computer
vision. In CVPR, 2016.
[11] S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He. Aggregated residual transformations for deep neural
networks. In CVPR, 2016.
[12] X. Zhang, Z. Li, C. Chen, and D. Lin. Polynet: A pursuit of structural diversity in very deep networks. In CVPR, 2017.
[13] S. Bell, C. L. Zitnick, K. Bala, and R. Girshick. Inside-Outside Net: Detecting Objects in Context with Skip Pooling
and Recurrent Neural Networks. In CVPR, 2016.
![Page 17: Squeeze-and-Excitation Networks - ImageNetimage-net.org/challenges/talks_2017/SENet.pdf · Squeeze-and-Excitation Networks Jie Hu 1, Li Shen2 , Gang Sun 1 Momenta 2 University of](https://reader031.vdocuments.mx/reader031/viewer/2022030418/5aa49e667f8b9ab4788c1f12/html5/thumbnails/17.jpg)
Thank you