transfer learning - machine learning graz · "a survey of transfer learning." journal of...
TRANSCRIPT
![Page 1: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/1.jpg)
Transfer Learning 31-Jul-2018
Adrian SpataruData Scientist at Know-Center [email protected]://www.fb.me/adrian.spataru.5
![Page 2: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/2.jpg)
Outline- Quick Neural Network Refresher- What is Transfer Learning?
- Bottlenecking & Fine-Tuning
- Multitask Learning
- Domain-adversarial Training
- Zero-Shot Learning
- Resources for Pretrained models
![Page 3: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/3.jpg)
Perceptron
![Page 4: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/4.jpg)
Multilayer Perceptron
![Page 5: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/5.jpg)
Convolutional Neural Network
![Page 6: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/6.jpg)
Traditional ML
![Page 7: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/7.jpg)
Transfer Learning
![Page 8: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/8.jpg)
Transfer Learning Landscape
Transfer Learning Tutorial (Hung-yi Lee)
![Page 9: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/9.jpg)
Fine-Tuning- Scenario
- Lot of labeled source data- limited labeled target data
- Idea: train a model by source data, then fine-tune the model with the target data.
- Why? - Training on target data only, will likely overfit.- May reduce training time with pretrained models
![Page 10: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/10.jpg)
Fine-Tuning Example- Kaggle Imaterialist Challenge- Labeling of Household items (glass,chair etc)- Multiclass Classification Problem- 128 Classes- Training 190k Images
![Page 11: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/11.jpg)
Typical Image Classifier
Feature Extractor Label Classifier
![Page 12: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/12.jpg)
Fine-Tuning
Feature Extractor Label Classifier
128Output
![Page 13: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/13.jpg)
Results- Using a pretrained ResNet50 model on ImageNet and finetuning
- 84% Accuracy - Top 20%
- Avg Ensemble of 11 different pretrained models and finetuning- 89% Accuracy - First Place
- ResNet50 takes 4 days to train on a GTX1060
![Page 14: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/14.jpg)
Bottlenecking - Fine-Tuning of the poor.
- Idea: train a model by source data -> use the model to extract features for the target data -> train a new model with the extracted features
input FCLayer
128Output
![Page 15: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/15.jpg)
Results- Initial feature extraction takes 6 hours on a GTX 1060 - Training a models takes 20min- Reduces the data from 30gb to 2-3gb- 82% Accuracy - Avg Ensemble of 5 bottleneck models- Top 38% Place - 159/436
![Page 16: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/16.jpg)
CODE DEMO
![Page 17: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/17.jpg)
![Page 18: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/18.jpg)
Multitask Learning
![Page 19: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/19.jpg)
Multitask Learning (2)- Why consider it?
- Attention focusing- Eavesdropping- Regularization
- Lot of Architectures- Cross-stitch Networks- Fully-Adaptive Feature Sharing- etc
![Page 20: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/20.jpg)
Multiple Language Translation
![Page 21: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/21.jpg)
Domain Adversarial Training- Scenario
- lot of labeled source data- lots of unlabel target data
- Goal: Train a model which performs well on unlabeled data.
![Page 22: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/22.jpg)
Domain Adversarial Training- Scenario
- lot of labeled source data- lots of unlabel target data
- Goal: Train a model which performs well on unlabeled data.- Goal 2.0: The distribution of the features extracted are similar
TargetSource
![Page 23: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/23.jpg)
Domain Adversarial Training- Scenario
- lot of labeled source data- lots of unlabel target data
- Goal: Train a model which performs well on unlabeled data.- Goal 2.0: The distribution of the features extracted are similar
TargetSource
![Page 24: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/24.jpg)
Domain Adversarial Training
Transfer Learning (Hung-yi Lee)
![Page 25: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/25.jpg)
Domain Adversarial Training Example
No Adapt - 87% AccWith Adapt - 91% Acc
![Page 26: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/26.jpg)
Zero Shot Learning- Scenario
- lot of labeled source data- unlabeled target data
- Goal: Train a model which performs well on target data.- How? Inference through attributes, metadata etc
![Page 27: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/27.jpg)
Zero Shot Learning Attributes
![Page 28: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/28.jpg)
Zero Shot Learning Attributes
![Page 29: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/29.jpg)
Multimodal Embeddings
![Page 30: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/30.jpg)
Zero Shot Learning Attributes- Can Classify unseen classes.- If 1:1 Mapping can be ensured- Ex: Weimaraner Dog
- gray, has tail, is on land, small- [0,1,1,1]
![Page 31: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/31.jpg)
Zero Shot Learning Attributes Wikipedia- Use Wikipedia and Word2Vec/GloVe articles as object description
![Page 32: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/32.jpg)
Image example of zero shot learning
![Page 33: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/33.jpg)
Image example of zero shot learning
![Page 34: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/34.jpg)
Google Translate - Zero shot LearningBackup
![Page 35: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/35.jpg)
Google Translate - Zero shot Learning
![Page 36: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/36.jpg)
Pretrained modelshttps://modeldepot.io/
http://pretrained.ml/
https://keras.io/
https://github.com/Cadene/pretrained-models.pytorch
https://nlp.stanford.edu/projects/glove/
https://github.com/pumpikano/tf-dann
![Page 37: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/37.jpg)
ReferencesWeiss, Karl, Taghi M. Khoshgoftaar, and DingDing Wang. "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9.
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
Dong, Daxiang, et al. "Multi-task learning for multiple language translation." Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Vol. 1. 2015.
Ganin, Yaroslav, et al. "Domain-adversarial training of neural networks." The Journal of Machine Learning Research 17.1 (2016): 2096-2030.
Johnson, Melvin, et al. "Google's multilingual neural machine translation system: enabling zero-shot translation." arXiv preprint arXiv:1611.04558 (2016).
Zhang, Ziming, and Venkatesh Saligrama. "Zero-shot learning via semantic similarity embedding." Proceedings of the IEEE international conference on computer vision. 2015.
![Page 38: Transfer Learning - Machine Learning Graz · "A survey of transfer learning." Journal of Big Data 3.1 (2016): 9. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet](https://reader034.vdocuments.mx/reader034/viewer/2022042309/5ed6316304e9cb4adb6710be/html5/thumbnails/38.jpg)
Adrian SpataruData Scientist at Know-Center [email protected]://www.fb.me/adrian.spataru.5
Thank you for your Attention!