Download - AINL 2016: Moskvichev
![Page 1: AINL 2016: Moskvichev](https://reader036.vdocuments.mx/reader036/viewer/2022070602/58749df51a28abfc5f8b6b25/html5/thumbnails/1.jpg)
Data Augmentation Method for the Image Sentiment
AnalysisAlexander Rakovsky1, Arseny Moskvichev2, Andrey Filchenkov1
1ITMO UniversitySaint Petersburg, Russia
2Saint Petersburg State UniversitySaint Petersburg, Russia
![Page 2: AINL 2016: Moskvichev](https://reader036.vdocuments.mx/reader036/viewer/2022070602/58749df51a28abfc5f8b6b25/html5/thumbnails/2.jpg)
Image sentiment analysis
Positiveness: 0.9
Positiveness: 0.01
![Page 3: AINL 2016: Moskvichev](https://reader036.vdocuments.mx/reader036/viewer/2022070602/58749df51a28abfc5f8b6b25/html5/thumbnails/3.jpg)
Why is it important?
Two words:
Social networks
![Page 4: AINL 2016: Moskvichev](https://reader036.vdocuments.mx/reader036/viewer/2022070602/58749df51a28abfc5f8b6b25/html5/thumbnails/4.jpg)
How do we approach it?
1.Collect lots of labeled images2.Train a convolutional neural network3.???4.Profit
![Page 5: AINL 2016: Moskvichev](https://reader036.vdocuments.mx/reader036/viewer/2022070602/58749df51a28abfc5f8b6b25/html5/thumbnails/5.jpg)
How do we approach it?
1.Collect lots of labeled images 2.Train a convolutional neural network3.???4.Profit
Problem!
![Page 6: AINL 2016: Moskvichev](https://reader036.vdocuments.mx/reader036/viewer/2022070602/58749df51a28abfc5f8b6b25/html5/thumbnails/6.jpg)
Solution
Data augmentation.
1.Get a few manually labeled images with corresponding hashtags
2.Learn to reconstruct labels from hashtags3.Collect as much labeled data as you need!
![Page 7: AINL 2016: Moskvichev](https://reader036.vdocuments.mx/reader036/viewer/2022070602/58749df51a28abfc5f8b6b25/html5/thumbnails/7.jpg)
Details
• Collecting data through FLICKR API (using keywords)
• Assessors evaluate the emotional colouring (positiveness) of each image
• Converting hashtags to vector representation (word2vec), and averaging them
• Using machine learning to predict assessors’ estimation
![Page 8: AINL 2016: Moskvichev](https://reader036.vdocuments.mx/reader036/viewer/2022070602/58749df51a28abfc5f8b6b25/html5/thumbnails/8.jpg)
(Preliminary!) Results
• kNN accuracy on classification task: 0.95• Average correlation between assessors: 0.86• Between the kNN regression and assessors:
0.83• Using this algorithm is almost as good as
hiring one more assessor!• Suspiciously good...
![Page 9: AINL 2016: Moskvichev](https://reader036.vdocuments.mx/reader036/viewer/2022070602/58749df51a28abfc5f8b6b25/html5/thumbnails/9.jpg)
Details
• Collecting data through FLICKR API (using keywords)
• Assessors evaluate the emotional colouring (positiveness) of each image
• Converting hashtags to vector representation (word2vec), and averaging them
• Using machine learning to predict assessors’ estimation
Nonrepresentative sample!
![Page 10: AINL 2016: Moskvichev](https://reader036.vdocuments.mx/reader036/viewer/2022070602/58749df51a28abfc5f8b6b25/html5/thumbnails/10.jpg)
Pros
• Easy to use (no word preprocessing)• Good results* (compared to dictionary -
based solutions)
![Page 11: AINL 2016: Moskvichev](https://reader036.vdocuments.mx/reader036/viewer/2022070602/58749df51a28abfc5f8b6b25/html5/thumbnails/11.jpg)
Cons
• Needs pre-training and an initial manually labeled sample
![Page 12: AINL 2016: Moskvichev](https://reader036.vdocuments.mx/reader036/viewer/2022070602/58749df51a28abfc5f8b6b25/html5/thumbnails/12.jpg)
Conclusions
• The proposed method affords a simple and efficient hashtag-based data augmentation solution for image sentiment analysis.
• More work is to be done to estimate the method’s performance on a general set of images.
![Page 13: AINL 2016: Moskvichev](https://reader036.vdocuments.mx/reader036/viewer/2022070602/58749df51a28abfc5f8b6b25/html5/thumbnails/13.jpg)
Thank you!