Transcript
Page 1: Automatic Image Annotation (AIA)

Seminar Report Presented to:

Dr. Shanbehzadeh

Presented by: Farzaneh Rezaei

November 2015

Page 2: Automatic Image Annotation (AIA)

2

What is the goal of computer vision ?

Perceive the story behind the picture

See the world!!But what exactly does it mean to see?Source: Wall-e Movie: Pixar, Walt Disney Pictures

Page 3: Automatic Image Annotation (AIA)

3

Outline

Introduction To Image

Annotation

• What?• Why?

Story Behind AIA

• Components of AIA• Progress of AIA• Issues &

Conclusions

Going deeper !

• Feature Extraction• Learning Methods• Deep Learning• Conclusions

Useful Information

• Recent Articles• Toolbox• Databases• Authors

Conclusions

• References

Page 4: Automatic Image Annotation (AIA)

4

Outline

Introduction To Image

Annotation

• What?• Why?

Story Behind AIA

• Components of AIA• Progress of AIA• Issues &

Conclusions

Going deeper !

• Feature Extraction• Learning Methods• Deep Learning• Conclusions

Useful Information

• Recent Articles• Toolbox• Databases• Authors

Conclusions

• References

Page 5: Automatic Image Annotation (AIA)

5

What is Automatic Image Annotation?Automatic image annotation is the task of automatically assigning words to an image that describe the content of the image.

Munirathnam Srikanth, et al. Exploiting ontologies for automatic image annotation

Source: Personalizing Automated Image Annotation Using Cross-Entropy: https://ivi.fnwi.uva.nl/isis/publications/bibtexbrowser.php?key=LiICM2011&bib=all.bib

Page 6: Automatic Image Annotation (AIA)

6

What is Automatic Image Annotation?(Cont.)

Source: MS COCO Captioning Challenge: http://mscoco.org/dataset/#captions-challenge2015

Page 7: Automatic Image Annotation (AIA)

7

3,000 Photos Are Uploaded Every Second to Facebook

Why Image Annotation is important?Recently, we have witnessed an exponential growth of user generated videos and images, due to the booming of social networks, such as Facebook and Flickr.

Source: petapixel.com

Source: http://petapixel.com/2012/02/01/3000-photos-are-uploaded-every-second-to-facebook/

Page 8: Automatic Image Annotation (AIA)

8

Why Image Annotation is important?(Cont.)

Source: Barriuso, A., & Torralba, A. (2012). Notes on image annotation

• Applications e.g. Photo organizer apps• Image Classification Systems

Page 9: Automatic Image Annotation (AIA)

9

Numbers of articles per year for “Automatic Image Annotation”

(in Title of article)

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 20150

10

20

30

40

50

60

70

Year Reported by: Google Scholar

Page 10: Automatic Image Annotation (AIA)

10

Outline

Introduction To Image

Annotation

• What?• Why?

Story Behind AIA

• Components of AIA• Progress of AIA• Issues &

Conclusions

Going deeper !

• Feature Extraction• Learning Methods• Deep Learning• Conclusions

Useful Information

• Recent Articles• Toolbox• Databases• Authors

Conclusions

• References

Page 11: Automatic Image Annotation (AIA)

11

How do you annotate these images?

Page 12: Automatic Image Annotation (AIA)

12

What are components of

Automatic Image Annotation

System ?

Page 13: Automatic Image Annotation (AIA)

13

How to classify Images ?

What are components of

Automatic Image Annotation

System ?

Page 14: Automatic Image Annotation (AIA)

14

Feature Extraction

ClassificationMethods

What are components of

Automatic Image Annotation

System ?

Page 15: Automatic Image Annotation (AIA)

15

What are components of

Automatic Image Annotation

System ?

ClassificationMethods

Feature Extraction

Page 16: Automatic Image Annotation (AIA)

16

What are components of

Automatic Image Annotation

System ?

Feature Extraction

ClassificationMethods

Pattern Recognition !!

Page 17: Automatic Image Annotation (AIA)

17

Slide Credit

Page 18: Automatic Image Annotation (AIA)

18

An Example of classical approaches in AIA

Source: Zhang, D., Islam, M. M., & Lu, G. (2012). A review on automatic image annotation techniques. Pattern Recognition, 45(1), 346–362. doi:10.1016/j.patcog.2011.05.013

Page 19: Automatic Image Annotation (AIA)

19

Theoretical Limitations of Shallow Architectures*

Functions that can be compactly represented by a depth k architecture

might require an exponential number of computational elements to

be represented by a depth k − 1 architecture

Issues of classical approaches

*Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning

Page 20: Automatic Image Annotation (AIA)

20

Issues of classical approaches (Cont.)Theoretical Limitations of Shallow Architectures

• Shallow? Deep?

• Functions?

• Compact?

• Depth?

• Computational Elements?

logic circuit

Page 21: Automatic Image Annotation (AIA)

21

Issues of classical approaches (Cont.)

Picture Source: Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning

Depth 4 Depth 3

Page 22: Automatic Image Annotation (AIA)

22

Issues of classical approaches (Cont.)Theoretical Limitations of Shallow Architectures

• Linear regression and logistic regression have depth 1, i.e., have a single level.

• Ordinary multi-layer neural networks With the most common choice of one hidden

layer, they have depth two

• Decision trees can also be seen as having two levels

• Boosting (Freund & Schapire, 1996) usually adds one level to its base learners: that

level computes a vote or linear combination of the outputs of the base learners

Page 23: Automatic Image Annotation (AIA)

23

Issues of classical approaches (Cont.)Theoretical Limitations of Shallow Architectures

• Shallow? Deep?

• Functions

• Compact

• Depth

• Computational Elements

Page 24: Automatic Image Annotation (AIA)

24

Theoretical Limitations of Shallow Architectures*

Functions that can be compactly represented by a depth k architecture

might require an exponential number of computational elements to

be represented by a depth k − 1 architecture

Issues of classical approaches

*Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning

Page 25: Automatic Image Annotation (AIA)

25

• A two-layer circuit of logic gates can represent any boolean function (Mendelson,

1997).

• With depth two logical circuits, most boolean functions require an exponential

number of logic gates (Wegener, 1987) to be represented (with respect to input size)

• There are functions computable with a polynomial-size logic gates circuit of depth k

that require exponential size when restricted to depth k − 1 (Hastad, 1986) The proof

of this theorem relies on earlier results (Yao, 1985) showing that d-bit parity circuits

of depth 2 have exponential size

Issues of classical approaches (Cont.)

Page 26: Automatic Image Annotation (AIA)

26

• One might wonder whether these computational complexity results for boolean circuits are

relevant to machine learning.

• See Orponen (1994)!

• for an early survey of theoretical results in computational complexity relevant to learning

algorithms. Interestingly, many of the results for boolean circuits can be generalized to

architectures whose computational elements are linear threshold units (also known as

artificial neurons (McCulloch & Pitts, 1943)), which compute:

f(x) = w0 x+b≥0 (1)

with parameters w and b.

Issues of classical approaches (Cont.)

Page 27: Automatic Image Annotation (AIA)

27

Issues of classical approaches (Cont.)

1 Theoretical Limitations of Shallow Architectures

2 Theoretical Advantages of Deep Architectures

Which one ?? !

Page 28: Automatic Image Annotation (AIA)

28

Slide Credit

Page 29: Automatic Image Annotation (AIA)

29

Slide Credit

Page 30: Automatic Image Annotation (AIA)

30

How to assign a word to an image ?

What are components of

Automatic Image Annotation

System ?

Feature Extraction

ClassificationMethods

Pattern Recognition !!

Components of AIA

Classical or Shallow

Structure Issues

Page 31: Automatic Image Annotation (AIA)

31http://graffiti-artist.net/corporate-offices/ny-facebook-office-graffiti/

Page 32: Automatic Image Annotation (AIA)

32

Outline

Introduction To Image

Annotation

• What?• Why?

Story Behind AIA

• Components of AIA• Progress of AIA• Issues &

Conclusions

Going deeper !

• Feature Extraction• Learning Methods• CNN• Conclusions

Useful Information

• Recent Articles• Toolbox• Databases• Authors

Conclusions

• References

Page 33: Automatic Image Annotation (AIA)

33

Going Deeper!• Color• Texture• Shape• Segmentation

Feature Extraction &

Representation

• ANN• SVM• Bayes• Metadata

Learning Methods

Page 34: Automatic Image Annotation (AIA)

34

Feature Extraction

ColorHistogram

Color Moments

Color Coherence

Vector

Color Correlogra

m Scalable Color

Descriptor

Color Structure Descriptor

Dominant Color

Descriptor

Spatial• Statistical• Structural• Model-basedSpectral• FT, DCT,

Wavelet, ..Texture

Page 35: Automatic Image Annotation (AIA)

35

Color

Page 36: Automatic Image Annotation (AIA)

36

Color

Page 37: Automatic Image Annotation (AIA)

37

Color: ComparisonsColor method Pros Cons

Histogram Simple to compute, intuitive High dimension, no spatial info,sensitive to noise

CM Compact, robust Not enough to describe all colors, no spatial info

CCV Spatial info High dimension, high computation cost

Correlogram Spatial info Very high computation cost, sensitive to noise, rotation and scale

Page 38: Automatic Image Annotation (AIA)

38

Color: Comparisons (Cont.)Color method Pros Cons

DCD Compact, robust,perceptual meaning

Need post-processing for spatial info

CSD Spatial info Sensitive to noise, rotation and scale

SCD Compact on need,scalability

No spatial info, less accurate ifcompact

Page 39: Automatic Image Annotation (AIA)

39

Spatial Texture : ComparisonsColor method Pros Cons

Texton Intuitive Sensitive to noise, rotation and scale, difficult to define textons

GLCM based method Intuitive, compact, robust High High computation cost, not enough to describe all

Tamura Perceptually meaningful Too few features

SAR Compact, robust, rotationinvariant

High computation cost, difficult to define pattern size

FD Compact, perceptually meaningful computation cost, sensitive to scale

Page 40: Automatic Image Annotation (AIA)

40

Spectral Texture : Comparisons (Cont.)Color method Pros Cons

FT/DCT Fast computation Sensitive to scale and rotation

Wavelet Fast computation, multi-resolution Sensitive to rotation, limitedorientations

Gabor Multi-scale, multi-orientation, robust

normalisation, losing of spectral information due to incomplete cover of spectrum plane

Curvelet Multi-resolution, multi-orientation, robust

Need rotation normalisation

Page 41: Automatic Image Annotation (AIA)

41

Shape

Chart Source: [Zhang and Lu 2004]

Page 42: Automatic Image Annotation (AIA)

42

Chart Source: [M. Yang, K. Kpalma, J. Ronsin 2008]

Shape (Cont.)

Page 43: Automatic Image Annotation (AIA)

43

Shape (Cont.)

Contour Based

Calculate shape features only from the boundaryof the shape

Region Based

Extract features from the entire

region

Page 44: Automatic Image Annotation (AIA)

44

Shape (Cont.)• Because contour based techniques are more sensitive to noise than

region based techniques.• Therefore, color image retrieval usually employs region based shape

features.

Page 45: Automatic Image Annotation (AIA)

45

Learning Methods:

Learning Methods• SVM• ANN• Tree• Parametric• Non-Parametric

Page 46: Automatic Image Annotation (AIA)

46

Learning Methods: ComparisonsAnnotation method Pros Cons

SVM Small sample, optimal class boundary, non-linear classification

Single labelling, one class per time, expensive trial and run, sensitive to noisy data, prone to over-fitting

ANN Multiclass outputs, non- linear classification, robust to noisy data, suitable for complex problem

Single labelling, sub-optimal, expensive training, complex and black box classification

DT Intuitive, semantic rules, multiclass outputs, fast, allow missing values, handle both categorical and numerical values

Single labelling, sub-optimal, need pruning, can be unstable

Page 47: Automatic Image Annotation (AIA)

47

Learning Methods: ComparisonsAnnotation method Pros Cons

Non-parametric Multi-labelling, model free, fast Large number of parameters, large sample, sensitive to noisy data

Parametric Multi-labelling, small sample, good approximation of unknown distribution

Predefined distribution, expensive training, approximated boundary

Metadata Use of both textual and visual features

Difficult to relate visual features with textual features, difficult textual feature extraction

Page 48: Automatic Image Annotation (AIA)

48

Deep Learning• Deep belief networks• Deep Boltzmann machines• Deep Convolutional neural networks• Deep Recurrent neural networks• Hierarchical temporal memory

Source: https://en.wikipedia.org/wiki/List_of_machine_learning_concepts

Page 49: Automatic Image Annotation (AIA)

49

Deep Learning (Cont.)

Source: Ranzato, 4 October 2013, Slides

Page 50: Automatic Image Annotation (AIA)

50

Deep Learning (Cont.)

•A Potential Problem with Deep Learning *??•Optimization Task• See : • Bengio’s Articles!• Hot videos about Deep Learning on YouTube!• Ranzato, 4 October 2013:• https://www.youtube.com/watch?

v=clgMTk5V2Sk*: Ranzato, 4 October 2013, Slides

Page 51: Automatic Image Annotation (AIA)

51

Outline

Introduction To Image

Annotation

• What?• Why?

Story Behind AIA

• Components of AIA• Progress of AIA• Issues &

Conclusions

Going deeper !

• Feature Extraction• Learning Methods• Deep Learning• Conclusions

Useful Information

• Recent Articles• Toolbox• Databases• Authors

Conclusions

• References

Page 52: Automatic Image Annotation (AIA)

52

2009, Shallow

Source: Venkatesh N. Mur thy, S. Maji, R. Manmatha, Automatic Image Annotation using Deep Learning Representations 2015

Useful Information: Recent Articles

Page 53: Automatic Image Annotation (AIA)

53

Which one ?? !

1 Theoretical Limitations of Shallow Architectures

2 Theoretical Advantages of Deep Architectures

Page 54: Automatic Image Annotation (AIA)

54

Source: B. Klein, G. Lev, G. Sadeh, and L. Wolf, Fisher Vectors Derived from Hybrid Gaussian-Laplacian Mixture Models for Image Annotation 2015

Useful Information: Recent Articles (Cont.)

Page 55: Automatic Image Annotation (AIA)

55

Useful Information: Toolbox

MatConvNet• MatConvNet is a MATLAB toolbox

implementing Convolutional Neural Networks (CNNs) for computer vision applications. It is simple, efficient, and can run and learn state-of-the-art CNNs. Several example CNNs are included to classify and encode images.

Caffe• Caffe is a deep learning framework made with

expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors.Yangqing Jia created the project during his PhD at UC Berkeley. Caffe is released under the BSD 2-Clause license.

Page 56: Automatic Image Annotation (AIA)

56

Useful Information: Databases

an important benchmark for keyword based image retrieval and image annotation5000 images manually annotated with 1 to 5 keywords. The vocabulary contains 260 words.

Corel5k:This data set is obtained from an online game where two players, that can not communicate outside the game, gain points by agreeing on words describing the image

ESP Game:This set of 20.000 images accompanied with descriptions in several languages was initially published for cross-lingual retrieval

IAPR TC12:

Page 57: Automatic Image Annotation (AIA)

57

Useful Information: Databases• Other Databases:• Flicker8,10,30

Table Source: M. Guillaumin, T. Mensink, J. Verbeek and C. Schmid, TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation

Page 58: Automatic Image Annotation (AIA)

58

Useful Information: Authors

Cordelia Schmid• Research director INRIA• Computer vision, object recognition,

video recognition, learning

Li Fei-Fei• Professor, Stanford University• Artificial Intelligence,

Machine Learning, Computer Vision, Neuroscience

Yoshua Bengio• Professor, U. Montreal, Computer Sc.• Machine learning, deep learning,

artificial intelligence

Reported by: Google Scholar

Page 59: Automatic Image Annotation (AIA)

59

Useful Information: Authors (Cont.)

Richard Socher• MetaMind• deep learning, machine learning,

natural language processing, computer vision

Recursive Deep Learning for Natural Language Pro

cessing and Computer Vision

,

PhD Thesis, Computer Science Department,

Stanford University

2014 Arthur L. Samuel Best Computer Science PhD

Thesis Award

Reported by: Google Scholar

Page 60: Automatic Image Annotation (AIA)

60

Outline

Introduction To Image

Annotation

• What?• Why?

Story Behind AIA

• Components of AIA• Progress of AIA• Issues &

Conclusions

Going deeper !

• Feature Extraction• Learning Methods• Deep Learning• Conclusions

Useful Information

• Recent Articles• Toolbox• Databases• Authors

Conclusions

• References

Page 61: Automatic Image Annotation (AIA)

61

How to assign a word to an image ?

What are components of

Automatic Image Annotation

System ?

Feature Extraction

ClassificationMethods

Pattern Recognition !!

Components of AIA

Classical or Shallow

Structure Issues

Conclusions !!!

Page 62: Automatic Image Annotation (AIA)

62

1. High dimensional feature analysis2. How to build an effective annotation model?3. The third issue is that currently annotation and

ranking are done online simultaneously in the multiple labelling annotation approaches. This is not efficient for image retrieval.

4. Lack of standard vocabulary and taxonomy.5. There is no commonly acceptable image database6. insufficient depth of architectures, and locality of

estimators[Bengio, 2009]

Picture Source: Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends® in Machine Learning

Source: Zhang, D., Islam, M. M., & Lu, G. (2012). A review on automatic image annotation techniques. Pattern Recognition, 45(1), 346–362. doi:10.1016/j.patcog.2011.05.013

Conclusions (Cont.)

Page 63: Automatic Image Annotation (AIA)

63

References


Top Related