exploiting big data via attributes (offline contd.)

55
Exploiting Big Data via Attributes (Offline Contd.)

Upload: edward-grisson

Post on 01-Apr-2015

225 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Exploiting Big Data via Attributes (Offline Contd.)

Exploiting Big Data via Attributes(Offline Contd.)

Page 2: Exploiting Big Data via Attributes (Offline Contd.)

Recap - Attributes

What are attributes?

Slide Credit: Devi Parikh

Page 3: Exploiting Big Data via Attributes (Offline Contd.)

Recap - Attributes

Rich Understanding

Image Credit: Ali Farhadi

Page 4: Exploiting Big Data via Attributes (Offline Contd.)

Recap - Annotations

Zero-shot learning

Frogs are green, have heads and legs. What is this?

Image Credit: Olga Russakovsky

Page 5: Exploiting Big Data via Attributes (Offline Contd.)

Recap - Annotations

Attributes help in getting richer description from Annotators.

Image Credit: Devi Parikh

Page 6: Exploiting Big Data via Attributes (Offline Contd.)

Understanding Single Image Or

Learning a Classifier (w/ Human Feedback)

Single or Few images to Big Data

Slide Credit: Abhinav Gupta

Page 7: Exploiting Big Data via Attributes (Offline Contd.)

Big Data

90% of web data is visual!

142 Billion Images6 Billion added

monthly

6 Billion Images

72 hours of videouploaded every

minute

How attributes can help in learning from big-data?Slide Credit: Abhinav Gupta

Page 8: Exploiting Big Data via Attributes (Offline Contd.)

What this part is about

Semi-supervised Learning

Slide Credit: Abhinav Gupta

Page 9: Exploiting Big Data via Attributes (Offline Contd.)

What this part is about

Before the start of the debate, Mr. Obama and Mrs. Clinton met with the moderators,

Charles Gibson, left, and George Stephanopoulos, right, of ABC News.

A officer on the left of car checks the speed of other cars on the road.

Weakly-labeled Learning

Slide Credit: Abhinav Gupta

Page 10: Exploiting Big Data via Attributes (Offline Contd.)

Key-insight

Attributes can help in coupling the learning and hence provide constraints for joint learning

Amph

ithea

tre

Audi

toriu

m

Goal: Learn multiple classifiers simultaneously. Ba

nque

tBe

droo

m

Slide Credit: Abhinav Gupta

Page 11: Exploiting Big Data via Attributes (Offline Contd.)

Slide Credit: Abhinav Gupta

Semi-supervised Learning

Shrivastava et al., 2012

Page 12: Exploiting Big Data via Attributes (Offline Contd.)

SEMI-SUPERVISED

[Zhu, TR, 2005], [Chunsheng Fang, Slides, 2009] Slide Credit: Abhinav Gupta

Page 13: Exploiting Big Data via Attributes (Offline Contd.)

Labeled Seed Examples

Amphitheatre

Unlabeled Data

Select Candidates

TrainModels

Add to Labeled Set

RetrainModels

Amphitheatre

BOOTSTRAPPING

Slide Credit: Abhinav Gupta

Page 14: Exploiting Big Data via Attributes (Offline Contd.)

BOOTSTRAPPING

RetrainModels

Labeled Seed Examples

Amphitheatre

Unlabeled Data

Select Candidates

Add to Labeled Set

Amphitheatre

25th Iteration

[Curran et al., PACL 2007]

Semantic Drift

Amphitheatre + Auditorium

Slide Credit: Abhinav Gupta

Page 15: Exploiting Big Data via Attributes (Offline Contd.)

GRAPH-BASED METHODS

[Ebert et al., ECCV 2010] [Fergus et al., NIPS 2009] Slide Credit: Abhinav Gupta

Page 16: Exploiting Big Data via Attributes (Offline Contd.)

Amphitheatre Amphitheatre

CONSTRAINED BOOTSTRAPPINGAmphitheatre

Auditorium

Amphitheatre

Auditorium

Slide Credit: Abhinav Gupta

Page 17: Exploiting Big Data via Attributes (Offline Contd.)

Amphitheatre

Auditorium

Amphitheatre

Auditorium

Joint Learning

[Carlson et al., NAACL HLT Workshop on SSL for NLP 2009]

Share Data

CONSTRAINED BOOTSTRAPPING

Slide Credit: Abhinav Gupta

Page 18: Exploiting Big Data via Attributes (Offline Contd.)

AmphitheatreAmphitheatre

AuditoriumAuditorium

BanquetHall

BanquetHall

Conference Room

Conference Room

Binary Attributes (BA)

Indoor Man-madeTables and Chairs Large Seating CapacityIndoor Man-madeTables and Chairs Large Seating Capacity

[Farhadi et al., CVPR 2009] [Lampert et al., CVPR 2009] Slide Credit: Abhinav Gupta

Page 19: Exploiting Big Data via Attributes (Offline Contd.)

Slide Credit: Abhinav Gupta

Tables and Chairs

Conference Room

BanquetHall

Auditorium

Amphitheatre

Indoor

Large Seating Capacity

Man-made

[Patterson and Hays, CVPR 2012]

Tables and Chairs

Conference Room

BanquetHall

Auditorium

Amphitheatre

Indoor

Large Seating Capacity

Man-made

Binary Attributes (BA)

Page 20: Exploiting Big Data via Attributes (Offline Contd.)

Slide Credit: Abhinav Gupta

AuditoriumIndoor Has Seat Rows

Page 21: Exploiting Big Data via Attributes (Offline Contd.)

Sharing via Dissimilarity

Amphitheatre Auditorium

Has Larger Circular Structures

[Parikh and Grauman, ICCV 2011] [Gupta and Davis, ECCV 2008] Slide Credit: Abhinav Gupta

Page 22: Exploiting Big Data via Attributes (Offline Contd.)

Amphitheatre AuditoriumHas Larger

Circular Structures

?Slide Credit: Abhinav Gupta

Page 23: Exploiting Big Data via Attributes (Offline Contd.)

Amphitheatre AuditoriumHas Larger

Circular Structures

Slide Credit: Abhinav Gupta

Page 24: Exploiting Big Data via Attributes (Offline Contd.)

Dissimilarity

Has Larger Circular Structures

[Parikh and Grauman, ICCV 2011] [Gupta and Davis, ECCV 2008]

COMPARATIVE ATTRIBUTES

Slide Credit: Abhinav Gupta

Page 25: Exploiting Big Data via Attributes (Offline Contd.)

• Similar to Relative Attributes.

• Uses pair of images as data-points during learning.

• Instead of predicting a real number, it uses binary classifier.

COMPARATIVE ATTRIBUTES

Slide Credit: Abhinav Gupta

Page 26: Exploiting Big Data via Attributes (Offline Contd.)

DissimilarityCOMPARATIVE ATTRIBUTES

Has Larger Circular

Structures

[Parikh and Grauman, ICCV 2011] [Gupta and Davis, ECCV 2008]

……

……

……

……

Features• GIST• RGB (Tiny Image)• Line Histogram of:

Length Orientation

• LAB histogram

Slide Credit: Abhinav Gupta

Page 27: Exploiting Big Data via Attributes (Offline Contd.)

……

……

DissimilarityCOMPARATIVE ATTRIBUTES

[Parikh and Grauman, ICCV 2011] [Gupta and Davis, ECCV 2008]

……

…… Has Larger

Circular Structures

ClassifierBoosted Decision Tree[Hoiem et al., IJCV 2007]

✗or

Has Larger Circular

Structures

Slide Credit: Abhinav Gupta

Page 28: Exploiting Big Data via Attributes (Offline Contd.)

Comparative Attributes

[Parikh and Grauman, ICCV 2011] [Gupta and Davis, ECCV 2008]

Amphitheatre > Barn

Amphitheatre > Conference Room

Desert > Barn

Is More Open

Church (Outdoor) > CemeteryBarn > Cemetery

Has Taller Structures

Slide Credit: Abhinav Gupta

Page 29: Exploiting Big Data via Attributes (Offline Contd.)

Amphitheatre

Auditorium

Amphitheatre

Auditorium

Labeled Seed Examples Bootstrapping

Slide Credit: Abhinav Gupta

Page 30: Exploiting Big Data via Attributes (Offline Contd.)

Labeled Seed Examples

Amphitheatre

Auditorium

Amphitheatre

Auditorium

Bootstrapping

Amphitheatre

Auditorium

Constrained Bootstrapping

Indoor

Has Seat Rows

Attributes

Has Larger Circular

Structures

ComparativeAttributes

Slide Credit: Abhinav Gupta

Page 31: Exploiting Big Data via Attributes (Offline Contd.)

Banq

uet

Bedr

oom

Labeled Data

Unlabeled Data

has more space

has larger structures

Training Pairwise Data

Promoted InstancesConference Room Banquet Hall

[Gupta and Davis, ECCV 2008]

Comparative Attribute Classifiers

mor

e sp

ace

larg

er st

ruct

ures

Attribute Classifiersin

door

has

gras

s

Scene Classifiers

bedr

oom

banq

uet h

all

Slide Credit: Abhinav Gupta

Page 32: Exploiting Big Data via Attributes (Offline Contd.)

Boot

stra

ppin

gBA

Con

stra

ints

AmphitheatreC-

Boot

stra

ppin

gSe

ed Im

ages

Page 33: Exploiting Big Data via Attributes (Offline Contd.)

BA C

onst

rain

ts

BridgeSe

ed Im

ages

Boot

stra

ppin

gC-

Boot

stra

ppin

g

Slide Credit: Abhinav Gupta

Page 34: Exploiting Big Data via Attributes (Offline Contd.)

Attributes help improve Recall

Slide Credit: Abhinav Gupta

Page 35: Exploiting Big Data via Attributes (Offline Contd.)

1

40

Banquet Hall

10

Itera

tions

Seed

Imag

es

Slide Credit: Abhinav Gupta

Page 36: Exploiting Big Data via Attributes (Offline Contd.)

Itera

tion-

1Ite

ratio

n-60

Boot

stra

ppin

gC-

Boot

stra

ppin

gIte

ratio

n-1

Itera

tion-

60Se

ed Im

ages

Bedroom

Page 37: Exploiting Big Data via Attributes (Offline Contd.)

Scene Classification

Eigen Functions: [Fergus et al., NIPS 2009] Slide Credit: Abhinav Gupta

Page 38: Exploiting Big Data via Attributes (Offline Contd.)

Co-training (large Scale)

• 15 Scene Categories 25 Seed images / category

• Unlabeled Set 1Million (SUN Database + ImageNet) >95% distractors

SUN Database: [Xiao et al., CVPR 2010]ImageNet: [Deng et al., CVPR 2009]

Improve 12 out of 15 scene classifiers

Slide Credit: Abhinav Gupta

Page 39: Exploiting Big Data via Attributes (Offline Contd.)

LIMITATIONS

C-bootstrapping uses semantic attributes and needs manually specified relationships

Amphitheatre > Barn

Amphitheatre > Conference Room

Desert > Barn

Is More Open

Can we learn the relationships?

Slide Credit: Abhinav Gupta

Page 40: Exploiting Big Data via Attributes (Offline Contd.)

Choi et al., Adding Unlabeled Samples to Categories by Learned Attributes , CVPR 2013

Framework for jointly learning visual classifiers and noun-attribute mapping.

Page 41: Exploiting Big Data via Attributes (Offline Contd.)

Formulation• A joint optimization for

– Learning classifier in visual feature space (wca)

– Learning classifier in attribute space (wcv)– With finding the samples (I)

• Non-convex– Mixed integer program: NP-complete problem– Solution: Block coordinate-descent

Learning a classifier on visual feature space

Learning a classifier on attribute spacewith a selection criterion

Mutual ExclusionNot convex

discrete continuous

Slide Credit: Junghyun Choi

Page 42: Exploiting Big Data via Attributes (Offline Contd.)

Overview Diagram

Initial Labeled-Samples

Build Attribute Space

Project

Find Useful Attributes

Unlabeled Samples

Project

Choose Confident Examples To Add

Auxiliary data

Slide Credit: Jonghyun Choi

Page 43: Exploiting Big Data via Attributes (Offline Contd.)

Example Qualitative Results

• Categorical: common traits of a categorySelected by Categorical Attributes

Initial Labeled Training Examples

Dotted

Animal-like shape

Slide Credit: Jonghyun Choi

Page 44: Exploiting Big Data via Attributes (Offline Contd.)

Slide Credit: Abhinav Gupta

Weakly-Labeled Learning

Gupta et al., 2008

Page 45: Exploiting Big Data via Attributes (Offline Contd.)

Captions - Bag of Nouns

Learning Classifiers involves establishing correspondence.

road.A officer on the left of car checks the speed of other cars on the

officercar

road

officer

car

road

Slide Credit: Abhinav Gupta

Page 46: Exploiting Big Data via Attributes (Offline Contd.)

Correspondence - Co-occurrence Relationship

Bear

Water

Bear

FieldWater

Bear

Field

Slide Credit: Abhinav Gupta

Page 47: Exploiting Big Data via Attributes (Offline Contd.)

Co-occurrence Relationship (Problems)

RoadCar RoadCar RoadCarRoadCar RoadCar RoadCarCar Road RoadCar

Hypothesis 1

Hypothesis 2

Car Road

Slide Credit: Abhinav Gupta

Page 48: Exploiting Big Data via Attributes (Offline Contd.)

Beyond Nouns – Exploit Relationships

Use annotated text to extract nouns and relationships between nouns.

road.officer on the left of car checks the speed of other cars on theA

On (car, road)Left (officer, car)

car officer road

Constrain the correspondence problem using the relationships

On (Car, Road)

Road

Car

Road

Car

More Likely

Less Likely

Key insight: Solve the correspondence problem jointly using constraints!

Slide Credit: Abhinav Gupta

Page 49: Exploiting Big Data via Attributes (Offline Contd.)

Relationships• Prepositions – A preposition usually indicates the temporal, spatial or

logical relationship of its object to the rest of the sentence

• The most common prepositions in English are "about," "above," "across," "after," "against," "along," "among," "around," "at," "before," "behind," "below," "beneath," "beside," "between," "beyond," "but," "by," "despite," "down," "during," "except," "for," "from," "in," "inside," "into," "like," "near," "of," "off," "on," "onto," "out," "outside," "over," "past," "since," "through," "throughout," "till," "to," "toward," "under," "underneath," "until," "up," "upon," "with," "within," and "without” where indicated in bold are the ones (the vast majority) that have clear utility for the analysis of images and video.

• Comparative attributes – relating to color, size, movement- “larger”, “smaller”, “taller”, “heavier”, “faster”………

Goal: Learn models of nouns, prepositions, comparative attributes simultaneously from weakly-labeled data.

Slide Credit: Abhinav Gupta

Page 50: Exploiting Big Data via Attributes (Offline Contd.)

Learning the Model – Chicken Egg Problem

Chicken-Egg Problem: We treat assignment as missing data and formulate an EM approach.

Road

Car

Car

Road

Assignment Problem Learning Problem

On (car, road)

Slide Credit: Abhinav Gupta

Page 51: Exploiting Big Data via Attributes (Offline Contd.)

EM Approach- Learning the Model

• E-Step: Compute the noun assignment for a given set of object and relationship models from previous iteration.

• M-Step: For the noun assignment computed in the E-step, we find the new ML parameters by learning both relationship and object classifiers.

• For initialization of the EM approach, we can use any image annotation approach with localization such as the translation based model described in [1].

[1] Duygulu, P., Barnard, K., Freitas, N., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. ECCV (2002)

Page 52: Exploiting Big Data via Attributes (Offline Contd.)

Relationships modeled

• Most relationships are learned “correctly”– Above, behind, below, left, right, beside, bluer, greener,

nearer, more-textured, smaller, larger, brighter

• But some are associated with the wrong features– In (topological relationships not captured by color, shape

and location)– on-top-of– taller (most tall objects are thin and the segmentation

algorithm tends to fragment them)

Slide Credit: Abhinav Gupta

Page 53: Exploiting Big Data via Attributes (Offline Contd.)

Resolution of Correspondence Ambiguities

[2] Barnard, K., Fan, Q., Swaminathan, R., Hoogs, A., Collins, R., Rondot, P., Kaufold, J.: Evaluation of localized semantics: data, methodology and experiments. Univ. of Arizona, TR-2005 (2005)

Duygulu et. al [1] Our Approach

[1] Duygulu, P., Barnard, K., Freitas, N., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. ECCV (2002)

below(birds,sun) above(sun, sea) brighter(sun,sea) below(waves,sun)

above(statue,rocks);ontopof(rocks, water); larger(water,statue)

below(flowers,horses); ontopof(horses,field); below(flowers,foals)

Slide Credit: Abhinav Gupta

Page 54: Exploiting Big Data via Attributes (Offline Contd.)

Summary

• Attributes can help in exploiting big-data.

• Attributes represent how class A is similar to class B, and how class B is different from class A…

• These relationships can help in formulating joint-learning problem and improve learning from large unlabeled and weakly labeled data.

Slide Credit: Abhinav Gupta

Page 55: Exploiting Big Data via Attributes (Offline Contd.)

Slide Credit: Abhinav Gupta