content based image retrieval romit das ryan scotka
DESCRIPTION
GIS Problems Search based on filename –Verbatim match –Noun replacement Potential for Abuse (Google Hack)TRANSCRIPT
Content Based Image Retrieval
Romit Das · Ryan Scotka
GIS Problems
• Search based on filename– Verbatim match– Noun replacement
• Potential for Abuse (Google Hack)
Possible Solutions
• Metadata– Standards– Re-index existing images
• Manual Classification– Time
• Content-based Classification
CBIR – Training
1. Choose features to distinguish images.2. Extract said features.3. Apply statistical method to model
features.4. Categorize based on textual description.
ExampleDimensions
Color Frequencies
Spatial Distribution
200 x 200 + Mostly flesh tones + Flesh tones concentrated in the center =
baby
Author’s Feature Set
• Feature Set (6 dimensions):– Color averages (LUV)– High-frequency energy bands
• “Effectively discern local texture”• Wavelet transform on 4x4 blocks• Use HL, LH, and HH “high energy bands”• Use the LL for lower resolution analysis
Author’s Implementation
• Statistical Modeling– Use machine learning to build concepts
Concept = Paris
Training Set =
Markov Models
• Take known facts• Deduce hidden/unknown data
Markov Model Example
• Given:– Queues of people, shelves, price labels,
disgruntled workers• Possible Results:
– Post office– Supermarket– Record Store
Markov Model Example
• Given:– Queues of people, shelves, price labels,
disgruntled workers, food products• Possible Results:
– Post office– Supermarket– Record Store
Ninja ModelPerson, outdoors
Ninja ModelPeople, ninjas, outdoor
Ninja ModelPeople, ninjas, weapons, outdoors
Ninja Markov Model
Person, outdoors
People, ninjas, outdoors
People, ninjas, outdoors
weapons, class photo
Creating Concepts
• Training Concept– Created from hand-picked images– Must choose statistically significant training
size• Resulting Concept
– Used in automatic cataloging of future images
Observations
• Images are associated with multiple concepts.
• Not foolproof• Example:
People, ninjas, outdoors
weapons, class photo
Advantages
• Automatic categorization
Disadvantages
• False positives– Concepts may require a vast amount of
images• Increases training time
• Dissimilar images needed for training of a concept
Future Additions
• Further refinement of conflicting semantics• Weights assigned to classifications
Our Implementation
• Perform classification with alternate learners (Weka)