An MPEG-7 Based Content-An MPEG-7 Based Content-aware Album System for aware Album System for Consumer Photographs Consumer Photographs
2003/12/182003/12/18
Chen-Hsiu Huang, Chih-Hao Shen, Chun-Hsiang Huang Chen-Hsiu Huang, Chih-Hao Shen, Chun-Hsiang Huang and Ja-Ling Wuand Ja-Ling Wu
Communication and Multimedia Laboratory,Communication and Multimedia Laboratory,National Taiwan University,National Taiwan University,
E-mail: {chenhsiu,shen,bh,wjl}@cmlab.csie.ntu.edu.twE-mail: {chenhsiu,shen,bh,wjl}@cmlab.csie.ntu.edu.tw
IntroductionIntroduction
It’s ease for consumers to shoot pictures but not It’s ease for consumers to shoot pictures but not trivial when it comes to deal with many of them.trivial when it comes to deal with many of them. Contents that we can not handle or manage Contents that we can not handle or manage
are of no values.are of no values. Many album system are designed to solve this Many album system are designed to solve this
by using by using EXIF informationEXIF information or or textual metadatatextual metadata, , but we think that’s not quite straight forward.but we think that’s not quite straight forward.
An ideal album system should be able to identify An ideal album system should be able to identify the difference between photographs and realize the difference between photographs and realize some semantic information about the content; some semantic information about the content;
It should be a content-aware album system.It should be a content-aware album system.
Core FunctionalitiesCore Functionalities LocatingLocating: Query images by face: Query images by face
Face detection & recognitionFace detection & recognition
AdaptationAdaptation: Smart Thumbnail: Smart Thumbnail Photo Focus identificationPhoto Focus identification
BrowsingBrowsing: Photo Similarity: Photo Similarity Find relevant photos with Find relevant photos with
similarity calculationsimilarity calculation
Query Images by FaceQuery Images by Face
Steps for querying photos by face:Steps for querying photos by face:
PS: We use Intel OpenCV Library as face detection & recognition module
Photo FocusPhoto Focus
Before thumbnailing, we should first identify what’s the Before thumbnailing, we should first identify what’s the focus in photosfocus in photos
For photos with people, human faces are surely our For photos with people, human faces are surely our focus when viewing.focus when viewing.
The user attention model has applied to find some The user attention model has applied to find some saliency points:saliency points: RedRed: Intensity based: Intensity based GreenGreen: Color based: Color based BlueBlue: Skin color based: Skin color based
Smart ThumbnailSmart Thumbnail
Focus Based Adaptive Selection
Direct Scale
Traditional way of creating thumbnail
Cropping the focus region first, then scalingBetter then direct scaling, but not so good
A weighting function was applied to calculate its importance.User can select the cropping ratio, the cropping region is adaptive decided according to the weighting value
Adaptive SelectionAdaptive Selection
For all the visual objects For all the visual objects (faces, saliency points), (faces, saliency points), calculate its importance calculate its importance by:by:
When adaptive selection, When adaptive selection, sort those visual objects sort those visual objects by importanceby importance, dropping , dropping the least import object to the least import object to achieve the goal cropping achieve the goal cropping ratio.ratio.
chwi dFRFRW /)( 2
Photo SimilarityPhoto Similarity
Borrowed from MPEG-7 standard: Borrowed from MPEG-7 standard: Color Layout DescriptorColor Layout Descriptor
Spatial distribution of colorsSpatial distribution of colors Dominant Color DescriptorDominant Color Descriptor
The representative colors in imageThe representative colors in image Face Number DescriptorFace Number Descriptor
The number of faces detected in imageThe number of faces detected in image By using the faces information and MPEG-7 By using the faces information and MPEG-7
descriptors, we can calculate the similarities descriptors, we can calculate the similarities between images.between images.
Similarity ModelingSimilarity Modeling
Distance of face number descriptor between photos is Distance of face number descriptor between photos is defined as: defined as:
Similarity modeling with descriptor distance combinationSimilarity modeling with descriptor distance combination
),max( ji
ji
FND FNFN
FNFNdist
3/)( FNDDCDCLDij distdistdistSim
System DiagramSystem Diagram
Face detection & reorganization
User attention modelSaliency Map
MPEG-7 Visual Descriptors
Query by Face
Photo Focus &Smart Thumbnail
Photo Similarity
We can get more semantic meanings from low level features by combining those kernel modules.
We can get more semantic meanings from low level features by combining those kernel modules.
In the FutureIn the Future
The album system can be improved both systematic side The album system can be improved both systematic side and component side:and component side:
System aspectSystem aspect: : The album syntax should be fully conform to the MPEG-7 The album syntax should be fully conform to the MPEG-7
standard.standard. The album should be able to process other media type such as The album should be able to process other media type such as
audio and video.audio and video.
Component aspectComponent aspect: : More low level features or descriptors in MPEG-7 standard will More low level features or descriptors in MPEG-7 standard will
be used and combined for further semantic meaning extraction.be used and combined for further semantic meaning extraction. The face detection & recognition library could be fine tuned to The face detection & recognition library could be fine tuned to
meet the needs of album system.meet the needs of album system.