multimedia databases: where are we? 2005 knowledge fusion workshop vincent oria department of...
Post on 27-Dec-2015
214 Views
Preview:
TRANSCRIPT
Multimedia Databases: Multimedia Databases: Where are we?Where are we?
2005 Knowledge Fusion Workshop2005 Knowledge Fusion Workshop
Vincent Oria
Department of Computer Science New Jersey Institute of Technology
2005 knowledge Fusion Workshop2
Multimedia databases: ChallengesMultimedia databases: Challenges
Retrieval and Indexing “Give me all video sequences showing Steve
laughing” Storage, Communication and
performances Standards (MPEG) Streaming and Resource Scheduling
Continious data such as audio and video need continious transport (QoS)
Authoring
Presentation
2005 knowledge Fusion Workshop3
OutlineOutline
Mono or Multi-media databases (or data repositories)
Image Content Analysis and Description Image Querying Multimedia Metadata Standards Conclusion
2005 knowledge Fusion Workshop4
Multimedia Database ArchitectureMultimedia Database Architecture
Multimedia Data Preprocessing System
Database Processing
MM DataPre-
processor
AdditionalInformation
<!ELEMENT ..>.....<!ATTLIST...>
Multimedia DBMS
Users
Que
ry I
nter
face
MM DataInstance
<article>.....</article>
Recognized components
MM DataInstance
MM Data
Meta-Data
2005 knowledge Fusion Workshop5
Document Database ArchitectureDocument Database Architecture
Document Processing System
Database Processing
DTD filesD
TD
Par
ser<!ELEMENT ..>
.....<!ATTLIST...>
DTDManager
TypeGenerator
Multimedia DBMS
Users
Que
ry I
nter
faceDTD
Document content
SGML/XMLDocuments
XML or SGMLDocumentInstance
ParseTree
<article>.....</article>
<!ELEMENT ..>.....<!ATTLIST...> DTD
C++ Types
C++ Objects
SGML/XMLParser
InstanceGenerator
2005 knowledge Fusion Workshop6
Image Database Architecture Image Database Architecture
Image Processing System
Database Processing
Image Analysis and Pattern Recognition
ImageAnnotation
Multimedia DBMS
Users
Que
ry I
nter
face
Image ContentDescription
Image
Image
SyntacticObjects
SemanticObjects
<article>.....</article>
Meta-Data
2005 knowledge Fusion Workshop8
Multimedia at the Confluence of Many Multimedia at the Confluence of Many DisciplinesDisciplines
Database Systems Artificial Intelligence
Visualization
DBMSModelingQuery processing
…
Machine LearningNeural NetworksAgentsKnowledge Representation…
Computer graphicsHuman Computer Interaction3D representation…
Information Retrieval
mathmatics
Networks
CodingStatistical andMathematical Modeling…
Other
CommunicationTimingQoS…
IndexingInverted files…
2005 knowledge Fusion Workshop9
Image Content AnalysisImage Content Analysis
Image content analysis can be categorized in 2 groups: Low-level features: vectors in a multi-
dimensional space Color Texture Shape
Mid- to high-level features: Try to infer semantics
Semantic Gap
2005 knowledge Fusion Workshop10
Image Content Analysis: ColorImage Content Analysis: Color
Color space: Multidimensional space A dimension is a color component Examples of color space: RGB, CIELAB, CIEL*u*v*,
YCbCr, YIQ, YUV, HSV RGB space: A color is a linear combination of 3 primary
colors (Red, Green and Blue) Color Quantization
Used to reduce the color resolution of an image Three widely used color features
Global color histogram Local color histogram Dominant color
2005 knowledge Fusion Workshop11
Color HistogramsColor Histograms
Color histograms indicate color distribution without spatial information Color histogram distance metrics
0
10
20
30
40
50
Red Orange Yellow Green Blue Indigo Violet
2005 knowledge Fusion Workshop12
Image Content Analysis: TextureImage Content Analysis: Texture
Refers to visual patterns with properties of homogeneity that do not result from the presence of only a single color
Examples of texture: Tree barks, clouds, water, bricks and fabrics
Texture features: Contrast, uniformity, coarseness, roughness, frequency, density and directionality
Two types of texture descriptors Statistical model-based
Explores the gray level spatial dependence of texture and extracts meaningful statistics as texture representation
Transform-based DCT transform, Fourier-Mellin transform, Polar Fourier
transform, Gabor and wavelet transform
2005 knowledge Fusion Workshop13
Image Content Analysis: ShapeImage Content Analysis: Shape
Object segmentation Approaches:
Global threshold-based approach Region growing, Split and merge approach, Edge detection app
Still a difficult problem in computer vision. Generally speaking it is difficult to achieve perfect segmentation
2005 knowledge Fusion Workshop14
Salient Objects vs. Salient PointsSalient Objects vs. Salient Points
Original images Segmented images with region boundaries
Extracted salient points
Generic low-level description of images into salient objects and salient points
2005 knowledge Fusion Workshop15
Modeling Images – PrinciplesModeling Images – Principles
Support for multiple representations of an image
Support for user-defined categorization of images
Well-defined set of operations on images An image can have (semantic, functional,
spatial) relationships with other images (or documents) which should be represented in the DBMS
An image is composed of salient objects (meaningful image components)
2005 knowledge Fusion Workshop16
Salient Object ModelingSalient Object Modeling
Multiple representations of a salient object (grid, vector) are allowed
A salient object O is of a particular type which belongs to a user defined salient object types hierarchy
An image component may have some (semantic, functional, spatial) relationships with other salient objects
2005 knowledge Fusion Workshop17
General Approach to Similarity General Approach to Similarity QueriesQueries
A descriptor is a point in a multidimensional space
Querying consists in defining a metric in the space and computing distances between a query point and the points in the space.
2005 knowledge Fusion Workshop18
The vector-based Similarity The vector-based Similarity SearchesSearches
1) extract from each object N numerical features and map objects into points of a N-dimensional space2) use a suitable distance (e.g., Euclidean) over such a space, and
search for “close” objects using a multi-dimensional (“spatial”) index(low distance = high similarity)
2005 knowledge Fusion Workshop19
User-defined similarityUser-defined similarity
Using the same distance function is not always appropriate
Example: retrieve (only) black points
2005 knowledge Fusion Workshop21
Integration of Sub-Query ResultsIntegration of Sub-Query Results
Image similarity queries are processed by splitting the overall query into sub-queries. How to obtain an effective and efficient integrating result ?
f0,1 f0,3f0,2
f1,1 f1,3f1,2 Feature
Extraction
Scoring Function
s1
s1,1 s1,3s1,2Feature
Comparison
I1I0
query
The common approach is to define a monotonic scoring function (e.g. min and avg) that associates to each object Ii a numeric value (overall score) si
An object is better (preferred to) than another iff its s is higher
2005 knowledge Fusion Workshop22
Common Approach: EfficiencyCommon Approach: Efficiency
In order to minimize the number of DB accesses, “middleware” algorithms are applied to return only the Top-k highest scored objects TA [FLN01] MEDRANK [FKS03] Upper [BGM02]
I2 S2,TI2 S2,C
I1 S1,TI3 S3,C
I3 S3,TI4 S4,C
I5 S5,TI1 S1,C
I4 S4,TI5 S5,C
TEXTURECOLOR
sub-query C
C T
sub-query T
STOP!
Top-k
… …
2005 knowledge Fusion Workshop23
Skyline OperatorSkyline Operator
The Skyline filters out the set of objects that are not dominated by any other object
“An object dominates another object if it is equal or better in all its dimensions and better in at least one dimension”
Hotel name
Distance to beach (km)
Price ($)
Opera 0.5 300
Chariot 2.5 100
Star 3 50
Hilton 3.5 350
t1
t2
t3
t4
t3
t2
t1
t4
t3 t4
t2
t1t4
t4
2005 knowledge Fusion Workshop24
Best OperatorBest Operator
Skyline filters out only the “first ” better results…but what happens if the cardinality of Skyline is
low and the user want more results?
The Best operator has been recently proposed Independent from the selected partial order Given a PO and the level n (n>0), Best computes
the best results (level 1) the “second choices” (level 2) … the “n-th choices” (level n)
t1t2t3
t4
2005 knowledge Fusion Workshop25
MM Example: BestMM Example: Best
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Color
Text
ure
…
…
2005 knowledge Fusion Workshop26
MPEG-7 ObjectivesMPEG-7 Objectives
MPEG-7, formally called ”Multimedia Content Description Interface”, will standardise: A language to specify description schemes, i.e. a
Description Definition Language (DDL). A set of Description Schemes and Descriptors A
scheme for coding the description
http://www.cselt.it/mpeg/standards/mpeg-7/mpeg-7.htm
2005 knowledge Fusion Workshop27
Content Context and ObjectivesContent Context and Objectives
Specification of a “Multimedia Content Description Interface”
Developed by the International Standard Organization and the International Electrotechnical Commission (IEC)
Standardized representation of multimedia metadata in XML (XML Schema Language)
Describes audio-visual content as a number of levels (features, structures, semantics, models,…)
2005 knowledge Fusion Workshop30
DescriptionGeneration
MPEG7Description
MPEG7Coded
Description
User
Encoder Decoder
Search /QueryEngine
MPEG7 DescriptionDefinition Language
(DDL)
MPEG7 DescriptionSchemes (DS) &Descriptors (D)
MM ContentMM Content
MPEG-7 FrameworkMPEG-7 Framework
FilterAgents
2005 knowledge Fusion Workshop32
MPEG-7 Example (cont)MPEG-7 Example (cont)
<object_hierarchy type="PHYSICAL"> <!-- Physical hierarchy -->
<object_node id="10" object_ref="0"> <!-- Portrait --> <object_node id="11" object_ref="1"> <!-- Father --> <object_node id="12" object_ref="4"/> <!-- Father’s face --> </object_node> <object_node id="13" object_ref="2"> <!-- Mother --> <object_node id="14" object_ref="5"/> <!-- Mother’s face --> </object_node> </object_node> </object_hierarchy> <object_hierarchy type="LOGICAL"> <!-- Logical hierarchy:
faces in the image --> <object_node id="15" object_ref="3"> <!-- Faces --> <object_node id="16" object_ref="4"/> <!-- Father’s face --> <object_node id="17" object_ref="5"/> <!-- Mother’s face --> </object_node> </object_hierarchy>
2005 knowledge Fusion Workshop33
MPEG-21MPEG-21
MPEG-7 does not take into account the aspect of the organization of the infrastructure of distributed multimedia systems
Initiated in 2000 to provide mechanisms for distributed multimedia systems design and associated services
A new distribution entity is proposed and validated: the Digital item
2005 knowledge Fusion Workshop34
Conclusion: Towards more SemanticsConclusion: Towards more Semantics
Semantic Representation MPEG-7
Semantic Acquisition Automated approach for all purpose images
and videos have failed Manual interpretation is tedious
How to minimize human interventions?
2005 knowledge Fusion Workshop35
Towards more Semantics: Towards more Semantics: ChallengesChallenges
“A picture is worth a thousand words”
“It is not so much that a picture is worth a thousand words, for many fewer words can describe a still picture for most retrieval purpose, the issue has more to do with the fact that those words vary from one person to another”
``In spite of almost fifty years of research, design of a general-purpose machine pattern recognizer remains an elusive goal....The best pattern recognizers in most instances are humans, yet we do not understand how humans recognize patterns".
top related