cs257 modelling multimedia information lecture 6

33
CS257 Modelling Multimedia CS257 Modelling Multimedia Information Information LECTURE 6 LECTURE 6

Post on 21-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CS257 Modelling Multimedia Information LECTURE 6

CS257 Modelling Multimedia InformationCS257 Modelling Multimedia Information

LECTURE 6LECTURE 6

Page 2: CS257 Modelling Multimedia Information LECTURE 6

IntroductionIntroduction

• See beginning of Lecture 5…

Page 3: CS257 Modelling Multimedia Information LECTURE 6

Queries to Video DatabasesQueries to Video Databases

• Users may want to query for a particular event involving particular people, e.g. “find me video with Bill hitting Tom” – why not use a list of keywords [hit, Bill, Tom] for query and to represent film content?

Need more structured descriptions of what’s happening (both for queries and for video metadata), i.e. who is doing what to whom with what and why. [More on this in PART 1]

Page 4: CS257 Modelling Multimedia Information LECTURE 6

Queries to Video DatabasesQueries to Video Databases

• User may want to specify a temporal sequence of events, e.g. “find me video where this happens then this happens while that happens”

[More on this in PART 2]

Page 5: CS257 Modelling Multimedia Information LECTURE 6

Queries to Video DatabasesQueries to Video Databases

• How to express queries / How to describe content – can be considered two sides of the same coin; both require dealing with the same kinds of issues

Page 6: CS257 Modelling Multimedia Information LECTURE 6

Creating Metadata for Video DataCreating Metadata for Video Data

• Content-descriptive metadata for video often needs to be manually annotated

• However, in some cases the process can be automated (partially) by:– Video segmentation– Feature recognition, e.g. to detect faces, explosions, etc.– Extracting keywords from time-aligned collateral texts,

e.g. subtitles and audio description

Page 7: CS257 Modelling Multimedia Information LECTURE 6

Overview of LECTURE 6Overview of LECTURE 6

• PART 1: Need to be able to formally describe video content in terms of objects and events in order to make a query to a video database, e.g. specify who is doing what. Subrahmanian’s Video SQL

• PART 2: May wish to specify temporal and / or causal relationships between events, e.g. X happens before Y, A causes B to happen Allen’s temporal logic Roth’s system for video browsing by causal links

• LAB –Bring coursework questions;

Page 8: CS257 Modelling Multimedia Information LECTURE 6

PART 1:PART 1: Querying Video ContentQuerying Video Content

Four kinds of retrieval according to Subrahmanian (1998)

Segment Retrieval: “find all video segments where an exchange of a briefcase took place at John’s house”

Object Retrieval: “find all the people in the video sequence (v,s,e)”

Activity Retrieval: “what was happening in the video sequence (v,s,e)”

Property-based Retrieval: “find all segments where somebody is wearing a blue shirt”

Page 9: CS257 Modelling Multimedia Information LECTURE 6

Querying Video ContentQuerying Video Content

• Subrahmanian (1998) proposes an extension to SQL in order to express a user’s information need when querying a video database– Based on video functions

• Recall that SQL is a database query language for relational databases; queries expressed in terms of:SELECT (which attributes)FROM (which table)WHERE (these conditions hold)

Page 10: CS257 Modelling Multimedia Information LECTURE 6

Subrahmanian’sSubrahmanian’sVideo FunctionsVideo Functions

FindVideoWithObject(o)

FindVideoWithActivity(a)

FindVideoWithActivityandProp(a,p,z)

FindVideoWithObjectandProp(o,p,z)

Page 11: CS257 Modelling Multimedia Information LECTURE 6

Subrahmanian’sSubrahmanian’sVideo Functions (continued)Video Functions (continued)

FindObjectsInVideo(v,s,e)

FindActivitiesInVideo(v,s,e)

FindActivitiesAndPropsInVideo(v,s,e)

FindObjectsAndPropsInVideo(v,s,e)

Page 12: CS257 Modelling Multimedia Information LECTURE 6

A Query Language for VideoA Query Language for Video

SELECT may containVid_Id : [s,e]

FROM may containvideo : <source>

WHERE condition allows statements liketerm IN func_call

(term can be variable, object, activity or property value

func_call is a video function)

Page 13: CS257 Modelling Multimedia Information LECTURE 6

EXAMPLE 1EXAMPLE 1

“Find all video sequences from the library CrimeVidLib1 that contain Denis Dopeman”

SELECT vid : [s,e]FROM video : CrimeVidLib1WHERE

(vid,s,e) IN FindVideoWithObjects(Denis Dopeman)

Page 14: CS257 Modelling Multimedia Information LECTURE 6

EXAMPLE 2EXAMPLE 2

“Find all video sequences from the library CrimeVidLib1 that show Jane Shady giving Denis Dopeman a suitcase”

Page 15: CS257 Modelling Multimedia Information LECTURE 6

EXAMPLE 2EXAMPLE 2

SELECT vid : [s,e]FROM video : CrimeVidLib1WHERE (vid,s,e) IN FindVideoWithObjects(Denis Dopeman) AND

(vid,s,e) IN FindVideoWithObjects(Jane Shady) AND

(vid,s,e) IN FindVideoWithActivityandProp(ExchangeObject, Item, Briefcase) AND

(vid,s,e) IN FindVideoWithActivityandProp(ExchangeObject, Giver, Jane Shady) AND

(vid,s,e) IN FindVideoWithActivityandProp(ExchangeObject, Receiver, Denis Dopeman)

Page 16: CS257 Modelling Multimedia Information LECTURE 6

EXAMPLE 3EXAMPLE 3

“Which people have been seen with Denis Dopeman in CrimeVidLib1”

Page 17: CS257 Modelling Multimedia Information LECTURE 6

EXAMPLE 3EXAMPLE 3

SELECT vid : [s,e], ObjectFROM video : CrimeVidLib1WHERE(vid,s,e) IN FindVideoWithObject(Denis Dopeman) ANDObject IN FindObjectsInVideo(vid,s,e) ANDObject = Denis Dopeman ANDtype of (Object, Person)

Page 18: CS257 Modelling Multimedia Information LECTURE 6

Exercise 6-1Exercise 6-1

Given a video database of old sports broadcasts, called SportsVidLib, express the following users’ information needs using the extended SQL as best as possible. You should comment on how well the extended SQL is able to capture each user’s information need and discuss alternative ways of expressing the information need more fully.

•Bob wants to see all the video sequences with Michael Owen kicking a ball

•Tom wants to see all the video sequences in which Vinnie Jones is tackling Paul Gascoigne

•Mary wants to see all the video sequences in which Roy Keane is arguing with the referee, because Jose Reyes punched Gary Neville, while Thierry Henry scores a goal, and then Roy Keane is sent off.

Page 19: CS257 Modelling Multimedia Information LECTURE 6

Bob wants to see all the video sequences Bob wants to see all the video sequences with Michael Owen kicking a ballwith Michael Owen kicking a ball

Page 20: CS257 Modelling Multimedia Information LECTURE 6

Tom wants to see all the video sequences in which Tom wants to see all the video sequences in which Vinnie Jones is tackling Paul GascoigneVinnie Jones is tackling Paul Gascoigne

Page 21: CS257 Modelling Multimedia Information LECTURE 6

Mary wants to see all the video sequences in Mary wants to see all the video sequences in which Roy Keane is arguing with the referee, which Roy Keane is arguing with the referee,

because Jose Reyes punched Gary Neville, while because Jose Reyes punched Gary Neville, while Thierry Henry scores a goal, and then Roy Keane Thierry Henry scores a goal, and then Roy Keane

is sent off.is sent off.

Page 22: CS257 Modelling Multimedia Information LECTURE 6

Think about…Think about…

What metadata would be required in order to execute these kinds of video query?

How could this be stored and searched most efficiently?

Page 23: CS257 Modelling Multimedia Information LECTURE 6

Part 2: Enriching Video Data Part 2: Enriching Video Data Models and QueriesModels and Queries

• More sophisticated queries to video databases can be supported by considering:– Temporal relationships between video intervals– Causal relationships between events

Need to be able to describe temporal relationships between intervals formally and make inferences about temporal sequences…

Page 24: CS257 Modelling Multimedia Information LECTURE 6

Temporal Relationships Temporal Relationships between Intervalsbetween Intervals

• Allen’s (1983) work on temporal logic is often discussed in the video database literature (and in other computing disciplines)

• 13 temporal relationships that describe the possible temporal relationships that can hold between temporal intervals (e.g. intervals or events in video) these can be used to formulate video queries

• A transitivity table allows a system to infer the relationship between A r C, if A r B and B r C are known (where r stands for one temporal relationship, and A, B, C are intervals)

SEE MODULE WEB-PAGE FOR EXTRA NOTES ON THIS

Page 25: CS257 Modelling Multimedia Information LECTURE 6

X equal Y = = XXXXXYYYYY

X before Y < > XXXX YYYY

X meets Y m mi XXXXYYYYX overlaps Y o oi XXXXX

YYYYYX during Y d di XXX

YYYYYYYYYX starts Y s si XXXX

YYYYYYYY X finishes Y f fi XXXXX

YYYYYYYYYY

Page 26: CS257 Modelling Multimedia Information LECTURE 6

Temporal Relationships Temporal Relationships between Intervalsbetween Intervals

• Crucial aspect of Allen’s work is the transitivity table that enables inferences to be made about temporal sequences

• Inferences take the form:

If A r B, and B r C, then r1, r2, r3… may hold between A and C

For example:

If A < B and B < C, then A < C

Page 27: CS257 Modelling Multimedia Information LECTURE 6

Another ExampleAnother Example• If A “contains” B, and B < C then what

relationships can hold between A and C?

BBBBB ?CC? ?CCCC? ?CCCCC?

AAAAAAAAAAAAA?CCCCC?

?CCCCC?

Possibilities: A < C ; A “overlaps” C; A “meets C”; A “contains” C; A “is finished by C”

Page 28: CS257 Modelling Multimedia Information LECTURE 6

Modelling the Relationships between Modelling the Relationships between Entities and Events in FilmEntities and Events in Film

• Some temporal relationships might be interpreted as causal relationships

• Roth (1999) proposed the use of a semantic network to represent the relationships between entities and events in a movie – including causal relations

• The user can then browse between scenes in a movie, e.g. if they are watching the scene of an explosion, they may browse to the scene in which a bomb was planted, via the semantic network (extra note on semantic network will be on the module website).

Page 29: CS257 Modelling Multimedia Information LECTURE 6
Page 30: CS257 Modelling Multimedia Information LECTURE 6

Organising and Querying Video Organising and Querying Video ContentContent

• Should consider… – Which aspects of the video are likely to be of

interest to the users who access the video archive?

– How to store relevant information about the video efficiently?

– How to express and process queries?– What scope of automatic content extraction?

Page 31: CS257 Modelling Multimedia Information LECTURE 6

EXERCISE 6-2EXERCISE 6-2• For an video database application domain of your

choosing write five video queries that use some of Allen’s 13 temporal relationships

• If event A is ‘before (<)’ event B, and event B is ‘during’ event C, then what relationships could hold between A and C?

• How do you think such reasoning about temporal could be used in a video database?

Page 32: CS257 Modelling Multimedia Information LECTURE 6

LECTURE 6:LECTURE 6:LEARNING OUTCOMESLEARNING OUTCOMES

After the lecture, you should be able to:• Express a user’s query to a video database

using Subrahmanian’s VideoSQL and discuss the limitations of this formalism

• Explain how and why temporal and causal relationships between events are represented in metadata for video databases

Page 33: CS257 Modelling Multimedia Information LECTURE 6

OPTIONAL READINGOPTIONAL READING

Dunckley (2003), pages 38-39; 393-395.For details of the extended video SQL, see:Subrahmanian (1998). Principles of Multimedia Databases

- pages 191-195. IN LIBRARY ARTICLE COLLECTIONFor temporal relationships:Allen (1983). J. F. Allen, ‘Maintaining Knowledge About Temporal

Intervals.’ Communications of the ACM 26 (11), pp. 832-843. Especially Figure 2 for the 13 relationships and Figure 4 for the full transitivity table. [In Library – on shelf]

For causal relationships:Roth (1999). Volker Roth, ‘Content-based retrieval from digital video.’

Image and Vision Computing 17, pp. 531-540. [Available online through library eJournals]