Transcript
Page 1: Understanding Near-Duplicate Videos: A User-Centric Approach

Near-Duplicate Videos

Page 2: Understanding Near-Duplicate Videos: A User-Centric Approach

Let’s say you’re looking for theBush attack video…

Page 3: Understanding Near-Duplicate Videos: A User-Centric Approach

…and you get

11,100 results.

Page 4: Understanding Near-Duplicate Videos: A User-Centric Approach

…after40 minutes...

watching the videos listed on the first page you notice

> 50% are similar, i.e. NDVC27% in average [Wu et al., 2007]

Page 5: Understanding Near-Duplicate Videos: A User-Centric Approach

NDVC technical definition

• Identical or approximately identical videos, that differ in some feature:– file formats, encoding parameters– photometric variations (color, lighting changes)– overlays (caption, logo, audio commentary)– editing operations (frames add/remove)– semantic similarity

NDVC are videos that are “essentially the same”NDVC are videos that are “essentially the same”

Page 6: Understanding Near-Duplicate Videos: A User-Centric Approach

…like this

Page 7: Understanding Near-Duplicate Videos: A User-Centric Approach

Two challenges:

1. There is no agreement on a single definition of NDVC

1. NDVC are mostly considered as redundant content that has to be removed from the system

Page 8: Understanding Near-Duplicate Videos: A User-Centric Approach

Human Perception of

Mauro CherubiniRodrigo de Oliveira

Nuria Oliver

Near Duplicate Videos

Page 9: Understanding Near-Duplicate Videos: A User-Centric Approach

What kind of NDVC?

Malicious (i.e., spamproduced by a single user)

Copyright infringement (e.g., pirated music videos)

User-edited content : videos that complement the original materialwith additional information

Page 10: Understanding Near-Duplicate Videos: A User-Centric Approach

Recently

NDVC detection algorithm

Page 11: Understanding Near-Duplicate Videos: A User-Centric Approach

Recently

NDVC detection algorithm

Page 12: Understanding Near-Duplicate Videos: A User-Centric Approach

Why not?

NDVC detection algorithm

?

Page 13: Understanding Near-Duplicate Videos: A User-Centric Approach

Methodology

• 2 large-scale online surveys (n=1003)• 7 pairs of NDVC (differing in 1 feature)

• Subjects were asked about:– Similarity– Preference

Page 14: Understanding Near-Duplicate Videos: A User-Centric Approach

NDVC technical definition

• Identical or approximately identical videos, that differ in some features:– photometric variations (color, lighting changes)– overlays (caption, logo, audio commentary)– editing operations (frames add/remove)And …– semantic similarity (e.g., two deer grazing grass in two different forests)

Page 15: Understanding Near-Duplicate Videos: A User-Centric Approach

Audio Quality

NDVCNDVC

PreferencePreference

Stereo, 44 Khz

Mono, 11 Khz

Page 16: Understanding Near-Duplicate Videos: A User-Centric Approach

Image Quality

NDVCNDVC

PreferencePreference

Page 17: Understanding Near-Duplicate Videos: A User-Centric Approach

Audio content (overlay)

PreferencePreference

NDVCNDVC

Page 18: Understanding Near-Duplicate Videos: A User-Centric Approach

Visual + audio content (length)

PreferencePreference

Not NDVCNot NDVC

Page 19: Understanding Near-Duplicate Videos: A User-Centric Approach

Visual content (editing)

Not NDVCNot NDVC

Want bothWant both

Page 20: Understanding Near-Duplicate Videos: A User-Centric Approach

Similar semantics, different videos(similar visual info)

NDVCNDVC

Want bothWant both

Page 21: Understanding Near-Duplicate Videos: A User-Centric Approach

Similar semantics, different videos(similar audio info)

Not NDVCNot NDVC

PreferencePreference

Page 22: Understanding Near-Duplicate Videos: A User-Centric Approach

Implications for Design

1. User-centric NDVC definitionNDVC are approximately identical videos that might

differ in audio/image quality, or overlays. Conversely, identical videos with relevant complementary

information (changing clip length or scenes) are not considered as NDVC.

Furthermore, users perceive as near-duplicate videos that are not alike but that are visually similar and

semantically related.

NDVC are approximately identical videos that might differ in audio/image quality, or overlays. Conversely,

identical videos with relevant complementary information (changing clip length or scenes) are not

considered as NDVC.

Furthermore, users perceive as near-duplicate videos that are not alike but that are visually similar and

semantically related.

Page 23: Understanding Near-Duplicate Videos: A User-Centric Approach

Implications for Design

2. Clustering– Groups sharing video,

audio, semantic content– Ranking based on

user-submitted query– Highlight the most

representative

Page 24: Understanding Near-Duplicate Videos: A User-Centric Approach

Implications for Design

3. Feature and user adaptation– Boost ranking based on general observations

• More content• Better image/audio quality• …

– Boost ranking based on personalization• Abilities (e.g., auditory skills)• Task (e.g., video producer vs. movie enthusiastic)• Search query

Page 25: Understanding Near-Duplicate Videos: A User-Centric Approach

Future Work

• NDVC’s differing in more than 1 low-level feature

• Propose ways to visualize the NDVCs• Study effects of user’s goals while searching

videos

Page 26: Understanding Near-Duplicate Videos: A User-Centric Approach

A Human-Centric stance in Multimedia research

Biomimetics

Crowdsourcing

Psychophysical experiments

Page 27: Understanding Near-Duplicate Videos: A User-Centric Approach

Thank you!

Mauro CherubiniRodrigo de Oliveira

Nuria Oliver

[email protected]@[email protected]


Top Related