untangling the semantic structure in a broadcast video archive

25
Untangling the semantic structure in a broadcast video archive Untangling the semantic structure in a broadcast video archive Ichiro IDE Nagoya University, Japan University of Amsterdam, The Netherlands Ichiro IDE Nagoya University, Japan University of Amsterdam, The Netherlands December 7, 2010 Workshop on Interactive Information Access Untangling Tasks and Technologies Workshop on Interactive Information Access Untangling Tasks and Technologies

Upload: ichiroide

Post on 13-Nov-2014

737 views

Category:

Technology


4 download

DESCRIPTION

Invited talk Workshop on Interactive Information Access: Untangling Tasks and Technologies At Centrum voor Wiskunde en Informatica (CWI), Amsterdam, The Netherlands On Dec. 6, 2010

TRANSCRIPT

Page 1: Untangling the semantic structure in a broadcast video archive

Untangling the semantic structure in a broadcast video archive Untangling the semantic structure in a broadcast video archive

Ichiro IDENagoya University, Japan

University of Amsterdam, The Netherlands

Ichiro IDENagoya University, Japan

University of Amsterdam, The NetherlandsDecember 7, 2010

Workshop on Interactive Information AccessUntangling Tasks and TechnologiesWorkshop on Interactive Information AccessUntangling Tasks and Technologies

Page 2: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

IntroductionIntroduction

Online digital video archive is becoming a reality•

Efficient retrieval and browsing

Effective reuse

Our aims:•

Extract the semantic structure between video data

Rearrange video segments and generate new contents•

Provide a browsing / editing interface based on the extracted semantic structure

(Semi-) automatic rearrangement of retrieved results for answering queries

Online digital video archive is becoming a reality•

Efficient retrieval and browsing

Effective reuse

Our aims:•

Extract the semantic structure between video data

Rearrange video segments and generate new contents•

Provide a browsing / editing interface based on the extracted semantic structure

(Semi-) automatic rearrangement of retrieved results for answering queries

22

Page 3: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

ClientPC

NII news video archiveNII news video archive

Program: NHK News7•

Period: March 16, 2001– (1,700– hours)

Program: NHK News7•

Period: March 16, 2001– (1,700– hours)

ServersRAIDdisk

PCs forcapturing

MPEG-1/2decoder

Closed-captiondecoder

Video archiving server DBMS server

Metadata

Closed-captiontext

[79 MB]

Storyboundary

[46 k stories]

Video archive

MPEG-1Video

[970 GB]

MPEG-2Video

[5.9 TB]

Data processing server

ClientPC

33

Page 4: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Overview of the talkOverview of the talk

Exploring news stories along the topic thread structure

§

Cross-language detection of related news stories

by text and near-duplicate video segments

§

Structuring a broadcast video archivebased on near-duplicate video segments

Exploring news stories along the topic thread structure

§

Cross-language detection of related news stories

by text and near-duplicate video segments

§

Structuring a broadcast video archivebased on near-duplicate video segments

44

Page 5: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Exploring news stories along the topic thread structure

Exploring news stories along the topic thread structure

I. Ide, H. Mo, N. Katayama, S. Satoh:“Exploiting topic thread structures in a news video archive for the semi-automatic generation of video summaries”,

2006 IEEE Int. Conf. on Multimedia and Expo (ICME2006), July 2006

I. Ide, T. Kinoshita, T. Takahashi, S. Satoh, H. Murase:“mediaWalker: A video archive explorer based on time-series semantic structure”,

15th ACM Int. Multimedia Conf. Demo Session, Sept. 2007

I.Ide

, T. Kinoshita, T. Takahashi, H. Mo, N. Katayama, S. Satoh, H. Murase:“Exploiting the chronological semantic structure in a large-scale broadcast news video archive

for its efficient exploration”,APSIPA Annual Summit and Conf. (ASC) 2010, to appear in Dec. 2010

55

Page 6: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Semantic structures in news video Intra- & Inter-video structure Semantic structures in news video Intra- & Inter-video structure

Intra-video structured videosIntra-video structured videos

Video-1Video-1

Video-2Video-2

Video-3Video-3

Video-4Video-4

Video-5Video-5

…… Thread-1Thread-1Thread-2Thread-2

• Story tracking / Topic threading• Story tracking / Topic threading

Story-2

Story-1

Story-1

Story-2

Story-5

Story-3

Story-3

Story-3

Story-2Story-1

Inter-videostructure

Reveals the semantic structure throughout the archive Reveals the semantic structure throughout the archive

66

Page 7: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Example of a topic thread structureExample of a topic thread structure

[Cluster-view]

OriginMay 1, 2003Story #1

OriginMay 1, 2003Story #1

Period: 100 days

77

Page 8: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Contents of a topic thread structureContents of a topic thread structure

SARS outbreakIn Beijing

Spreads inmainland China

WHO sends amission to Beijing

Slows downs in mainlandChina, spreads in Taiwan

Chinese gov. worriesthe spread in rural areas

Chinese gov. watchesthe spread in rural areas

Anti-SARS conferenceheld in Beijing

Taiwanese doctor foundinfected after traveling Japan Search for

Infection in Japan

Calms down inmainland China,

reports fromToronto

Calms downin Taiwan

WHO declaresthe cease

88

Page 9: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Browsing news video by the thread structure: mediaWalker Browsing news video by the thread structure: mediaWalker

Demo

99

Page 10: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Towards Video Story-TellingTowards Video Story-Telling

Generate a summarized video that explains how the story developed between two news stories •

Select a path (semi-)automatically

Summarize the video streams along the path

Generate a summarized video that explains how the story developed between two news stories•

Select a path (semi-)automatically

Summarize the video streams along the path

From here

To hereI want to knowhow it developed

Currently under work with FrankCurrently under work with Frank

1010

Page 11: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Cross-language detection of related news stories

by text and near-duplicate video segments

Cross-language detection of related news stories

by text and near-duplicate video segments

A. Ogawa, T. Takahashi, I. Ide, H. Murase:“Cross-lingual retrieval of identical news events by near-duplicate video segment detection”,

14th Intl. Multimedia Modeling Conference (MMM2008), Jan. 2008

1414

Page 12: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Cross-language news story detectionCross-language news story detection

Definition―Detect news stories in different

channels (especially in different languages) discussing the same event

Problem―Text-based approach

Low MT * ASR quality(Though, recently improving…)

Different view-point, culture•

Proposed method•

Detect near-duplicate video segments to complement text information

Definition―Detect news stories in different

channels (especially in different languages) discussing the same event

Problem―Text-based approach

Low MT * ASR quality(Though, recently improving…)

Different view-point, culture•

Proposed method•

Detect near-duplicate video segments to complement text information

Nearduplicate

1515

Page 13: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Comparison of news video streamsComparison of news video streams

Identical event should be broadcast in a close timing•

Compare news programs broadcast within +/- 24 hours

Identical event should be broadcast in a close timing•

Compare news programs broadcast within +/- 24 hours

Cope with color differencesby histogram averaging

Compare only the center partto avoid super-imposed captions

1616

Page 14: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Example of news stories on a same eventExample of news stories on a same event<<Keywords>>operation [25], US army [20], Fallujah [18], military force [12], troops [7], military strategy [7], attack [5], Iraqi army [5], general citizens [5], Iraq [4], …

Nov 9, 2004 Story # 119:01 (GMT+9) --

<<Keywords>>city [9], Jean [6], Aaron [6], Iraqi [4], phone, call [3], army forces [3], casualties [3], …Nov 8, 2004 Story # 1

22:03 (GMT-5) --

1717

Page 15: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Cross-language news browsing interface: topicTraveller Cross-language news browsing interface: topicTraveller

Demo

1818

Page 16: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

ResultResult

• Dataset– 18 pairs of (JP: 1 US: 2)– Ground truth: manually given

Advantage of using image information

• Dataset– 18 pairs of (JP: 1 US: 2)– Ground truth: manually given

Advantage of using image information

Text onlySum of

text and ImageImage only

Recall 83% (38/46) 96% (20/46) 43% (20/46)

Precision 72% (38/53) 90% (44/49) 77% (20/26)

2020

Page 17: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Structuring a broadcast video archivebased on near-duplicate video segments

Structuring a broadcast video archivebased on near-duplicate video segments

I. Ide, Y. Shamoto, D. Deguchi, T. Takahashi, H. Murase:“Classification of near-duplicate video segments based on their appearance patterns”,

20th

Int. Conf. on Pattern Recognition (ICPR2010), Aug. 2010.

2121

Page 18: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Structuring a broadcast video archiveStructuring a broadcast video archive

• Structure?– For browsing / retrieval– Differs among programs / genres

• Applications– Advertisement database– Related contents detection

• Related news, …– Periodic contents detection

• Sub-program structure

Handle in a unified framework

• Structure?– For browsing / retrieval– Differs among programs / genres

• Applications– Advertisement database– Related contents detection

• Related news, …– Periodic contents detection

• Sub-program structure

Handle in a unified framework

2222

Page 19: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access• Different distributions for different types• Different distributions for different types

Advertisement

Related news

Sub-program

Example of appearance patternsExample of appearance patterns

Demo

2323

Page 20: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

1) Advertisement 2) Related news 3) Sub-program

4) Rebroadcast 5) Similar framing 6) Extracted segment

Classes of near-duplicate segment typesClasses of near-duplicate segment types2424

Page 21: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

• Data set– 1 week of broadcast from 6 channels in Tokyo area Total: 1,008 hours

• Computer environment– Cluster computer

• 40 CPU (Intel Xeon 3.4Ghz, Main Memory: 1.0 GB)• Computation cost

– CPU time: 133 days– Actual time: 4 days

• Result– 3,597,943 pairs (40,928 unique segments)

• Data set– 1 week of broadcast from 6 channels in Tokyo area Total: 1,008 hours

• Computer environment– Cluster computer

• 40 CPU (Intel Xeon 3.4Ghz, Main Memory: 1.0 GB)• Computation cost

– CPU time: 133 days– Actual time: 4 days

• Result– 3,597,943 pairs (40,928 unique segments)

Near-duplicate detection experimentNear-duplicate detection experiment2525

Page 22: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

• Classification rules– Features of near-duplicate

video segments within a unique segment set

• Appearance period• Appeared channels• Appearance interval• Length of the segment• Periodic or not• Extracted segment or not

• Classification rules– Features of near-duplicate

video segments within a unique segment set

• Appearance period• Appeared channels• Appearance interval• Length of the segment• Periodic or not• Extracted segment or not

Rebroadcast

Advertisement

Sub-program

Unique ND segment set

Similarframing

Extractedsegment

Related news

Extracted segment Original segment

Automatic classification of classesAutomatic classification of classes2626

Page 23: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Manual classification1) 2) 3) 4) 5) 6) Misc.

Autom

atic classification

1) Advertisement 92% 1% 2% 0% 0% 0% 5%

2) Related news 1% 51% 7% 0% 5% 17% 19%

3) Sub-program 0% 0% 65% 0% 2% 0% 33%

4) Rebroadcast 0% 0% 0% 36% 0% 0% 64%

5) Similar framing 0% 0% 0% 0% 63% 6% 31%

6) Extracted segment 1% 49% 2% 0% 0% 35% 13%

EvaluationEvaluation

• 100 unique segment sets per class(61 sets for rebroadcast)

Accuracy: 57% Cover rate: 77%

• 100 unique segment sets per class(61 sets for rebroadcast)

Accuracy: 57% Cover rate: 77%

2727

Page 24: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

Future directionsFuture directions

• Now we have structured the archives in various ways

Consider how to exploit the structure

• Reorganize the video data based on an external “scenario” – News video archive

Wikipedia description

(Semi-)automatic Documentary generation– Cooking video archive

Plain recipe text

Multimedia supplementation to a text recipe …

• Now we have structured the archives in various ways

Consider how to exploit the structure

• Reorganize the video data based on an external “scenario”– News video archive

Wikipedia description

(Semi-)automatic Documentary generation– Cooking video archive

Plain recipe text

Multimedia supplementation to a text recipe …

2828

Page 25: Untangling the semantic structure in a broadcast video archive

Workshop on Interactive Information AccessWorkshop on Interactive Information Access

SummarySummary

• Introduced works on analyzing the semantic structures in large-scale news video archives and interfaces for efficient understanding of its contents.

• Introduced works on analyzing the semantic structures in large-scale news video archives and interfaces for efficient understanding of its contents.

Thanks to:•

Nagoya Univ:

Profs. Hiroshi Murase, Daisuke DeguchiAkira Ogawa, Yuji Shamoto, Tomoki Okuoka

NII:

Profs. Shin’ichi

Satoh, Norio Katayama, Hiroshi Mo•

Gifu Shotoku

Gakuen

Univ.:

Prof. Tomokazu

Takahashi

NetCompass

Ltd.:

Tomoyoshi Kinoshita, Takeharu

Haraigawa

Funded by:•

JSPS, MEXT, MRI Inc., Kayamori

Information Science Fund, Hoso

Bunka

Foundation

Thanks to:•

Nagoya Univ:

Profs. Hiroshi Murase, Daisuke DeguchiAkira Ogawa, Yuji Shamoto, Tomoki Okuoka

NII:

Profs. Shin’ichi

Satoh, Norio Katayama, Hiroshi Mo•

Gifu Shotoku

Gakuen

Univ.:

Prof. Tomokazu

Takahashi

NetCompass

Ltd.:

Tomoyoshi Kinoshita, Takeharu

Haraigawa

Funded by:•

JSPS, MEXT, MRI Inc., Kayamori

Information Science Fund, Hoso

Bunka

Foundation

2929