a study of awareness in multimedia search
DESCRIPTION
A Study of Awareness in Multimedia Search. Robert Villa, Nick Gildea, Joemon Jose Information Retrieval Group April 2008. Overview. Introduction Collaboration and awareness in search Research questions Experimental study Inducing awareness: a game scenario Video retrieval - PowerPoint PPT PresentationTRANSCRIPT
A Study of Awareness in Multimedia Search
Robert Villa, Nick Gildea, Joemon Jose
Information Retrieval Group
April 2008
Overview
• Introduction• Collaboration and awareness in search• Research questions• Experimental study
– Inducing awareness: a game scenario– Video retrieval – Demo of multimedia retrieval system– Some results
• Conclusions
Information Retrieval
• Deals with practical and theoretical models of searching unstructured collections of documents– Idealised aim: supply the system with a
natural language description of your need– The system returns a ranked list of the
documents relevant to your need• Process is naturally probabilistic and
uncertain
Video retrieval
• Video retrieval systems index and search collections of videos– Like traditional IR, the indexing of the video
data is assumed to be automatic• Extraction of visual or audio features• Use of automatic speech recognition
– Queries are typically textual or by example
Example interface
Video retrieval
• Videos are automatically split into ‘shots’ (shot segmentation)– Shot boundaries are determined using the
visual content of video frames– Each shot is a short element of a video
• in TRECVID 2006, typically 2 to 3 seconds, although there are some much longer shots
• Shots are the element of retrieval – Where a text system retrieves documents,
video retrieval systems retrieve shots
Example of a shot
• Every shot has an associated text transcript: • E.g. “A dramatic arrival”
• Generated by Automatic Speech Recognition (ASR)• Transcript can often be very wrong
Collaborative Retrieval
• Most current search systems assume searching is a solitary activity– Is this always the case, or can
collaborative searching with one or more others be effective?
• Rather than focus on collaboration in general, we decided to look at only one aspect of collaboration - awareness
Awareness
• Awareness enables an “understanding of the activities of others”, an important aspect of collaboration
• Paul Dourish and Victoria Bellotti. “Awareness and Coordination in Shared Workspaces”, CSCW'92
• Scenario:– Two users are searching on the same task
at the same time in different places– Synchronous and remote
Previous work – collaborative search
• Cerchiamo (FXPAL, Xerox)• Adcock et. al., in TRECVID 2007• Two people collaborating, one a “gatherer” and the
other a “reviewer”
• SearchTogether (Microsoft)• Morris, M. R. (2007)• Provides a messaging system, recommending a
web page to another, query awareness, etc.
• Fischlar-DiamondTouch (DCU)• Smeaton et. al. (2007) • Table-top display which allows two people to work
around it
Research question
• Can awareness of another searcher aid a user when carrying out a multimedia search?– Will their performance increase?– Will less effort be needed to reach a given
performance?• Shots played, browsing required
– Will the user’s search behaviour change?• Number of queries executed, shots found
independently
Competitive game scenario
• We wanted to evaluate the effect of awareness in a “best case” scenario– i.e. a situation where there was some
benefit to users in being aware of another’s actions
• A competitive game scenario was used, where pairs of users competed to “win” the search tasks
Aim of the ‘game’
• The aim of the ‘game’ was to find as many relevant shots as possible for the task– Domain was video retrieval, where users
had to search a video collection for ‘shots’– Whoever finds the most shots ‘wins’– A monetary award was given to the winner
System
• Our existing video retrieval system was modified to allow collaboration– Each user could be given a view of the
other user’s search screen– This was designed to work with two
monitors:• The user’s own search interface on one screen• The other screen optionally showing the other
user’s search screen
– We supported 4 different situations
User A User B“Mutually Aware”
A can see B’s screen and B can see A’s screen
User A User B“A aware of B”
A can watch B’s screen while B cannot watch A
User A User B“B aware of A”
B can watch A’s screen while A cannot watch B
User A User B“Independent”
Both A and B cannot watch each other
System interface
Local search interface
Text Query
Search results
Shots to use in relevancefeedback
Result shots for the user
Remote search interface
User cannot see the other user’sfinal results –
only a count of the number of shot currentlymarked by the
user
This screen doesnot update
automatically, theuser must press
the “Refresh”button to update
the screen
Video browser
• Simple video browser pop’s up when the user clicks a keyframe
• Allows the user to view the shot, and move backwards and forwards in the video
Conditions
• From the point of view of an individual user:– Working independently– Cannot watch the other user, and knows
that the other user can watch him/her– Can watch the other user, and knows that
the other user cannot watch him/her– Can watch the other user, and knows that
the other user can watch him/her
TRECVID 2006 Collection
• Almost 260 hours of mostly news data from the end of 2005– CNN, LBC, CCN, etc. – Multilingual (English, Chinese and Arabic)
• Has a standard shot segmentation• ASR transcripts provided
– For Chinese and Arabic video, also automatically translated into English
TRECVID 2006 Topics
• 24 topics, of which we used the 4 worst performing overall from the interactive track– Hoped that these would be a similar
challenge for the user– Adcock et al. (2007) found that users
collaborated better on difficult tasks
Topics
TopicMedian
MAPTopic description
0189 0.038Find shots of a group including least four people dressed in suits, seated, and with at least one flag
0173 0.037Finds shots with one or more emergency vehicles in motion (e.g., ambulance, police car, fire truck, etc.)
0175 0.034Find shots with one or more people leaving or entering a vehicle
0192 0.030Find shots of a greeting by at least one kiss on the cheek
Experimental design
• Within user study was carried out• Latin square design
– 4 tasks– 4 conditions– 24 users (12 pairs)
Procedure
• Users took part in pairs• Users had 15 minutes to find as many
shots as possible• At the end of the 4 tasks, the “winner”
was announced – Each user was paid £10 – Winner got an extra £5, shared if there was
a draw
Results
• 12 competitive runs– 11 wins and 1 draw
• And there was an immediate issue with one of the user’s ...
Independent Watched Watching Mutual
MAP
Mean 0.0163 0.0199 0.0222 0.0243
SD 0.0150 0.0163 0.0165 0.0204
Precision at 10 shots
Mean 0.3083 0.3750 0.4167 0.4667
SD 0.2569 0.2996 0.3158 0.3332
Search performance
Search Performance
• No significant difference found between the level of performance in the four different conditions– Overall performance was very low (typical
in video IR, for these hard topics)– Performance does vary widely across the
four tasks • Tasks 189 and 192 performed worst
Search behaviour: queries
Ind Watched Watching Mutual
Total Queries 603 501 473 570
Queries per task
Mean(SD)
25.13
(14.72)
20.88
(13.61)
19.71
(9.34)
23.75
(13.35)
• Do users execute more queries searching alone?
• Significant variation between watching and Independent was found
Number of shots found independently
Ind Watched Watching Mutual
Total shots found 155 188 244 222
Shots found/task
Mean (SD)
6.46
(4.11)
7.83
(7.17)
10.17
(9.31)
9.25
(8.07)
• Significant interaction was found between Independent and Watching
Changes in search behaviour
• Users searched less when watching someone else– Also less searching executed in the
watched condition (not significant)• Users found more shots themselves
when watching someone else– Also found more shorts in the mutual and
watched conditions, but not significant
Search terms used
• One possible way awareness may help is by providing a user with new terms with which to use in queries
• Did users copy search terms from the remote user?– We could not directly record this in the logs
(terms are easily retyped)
Estimating copied search terms
• Search terms which could have been copied were derived from the logs
• Method:– Found the set of common terms– Found who used that term first– Checked for a click of the “refresh” button
by the user who was second– Assumed that second user could have then
copied that term
Copied terms
Watching Mutual
Total unique terms 355 388
Total terms copied 44 40
% terms copied %12 %10
• Suggests that a user is able to reuse search terms used by the other user
Searcher effortIndependent Watched Watching Mutual
Play events per search task
Mean 187.08 192.21 170.08 184.79
SD 116.51 112.91 101.28 117.52
Next shot in video events per search task
Mean 115.04 121.08 97.04 104.13
SD 106.64 99.23 96.35 107.01
Previous shot in video events per search task
Mean 16.88 20.71 12.33 12.29
SD 23.90 22.20 18.77 15.86
Searcher effort• Recorded three types of events to gauge searcher
effort– Play events when a user clicks a shot– Move to next shot in video– Move to previous shot in video
• Only significant relationship:– Watching and Watched and move to previous shot
Where did a user’s final results come from?
• From the interface, we logged the user dragging and dropping shots between the different parts of the interface– We could record when someone copied a
shot from the other user• Using this, we can estimate where
user’s got their final results– (roughly!)
Conclusions
• Despite the game scenario, users didn’t copy other people’s shots much– This came as something as a surprise
• There’s no significant increase in a user’s performance– Only a trend ...
• There is evidence that user’s do reuse search terms – 10 and 13% of terms are potentially copied
Conclusions
• Results from user effort were unclear– Only significantly less interaction in one
event
The End