hippocampus: answering memory queries using transactive search

Post on 10-May-2015

229 Views

Category:

Internet

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Hippocampus: Answering Memory Queries

using Transactive Search

Michele @pirroh CatastaAlberto Tonon, Djellel Eddine Difallah, Gianluca Demartini,

Karl Aberer, and Philippe Cudre-Mauroux

�1

�2

“A transactive memory system is a mechanism through which groups collectively encode, store, and retrieve knowledge.”

“[…] a memory system that is more complex and potentially more effective than that of any of its individual constituents.”

A transactive search system discovers and aggregates the information stored in a transactive memory.

Wikipedia

Wikipedia

–Daniel M. Wegner

“[…] it is a property of a group. This unique quality of transactive memory brings with it the realization that we are speaking of a constructed system, a

mode of group operation that is built up over time by its individual constituents.”

�3

�4

INFORMATION NEED reconstruct the attendees’ list of the 86th Academy Awards (2014)

�5

�6

MISTAKES: not all the nominees participate to the ceremony

PRECISION :-( !

!

!

MISSING ENTRIES: what about all the people working “behind the scenes”?

RECALL :-(

�7

FROM THE IDEA…

• for data that is stored in the memories of a group of people, the current query strategies are suboptimal

• we need a new form of human computation, different from standard crowdsourcing (i.e., no anonymous crowd)

�8

Navigational: The immediate intent is to reach a particular Web site.

Informational: The intent is to acquire some information assumed to be present on one or more Web pages.

Transactional: The intent is to perform some Webmediated activity.

Transactive: The intent is to acquire some information that can be reconstructed only by an [ephemeral] social network.

“A taxonomy of Web Search”

— A. Broder (2002)

…TO THE TESTING ENVIRONMENT

• We want to reconstruct the attendees list of two Semantic Web conferences, ISWC2012 and ISWC2013

!

• We were given access to the ground truth but, in general, such lists are not publicly available

!

• Additional data sources: authors list (first author, last author, etc.), mentions in Online Social Networks

�9

�10

EXPERIMENT ARCHITECTURE

• tailored Web UI + results aggregator

• iterative reconstruction: every time a new person was mentioned, Hippocampus sent her an invitation to contribute to the attendees list

�11

Hippocampus !!!

discovery (Web UI +

messaging)results

aggregatorstorage

layer

MACHINE LEARNING APPROACHES

• we collected the proceedings information and all the tweets with the conference hashtags

• we trained state-of-the-art classifiers with these features:

�12

not possible without the ground truth!

ML + CROWDSOURCING APPROACHES

• Uncertain cases (precision): we asked the crowd to revise the low-confidence results of the ML classifier.(e.g., people that didn’t attend the conference but tweeted about it)

• Unseen cases (recall): we asked the crowd to actively look for attendees not included in the authors list (e.g., organizers mentioned in the Web site)

�13

the crowd has access only to public data on the Web!

Transactive vs ML & Crowdsourcing ISWC 2013

�14

Authors and Tweets: baseline (exhaustive list of authors and twitterers)Machine Learning: SVM, M5P RegressionMachine Learning + Crowdsourcing: Hybrid_(uncertain, unseen, uncertain_unseen)

attendees found over time Transactive Search

0"

50"

100"

150"

200"

250"

300"

350"

400"

450"

10(Dec" 11(Dec" 12(Dec" 13(Dec" 14(Dec" 15(Dec" 16(Dec"

Retrieved(Re

sults(

Day(

Retrieved"A4endees"2012" Duplicates"Names"2012"

Retrieved"A4endees"2013" Duplicates"Names"2013"

�15

Transactive Memory Graph in green, two isolated “components” discovered by top-contributors

�16

CONCLUSIONS

• for a specific class of queries, our Transactive Search performs up to 46% better than the best alternative approach (i.e., Machine Learning + Crowdsourcing)

!

• we will explore incentives for Hippocampus, as it is currently two orders of magnitude slower than the alternative approaches

• we reported some initial evidences that, as human memories fade with time, our approach works best with recent events

�17

top related