effect of heuristics on serendipity in path-based storytelling with linked data

26
Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data Laurens De Vocht Christian Beecks, Ruben Verborgh, Erik Mannens, Thomas Seidl, Rik Van de Walle

Upload: laurens-de-vocht

Post on 16-Feb-2017

134 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

Laurens De VochtChristian Beecks, Ruben Verborgh, Erik Mannens, Thomas Seidl, Rik Van de Walle

Page 2: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

BA

Introduction

Pathfinding

Semantic Distance

Evaluation

Conclusions & Next Steps

Page 3: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

BA

Introduction

Pathfinding

Semantic Distance

Evaluation

Conclusions & Next Steps

Page 4: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

BA

?

Page 5: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

How to consistently improve and tailor existing pathfinding approaches? [pathfinding]

How well do heuristics effect user expectations so users are able to discover feeling confident about the story facts relevance? [serendipity]

Is semantic distance between facts a good criterion for optimizing the paths forming a story? [user judgments]

Page 6: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

6

- trivial

randomness -

familiarity

surprise

sense making

+ discovery

Serendipity

Page 7: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

BA

Introduction

Pathfinding

Semantic Distance

Evaluation

Conclusions & Next Steps

Page 8: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

8

Original Core Algorithm A* based

A*

h =Jaccard

Distance

w = Common

Node Degree

Optimizations

Page 9: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

9

Improved Algorithm Wraps Core

Algorithm

h

w

Domain Delineation

Iterative Refinementto increase semantic relatedness

Page 10: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

10

Heuristics [h]

Jaccard

NormalizedDBpedia Distance

Confidence

Page 11: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

11

Weights [w]

Jaccard

Jiang-ConrathDistance (JCW)

Common NodeDegree (CND)

Page 12: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

BA

Introduction

Pathfinding

Semantic Distance

Evaluation

Conclusions & Next Steps

Page 13: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

13

Semantic Distance0.62 via Physics 0.45 via Hume

EinsteinNewton

Physics

Hume

:influences

:discipline

:birthPlace :deathPlace

Semantic Distances

Page 14: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

14

Normalized Web Search Distancee.g. Google Distance, Bing Distance…

Page 15: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

15

Motivating Example

Page 16: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

16

Page 17: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

17

Page 18: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

18

Semantic Distances (continued)

Page 19: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

BA

Introduction

Pathfinding

Semantic Distance

Evaluation

Conclusions & Next Steps

Page 20: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

20

Serendipity – Semantic Distance

Page 21: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

21

Serendipity – User Judgments

Page 22: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

22

Serendipity – User Judgments

Least agreement (high standard deviation): Carl Linnaeus and Albert Einstein [JCWJaccard]Carl Linnaeus and Baruch Spinoza are Expert, Intellectual and Scholar Baruch Spinoza’s and Albert Einstein’s are both Pantheists Intellectuals and Jewish Philosophers

Most relevant and consistent: Charles Darwin and Carl Linnaeus [CNDJaccard]Copley Medal’s the award of Alfred Russel Wallace and Charles Darwin Alfred Russel Wallace’s and Charles Darwin’s awards are Royal Medal and Copley Medal Alfred Russel Wallace and Charles Darwin are known for their Natural selection Carl Linnaeus and Alfred Russel Wallace have as subject ‘Fellows of the Royal Society’ Carl Linnaeus and Alfred Russel Wallace are Biologists and Colleagues

Page 23: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

BA

Introduction

Pathfinding

Semantic Distance

Evaluation

Conclusions & Next Steps

Page 24: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

24

Conclusions

Reducing the number of arbitrary resources/facts revealed for a story.

Dbpedia example: telling a story with better link estimation, in cases where the original algorithm did not make optimal choices of links.

The most consistent output was generated with the Jaccard distance used both as weight and heuristic; or as heuristic in combination with the Jiang-Conrath distance as weight.

The most arbitrary facts occur in a story when using the combined node degree as weight with the Jaccard distance as heuristic, both in the optimized and the original algorithm.

User judgments confirm the findings for the Jiang-Conrath weight, original algorithm and for the Jaccard distance used as weight and heuristic in terms of discovery.

Page 25: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

25

Validate the correlation between the effect of the link estimation on the arbitrariness as perceived by users and computational semantic relatedness measures such as SemRank.

Measure the scalability of the approach by implementing the algorithms: (i) solely on the client, (ii) completely on the sever, and (iii) in a distributed client/server architecture.

Next Steps

Page 26: Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data

Additional questions?@[email protected]://slideshare.net/laurensdv