Transcript
Page 1: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

Summarizing archival collections using storytelling techniques

Yasmin AlNoamanyMichele C. WeigleMichael L. Nelson

Old Dominion UniversityWeb Science & Digital Libraries Research Group

www.cs.odu.edu/~mln/@phonedude_mln

Research Funded by IMLS LG-71-15-0077-15

Dodging the Memory Hole Los Angeles, CA, 2016-10-14

Page 2: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

2

Archive-It, a subscription-based service, allows creation of collections

> 3,000 collections

~340 institutions

> 10B archived pages

Page 3: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

3

Collection title

Collection categorization based on the

curator

Seed URI

Metadata about the collection

Text search box

The group that the resource belongs to

List of the seed

URIs

Timespan of the resource

and the number of times it has been captured

Page 4: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

4

Collection understanding and collection summarization are not supported currentlyNot easy to answer “what’s in that collection?” or “how is this collection different from others”?

Page 5: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

5

There is more than one collection about “Egyptian Revolution”

• “2010-2011 Arab Spring” https://archive-it.org/collections/3101• “North Africa & the Middle East 2011-2013” https://archive-it.org/collections/2349• “Egypt Revolution and Politics” https://archive-it.org/collections/2358

Page 6: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

6

Page 7: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

7

Page 8: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

8

Page 9: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

9

Our early attempts at collection understanding tried to include everything…

“Visualizing digital collections at Archive-It”, JCDL 2012.http://ws-dl.blogspot.com/2012/08/2012-08-10-ms-thesis-visualizing.html

Page 10: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

10

1000s of Seeds X 1000s of archived pages == Conventional Vis Methods Not Applicable

Page 11: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

11

Idea: Storytelling

Page 12: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

12

Stories in literature

Story elements: setting, characters, sequence, exposition, conflict, climax, resolution

Once upon a time

http://www.learner.org/interactives/story/

Page 13: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

13

Stories in social media“It's hard to define a story, but I know it when I see it” (Alexander, 2008)

basically, just arranging web pages in time

Page 14: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

14

“Storytelling” is becoming a popular technique in social media

Page 15: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

15

What are the limitations of storytelling services?

Page 16: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

16

The Egyptian Revolution on Storify

Page 17: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

17

Bookmarking, not preserving!

Page 18: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

18

Despite these limitations, how do we combine storytelling & archives?

Page 19: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

19

Use interface people already know how to use to summarize collections

Archived collectionsStorytelling services

Archived enriched stories

Page 20: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

20

We sample k mementos from N (k << N) pages of the collection to create a summary story

S1

S2

S3

S4

S2

S1

S3

Collection Y

S3

S2

S1

Collection Z

Archive-It Collections

Collection X

Story

The Web

Page 21: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

21

Yasmin hand-crafted stories to summarize the Egyptian Revolution collection for her son, Yousof

https://storify.com/yasmina_anwar/the-egyptian-revolution-on-archive-it-collection

https://storify.com/yasmina_anwar/the-story-of-the-egyptian-revolution-from-archive-

Page 22: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

22

How do we generate this automatically?

Page 23: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

23

Collections have two dimensions:{Fixed, Sliding} X {Page, Time}

R11

R12

R13

R1n

t1 t3t2 t5t4 tk

R21

R22

R23

R2n

R31

R32

R33

R3n

R41

R42

R43

R4n

R51

R52

R53

R5n

R61

R62

R63

R6n

URI

Time

Rk1

Rk2

Rk3

Rkn

t6

Page 24: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

24

Fixed Page, Fixed Time

A desktop Chrome user-agenthttp://www.cnn.com/2014/02/24/world/africa/egypt-politics/index.html?hpt=wo_c2

Android Chrome user-agenthttp://www.cnn.com/2014/02/24/world/africa/egypt-politics/index.html?hpt=wo_c2

Schneider and McCown, “First Steps in Archiving the Mobile Web: Automated Discovery of Mobile Websites”, JCDL 2013.Kelly et al. “A Method for Identifying Personalized Representations in Web Archives”, D-Lib Magazine 2013 .

Page 25: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

25

Feb 1 Feb 1 Feb 2

Feb 4 Feb 5 Feb 7

Feb 9 Feb 11 Feb 11

Fixed Page, Sliding Time

Page 26: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

26

Feb. 11, 2011Mubarak resigns Sliding Page, Fixed Time

Page 27: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

27

Jan 27 Jan 31

Feb 7Feb 4

Feb 11 Feb 11

Feb 2

Jan 25

Feb 10

Sliding Page, Sliding Time

Page 28: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

28

The Dark and Stormy Archives (DSA) framework

Establish a baseline

Reduce the candidate pool of archived pages

Select good representative

pages

Characteristics of human-generated

Stories

Characteristics of Archive-It collections

Exclude duplicates

Exclude off-topic pages

Exclude non-English Language

Dynamically slice the collection

Cluster the pages in each slice

Select high-quality pages from each

cluster

Order pages by time

Visualize

https://pbs.twimg.com/media/BQcpj7ACMAAHRp4.jpg

Page 29: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

29

Establish a baseline of social media stories

"Characteristics of Social Media Stories”, TPDL 2015, IJDL 2016.

Page 30: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

30

What is the length of a story(the number of resources per story)?

This story has 31 resources

1

3

2

Page 31: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

31

What are the types of resources that compose a story?

Quotes

Video

This story has • 19 quotes • 8 images• 4 videos

Page 32: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

32

What are the most frequently used domains?

Twitter.com

Twitter.com

Twitter.com

This story has • 90% twitter.com• 7% instagram.com• 3% facebook.com

Page 33: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

33

Top 25 domains represents 92% of all domains

Page 34: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

34

What differentiates a popular story? (popular = stories with the top 25% of views)

19,795 views 64 views

Page 35: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

35

The distributions for the features of the stories

• Based on Kruskal-Wallis test, at the p ≤ 0.05 significance level, the popular and the unpopular stories are different in terms of most of the features

• Popular stories tend to have:• more web elements (medians of 28 vs. 21) • longer timespan (5 hours vs. 2 hours) than the unpopular stories

Page 36: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

36

Do popular stories have a lower decay rate?

The 75th percentile of decay rate per popular story is 10% of the resources, while it is 15% in the unpopular stories

Page 37: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

37

We found that 28 mementos is a good number for the resources in the stories.

Page 38: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

38

Establish a baseline of current Archive-IT collections

"Characteristics of Social Media Stories. What makes a good story?", International Journal on Digital Libraries 2016.

Page 39: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

39

The mean and median number of

URIs in a collection

This collection has 435 seed URIs

Page 40: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

40

The mean and median number of mementos per URI

This seed URI has 16 mementos

Page 41: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

41

The most frequent used domains

abcnews.go.com

blogspot.com

This collection has 30% abcnews.com, 10% blogspot.com, 3% facebook.com

Page 42: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

42

Archive-It top 25 is fundamentally different than Storify top 25

Page 43: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

43

Archive-It top 25 is fundamentally different than Storify top 25

Twitter is #10 not #1

Page 44: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

44

What we archive and what we put in our stories are different

subsets of the web

Page 45: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

45

Detecting off-topic pages

"Detecting Off-Topic Pages in Web Archives”, TPDL 2015, IJDL 2016.

Page 46: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

46

Archive-It provides their partners with tools that allow them to build themed collections

Page 47: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

47

Archive-It tools are about HTTP events / mechanics, not “content”

Page 48: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

48

Over 60% of archived versions of hamdeensabahy.com are off-topic

May 13, 2012: The page started as on-topic.

May 24, 2012: Off-topic due to adatabase error.

Mar. 21, 2013: Not working because offinancial problems.

May 21, 2013: On-topic again June 5, 2014: The site has been hacked Oct. 10, 2014: The domain has expired.

http://wayback.archive-it.org/2358/*/http://hamdeensabahy.com

Page 49: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

49

How do we automatically detect off-topic pages?

Page 50: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

50

We investigated 6 similarity metrics• Textual Content

• cosine similarity of TF-IDF• intersection of the 20 most frequent terms• Jaccard similarity coefficient

• Semantics • Web-based kernel function using a search engine (SE)

• Structural• the change in number of words• the change in content length

Page 51: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

51

Textual contentcosine similarity, intersection of the most frequent terms, Jaccard similarity

Method Similaritycosine 0.7TF-Intersection 0.6Jaccard 0.5

Page 52: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

52

Textual contentcosine similarity, intersection of the most frequent terms, Jaccard similarity

Method Similaritycosine 0.7TF-Intersection 0.6Jaccard 0.5

Method Similaritycosine 0.0TF-Intersection 0.0Jaccard 0.0

Page 53: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

53

Semantics of the textWeb based kernel function using the search engine (SE)

Sahami and Heilman, A Web-based Kernel Function for Measuring the Similarity of Short Text Snippets, WWW 2006

Page 54: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

54

Semantics of the textWeb based kernel function using the search engine (SE)

Method SimilaritySE-Kernel 0.7

Sahami and Heilman, A Web-based Kernel Function for Measuring the Similarity of Short Text Snippets, WWW 2006

Page 55: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

55

Structural methodsno. of words, content-length

100 109

Method % changeWordCount 0.09

Page 56: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

56

Structural methodsno. of words, content-length

100 109

100 5

Method % changeWordCount 0.09

Method % changeWordCount -0.95

Page 57: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

57

We built a gold standard data set to evaluate the methods

Page 58: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

58

We manually labeled 15,760 mementos

Egypt Revolution and PoliticsURI-Rs: 136URI-Ms: 6,886Off-topic URI-Ms: 384

Occupy MovementURI-Rs: 255URI-Ms: 6,570Off-topic URI-Ms: 458

Columbia Univ. Human Rights collectionURI-Rs: 198URI-Ms: 2,304Off-topic URI-Ms: 94

Page 59: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

59

Evaluated 6 methods at 21 thresholds

• Assumed first memento was on-topic

• Combined two methods ('OR') to find best combination method

• 15 combinations• 6,615 tests (15 combinations x 21 thresholds x 21

thresholds)

• Averaged the results at each threshold over the three collections

Page 60: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

60

Cosine Similarity performed wellSimilarity Measure Threshold FP FN FP+FN ACC F1 AUC

(Cosine,WordCount) (0.10,-0.85) 24 10 34 0.987 0.906 0.968

(Cosine,SEKernel) (0.10,0.00) 6 35 40 0.990 0.901 0.934

Cosine 0.15 31 22 53 0.983 0.881 0.961

(WordCount,SEKernel) (-0.80,0.00) 14 27 42 0.985 0.818 0.885

WordCount -0.85 6 44 50 0.982 0.806 0.870

SEKernel 0.05 64 83 147 0.965 0.683 0.865

Bytes -0.65 28 133 161 0.962 0.584 0.746

Jaccard 0.05 74 86 159 0.962 0.538 0.809

TF-Intersection 0.00 49 104 153 0.967 0.537 0.740

Page 61: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

61

Average precision of 0.89 on 18 Archive-It collections

(Cosine,WordCount) with (0.10,-0.85) thresholds

Page 62: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

62

Detecting duplicates in a TimeMap

Page 63: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

63

9 mementos for news.egypt.com, but 5 are duplicates

Page 64: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

64

How do we dynamically divide the collections into appropriate slices?

Page 65: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

65

We expected to see more like this…

The Global Food Crisis collection at Archive-It

Page 66: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

66

This is what we found

Egypt Revolution and Politics

Human Rights April 16 Archive Virginia Tech Shooting

Jasmine Revolution 2011 Wikileaks Document Release

Page 67: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

67

Selecting representative pages for generating stories(skipping clustering details, but goal is k=28)

Page 68: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

68

Quality metrics for selecting mementos• In the DSA, memento quality Mq is calculated as

following: Mq = (1 − wm*Dm) + wql*Sql + wqc*Sqc

• Dm is the memento damage (Brunelle, JCDL 2014)

• Sql is the snippet quality based on the URI level• Sqc is the snippet quality based on URI category• wm, wql, wqc are the weights of memento damage, level,

and category

Page 69: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

69

We prefer a higher quality memento (Dm)

http://wayback.archive-it.org/2358/20110201231457/http://news.blogs.cnn.com/category/world/egypt-world-latest-news/

http://wayback.archive-it.org/2358/20110201231622/http://www.bbc.co.uk/news/world/middle_east/

Brunelle et al. Not All Mementos Are Created Equal: Measuring The Impact Of Missing Resources, JCDL 2014

Page 70: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

70

We consider the page that gives an attractive snippet

https://wayback.archive-it.org/2358/20110207193404/http://news.blogs.cnn.com/2011/02/07/egypt-crisis-country-to-auction-treasury-bills/

https://wayback.archive-it.org/2358/20110207194425/http://www.cnn.com/2011/WORLD/africa/02/07/egypt.google.executive/index.html?hpt=T1

Page 71: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

71

We prefer deep links over high level domains (Sql)

Feb. 11, 2011: the homepage of BBC on Storify

Feb. 11, 2011: the homepage of BBC Middle East section on Storify

Feb. 11, 2011: the article of BBC on Storify

https://wayback.archive-it.org/2358/20110211191429/http://www.bbc.co.uk/

https://wayback.archive-it.org/2358/20110211192204/http://www.bbc.co.uk/news/world-middle-east-12433045

https://wayback.archive-it.org/2358/20110211191942/http://www.bbc.co.uk/news/world/middle_east/

Page 72: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

72

Social media pages may not produce good snippets (Sqc)

http://wayback.archive-it.org/1784/20100131023240/http:/twitter.com/Haitifeed/http://wayback.archive-it.org/2358/20141225080305/https:/www.facebook.com/elshaheeed.co.uk

Page 73: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

73

Visualizing stories in Storify

Page 74: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

74

Remember Yasmin’s hand-crafted stories?

Page 75: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

75

Remember Yasmin’s hand-crafted stories?

Page 76: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

76

We extract the metadata of the pages and order them chronologically

{ "elements":[ { "permalink":"http://wayback.archive-it.org/694/20070523182134/http://www.usatoday.com/news/nation/2007-04-16-virginia-tech_N.htm", "type":"link", "source":{"href":"http://www.usatoday.com", "name":"www.usatoday.com @ 23, May 2007"} }, { "permalink":"http://wayback.archive-it.org/694/20070530182159/http://www.time.com/time/specials/2007/vatech_victims", "type":"link", "source":{"href":"http://www.time.com", "name":"www.time.com @ 30, May 2007" } }, { "permalink":"http://wayback.archive-it.org/694/20070530182206/http://www.collegiatetimes.com/", "type":"link", "source":{"href":"http://www.collegiatetimes.com", "name":"www.collegiatetimes.com @ 30, May 2007" } }, { "permalink":"http://wayback.archive-it.org/694/20070606234248/http://hokies416.wordpress.com/", "type":"link", "source":{ "href":"http://hokies416.wordpress.com", "name":"hokies416.wordpress.com @ 06, Jun 2007" } }, …{ "permalink":"http://wayback.archive-it.org/694/20070620234329/http://www.hokiesports.com/april16/", "type":"link", "source":{"href":"http://www.hokiesports.com", "name":"www.hokiesports.com @ 20, Jun 2007" } }, ],

"description":"This is an automatically generated story from Archive-It collection.", "title":"April 16 Archive ”}

We override the default metadata to generate more attractive snippets

Page 77: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

77

Example of an automatically generated story

Notice the good metadata: images, titles with dates, favicons

Page 78: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

78

Evaluating the Dark and Stormy Archive framework

Page 79: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

79

What a successful evaluation looks like!

• We use Amazon's Mechanical Turk to compare the following stories:

• Human-generated stories• DSA (automatically) generated stories• Randomly generated stories

• Successful evaluation should result in:• Human and DSA stories are indistinguishable• Human and DSA stories are better than Random

Page 80: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

80

Our guidelines for expert archivists at Archive-It for generating stories from the collections

Page 81: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

81

We received 23 stories for 10 Archive-It collections

SPST is “Sliding Page, Sliding Time”SPFT is “Sliding Page, Fixed Time” FPST is “Fixed Page, Sliding Time”

Page 82: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

82https://storify.com/mturk_exp/3649b1s-57218803f5db94d11030f90b

• Generated by domain experts• Sliding Page, Sliding Time• The Boston Marathon

Bombing collection

Page 83: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

83

Automatically generated stories from archived collections

1. Obtain the seed list and the TimeMap of URIs from the front-end interface of Archive- It

2. Extract the HTML of the mementos from the WARC files (locally hosted at ODU) and download the collections that we do not have in the ODU mirror from Archive-It

3. Extract the text of the page using the Boilerpipe library 4. Eliminate the off-topic pages based on the best-performing method ((Cosine,

Word-Count) with the suggested thresholds (0.1, −0.85))5. Exclude the duplicates of each TimeMap 6. Eliminate the non-English language pages7. Slice the collection dynamically and then cluster the mementos of each slice

using DBSCAN algorithm8. Apply the quality metrics to select the best representative pages9. Sort the selected mementos chronologically then put them and their metadata

in a JSON object

Page 84: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

84https://storify.com/mturk_exp/3649b0s

• Automatically generated story • Sliding Page, Sliding Time• The Boston Marathon

Bombing collection

Page 85: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

85

Random stories

28 mementos were randomly selected from each collection before excluding off-topic and duplicate pages

Page 86: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

86https://storify.com/mturk_exp/3649b2s-57227227bb79 048c2d0388dc

• Randomly generated story• Sliding Page, Sliding Time• The Boston Marathon

Bombing collection

Page 87: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

87https://storify.com/mturk_exp/3649bads

if someone prefers this story, we exclude their results

• Poorly generated story• Sliding Page, Sliding Time• The Boston Marathon

Bombing collection

Page 88: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

88

MT experiment setup

• Three HITs for each story (69 HITs to evaluate 23 stories); two comparisons per HIT:

• HIT1: human vs. automatic, human vs. poor• HIT2: human vs. random, human vs. poor• HIT3: random vs. automatic, automatic vs. poor

• 15 distinct turkers with master (have high acceptance rate) qualification for each HIT

• We rejected the submissions contained poorly-generated stories and the HITs that were completed in less than 10 seconds (mean time per HIT = 7 minutes)

• 989 out of 1,035 (69*15) valid HITs

• We awarded the turker $0.50 per HIT

https://www.mturk.com/mturk/help?helpPage=worker#what_is_master_worker

Page 89: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

89

A sample HIT

Page 90: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

90

DSA == Human(Human,DSA) > Random

Page 91: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

91

Automatic versus Human

Sliding Page, Sliding Time Sliding Page, Fixed Time Fixed Page, Sliding Time

Page 92: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

92

Human versus Random

Sliding Page, Sliding Time Sliding Page, Fixed Time Fixed Page, Sliding Time

Page 93: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

93

Automatic versus Random

Sliding Page, Sliding Time Sliding Page, Fixed Time Fixed Page, Sliding Time

Page 94: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

94

Success!

DSA-generated stories are just as good as stories generated by human experts

Page 95: Nelson, Michael: Summarizing Archival Collections Using Storytelling Techniques

95

Use interface people already know how to use to summarize collections

Archived collectionsStorytelling services

Archived enriched stories

All the code, datasets, papers, slides, etc.:http://bit.ly/YasminPhD


Top Related