robust linking to web resources

34
Robust Linking to Web Resources @mart1nkle1n DtMH 2017, 11/15/2017, San Francisco, CA Robust Linking to Web Resources http://robustlinks.mementoweb.org/ Martin Klein @mart1nkle1n Research Library Los Alamos National Laboratory Acknowledgements: Herbert Van de Sompel, LANL Harihar Shankar, LANL Michael L. Nelson, ODU Mark Graham, Internet Archive

Upload: martin-klein

Post on 22-Jan-2018

668 views

Category:

Internet


0 download

TRANSCRIPT

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

Robust Linking to Web Resourceshttp://robustlinks.mementoweb.org/

Martin Klein@mart1nkle1n

Research Library

Los Alamos National Laboratory

Acknowledgements:

Herbert Van de Sompel, LANL

Harihar Shankar, LANL

Michael L. Nelson, ODU

Mark Graham, Internet Archive

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

2

Slide by Herbert Van de Sompel, 2017

A Managed Collection Desires Reliable Outlinks

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

3

Slide by Herbert Van de Sompel, 2017

Links to another Managed Collection

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

4

Slide by Herbert Van de Sompel, 2017

Links to Web at Large Resources

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

5

Link Rot

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

6

https://web.archive.org/web/20140101072007/http://netpreserve.org/general-assembly/2013/overview

IIPC

2013

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

7

http://netpreserve.org/general-assembly/2013/overview

IIPC

today

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

8

Content Drift

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

9

https://web.archive.org/web/20161228184110/https://www.epa.gov/climatechange

EPA

12/2016

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

10

https://www.epa.gov/sites/production/files/signpost/cc.html

EPA

today

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

11

• On the web, all links are subject to reference rot

• Reference rot hinders our ability to follow links as they were

intended when they were put in place

• Link rot: a link stops working all together

• Content drift: The linked content changes over time and

may eventually no longer be representative of the

content that was originally linked

Problem

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

12

http://dx.doi.org/10.1371/journal.pone.0115253 http://dx.doi.org/10.1371/journal.pone.0167475

Reference Rot in Scholarly Communication

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

13

Link Rot in Scholarly Articles

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

14

Link Rot in Scholarly Articles

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

15

Reference Rot Over Time - arXiv

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

16

• On the web, all links are subject to reference rot

• Reference rot hinders our ability to follow links as they were

intended when they were put in place

• Link rot: a link stops working all together

• Content drift: The linked content changes over time and

may eventually no longer be representative of the

content that was originally linked

How can we:

1. Make links more robust?

2. Make them actionable for humans and machines?

Problem

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

17

Robust Links

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

18

Robust Links

1. Create a snapshot of referenced resources in a public web

archive

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

19

Why multiple archives? They aren’t magic web sites!They’re just web sites.

If you used Mummify, you’re now left with a bunch of defunct, shortened links like:

https://mummify.it/XbmcMfE3

Slide by Michael L. Nelson, 2016

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

20

Robust Links

1. Create a snapshot of referenced resources in a publically available

web archive

2. Decorate links with:

• URI of archived snapshot

• datetime of archiving

• resource’s original URI

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

21

Link Decoration with Standard HTML

<a href="http://web.archive.org/web/20171108053054/http://sfgov.org/"

data-originalurl="http://sfgov.org/"data-versiondate="2017-11-08">

City and County of San Francisco</a>

http://robustlinks.mementoweb.org/spec

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

22

Link Decoration via API

http://robustlinks.mementoweb.org/api/json/http://web.archive.org/web/20171108053054/http://sfgov.org/

• Submit URI of an archived

snapshot

• Retrieve Robust Links

HTML snippet

• Copy and paste into your

application

http://robustlinks.mementoweb.org/api/json/{URI-of-archived-snapshot}

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

23

Robust Links

1. Create a snapshot of referenced resources in a publically available

web archive

2. Decorate links with:

• URI of archived snapshot

• datetime of archiving

• resource’s original URI

Benefits:

• Can visit archived, immutable version of referenced resource

• Original URI & capture datetime allow finding versions in other

web archives

• Uniform, machine-actionable

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

24

Robust Links for Machines

1. JavaScript

2. Browser extensions

a. Memento for Chrome

b. IA Chrome Extension

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

25

Robust Links in Action - JavaScript

http://dx.doi.org/10.1045/november2015-vandesompel

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

26

Robust Links in Action - JavaScript

http://dx.doi.org/10.1045/november2015-vandesompel

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

27

Robust Links in Action – Memento for Chrome

https://chrome.google.com/webstore/detail/memento-time-travel/jgbfpjledahoajcppakbgilmojkaghgm

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

28

Robust Links in Action – Memento for Chrome

http://robustlinks.mementoweb.org/demo/uri_references.html

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

29

Robust Links in Action – IA Chrome Extension

https://chrome.google.com/webstore/detail/wayback-machine/fpnmgdkabkmnadcjpehmlllkndpkmiak

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

30

Robust Links in Action – IA Chrome Extension

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

31

Robust Links in Action – IA Chrome Extension

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

32

Robust Links in Action – IA Chrome Extension

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

33

Take-Aways

• Links on the web are subject to reference rot

• “Robustifying” them (manually or via API calls) can help alleviate the

problem

• Link decorations as proposed by Robust Links are

• based on HTML standards

• machine-actionable

• Organizations such as the Internet Archive, Wikipedia,

News Publishers can help with adoption

Robust Linking to Web Resources

@mart1nkle1n

DtMH 2017, 11/15/2017, San Francisco, CA

Robust Linking to Web Resourceshttp://robustlinks.mementoweb.org/

Martin Klein@mart1nkle1n

Research Library

Los Alamos National Laboratory

Acknowledgements:

Herbert Van de Sompel, LANL

Harihar Shankar, LANL

Michael L. Nelson, ODU

Mark Graham, Internet Archive