using lod to crowdsource dutch ww2 underground newspapers on wikipedia - dch, 30-08-2017, berlin,...

59
Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia Olaf Janssen, National Library of the Netherlands Digital Cultural Heritage, Berlin, 30-08-2017 [email protected] - @ookgezellig - slideshare.net/OlafJanssenNL

Upload: koninklijke-bibliotheek

Post on 21-Jan-2018

107 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia

Olaf Janssen, National Library of the Netherlands

Digital Cultural Heritage, Berlin, 30-08-2017

[email protected] - @ookgezellig - slideshare.net/OlafJanssenNL

Page 2: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

htt

p:/

/ww

w.4

en5

mei

amst

erd

am.n

l/at

tach

men

t/4

74

54

Page 3: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

During WW2 the Dutch resistance issued many

underground newspapers.

In every shape & form…

htt

p:/

/ww

w.4

en5

mei

amst

erd

am.n

l/at

tach

men

t/4

74

54

Page 4: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

http://resolver.kb.nl/resolve?urn=ddd:010436323

http://resolver.kb.nl/resolve?urn=ddd:010442948

http://resolver.kb.nl/resolve?urn=ddd:010447825 http://resolver.kb.nl/resolve?urn=ddd:010450508

From well-organized, ‘professional’

big titles…

(o.a. Parool, Vrij Nederland, Trouw, de Waarheid)

Page 6: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

After the war 1.300 newspaper titles were collected & preserved at the NIOD …

https://commons.wikimedia.org/wiki/File:Verzetskrant_in_archiefdozen_bij_het_NIOD.jpg – CC-BY-SA - OlafJanssen

The national Institute for War, Holocaust and Genocide Studies in Amsterdam

Page 7: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

http://opac-gonext.oclc.org:8180/DB=8/XMLPRS=Y/PPN?PPN=107123223

.. and were described in formal library catalogues

(1.300 titles)

Bibliographic metadata

Underground students’ newspaper

from The Hague

Page 8: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

In 2010 these WW2 newspapers were digitized…..

Page 9: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

www.delpher.nl/kranten

…into full-texts in Delpher …

(1.300 titles)

The Dutch national aggregator for historic full-texts • Newspapers • Books • Magazines

Page 10: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

In Delpher you can read and word-search these newspapers…

Page 11: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

But say, I want to know more about this newspaper • What sort of illegal newspaper was it? • What is the history of this newspaper? • Who wrote it? • Where was this newspaper printed? • How was it distributed? • Were there any relations with other underground newspapers? • Etc…

Page 12: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

But say, I want to know more about this newspaper • What sort of illegal newspaper was it? • What is the history of this newspaper? • Who wrote it? • Where was this newspaper printed? • How was it distributed? • Were there any relations with other underground newspapers or

resistance groups? • Etc…

Page 13: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

But say, I want to know more about this newspaper • What sort of illegal newspaper was it? • What is the history of this newspaper? • Who wrote it? • Where was this newspaper printed? • How was it distributed? • Were there any relations with other underground newspapers? • Etc…

You can’t answer these questions from Delpher

Page 14: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Big drawback of Delpher:

No contextual information about WW2 underground newspapers

https://thejungleisneutral.files.wordpress.com/2013/11/lost.jpg

Page 15: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Where would many people go to find contextual information about historic newspapers?

Probably Wikipedia (via Google)

Page 16: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Where would many people go to find contextual information about historic newspapers?

Probably Wikipedia (via Google)

Page 17: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

http://nl.wikipedia.org/wiki/De_Geus_onder_studenten_(verzetsblad)

Where would many people go to find contextual information about historic newspapers?

Probably Wikipedia (via Google)

Page 18: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

htt

p:/

/2.b

p.b

logsp

ot.

com

/_BW

zuYw

iS6-I

/TM

geR

sFd3m

I/AAAAAAAAElw

/3cv

gbZSPW

cs/s

1600/d

oct

or+

macr

o+

judy+

scare

d.jpg

Page 19: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

htt

p:/

/2.b

p.b

logsp

ot.

com

/_BW

zuYw

iS6-I

/TM

geR

sFd3m

I/AAAAAAAAElw

/3cv

gbZSPW

cs/s

1600/d

oct

or+

macr

o+

judy+

scare

d.jpg

Page 20: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

htt

p:/

/2.b

p.b

logsp

ot.

com

/_BW

zuYw

iS6-I

/TM

geR

sFd3m

I/AAAAAAAAElw

/3cv

gbZSPW

cs/s

1600/d

oct

or+

macr

o+

judy+

scare

d.jpg

Information on Dutch underground newspapers was distributed across multiple, unconnected sources

1. Descriptions (metadata in library catalogue, 1.300 titles) 2. Content (full-text in Delpher, 1.300 titles) 3. Context (in Wikipedia…. at least... )

Page 21: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

htt

p:/

/2.b

p.b

logsp

ot.

com

/_BW

zuYw

iS6-I

/TM

geR

sFd3m

I/AAAAAAAAElw

/3cv

gbZSPW

cs/s

1600/d

oct

or+

macr

o+

judy+

scare

d.jpg

Information on Dutch underground newspapers was distributed across multiple, unconnected sources

1. Descriptions (metadata in library catalogue, 1.300 titles) 2. Content (full-text in Delpher, 1.300 titles) 3. Context (in Wikipedia…. at least... )

Page 22: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

htt

p:/

/2.b

p.b

logsp

ot.

com

/_BW

zuYw

iS6-I

/TM

geR

sFd3m

I/AAAAAAAAElw

/3cv

gbZSPW

cs/s

1600/d

oct

or+

macr

o+

judy+

scare

d.jpg

Information on Dutch underground newspapers was distributed across multiple, unconnected sources

1. Descriptions (metadata in library catalogue, 1.300 titles) 2. Content (full-text in Delpher, 1.300 titles) 3. Context (in Wikipedia…. at least... )

Page 23: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

htt

p:/

/2.b

p.b

logsp

ot.

com

/_BW

zuYw

iS6-I

/TM

geR

sFd3m

I/AAAAAAAAElw

/3cv

gbZSPW

cs/s

1600/d

oct

or+

macr

o+

judy+

scare

d.jpg

Information on Dutch underground newspapers was distributed across multiple, unconnected sources

1. Descriptions (metadata in library catalogue, 1.300 titles) 2. Content (full-text in Delpher, 1.300 titles) 3. Context (in Wikipedia…. at least... )

Page 24: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany
Page 25: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

This Wikipedia article is a carefully chosen exception

Page 26: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

1. Very few illegal newspapers had their own WP articles

2. The inventory of these newspapers on WP:NL was far from complete

<<< 1.300 titles

Page 27: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

We tackled both problems!

Page 28: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Wikiproject

“Systematically and uniformly describe all 1.300 Dutch underground newspapers from WW2

on Wikipedia”

tinyurl.com/verzetskranten

Page 29: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Wikiproject

“Systematically and uniformly describe all 1.300 Dutch underground newspapers from WW2

on Wikipedia”

tinyurl.com/verzetskranten

Reach big audiences

Page 30: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

https://thejungleisneutral.files.wordpress.com/2013/11/lost.jpg

We badly needed contextual information about

the newspapers. Where did we get it?

De Ondergrondse Pers 1940-1945

Lydia E. Winkel, H. de Vries , 1989

This paper book contains entries about

all 1.300 illegal newspapers

Page 31: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Entry 199 – De Geus; (onder studenten)

Page 32: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Unique ID

(within the book)

Page 33: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Place of publication

Newspaper Place name

Page 34: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Context

Raw material for

Wikipedia article!

Page 35: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Person names

Newspaper Persons

Page 36: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

IDs of related students’ newspapers

This newspaper Other newspapers

Page 38: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

We OCRed this book into PDF (CC-BY-SA)

http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)

Available online (PDF, flat file)

Open license (CC-BY-SA)

Convert PDF into structured database. Link titles to places, persons, other titles Link titles to KB-catalogue (metadata) and Delpher (full-text) Link titles, persons and places to external sources

Page 39: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

We OCRed this book into PDF (CC-BY-SA)

http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)

Available online (PDF, flat file)

Open license (CC-BY-SA)

Convert PDF into structured database. Link titles to places, persons, other titles Link titles to KB-catalogue (metadata) and Delpher (full-text) Link titles, persons and places to external sources

Page 40: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

We OCRed this book into PDF (CC-BY-SA)

http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)

Available online (PDF, flat file)

Open license (CC-BY-SA)

---------------------------------------------------

Convert PDF into structured database Link: titles places, persons, other titles Link titles to KB-catalogue (metadata) and Delpher (full-text) Link titles, persons and places to external sources

Page 41: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

We OCRed this book into PDF (CC-BY-SA)

http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)

Available online (PDF, flat file)

Open license (CC-BY-SA)

---------------------------------------------------

Convert PDF into structured database Link: titles places, persons, other titles Link: titles library catalogue (metadata) and Delpher (full-text) Link titles, persons and places to external sources

Page 42: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

We OCRed this book into PDF (CC-BY-SA)

http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)

Available online (PDF, flat file)

Open license (CC-BY-SA)

---------------------------------------------------

Convert PDF into structured database. Link: titles places, persons, other titles Link: titles library catalogue (metadata) and Delpher (full-text) Link: titles, persons and places external sources

Page 43: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Convert PDF into structured database.

Link: titles places, persons, other titles Link: titles library catalogue (metadata) and Delpher (full-text) Link: titles, persons and places external sources

LOD & database expert

Gerard Kuys

Page 44: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Convert PDF into structured database.

Link: titles places, persons, other titles Link: titles library catalogue (metadata) and Delpher (full-text) Link: titles, persons and places external sources

Page 45: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

VIAF

Page 46: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany
Page 47: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Available online (PDF, flat file)

Open license (CC-BY-SA)

---------------------------------------------------

Convert PDF into structured database. Link: titles places, persons, other titles Link: titles library catalogue (metadata) and Delpher (full-text) Link: titles, persons and places external sources

Page 48: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Summer 2016

This LOD database is unique in the Netherlands.

First time data about underground newspapers was

systematically collected and linked online!

htt

ps:

//w

ww

.pin

tere

st.c

om

/fre

eth

ewro

nge

d/w

orl

d-w

ar-i

i/

Page 49: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Wikiproject

“Systematically and uniformly describe all 1.300 Dutch underground newspapers from WW2

on Wikipedia”

Page 50: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

We have: LOD database

Using an article template we generated 1.300 uniform and interlinked Wikipedia stubs

htt

ps:

//c1

.sta

ticf

lickr

.co

m/9

/82

81

/76

99

23

19

18

_11

a73

56

c38

_b.jp

g

Page 51: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

https://nl.wikipedia.org/wiki/De_Geus_onder_studenten_(verzetsblad)

Page 52: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Grey = Wikipedia article stub Automatically generated from database using the article template

Page 53: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

https://nl.wikipedia.org/wiki/De_Geus_onder_studenten_(verzetsblad)

Non-grey = Wikipedia article stub Automatically generated from database using the article template

Page 54: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

This bit was added manually

to expand stub into full article

Crowdsourcing by Dutch Wikipedia community

https://nl.wikipedia.org/wiki/De_Geus_onder_studenten_(verzetsblad)

Page 55: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Wikipedia volunteers are expanding the 1.300 stubs…

gradually creating more and more full articles.

Door Sebastiaan ter Burg [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons

Page 56: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Before the project

Page 57: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

The number of articles is growing steadily…

Page 58: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

… making Dutch people wiser and happier!

htt

p:/

/ww

w.f

orm

erd

ays.

com

/20

11

/05

/du

tch

-lib

erat

ion

.htm

l

Page 59: Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany

Vielen Dank!

[email protected] - @ookgezellig

tinyurl.com/verzetskranten