preserving born-digital news panel jcdl 2016
TRANSCRIPT
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
Panel: Preserving Born-Digital News
Collecting, Analyzing, and Linking TV News and
Social Media Collections
Peter Broadwell@peterbroadwell
Martin Klein@mart1nkle1n
University of California Los AngelesResearch Library
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
2
• Iranian Green Movement• Tahrir Square Unrest• Zanzibar Riots• Israel, South Africa, Argentina, Cuba, Armenia, Ukraine
International Digitizing Ephemera Project
http://digital.library.ucla.edu/dep/
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
3
• Tahrir Square Egypt & Libya unrest, 2011• Tōhoku earthquake and tsunami, Japan, 2011• AirAsia 8501 crash, December 2014• Charlie Hebdo shooting, January 2015• GOP and Democratic Party presidential debates 2015/16
Collecting Social Media - Tweets
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
4
• Tahrir Square Egypt & Libya unrest, 2011• Tōhoku earthquake and tsunami, Japan, 2011• AirAsia 8501 crash, December 2014• Charlie Hebdo shooting, January 2015• GOP and Democratic Party presidential debates 2015/16
Social Feed Managerhttp://social-feed-manager.readthedocs.org/
Collecting Social Media - Tweets
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
5
Collecting TV News - NewsScape
• 289,174 hours of TV news archived digitally• Recorded 2005-present, ca. 145 shows/day• 46 networks, 13 countries, 9 languages• Searchable by captions, official transcripts, on-screen text• 3.55 billion words indexed
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
6
Collecting TV News - NewsScape
• 289,174 hours of TV news archived digitally• Recorded 2005-present, ca. 145 shows/day• 46 networks, 13 countries, 9 languages• Searchable by captions, official transcripts, on-screen text• 3.55 billion words indexed
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
7
Linking TV News and Social Media
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
8
Linking TV News and Social Media
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
9
Linking TV News and Social Media
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
10
CNN09/16/201505:22pm
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
11
CNN09/16/201505:22pm
Twitter09/16/2015
06:22pm
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
12
Linking via Automated Entity Detection
• Discover and highlight commonalities and relationships between disjoint collections on related news events• Link to authorities• Address problem of disambiguation• Improve discoverability and reusability
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
13
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
14
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
15
Experimental Exploration
• Apply DBpedia Spotlight Named Entity Recognition (NER) software to collections on second GOP presidential primary debate on 09/16/2015• Twitter: 800,000 tweets• TV: CNN coverage of debate• Minute granularity• Persons, Organizations, Places
Results:• Linked entities with URIs to DBpedia resources• Visualization of correlations between entities
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
16
Persons
http://sologlo.library.ucla.edu/visualizations/gop_debate/rickshaw/graphs/gop_persons.html
Twitt
erN
ewsS
cape
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
17
Places
http://sologlo.library.ucla.edu/visualizations/gop_debate/rickshaw/graphs/gop_places.html
Twitt
erN
ewsS
cape
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
18
Persons
http://sologlo.library.ucla.edu/visualizations/gop_debate/rickshaw/graphs/gop_persons.html
Twitt
erN
ewsS
cape
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
19
Organizations
http://sologlo.library.ucla.edu/visualizations/gop_debate/rickshaw/graphs/gop_orgs.html
Twitt
erN
ewsS
cape
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
20
http://dbpedia.org/resource/Ronald_Reagan
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
21
http://dbpedia.org/resource/Ronald_Reagan
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
22
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
23
Hashtags
http://sologlo.library.ucla.edu/visualizations/gop_debate/rickshaw/graphs/gop_orgs_hashtags.html
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
24
#DebateWithBernie
http://sologlo.library.ucla.edu/visualizations/gop_debate/rickshaw/graphs/gop_orgs_hashtags.html
Collecting, Analyzing, and Linking TV News and Social Media Collections
@mart1nkle1n#jcdl2016 Newark, NJ, 06/21/2016
Panel: Preserving Born-Digital News
Collecting, Analyzing, and Linking TV News and
Social Media Collections
Peter Broadwell@peterbroadwell
Martin Klein@mart1nkle1n
University of California Los AngelesResearch Library