web science and web archive research @ l3s wolfgang nejdl l3s research center hannover, germany

8
Web Science and Web Archive Research @ L3S Wolfgang Nejdl L3S Research Center Hannover, Germany

Upload: justin-elliott

Post on 28-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Web Science and Web Archive Research @ L3S Wolfgang Nejdl L3S Research Center Hannover, Germany

Web Science andWeb Archive Research

@ L3S

Wolfgang Nejdl

L3S Research CenterHannover, Germany

Page 2: Web Science and Web Archive Research @ L3S Wolfgang Nejdl L3S Research Center Hannover, Germany

L3S @ Hannover

Page 3: Web Science and Web Archive Research @ L3S Wolfgang Nejdl L3S Research Center Hannover, Germany

Computer Science and interdisciplinary research on all aspects of the Web

Internet: Communication and Networks

Information: Accessing information and knowledge on and through the Web

Community: Supporting communities and groups on the Web, for research, education, production and entertainment

Society: Requirements (technological, social, legal) for the Web

Selected projects

Web Science @ L3S

LivingKnowledge: Diversity, opinion and

bias on the Web

CUbRIK: Searching by computers and humans

Glocal: Event-based Searchfor Networked Media

Privacy and clinical research

Arcomem: Social Web & Archiving

ForgetIT: Concise Preservation via

Managed Forgetting

Page 4: Web Science and Web Archive Research @ L3S Wolfgang Nejdl L3S Research Center Hannover, Germany

Spam

Attack on Copts

Gun running from Sudan

Are we loosing the past of the web?

Page 5: Web Science and Web Archive Research @ L3S Wolfgang Nejdl L3S Research Center Hannover, Germany

Are we loosing the past of the web?Library of Congress

In April 2010 LoC and Twitter signed an agreement to archive all tweets since 2006

January 2013: It is clear that technology to allow for scholarship access to large data sets is lagging behind technology for creating and distributing such data. The Library is now pursuing partnerships to allow some limited access capability in reading rooms.

German National Library Based on a law of June 22, 2006, the GNL should collect, enrich, catalog

and archive Web publicationsInternet Archive

Archiving the Web (3 Petabyte) since 1996 Access possible through the URL

National Archives in Denmark, Portugal, etc.Relevant Projects @ L3S

Web Archiving: LiWA, ARCOMEM, ForgetIT Web Search: PHAROS, CUBRIK Web Analysis: EUMSSI ERC Advanced Grant: ALEXANDRIA (2014 – 2018, 2.5 Mill. Euro)

Cooperations German National Library, British Library, Internet Archive, Rutgers

University, et al

Page 6: Web Science and Web Archive Research @ L3S Wolfgang Nejdl L3S Research Center Hannover, Germany

ERC Grant ALEXANDRIA: Temporal Information Retrieval, Exploration and Analytics in Web Archives

Page 7: Web Science and Web Archive Research @ L3S Wolfgang Nejdl L3S Research Center Hannover, Germany

ALEXANDRIA Test Beds

Temporal Wikipedia English, German, Italian Wikipedia with all revisions Links to news archives (NYTimes, Times, Zeit) and web content Entity extraction and evolution, time and entity aware retrieval

Academic Web Archive Academic content in Germany and UK BibSonomy and FreeSearch/DBLP data Time-aware entity extraction and linking, collaborative exploration

and analytics

Politics on the Web Political web sites: German and UK Web content (together with the

British Library, German National Library and Internet Archive), Stanford US collections, new crawls, blogs, social media

Social stream aggregation, collaborative analytics, as well as the other research questions

Page 8: Web Science and Web Archive Research @ L3S Wolfgang Nejdl L3S Research Center Hannover, Germany

Web Observatory and eHumanities

Multidisciplinary Research Questions:• How to decide which Web content to capture, in order to enable relevant

analysis by the eHumanities? How to document the selection and collection process?

• How can combining distributed Web Observatories help to cover multiple perspectives, disciplines and tasks (for selection)?

• How does the Web influence collective and individual remembering and language? How to systematically capture Web evolution and the evolution of observed processes and social realities?

• What are relevant multidisciplinary methods for a comprehensive analysis of Web content and the (changing) social realities reflected by it?

• How to deal with legal, commercial and privacy aspects of Web Archiving?

Collective remembering &collective memory

in the Web Age„Web Memory / Archive“

Web as reflection of social processes and practices,

language, culture„Web (Archive) as Memory“

Web Observatory with focus on eHumanities

„Web Gedächtnis“