webarchiv - archive of the czech web

Click here to load reader

Upload: jaroslav-kvasnica

Post on 12-Jul-2015

949 views

Category:

Technology

1 download

Report

Download

Embed Size (px):

TRANSCRIPT

WebArchiv - Archive of the Czech Web

5. 6. 2014

WebArchiv

• a digital archive of Czech web resources

• purposes of web archiving:

• growth of electronic online resourses • long-term preservation • at-risk content on web

Page 3: WebArchiv - Archive of the Czech Web

Department of Web Archiving

Page 4: WebArchiv - Archive of the Czech Web

History

• project started in 2000

• first document harvested in 3. 9. 2001

• IIPC member from 2007 !

• since 2008 part of National Digital Library

Page 5: WebArchiv - Archive of the Czech Web

Today

• 87 TB archived data

• whole archive accesible in the library

• only selective harvests accesible online

• more then 4000 archived websites with online access

• 3 people in the deparment + 1 IT guy

• focus on long-term preservation

Page 6: WebArchiv - Archive of the Czech Web

Legal Issues

• Legal deposit act - doesn’t cover online-born documents

• Copyright act - only the library licence which allows library to make a reproduction of a work for own archiving or conservation purpose

• Online access - based on contracts with publishers or on Creative Commons licence

Page 7: WebArchiv - Archive of the Czech Web

Web Archive Content

1. Comprehesive harvests

2. Selective harvests

3. Topic collections

Page 8: WebArchiv - Archive of the Czech Web

Comprehensive harvests

• contract with czech domain provider CZ.NIC

• once a year crawl of the whole .cz domain

• accesible only in the library

• a maximum of 5000 harvested files per site

Page 9: WebArchiv - Archive of the Czech Web

Selective Harvests

• selective approach: • territory • language • autorship • topic/content

• curated resourses • crawled periodically (several frequencies) • communication with publishers • online access • cataloging

Page 10: WebArchiv - Archive of the Czech Web

Topic Collections

• collection of resources which are related to certain event of topic

• for example: • presidential elections • floods • olympic games

Page 11: WebArchiv - Archive of the Czech Web

Workflow

• selecting and evaluating

• contracting with publishers

• harvesting

• access and quality assurance

Page 12: WebArchiv - Archive of the Czech Web

Software

• crawler: Hertrix

• access: Open Wayback

• web curator tool: WA admin

• https://github.com/WebArchivCZ/

https://github.com/WebArchivCZ/

Page 13: WebArchiv - Archive of the Czech Web

Thank you for you attention. !!Barbora Bjačková [email protected] !Jaroslav Kvasnica [email protected]

http://www.webarchiv.cz

mailto:[email protected]

http://www.webarchiv.cz

Czech Social Science Data Archive Preservation Policyarchiv.soc.cas.cz/sites/default/files/csda_preservation_policy_0.pdf · Czech Social Science Data Archive Preservation Policy

2 Czech 100 Czech for Travelers Czech for Travelers 18.9.2008 Lesson 4 – Czech Phenomenons 18.9.2008

Webarchiv ETHZ / Webarchive ETH

Attendance of ice hockey matches in the Czech Extraliga · Munich Personal RePEc Archive Attendance of ice hockey matches in the Czech Extraliga Lahvicka, Jiri 21 December 2010 Online

Http:// WebArchiv Czech Web Archive IIPC 2007, Paris

02 Communicative Czech (Intermediate Czech) Workbook

CZECH REPUBLIC Czech National EmblemFlag of the Czech Republic

CZECH REPUBLIC 2014 · Czech Republic, originally drafted by Monika Granja, Czech Donors Forum with the support of Pavlína Kalousová, Czech Donors Forum, Martin Kameník, Czech

Metadata & Webarchiv

Webarchiv - INFORUM · Budoucnost českého webového archivu. Jsme Webarchiv digitální knihovna, která uchovává webové zdroje pro budoucí generace. www Pokud je nebudeme průběžně

Czech food and what eat Czech teenegers. Warp Czech food Typical Czech meal Typical Czech sweet foods Where going out on a typical Czech food ? What eat

The Czech Republic’s Democracy Promotion Policies and ...pdc.ceu.hu/archive/00006527/01/IPA_Czech-Republic... · parties,2 democracy promotion has become one of the Czech Republic’s

Marktversagen - Webarchiv ETHZ / Webarchive ETHwebarchiv.ethz.ch/vwl/down/folien/schips_ss05/Marktversagen.pdf · 1 Marktversagen: Monopol, Oligopol, Externalitäten und öffentliche

Drawing Gantt Charts in LATEX with TikZctan.math.washington.edu/tex-archive/graphics/pgf/contrib/pgfgantt/… · Alamos National Laboratory), Petr Pošík (Czech Technical University

Webarchiv jako digitální knihovna II

Project of Digitisation of the Czech TV Archive - Martin Bouda (Czech TV, CZ)

NetarchiveSuite Meeting, BnF, 24./25.11.2011 1 Curator Track Web@rchive Austria Michaela Mayr Austrian National Library [email protected]

Back Czech, fore Czech - GOLD CZECH - IIHF International Ice Hockey …webarchive.iihf.com/fileadmin/user_upload/PDF/The_IIHF/... · 2014-03-27 · Back Czech, fore Czech - GOLD CZECH

Webarchiv jako digitální knihovna

Webarchiv AKM 2015

Http:// WebArchive – Archive of the Czech Web Mgr. Jan HUTAŘ

WebArchiv digitální knihovna českého webu

Czech Statistical Office Prague, Czech Republic

Archiv českého webu (Webarchiv) a CC Lukáš Gruber

Webarchiv CZ

CZECH FLAX AND HEMP COLLECTION - archive-ecpgr.cgiar.orgarchive-ecpgr.cgiar.org/fileadmin/... · czech flax and hemp collection martin pavelek agritec, research, breeding and services

Czech University of Life Sciences Prague, Czech Republic. Bioklima 2013_presentation2.pdf · Czech University of Life Sciences Prague, Czech Republic Czech University of Life Sciences

Europass website activity report 2014 (Czech Republic, Czech) · Europass website activity report 2014 (Czech Republic, Czech) State of play: December 2015 Visits from Czech Republic

Munich Personal RePEc Archive - uni-muenchen.de Personal RePEc Archive ... the topic of country vs. industry effects in stock returns is ... (the Czech Republic,

The Czech Library Digitization of Cultural Heritage ...wildem/Prague_Seminar_2010/Doran.CZpaper.2011-01-23.pdf · digital repositories Kramerius, Manuscriptorium, and the WebArchiv,

NATIONAL RESEARCH REPORT Czech Republic CZECH …

WebArchiv – digitální knihovna českého webu

Webarchiv - Městská knihovna v Prazesdruk.mlp.cz/data/xinha/sdruk/Holoubkova.pdf · Webarchiv digitální knihovna, která uchovává webové zdroje pro budoucí generace. www Pokud

- Munich Personal RePEc Archive - Czech …Munich Personal RePEc Archive Czech Government Bond yields under FX pressure Simerský, Mojmír May 2018 Online at Czech Government Bondyields

Czech Television Television For the Third Millenniumimg.ceskatelevize.cz/boss/pages/english/pdf/CT_Angl.pdf · Television Archive Award 1998 - Fiat/Ifta, Florence Trilobite, Annual