p. doorn (dans netherlands) - building data infrastructures for humanities

40
Building national and international data infrastructures for humanities research Peter Doorn - director, Data Archiving and Networked Services (DANS); co-ordinator, “Preparing DARIAH” (Digital Research Infrastructure for the Arts and Humanities) Presentation for Digitale Forschungsinfrastrukturen für die Geschichtswissenschaften, Bern, 16 September 2010 Driven by data

Upload: infoclioch

Post on 16-Jan-2015

875 views

Category:

Education


1 download

DESCRIPTION

Présentation de Peter Doorn (Data Archiving and Networked Services DANS) présentée lors do colloque infoclio.ch à Berne le 16 septembre 2010.

TRANSCRIPT

Page 1: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Building national and international data infrastructures for humanities research

Peter Doorn - director, Data Archiving and Networked Services (DANS); co-ordinator, “Preparing DARIAH” (Digital Research Infrastructure for the Arts and Humanities)

Presentation for Digitale Forschungsinfrastrukturen für die Geschichtswissenschaften, Bern, 16 September 2010

Driven by data

Page 2: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Contents:

1.What is a data/research infrastructure?

2.The changing needs of the researchers

3.Setting up data infrastructures in the Netherlands, 1964 – 2010

4.The next steps

5.DARIAH and other international initiatives

Driven by data

Page 3: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

1. What is a data/research infrastructure?

In the natural sciences: something concrete, something physical....

A building, a telescope, a particle accelerator, a nuclear icebreaker....

Driven by data

Page 4: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities
Page 5: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities
Page 6: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Driven by data

Research Infrastructures (R.I.)

• R.I. in general: permanent and physical• R.I. for the arts and humanities?

– Cultural heritage in all forms is the main source of humanities research– Libraries, archives and museums are the traditional “laboratories” for

the humanities

• In the digital age, essential for innovative humanities research is:– Access to digitised heritage data (data bases, text corpora, speech,

image collections, etc.)– Tools to process this information

• The most important new research infrastructure for the humanities is therefore a digital one

Page 7: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

What kind of infrastructure do humanities scholars need?

Page 8: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Driven by data

From Humanities computingto e-humanities

Roots go back to the 1960s:• text analysis, e.g. bible studies• quantitative social and economic history• computer linguistics• digital archaeology

E-humanities as analogy of e-science: ‘science increasingly done through distributed global

collaborations enabled by the Internet, using very large data collections, large-scale computing resources and high performance visualisation.’

Page 9: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

3. Setting up data infrastructures in the Netherlands, 1964 – 2010

Page 10: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

1989: Netherlands Historical Data Archive

• Initiative by Low Countries Association for History and Computing

• Started with feasibility study, followed by inventory of databases and a pilot

• Until 1995 on a project basis, supported with digitization projects

• Organizational form flexible

Driven by data

Page 11: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

2004: Electronic Depot of Dutch Archaeology• Idea came up at a conference of historians and archaeologists in 2003• 1980: computer used during excavation• Initiative by university archaeologists, data archive and state archaeological service• Started as a series of projects, since 2005 hosted by DANS

Driven by data

Page 12: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

DANS created in 2005• Merger of earlier existing data infrastructures

• Serving humanities and social sciences

• Mission: providing permanent access to research data

• Funded by Academy of Arts and sciences (KNAW) and

Dutch Organisation for Research (NWO)

• Budget: 2.5 M€ + 1 M€ projects. Staff grew from 10 to

almost 40 (including projects).

Driven by data

Page 13: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

What do we do?• Archive data and provide access

• Data projects in connection with researchers

• Data Seal of Approval

• Persistent Identifiers

• Symposia and Publications

• Subsidize “small data projects”

Three sections:• Archive

• Infrastructure

• Software Development Driven by data

Page 14: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Datasets according to disciplines

Driven by data

Page 15: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

www.dans.knaw.nl

Driven by data

Page 16: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Electronic Archiving System for searching and depositing data

Page 17: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Dataset description

Page 18: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Download data after

login

Data

Documentation

Publications

Driven by data

Page 19: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Download statistics visible to all registered users

Page 20: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Digitization Population Censuses

Page 21: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

www.volkstellingen.nl

Driven by data

Page 22: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Driven by data

Spreadsheets are look-alikes of original published tables

Page 23: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Driven by data

Mapping the census data for the Dutch municipalities

Page 24: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Shipping in the “Golden Age”

Page 25: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Journal entries, 26-29 September 1758

Ship’s name: NoordbevelandMonth: September

Year: 1758

Day: Tuesday

Date: 26thWeather on board Wind

Peculiarities

Page 26: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Dutch Shipping Routes 1750-1850Courtesy of CLIWOC project, KNMI

Driven by data

Page 27: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

Driven by data

The research data:• can be found on the

Internet• are accessible (clear

rights and licenses)• are in a usable format• are reliable• can be referred to

(persistent identifier)

www.datasealofapproval.org

5 Criteria16 guidelines

Page 28: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

4. The next steps

Broaden DANS into a discipline-independent data

organisation.

Many DANS activities are independent of discipline:

• Data Quality Guidelines: Data Seal of Approval

• Resolver for Persistent Identifiers

• Selection criteria for data preservation

• Deposit and Access Licenses, Intellectual Property Rights,

Privacy

• Standards (Archival file formats, metadata)

• Storage, conversion, backup, documentation services

Driven by data

Page 29: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

New DANS strategy

– In line with National Coalition for Digital Preservation

– Build bridges between e-science and digital humanities

– Connect to other data infrastructures and initiatives in the technical, natural and life sciences

– Step by step approach

– Many large-scale facilities on the National Roadmap have a data function

Driven by data

Page 30: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

www.dariah.eu

5. DARIAH and other international initiatives

European infrastructure challenges• In spite of some achievements, existing research

infrastructures are primarily national... if they are there at all!

• European activities are until now funded on a project basis and carried out as voluntary activities by national partners

• Stable, pan-European research infrastructures for the arts and humanities hardly exist

• Increasing internationalisation of humanities research puts new requirements for such infrastructures

• DARIAH is the only ESFRI proposal for the arts and humanities

Page 31: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

www.dariah.eu

Science Case for DARIAH• Changing research practice in a networked world:

• Digital resources (data & tools) form the laboratory of the scholar in the arts and humanities

• Computational technologies and methods of analysis• Resources on the web are highly distributed• The scale of research goes up: networked projects

• European projects have no continuity• The existing structures are too weak (ad hoc networks, no

permanence) and national in scope• Answer: strong European data infrastructure, providing

continuity and support for digital A&H research and access to digital resources

Page 32: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

www.dariah.eu

DARIAH Mission

The mission of DARIAH is to enhance and support digitally enabled research across the humanities and arts. DARIAH aims to develop and maintain an infrastructure in support of ICT-based research practices, working with communities of practice to:• Explore and apply ICT-based methods and tools to enable new

research questions to be asked and old questions to be answered in new ways

• Link and provide access to distributed digital source materials of many kinds

• Exchange knowledge, expertise, methodologies and practices across domains and disciplines

Page 33: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

www.dariah.eu

DARIAH Partners• 14 members in 10 countries:

Croatia, Cyprus, Denmark, France, Germany (2), Greece (2), Ireland, Netherlands, Slovenia, United Kingdom (3)

• Associate members: Italy, Spain, Sweden

• Aspiring partners: Austria, Switzerland

• Other prospective partners in: Bulgaria, FYROM (Macedonia), Hungary, Lithuania, Norway, Serbia, Rumania

MembersAssociateAspiringProspective

Page 34: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

www.dariah.eu

Preparation Project: Overview of the Work Packages

1. Project management2. Dissemination3. Strategic work4. Financial work5. Governance and logistical work6. Legal work7. Technical reference architecture8. Technical: Conceptual modelling

Page 35: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

www.dariah.eu

Preparing DARIAH: time schedule

2008 2009

May 2007Deadline Capacities call

ESFRI projects

Q3 2008Agreement EC

funding

Q4 2008Start “Preparing DARIAH”

20102007

October 2006Publication ESFRI

Roadmap December 2006

Publication relevant FP7 call

Q3 2010 DARIAH

conference

Q1 2011Start construction DARIAH

Financial Commitment?

Q4 2009 Funders’ meeting

Page 36: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

www.dariah.eu

DARIAH Virtual Competency Centers (Hubs)

Research & Education: supporting research groups and centres in the 'digital humanities'; knowledge exchange and education, post- graduate programmes and researcher exchange

e-Infrastructure: service provision, systems & tools, connecting resources

Adocacy & Promotion (& Management): PR, encourage collaboration, community building, website, administration, demonstrate value and impact

Content & Legal: supporting scholarly data creation, access, curation and preservation; rights management, IPR licences, quality assurance

Page 37: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

www.dariah.eu

The VCC concept

Page 38: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

www.dariah.eu

DARIAH Governance and Costs in Construction Phase

Governance structure of ERIC

Page 39: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

www.dariah.eu

Relations to other projects and networks

Page 40: P. Doorn (DANS Netherlands) - Building Data Infrastructures for Humanities

www.dariah.eu

19-21 October, Vienna: SDHDARIAH-CLARIN conference

www.dariah.eu