opening up data: a uk perspective – jisc and cni conference 10 july 2014

44
Opening up data: A UK perspective Kevin Ashley Digital Curation Centre www.dcc.ac.uk @kevingashley [email protected] Reusable with attribution: CC-BY The DCC is supported by Jisc

Upload: jisc

Post on 07-Nov-2014

1.429 views

Category:

Education


2 download

DESCRIPTION

Kevin Ashley, director, Digital Curation Centre

TRANSCRIPT

Page 1: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Opening up data:A UK perspective

Kevin Ashley Digital Curation Centre

www.dcc.ac.uk@kevingashley

[email protected]

Reusable with attribution: CC-BY The DCC is supported by Jisc

Page 2: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

2

A summary

• Policy background• The end point – why it matters• UK reaction & developments• Infrastructure• Costs• Joining up internationally• More than data…

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 3: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

3

My home – the DCC

• Mission – to increase capability and capacity for research data services in UK institutions

• Not just a UK problem – an international one

• Training, shared services, guidance, policy, standards, futures

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 4: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

42014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

SWEDEN

DENMARK

CANADA

Page 5: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

5

Data reuse stories

• The palaeontologist who saved years of work with archaeological data

• The 19th-century ships logs that help us model climate change

• The ‘noise’ from research radar that mapped dust from Eyjafjallajökull

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 6: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

6

Data reuse - messages

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Often your data tells stories that your

publications do not

Not all data comes from other researchers

One person’s noise is another person’s signal

Discipline-bounded data discovery doesn’t give us

all we need or want

Page 7: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

7

Why does this matter?

• Research quality– How close can we get to

the truth?

• Research speed– How quickly can we get

to the truth?

• Research finance– How much does the

truth cost?

• Improving one or more of these is of interest to all actors:

• Researchers as data creators

• Researchers as data reusers

• Research institutions• Funders – hence

government and society2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 8: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY 8

The Policy

2014-07-10

Page 9: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY 9

G8UK - Endorses OAOpen Data CharterPolicy Paper18 June 2013

2014-07-10

Page 10: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

10

Funder requirements• UK – RCUK (generic), NERC, STFC,

ESRC, BBSRC, EPSRC, MRC

• USA – NSF, NEH, NIH• Europe

• Denmark, Germany, Netherlands…• Most place burden on researcher –

some on the institution

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

http://www.epsrc.ac.uk/about/standards/researchdata/Pages/policyframework.aspx

Page 11: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

11

RCUK policy - The 1-minute version

• Research data are a public good – make openly available in timely & responsible way

• Have policies & plans. Data with long-term value should be preserved & usable

• Metadata for discovery & reuse. Link publications & data

• Sometimes law, ethics get in the way. We understand.• Limited embargos OK. Recognition is important –

always cite data sources• OK to use public money to do this. Do it efficiently.

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 12: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY

EPSRC policy points

• Awareness of regulatory environment• Data access statement• Policies and processes• Data storage• Structured metadata descriptions• Permanent identifiers for data• Securely preserved for a minimum of 10 years

from last use

2014-07-10

12

Compliance expected by 2015

Page 13: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY 132014-07-10

DCC Policy Summary

http://www.dcc.ac.uk/resources/policy-and-legal

Page 14: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY 14

The Response

2014-07-10

Page 15: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

15

DCC guidance

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 16: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY 16

Roles and Responsibilities

What data to keep

2014-07-10

Page 17: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Compliance

Benefits

Page 18: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

18

DCC ‘institutional engagement’

Assess needs

Make the case

Develop support and

services

RDM policy development

Customised Data Management Plans

DAF & CARDIO assessments

Guidance and training

Workflow assessment

DCC support

team

Advocacy with senior management

Institutional data catalogues

Pilot RDM tools

…and support policy implementation2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 19: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

20

Who (in the UK) is leading on RDM ?

Library

IT

ResearchOffice

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 20: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY 21

Survey of UK HE RDM readiness• 61 of 69 responded (> 10%

funding from research)• 90% using internal funding for

staff, training• 57% filling all or most roles

through restructuring• Russell Group: 4.7FTE -> 9.5 FTE

within a year• Others: 2.6 FTE -> 3 FTE• Lack of clarity on staff outside

central services

2014-07-10

31%

38%

14%

17%

Research Sup-port & Com-mercialisation

Library or In-formation Ser-vice

IT/ Research computing

Others

Data & charts from Angus Whyte, DCC

Page 21: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY 22

Drivers – UK institutions

2014-07-10

UK Research Council data policies

Government policy on open data

Governance of research integrity / academic conduct

Strategy to expand support for research

EU Horizon2020 policy on data management

0 10 20 30 40 50 60 70 80 90 100

92

57

54

54

53

% Agreeing

Page 22: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

23

Least progress

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Business planning & sustainability

Digital preservation & continuity planning

Governance of data access & reuse

0 5 10 15 20 25

% indicating piloting or live

Page 23: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

24

What kind of external support is needed?

• Advice on retention, selection• Advice on metadata creation for discovery• Specifying tools & infrastructure• Costing• Advocacy to senior management• Developing data catalogues/registers

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 24: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY 25

EPSRC asked its researchers…

• 75% know of funder’s policy (25% in detail)• 55% know their institution has a policy• 70% are not aware of institutional training or

services for RDM• Some contradictory responses

2014-07-10

Thanks to Ben Ryan, EPSRC, for quotes & data

Page 25: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

26

Services researchers are aware of

• Help with data management planning• Help with metadata creation• Training• Backup of research data• Dedicated storage

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 26: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

27

Some selected observations

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

The nature of my work is such that it generates no data that doesn't end up in my papers, so I'm unlikely to know about these policies.

This is irrelevant to me. I deal

with no sensitive data

RDM sounds like a gigantic waste of time and I intend to spendas little time on it as possible

I am on the point of retiring so taking

less interest in these things

Page 27: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

28

Infrastructure

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 28: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY 292014-07-10

Page 29: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY 302014-07-10

Page 30: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY 31

Data centres are good value!

• See Jisc reports on ADS, BADC, UKDA:

2014-07-10

http://www.jisc.ac.uk/whatwedo/programmes/di_directions/strategicdirections/badc.aspx

Returns on

investment

between 400%

and 1200%

Page 31: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

32

Research Data Registry & Discovery Service

• Modelled on Research Data Australia• Gain visibility of small data collections• Help drive home distinction between

discoverable data & open data• Get evidence on which metadata items deliver

reuse potential• Idea from UKRDS report in 2010• RDA working group coordinating international

work2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 32: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY 332014-07-10

Page 33: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

342014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 34: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

35

Pimp your data –

make it findable & reusable

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Gking.harvard.edu/data

Page 35: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY 36

On costs

• Costs of data curation relatively simple to measure: see work of 4C (4cproject.eu)

• Charging and payment are more complex• Funder rules can lead to perverse, inefficient

payment systems• Fundamental question is ‘who pays’. This

changes the answer to ‘what does it cost’

2014-07-10

Page 36: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

37

Commercial services

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 37: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

38

What it means

• Project funding can only be spent during projects on direct project costs

• Project funding comes with overheads, which universities must use for research infrastructure

• Ongoing (‘QR’) money is continuous, relates to research ranking

• Important to distinguish business-as-usual from exceptional requirements

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 38: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

39

A research lifecycle

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Time

Resources

Exceptional zone

Normal zone

Project end point

Business as usual threshold

Eligible for project funding

Page 39: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

40

Being clever with costs

• Ongoing costs beyond project end cannot be charged to a grant, but…

• ‘Pay once, store forever’ charges are acceptable.

• Thus, incentive to outsource long-term curation• Yet universities are only acting as last-resort

option in any case – discipline data archives preferred

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Many of these are run by

funders

Page 40: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

41

What stops data reuse

• Loss• Destruction• Pride• Gluttony• Ineptitude• Concealment• Bureaucracy• Complexity• Procrastination• Lack of potential2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

Page 41: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY 42

“Departments don’t have guidelines or norms for personal back-up and researcher procedure,

knowledge and diligence varies tremendously. Many have experienced moderate to

catastrophic data loss”

Incremental Project Report, June 2010

http://www.flickr.com/photos/mattimattila/3003324844/

2014-07-10

Page 42: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

Kevin Ashley -Jisc/CNI 2014 - CC-BY 43

Excuses – and responses• “People will ask questions”

– So use a data centre or repository• “It will be misinterpreted”

– Stuff happens. Also, openness encourages correction• “It’s not interesting”

– Let others be the judge – your noise is my signal• “I might get another paper out of it”

– Up to a point. We might get more research out of it• “I don’t have permission”

– A real problem. But solvable at senior level• “It’s too bad/complicated” –see above• “It’s not a priority”

– Unfortunately, funders are making it so. But if you looked at the evidence, it would be your priority as well

2014-07-10

See e.g. Carly Strasser’s blog: http://datapub.cdlib.org/2013/04/24/closed-data-excuses-excuses/

Page 43: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

44

Citability

• Making data available increases citations• Everyone – academic, funder, institution –

loves citations• Want evidence?

– Alter, Pienta, Lyle – 240%, social sciences *– Piwowar, Vision – 9% (microarray data)†– Henneken, Accomazzi – 20% (astronomy) #

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY

† Piwowar H, Vision TJ. (2013) Data reuse & the open data citation advantage. PeerJ PrePrints 1:e1v1 http://dx.doi.org/10.7287/peerj.preprints.1v1

* Amy Pienta, George Alter, Jared Lyle, (2010) The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data.http://hdl.handle.net/2027.42/78307

# Edwin Henneken, Alberto Accomazzi, (2011) Linking to Data - Effect on Citation Rates in Astronomy. http://arxiv.org/abs/1111.3618

Page 44: Opening up data: a UK perspective – Jisc and CNI conference 10 July 2014

45

Open scholarly communication

• It’s not just publications and/or data• Software, methods, workflows, instruments…• Need to resist the urge to make everything

look like a publication

2014-07-10 Kevin Ashley -Jisc/CNI 2014 - CC-BY