scott edmunds talk at odhk.meet.26: open science data = open data (a rant in e-minor)

24
0000-0001-6444-1436 @SCEdmunds [email protected] [email protected] ODHK Open Science Working Group Open Science Data = Open Data (a rant in e-minor)

Upload: scott-edmunds

Post on 15-Jul-2015

247 views

Category:

Science


3 download

TRANSCRIPT

Page 1: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

0000-0001-6444-1436

@SCEdmunds

[email protected]@opendatahk.com

ODHK Open Science Working Group

Open Science Data = Open Data(a rant in e-minor)

Page 2: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

Open Data is about better decision making

Open Science Data = Open DataAs is Open Access, Open Hardware, Open Environmental Data, Open Scholarship…

• To make decisions

• You need good ideas

• Which are based on relevant information

• Supported by valuable data

• Captured by accurate measures

Page 3: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

What is Open (Science) Data?

• Something very very very geeky

• Free & open access to data about the world around uso Searchable, findableo Machine-readable, app-makeable, Excel-usableo Without restrictions/limitations

• This (examples)

Page 4: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

Open Science Data = Data Journalism

http://www.sciencedirect.com/science/article/pii/S2405471215000022

Page 5: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

Open Science Data = Transparency

• Evidence based policy making needed on drugs, environment, GMOs, etc.

• Casualties of Politics v Science (advisors): UK Gov v David Nutt, EU (Greenpeace) v Anne Glover

• Sensitivity over air/water pollution data in China

• Sensitivity over radiation data in Japan

Page 6: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

Why Open Science Data is the most important open data* *(I may be biased though)

Climate change, global hunger, pollution, radioactivity, cancer, disease outbreaks…

http://www.nature.com/news/data-sharing-make-outbreak-research-open-access-1.16966

Page 7: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

To maximize its utility to the research community and aid those fighting the current epidemic, genomic data is released here into the public domain under a CC0 license. Until the publication of research papers on the assembly and whole-genome analysis of this isolate we would ask you to cite this dataset as:

Li, D; Xi, F; Zhao, M; Liang, Y; Chen, W; Cao, S; Xu, R; Wang, G; Wang, J; Zhang, Z; Li, Y; Cui, Y; Chang, C; Cui, C; Luo, Y; Qin, J; Li, S; Li, J; Peng, Y; Pu, F; Sun, Y; Chen,Y; Zong, Y; Ma, X; Yang, X; Cen, Z; Zhao, X; Chen, F; Yin, X; Song,Y ; Rohde, H; Li, Y; Wang, J; Wang, J and the Escherichia coli O104:H4 TY-2482 isolate genome sequencing consortium (2011) Genomic data from Escherichia coli O104:H4 isolate TY-2482. BGI Shenzhen. doi:10.5524/100001 http://dx.doi.org/10.5524/100001

In contrast to our example:

To the extent possible under law, BGI Shenzhen has waived all copyright and related or neighboring rights to Genomic Data from the 2011 E. coli outbreak. This work is published from: China.

Page 8: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)
Page 9: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

1.3 The power of intelligently open dataThe benefits of intelligently open data were powerfully illustrated by events following an outbreak of a severe gastro-intestinal infection in Hamburg in Germany in May 2011. This spread through several European countries and the US, affecting about 4000 people and resulting in over 50 deaths. All tested positive for an unusual and little-known Shiga-toxin–producing E. coli bacterium. The strain was initially analysed by scientists at BGI-Shenzhen in China, working together with those in Hamburg, and three days later a draft genome was released under an open data licence. This generated interest from bioinformaticians on four continents. 24 hours after the release of the genome it had been assembled. Within a week two dozen reports had been filed on an open-source site dedicated to the analysis of the strain. These analyses provided crucial information about the strain’s virulence and resistance genes – how it spreads and which antibiotics are effective against it. They produced results in time to help contain the outbreak. By July 2011, scientists published papers based on this work. By opening up their early sequencing results to international collaboration, researchers in Hamburg produced results that were quickly tested by a wide range of experts, used to produce new knowledge and ultimately to control a public health emergency.

Page 10: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

OKFn Open Science Working Group

Page 11: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

Open Science Survey (Index?)

Page 12: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

Growing # Government OA/OD mandates

Page 13: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

Hong Kong: still some work to go with OA

…China, Singapore, India beats us

Page 14: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

How much does closed data cost us? More profitable than a gold mine

See: http://alexholcombe.wordpress.com/2013/01/09/scholarly-publishers-and-their-high-profits/

Page 15: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

Hong Kong: still some work to go with OA

Dear Mr. Edmunds,

Thank you for your email dated 27 April 2014.

Please be informed that the requested information is not maintained in our database system. In addition, as the bulk of the University Grants Committee's recurrent grants are disbursed to institutions in the form of a block grant to provide institutions with flexibility in internal deployment, we do not possess the information on funding / spending for journal subscription. Since the requested information does not exist in our Department, you may wish to approach institutions directly on your request.

Regards,University Grants Committee Secretariat

Hong Kong Code on Access to Information request on Elsevier spending:

Q. How much are we spending on closed access?

Page 16: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

Hong Kong: still some work to go with OAQ. How much are we spending? What we do know:

• Hong Kong University Grants Committee (UCG) yearly budget for grants = 17.5 Billion HKD (4% of Government spending).

• HKU library budget = ~$200M HKD, 78.4% of acquisition budget spent on electronic journals.

• In 2011-2012 8 funded institutions published 16,594 papers (inc conference papers and non refereed work).

• HKU and Poly U have OA self archiving policies, but no enforcement or open data policies (yet…)

Page 17: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

Hong Kong: still some work to go with OAQ. How much are we wasting? What we do know:

OA boosts impact 50% and open data leads to a 9% citation boost.

Estimates from Dryad that spending $400,000 to archive 2,500 datasets per year contributes to more than 1,000 papers within 4 years.

Reproducibility crisis for published research: >50% of the time due to lack of open data. Ioannidis estimate that 85% of research funding wasted because of this.

= ~15 Billion HKD wasted http://www.ecs.soton.ac.uk/~harnad/Temp/research-australia.dochttps://peerj.com/articles/175/ http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1001747

Page 18: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

“Faked research is endemic in

China”

If not Open Data, what are we focussing on instead?

475, 267 (2011)

New Scientist, 17th Nov 2012: http://www.newscientist.com/article/mg21628910.300-fraud-fighter-faked-research-is-endemic-in-china.htmlNature, 29th September 2010: http://www.nature.com/news/2010/100929/full/467511a.html Science, 29th November 2013: http://www.sciencemag.org/content/342/6162/1035.fullNature 20th July 2011: http://www.nature.com/news/2011/110720/full/475267a.html

“Wide distribution of information is key to scientific progress, yet traditionally, Chinese scientists have not systematically released data or research findings, even after publication.“

“There have been widespread complaints from scientists inside and outside China about this lack of transparency. ”

“Usually incomplete and unsystematic, [what little supporting data released] are of little value to researchers and there is evidence that this drives down a paper's citation numbers.”

Page 19: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

Chinese Paper Mills:

Attempts to “game the peer-review system on an industrial scale”

1. http://www.scientificamerican.com/article/for-sale-your-name-here-in-a-prestigious-science-journal/2. http://www.grassley.senate.gov/sites/default/files/about/upload/Senator-Grassley-Report.pdf

Companies offering authorship of papers made to order by “paper mills”1. Common ghostwriting medical papers by pharma2

Guaranteed publication in JIF journal, often using fake referees, ID theft, etc.

Page 20: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

Chinese Paper Mills: Attempts to “game the peer-review system on an industrial scale”

1. http://www.scientificamerican.com/article/for-sale-your-name-here-in-a-prestigious-science-journal/2. http://www.grassley.senate.gov/sites/default/files/about/upload/Senator-Grassley-Report.pdf

Page 21: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

What could we be doing with open science data?

Mojave Solar Farms v Desert Tortoise

Page 22: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

What could we be doing with open science data?

Page 23: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

What could we be doing with open science data?

Hong Kong-Zhuhai-Macau Bridge v Pink Dolphins

Page 24: Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in e-minor)

ODHK Open Science Working Group