counter standards for open access: the value of measuring...

26
UCD Library University College Dublin, Belfield, Dublin 4, Ireland Leabharlann UCD An Coláiste Ollscoile, Baile Átha Cliath, Belfield, Baile Átha Cliath 4, Eire COUNTER standards for Open Access: The value of measuring/the measuring of value LIBER 2017 Patras, 6 July Joseph Greene Research Repository Librarian University College Dublin [email protected] http://researchrepository.ucd.ie

Upload: others

Post on 12-Sep-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

UCD Library

University College Dublin,

Belfield, Dublin 4, Ireland

Leabharlann UCD

An Coláiste Ollscoile, Baile Átha Cliath,

Belfield, Baile Átha Cliath 4, Eire

COUNTER standards for Open Access: The value of measuring/the measuring of valueLIBER 2017Patras, 6 July

Joseph GreeneResearch Repository LibrarianUniversity College [email protected]://researchrepository.ucd.ie

Page 2: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

IntroductionDefining success, defining value

Page 3: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Call: define success

“…[we] too often conflate several rather different objectives for transforming scholarly communications...”

https://scholarlykitchen.sspnet.org/2017/05/23/open-access-scholarly-communication-defining-success/

Page 4: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

First principles: BOAI 2002

“…the world-wide electronic distribution of the peer-reviewed journal literature and completely free and unrestricted access to it...”

Page 5: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Measure distribution

Tipping point: in 2014, more than 50% of recent papers (2011-2013) were found to be Open access

Archambault, E. et al. (2014). Proportion of Open Access Papers Published in Peer-Reviewed Journals at the European and World Levels: 1996–2013 (41p.). Produced for the European Commission DG Research & Innovation.

Page 6: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Measuring ‘free and unrestricted access’Defining value, measuring value

Page 7: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

OA Citation advantage

• At least 40 separate studies show that Open Access increases citations1,2

• Wide variations between disciplines• 35% increase in mathematics2

• 500% increase in citations in physics/astronomy2

• Most recent study: 3.3 million papers3

• Average: OA = 50% more citations

• (Green is overall the better strategy)

1Wagner, B. (2010) ‘Open Access Citation Advantage: An Annotated Bibliography’. DOI: 10.5062/F4Q81B0W2Swan, A. (2010) ‘The Open Access citation advantage: Studies and results to date’. https://eprints.soton.ac.uk/268516/3Archambault, E. (2016) ‘Research impact of paywalled versus open access papers’. www.1science.com/oanumbr.html

Page 8: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

OACitationAdvantage {

if (papers_are_OA) {papers_are_accessible = true;citationAdvantage();

}}

citationAdvantage {

if (papers_are_accessible) {++papers_read;++chance_of_citation;

}}

Page 9: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Measuring accessUsage data as metric

Page 10: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

BOAI15

'Means should exist that will permit having some idea of the value and quality of each document, for example, a number of metrics having to do with views, downloads, comments, corrections'

Guédon, Jean-Claude (2017-02). Open Access: Toward the Internet of the Mind. http://www.budapestopenaccessinitiative.org/open-access-toward-the-internet-of-the-mind

Page 11: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

European Commission

'Usage metrics are highly relevant for open-science'

Recommend 'making better use of existing metrics for open science' including usage metrics

Directorate-General for Research and Innovation (2017-03). Next-generation metrics: Responsible metrics and evaluation for open science DOI:10.2777/337729

Page 12: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Coalition for Networked

Information

'Researchers and librarians at several universities are working to make analytics on use of items in IRs more reliable‘

But 'statistics generated by the systems are poor and do not demonstrate impact'

CNI Executive Roundtable (2017-04). Rethinking Institutional Repository Strategies. https://www.cni.org/topics/publishing/rethinking-institutional-repository-strategies

Page 13: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

OA usage statistics

Page 14: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies
Page 15: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies
Page 16: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Usage data are not perfect

• Up to 85% of OA repository downloads come from non-human agents1

• At least 40% of OA journal downloads are not human2

• Even with robot detection, there is room for improvement3

• DSpace stats: 62% human

• EPrints stats: 55% human

• U. Minho DSpace stats: 59-73% human

1Greene, J. (2016) 'Web robot detection in scholarly Open Access institutional repositories'. Library Hi Tech, 34 (3):500-5202Huntington, P., Nicholas, D., & Jamali, H. R. (2008). Web robot detection in the scholarly information environment. Journal of Information Science, 34(5), 726-7413Greene, J. (2016) 'How Accurate are IR Usage Statistics?’. Open Repositories (OR2016) Dublin, 13-16 June 2016

Page 17: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Creating standardsRaw data to empirical knowledge

Page 18: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Problems

• Many ways to do robot detection• (At least 23 in the literature , not to

mention combinations)

• Nothing resembling a standard available

• Cross-platform comparison and aggregation impossible

Page 19: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Addressing the

problem

• COUNTER Robots Working Group• Joseph Greene, UCD, RIAN (chair)

• Lorraine Estelle, Project COUNTER

• Paul Needham, IRUS-UK/COUNTER

• Representatives from EBSCO, Elsevier, Wiley, ScholarlyIQ, DSpace, EPrints, DigitalCommons, OpenAIRE, Base Bielefeld and Open Journal Systems

“…to devise ‘adaptive filtering systems’ that will allow publishers/repositories/services to follow a common set of rules to dynamically identify and filter out unusual usage and robot activity”

Page 20: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Usage data sources

.csv

.csv

.txt

Source: Bielefeld/OJS (x3)Lines: 233,000

Source: IRUS-UK (97 IRs)Lines: 1.9 million

Source: WileyLines: Several million

PostgreSQL databaseSeveral million rows Period: 3-9 October 2016

Page 21: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Robot detection

• Simple random sample taken• 202-204 downloads for each dataset

• 95% certainty

• 12 syntactic variables from SQL queries or added manually• E.g. IP address, agent, IP owner

• 12-13 behavioural variables added using SQL queries or API calls• E.g. number of downloads by user,

number of items downloaded, dates/times seen

Page 22: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

Mozilla/5.0 (compatible; spbot/5.0.3; +http://OpenLinkProfiler.org/bot )

Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)

gsa-crawler (Enterprise; T4-BLNCV2FADUSTW; [email protected])

Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)

RePEc link checker (http://EconPapers.repec.org/check/)

Jakarta Commons-HttpClient/3.0.1

BetsieSelf-declared robots

Page 23: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Undeclared but obvious behaviour

Page 24: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Testing filters

• Test existing COUNTER robots list

• Test existing COUNTER double-click filter

• Rate of requests

• Volume of requests

• User agents per IP address

• Requests where requested item = referring URL

Page 25: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Testing filters

• Simulate a set of filters on the datasets

• Assign true/false positives, true/false negatives compared with manual determination

• Calculate:• Recall, precision (excluded stats)

• Inverse recall, inverse precision (reported stats)

• Find best combination of filters, balance of practicality and accuracy

Page 26: COUNTER standards for Open Access: The value of measuring ...liber2017.lis.upatras.gr/wp-content/uploads/sites/6/2017/04/6.2.pdf · OA Citation advantage •At least 40 separate studies

Results: COUNTER Code of Practice Release 5, 2017

[email protected]