developing infrastructure to support closer collaboration of aggregators with open repositories
TRANSCRIPT
Developing Infrastructure to Support Closer Collaboration of Aggregators with Open
Repositories
Dr. Nancy Pontika & Dr. Petr KnothCOnnecting Repositories (CORE)
Open University, UK
LIBER 2015, 24 – 26 June, London
Mission of CORE
Aggregate all open access content distributed across different systems worldwide, enrich this content and provide access to it through a set of services …
[Source: http://core.ac.uk/about#mission]
Need for a UK aggregator
Bringing the UK’s open access research outputs together:• Feasibility study commissioned
by Jisc, published June 2014• Referred to as “Open Mirror”
[Source : https://repository.jisc.ac.uk/5570/1/JISC_REPORT_open_mirror_09051
4_FINAL_WEB.pdf]
Three levels of support
Programmable Data Access
- CORE API - CORE Data Dumps
- Researchers- Developers - Companies
Transaction Information
Access
- CORE Portal- CORE Mobile - CORE Plugin
- Researchers- Students
- Life long learners
Analytical Information
Access
- CORE Policy -CORE Compliance
Analytics- CORE Dashboard
- Funders - Governments- Data Providers
[Source: http://www.dlib.org/dlib/november12/knoth/11knoth.html]
CORE Statistics• Content: 20M+ records, 600+ repositories, 1.8M+
full-texts • The UK national aggregator - Jisc• Full-text aggregator (not just metadata)• Placed among Top 10 search engines for research
that go beyond Google [Jisc, 2013]• Listed among Top 100 Thesis and Dissertation
Resources• Part of Jisc’s Repositories Shared Services Project
(RSSP)
Aggregation process • Metadata download, extraction and cleaning• Full-text harvesting• Text extraction• Language detection• Extraction of citation references from text• Identification of related content• Detection of duplicate items• Parsing of author names• Indexing
CORE Applications • CORE Portal– Search engine providing open access content
• CORE Mobile – Android and iOS apps
• CORE Plugin– For repositories and journals
• CORE API– Programmable access to million of resources
• CORE Dashboard – Tool for repository managers
CORE Dashboard : purpose
• Harvested Records
• Metadata
• Harvesting Process
• Standards
• Repository Managers
• Funders
• Repositories• Journals
Data Providers Collaboration
QualityTransparency
Institution main page
Edit repository information
Invitations
Content
Manage record visibility statusTake down
Manage record visibility statusTake down
Manage record visibility statusTake down
Take up
Manage record visibility statusTake down
Take up
Update metadata records
• Asynchronous process • Item is queued in the CORE system• Record is updated within 12 hours
Statistics
Issues : 3 types
When harvesting your repository/document we encountered an error that we couldn't resolve. These errors need to be fixed in order to to harvest your repository/document.
We encountered an error but we were still able to harvest the repository/document. We strongly recommend that these issues are resolved as they may lead to incompatibility problems in the future.
This may not be a problem but it may be a clue for misconfiguration or future incompatibilities.
Issues : good news
Issues : good news
Issues : bad news…
Issues: Robots.txt
Issues: Robots.txt
Issues: Document Issues
Issues: Malformed PDF url
Dashboard benefits - Increased and simplified collaboration between
aggregators and content providers- Improved control of the content provider over the
harvested content- Reduction of scepticism and fear of sharing
content with other systems- Improvement of the harvesting process- Broadening of the open access content
discoverability and thus reuse of the open access content where permitted
Would you like to take a look?
Dashboard still in BETA but we welcome volunteer testers
Email me at [email protected]
Many thanks to…CORE developers: • Matteo Cancellieri• Samuel Pearce• Drahomira Herrmannova• Lucas Anastasiou
Volunteer testers: • Chris Biggs, Metadata & Repository Specialist, Open University• Nick Sheppard, Repository Developer, Leeds Beckett University
Thank you
Questions
CORE Contacts: Nancy Pontika [email protected] Knoth [email protected] Website: http://core.ac.uk Twitter: @oacore