harvesting repositories: dpla, europeana, & other case studies
TRANSCRIPT
![Page 1: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/1.jpg)
Harvesting Repositories DPLA, Europeana, and Other Case Studies
ALA Conference June 25, 2016
![Page 2: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/2.jpg)
Introductions
Erin Tripp, Bus. Dev.
Staff librarian since 2011. Erin delivers Islandora
training at events worldwide and has managed more than 40 digital repository projects.
Contact Details ● Email: [email protected] ● Twitter: @eeohalloran or @discgarden ● Hashtags: #islandora #ALAAC16
![Page 3: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/3.jpg)
Agenda
Objectives Overview
By Show of Hands & Introductions
Why Should We Care? Repository Requirements
OAI-PMH Overview
Case Studies
Top Takeaways
![Page 4: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/4.jpg)
Objectives for Today
Learn a thing or two about:
● OAI-PMH
● Common Harvesters
● Who to ask for help
● What questions to ask
● Confidence to continue
learning/ try a new tool
![Page 5: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/5.jpg)
By Show of Hands...
Who is interested in ● National Harvester, ● State Harvester, ● Subject Harvester, or ● Proprietary Discovery Service
Harvester? Who has already been involved in a harvesting project? Who has experience using ● XLSTs ● OAI-PMH ● REPOX?
![Page 6: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/6.jpg)
Why should we care? Discoverability.
![Page 7: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/7.jpg)
Why should we care? Discoverability.
February 2015 LITA panelists said Top Technology Trends include enhancing discoverability (Enis, 2015) Making content accessible where the search originates (e.g. Google, Google Scholar, WorldCat, DPLA, Europeana) creates value for digital libraries and users Repositories contributing to aggregators can experience increased site visits from 55-109 per cent (DPLA, n.d.)
![Page 8: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/8.jpg)
Why should we care? Discoverability.
Increased exposure through
● Blogs, social media and Wikipedia,
Provide richer context and increase the visibility of your collections
Make your collections available for re-use by other services (Europeana, n.d.)
Access to valuable skills
Data modelling
Copyright and licensing
Reporting on access usage analytics (Europeana, n.d.)
![Page 9: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/9.jpg)
Why should we care? Discoverability.
Using open source
Linking up to thousands of other collections
Interoperable (no vendor lock in/ proprietary formats)
Access to Wikimedia Commons (Europeana, n.d.)
Expanding your network
Connect with like-minded industry professionals
Identify potential partners and joint funding opportunities
Reach out to other sectors – creatives, education, tourism and more (Europeana, n.d.)
![Page 10: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/10.jpg)
Why should we care? Discoverability.
Anecdotally, repository harvest can: ● Act as incentive for people to deposit content into
the repository / buy-in from stakeholders
● Clean up and normalize metadata resulting in better raw material to support discovery
![Page 11: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/11.jpg)
OAI-PMH Overview
![Page 12: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/12.jpg)
OAI-PMH
Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)
Low-barrier mechanism for repository interoperability
OAI-PMH is a set of six requests
(aka verbs or services) that are invoked within HTTP
![Page 13: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/13.jpg)
Providers
Data Providers are repositories that expose structured metadata via OAI-PMH = Repository
Service Providers then make OAI-PMH service requests to harvest that metadata = Harvester
![Page 14: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/14.jpg)
Vocabulary
Request/ Verb/ Service The action that the service
provider (harvester) is requesting from the data provider (repository)
Response Size The maximum number of
records to issue per response
![Page 15: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/15.jpg)
Vocabulary… continued
Resumption Token
When a request returns records greater than the response size a resumptionToken is issued such that the service provider can resume harvesting from where it left off
Identify
This request used to retrieve information about a repository. Some of the information returned is required as part of the OAI-PMH. Example: YourSite/oai2?verb=Identify
![Page 16: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/16.jpg)
Vocabulary… continued
ListMetadataFormats This request is used to retrieve the metadata formats available from a repository. Example: YourSite/oai2?verb=ListMetadataFormats
ListRecords This request is used to harvest records from a repository. Optional arguments permit selective harvesting of records based on set membership and/or datestamp. Example: YourSite/oai2?verb=ListRecords&metadataPrefix=oai_dc
![Page 17: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/17.jpg)
Vocabulary… continued
ListSets This request is used to retrieve the set structure of a repository, useful for selective harvesting All Collections Example: YourSite/oai2?verb=ListSets Specific Collection Example: YourSite/oai2?verb=ListRecords&metadataPrefix=oai_dc&set=ir_citationCollection
![Page 18: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/18.jpg)
Repository Requirements
Accessible to the web
Storing standards, XML-based descriptive metadata
The ability to apply additional
metadata mapping if needed (rather in or external to repository)
Access to documentation and XSLTs used for metadata mapping
![Page 19: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/19.jpg)
Repository Requirements
Pass XML metadata to service provider from the:
1. Preservation (storage) component or
2. Discovery (index) component
Provide a method to harvest a TN and link back to repository Accommodate customization
![Page 20: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/20.jpg)
Repository Requirements … Continued
For example: University of South Carolina video content model is tiered for preservation, media production and streaming web access. We only want to harvest one of three possible records
![Page 21: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/21.jpg)
Case Study Europeana
![Page 22: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/22.jpg)
Europeana
Our material comes from all over
Europe and the scope of the
collections is really quite
astonishing. [...]
http://www.europeana.eu/
http://pro.europeana.eu/
![Page 23: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/23.jpg)
Intermediate Aggregator
Digibess repo stores digitized objects from 18 Economic and Social Sciences libraries in Italy Europeana requires an intermediate aggregator; a national harvester such as Cultura Italia Cultura Italia harvests custom “Pico” metadata format from Digibess and then is harvested by Europeana
![Page 24: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/24.jpg)
Harvesting Tools
Digibess pre-dated Islandora OAI module and REPOX aggregator
Used Proai servlet oaiprovider-1.2.2
Harvest resulted in examining in general needs and specific applications of the protocol
![Page 25: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/25.jpg)
Digibess on Europeana
![Page 26: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/26.jpg)
REPOX
Since the Digibess project a new intermediate aggregator has been released called REPOX. It aims to provide [...] Europeana partners a simple solution to import, convert and expose their bibliographic data via OAI-PMH http://repox.sysresearch.org/
![Page 27: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/27.jpg)
Case Study Digital Public Library of America (DPLA)
![Page 28: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/28.jpg)
DPLA
The Digital Public Library of America brings together the riches of America’s libraries, archives, and museums, and makes them freely available to the world.
https://dp.la/info/
![Page 29: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/29.jpg)
Service Hub
Empire State Digital Network (ESDN) is the New York State service hub for the DPLA
Hosted and administered by the Metropolitan New York Library Council in conjunction with eight allied regional library councils working collectively in New York State as the ESLN
Liaise with partners for data aggregation, mapping and licensing
![Page 30: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/30.jpg)
Mapping & Testing
Harvests from partners using OAI-PMH
o Provides all partner metadata to DPLA through one OAI-PMH feed from REPOX
Undertakes data review and QA prior to exposing feed to DPLA for harvest
![Page 31: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/31.jpg)
ESDN on DPLA
![Page 32: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/32.jpg)
Case Study Other Discovery Services
![Page 33: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/33.jpg)
Other Discovery Services
WorldCat, Summon, & Primo are commercial discovery services Local discovery layers can also collocate resources for discovery OAI -PMH modules within your repository framework can allow for these services to harvest your repository
![Page 34: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/34.jpg)
Everyone is Harvesting Everyone
Connecticut State Library aggregating data to Research It State Library harvests University of Connecticut Archives and Special Collections, ILS and other University of Connecticut Library harvests to Summon/ Primo and will be harvested by DPLA
![Page 35: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/35.jpg)
Creating Lots of Portals
University of Connecticut Library started harvesting in mid 2014 Notable increases in access to digital content since harvest (one of many factors) Access statistics available at CTDA Statistics
![Page 36: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/36.jpg)
University of Connecticut on Research It - EBSCOhost
![Page 37: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/37.jpg)
Harvesting Top Takeaways
![Page 38: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/38.jpg)
Top Takeaways -
Data Providers
● Server Load/ Application Load
● Permissions / Copyright
● Relationships with Service
Providers ● Repository Buy-in
● Increased Discovery
● Metadata Normalization
![Page 39: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/39.jpg)
Top Takeaways - Service
Providers
● Knowledge of ○ XSLT, ○ OAI-PMH, and ○ Metadata Schema Knowledge
(DC, MODS, QDC, MARC XML)
● Technical staff to set-up and maintain the aggregator & write scripts to transform harvested metadata
● Relationships with Data Providers
![Page 40: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/40.jpg)
Harvesting Discussion
![Page 41: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/41.jpg)
Discussion
● What are your biggest challenges?
● What Resources do you find helpful?
● What was your AH HA! moment?
● What was most useful in this presentation?
![Page 42: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/42.jpg)
Harvesting Demonstration
![Page 43: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/43.jpg)
Demonstration
To follow along or try it at home, navigate to….
http://sandbox.discoverygarden.ca/ OR
http://islandora.ca/downloads Click Islandora > Islandora Utility Modules > Islandora OAI
![Page 44: Harvesting Repositories: DPLA, Europeana, & Other Case Studies](https://reader031.vdocuments.mx/reader031/viewer/2022022414/587520421a28ab3f098b4591/html5/thumbnails/44.jpg)
Questions? Contact us at: [email protected]