829 tdwg-2015-nicolson-kew-strings-to-things
TRANSCRIPT
![Page 1: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/1.jpg)
Strings to things: a user-friendly framework for data reconciliation
Nicky Nicolson, RBG Kew@nickynicolson
Biodiversity Information Standards (TDWG) annual meetingNairobi, Kenya / 28th September – 1 October 2015
![Page 2: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/2.jpg)
Reconciliation
• Turns a string representation of an entity into an actionable identifier.
e.g.:Tahina spectabilis
Will reconcile to:http://
ipni.org/urn:lsid:ipni.org:names:77086615-1
![Page 3: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/3.jpg)
![Page 4: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/4.jpg)
Maximise reuse, two stage process1. Standardise data
- Package of 40 plus “transformers”- All accept a string input, produce a string
output
![Page 5: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/5.jpg)
Examples of transformers
![Page 6: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/6.jpg)
Open Refine screenshot
![Page 8: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/8.jpg)
Maximise reuse, two stage process2. Match the data
- Package of 20 plus “matchers”- All accept two inputs and return a flag if they
match
![Page 9: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/9.jpg)
Configuring a service
1) Read tabular data (file or DB)2) Configure transformers3) Configure matchers
![Page 10: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/10.jpg)
Run it…
1) Service description2) Three service endpoints3) Javascript query interface
![Page 11: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/11.jpg)
IPNI Reconciliation Service
![Page 12: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/12.jpg)
3 service endpoints
![Page 13: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/13.jpg)
IPNI Reconciliation Service
![Page 14: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/14.jpg)
Flexible web service
• Open Refine compatible• But underneath it’s JSON over HTTP• … so call it from any programming language
![Page 15: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/15.jpg)
Service metadata
![Page 16: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/16.jpg)
Service call
![Page 17: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/17.jpg)
Service response
![Page 18: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/18.jpg)
List of reconciliation services
https://github.com/OpenRefine/OpenRefine/wiki/Reconcilable-Data-Sources
![Page 19: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/19.jpg)
Open source
https://github.com/RBGKew/Reconciliation-and-Matching-Framework
![Page 20: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/20.jpg)
What we’ll work on in the future
![Page 21: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/21.jpg)
Reconciliation services on different data types
• Specimens– Add DwCA as a readable data store– Collections focussed transformers & matchers– Resolve & link specimen duplicates
• People• Trait glossaries
![Page 22: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/22.jpg)
Integration with github
![Page 23: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/23.jpg)
![Page 24: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/24.jpg)
Thanks to:• Biodiversity Informatics team (Abigail Barker,
Matt Blissett, James Crowe, John Iacona, Rob Turner, Alecs Gueder)
• Plant & fungal name curation team (Christine Barker / Irina Belyaeva / Katherine Challis / Rafael Govaerts / Paul Kirk / Heather Lindon / Emma Williams)
• Data improvement team (Anna Lynch, Rachel Witherow, Malin Rivers, Esther Wainwright-Deri)
![Page 25: 829 tdwg-2015-nicolson-kew-strings-to-things](https://reader035.vdocuments.mx/reader035/viewer/2022070518/58ea8eaf1a28ab983e8b59d9/html5/thumbnails/25.jpg)
@nickynicolson / [email protected]
http://bit.ly/k-names-service
http://github.com/RBGKew
Biodiversity Information Standards (TDWG) annual meetingNairobi, Kenya / 28th September – 1 October 2015