linking scientific metadata (presented at dc2010)
DESCRIPTION
Linked entity data in metadata records builds a foundation for semantic web. Even though metadata records contain rich entity data, there is no linking between associated entities such as persons, datasets, projects, publications, or organizations. We conducted a small experiment using the dataset collection from the Hubbard Brook Ecosystem Study (HBES), in which we converted the entities and their relationships into RDF triples and linked the URIs contained in RDF triples to the corresponding entities in the Ecological Metadata Language (EML) records. Through the transformation program written in XML Stylesheet Language (XSL), we turned a plain EML record display into an interlinked semantic web of ecological datasets. The experiment suggests a methodological feasibility in incorporating linked entity data into metadata records. The paper also argues for the need of changing the scientific as well as general metadata paradigm.TRANSCRIPT
![Page 1: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/1.jpg)
School of Information StudiesSyracuse University
Linking Entities in Scientific Metadata
Jian Qin, Miao Chen, Xiaozhong Liu, & Andrea Wiggins
School of Information Studies, Syracuse University
![Page 2: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/2.jpg)
The context: Islands of research information
04/10/2023
Linking Entities in Scientific Metadata -- DC2010 2
Data
Projects
Publications
Research interest
Researchers
![Page 3: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/3.jpg)
Unlinked entities
Same entity!
04/10/2023 3Linking Entities in Scientific Metadata -- DC2010
![Page 4: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/4.jpg)
Duplication of entity data entry
04/10/2023
Linking Entities in Scientific Metadata -- DC2010 4
Seamless Daily Precipitation for the Conterminous United States
Metadata:Identification_InformationData_Quality_InformationSpatial_Data_Organization_InformationSpatial_Reference_InformationEntity_and_Attribute_InformationDistribution_InformationMetadata_Reference_Information
![Page 5: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/5.jpg)
What’s lacking in scientific metadata?• Standards focus on describing datasets, not
entities• No mechanism is provided for linking entities
– It is considered as an implementation issue• Islands of entities duplication of data entry
for the same entity – Increased costs and time in creating metadata– Effect in resource discovery and browse
04/10/2023 5Linking Entities in Scientific Metadata -- DC2010
![Page 6: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/6.jpg)
Defining the research Problem
04/10/2023 6Linking Entities in Scientific Metadata -- DC2010
How can we build an interlinked network of entities for a scientific domain?
How can we associate the linked entities with their corresponding metadata records?
![Page 7: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/7.jpg)
Linked Data: A solution
04/10/2023 7Linking Entities in Scientific Metadata -- DC2010
Relational database
containing entities and relationships
Metadata records in
XML format
Problem: Lack relationships between entities
Problem: Not related to metadata records
ResourcePropertyType
Value
RDF TriplesConvert to Embed RDF triples into
Solution
![Page 8: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/8.jpg)
Linked data: How it works
04/10/2023
Linking Entities in Scientific Metadata -- DC2010 8
![Page 9: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/9.jpg)
Linked data is
04/10/2023
Linking Entities in Scientific Metadata -- DC2010 9
“…a recommended best practice for exposing, sharing, and connecting
pieces of data, information, and knowledge on the Semantic Web
using URIs and RDF.”
--Wikipedia, http://en.wikipedia.org/wiki/Linked_Data
![Page 10: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/10.jpg)
A case study
04/10/2023 10Linking Entities in Scientific Metadata -- DC2010
Dataset collection search interface at HBES (http://hubbardbrook.org/data/dataset_search.php)
![Page 11: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/11.jpg)
Hubbard Brook Ecosystem Study (HBES)• Long term ecological research sites since 1960s• 3,160 hectare reserve• Six principle organizations & 10 other participants:
– USDA Forest Service– Cornell– Dartmouth– Syracuse– Yale– the Institute of Ecosystem Studies (IES)– the U.S. Geological Survey
• Over 300 datasets available and 2000 publications
04/10/2023 11Linking Entities in Scientific Metadata -- DC2010
![Page 12: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/12.jpg)
HBES Data Collection• Focused on entities on the HBES site:
– Projects– Persons– Publications– Subject interests– Datasets– Events
• Verified Person and Project information against the Long-Term Ecological Research (LTER) directory if necessary;
• Stored the entities in relational database• Metadata records in EML format
04/10/2023 12Linking Entities in Scientific Metadata -- DC2010
![Page 13: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/13.jpg)
Ecological Metadata Language (EML) Structure and Modules
04/10/2023
Linking Entities in Scientific Metadata -- DC2010 13
![Page 14: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/14.jpg)
Conditions required for interlinking
• URI-identified entities• Relationships between these entities• Relationships between the entities
and metadata records
04/10/2023 14Linking Entities in Scientific Metadata -- DC2010
![Page 15: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/15.jpg)
Experiment stage 1: Data prep• Two sets of data:
– Entities and their relationships• Person, subject interest, project, dataset, and paper• Many-to-many relations between the entities
– Sample EML records in XML format• Downloaded from HBES website• Entity URIs added to the corresponding XML files to be
used as semantic identifiers and hyperlinks to the entities
• 126 XML files in total
04/10/2023 15Linking Entities in Scientific Metadata -- DC2010
![Page 16: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/16.jpg)
Entity relationships
04/10/2023 16Linking Entities in Scientific Metadata -- DC2010
![Page 17: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/17.jpg)
Experiment stage 2: Converting to RDF• Toolkit: D2R, a service for converting
relational databases into RDF triples and publishing them on the web– Turn each table into a class– Turn each column as class property– Make each value in a column as an instance– Assign a URI to each class, property, and instance
04/10/2023 17Linking Entities in Scientific Metadata -- DC2010
![Page 18: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/18.jpg)
04/10/2023 Linking Entities in Scientific Metadata -- DC2010 18
![Page 19: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/19.jpg)
Experiment stage 3: Incorporating URI into XML records
• Add the URIs generated from the D2R software to their corresponding entities in EML records by using an XSL program
• Transform the EML records with inserted URIs into the HTML format for display in browser
04/10/2023 19Linking Entities in Scientific Metadata -- DC2010
![Page 20: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/20.jpg)
Example of name with URI inserted
04/10/2023 Linking Entities in Scientific Metadata -- DC2010 20
Original EML record without URI URI added to individual name element
<individualName> <givenName>Thomas G</givenName> <surName>Siccama</surName></individualName>
<individualName> <givenName>Thomas G. </givenName> <surName>Siccama</surName> <personURI>page/people/tsiccama </personURI></individualName>
![Page 21: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/21.jpg)
04/10/2023
Linking Entities in Scientific Metadata -- DC2010 21
Original display of EML record RDF-enabled display of EML record
![Page 22: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/22.jpg)
Discussion• Methodology for transforming islands of entities
into linked scientific metadata• A larger scale data set needed to test its
scalability• Potentials:
– Reducing duplicate entity data entry – Applicable to legacy metadata generated using older
data model– Linking semantic data already published on the web– Facilitating data/metadata visualization??
04/10/2023 22Linking Entities in Scientific Metadata -- DC2010
![Page 23: Linking Scientific Metadata (presented at DC2010)](https://reader036.vdocuments.mx/reader036/viewer/2022070304/54bc2be64a7959df5d8b4575/html5/thumbnails/23.jpg)
DEMO
http://sdl.syr.edu/eml/
04/10/2023 Linking Entities in Scientific Metadata -- DC2010 23