key integrating concepts groups formal community groups ad-hoc special purpose/ interest groups

1
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content can be tagged at a variety of level of detail Incentives to contribute content Content is visible in key reports – accepted by sponsors Progress in Open-World, Integrative, Collaborative Science Data Platforms Peter Fox 1 ([email protected] ) ( 1 Rensselaer Polytechnic Institute 110 8 th St., Troy, NY, 12180 United States – see Acknowledgements) Glossary: RPI – Rensselaer Polytechnic Institute TWC – Tetherless World Constellation at Rensselaer Polytechnic Institute CKAN – Comprehensive Knowledge Archive Network VIVO - Research Collaboration Network model and software GHS – Global Handle System PROV – W3 Provenance Model and Ontology Acknowledgments: DCO Data Science Team W3C Provenance Working Group Sponsors: AP Sloan Foundation National Science Foundation Tetherless World Constellation ABSTRACT As collaborative, or network science spreads into more Earth and space science fields, both the participants and their funders have expressed a very strong desire for highly functional data and information capabilities that are a) easy to use, b) integrated in a variety of ways, c) leverage prior investments and keep pace with rapid technical change, and d) are not expensive or time-consuming to build or maintain. At a conceptual level, science networks (even small ones) deal with people, and many intellectual artifacts produced or consumed in research, organizational and/our outreach activities, as well as the relations among them. Increasingly these networks are modeled as knowledge networks, i.e. graphs with named and typed relations among the 'nodes'. Nodes can be people, organizations, datasets, events, presentations, publications, videos, meetings, reports, groups, and more. In this heterogeneous ecosystem, it is also important to use a set of common informatics approaches to co-design and co-evolve the needed science data platforms based on what real people want to use them for. Community Data and Groups Integrate Semantic Applications into a Data Science Platform Information Models and Leveraged Ontology Community Dashboard and Reporting Progress Solution: Leverage Open-Source Semantic Technologies Triples tie it together Integrate terms from VIVO, FoaF, BIBO, O&M, VSTO, BCO-DMO, TWC and others Starting with their information models where possible Progress = integration at the application level to bring several key functions together in a Web-based environment oovering most research needs HIDE ALL OF “IT” FROM THE USERS!! Identify Everything • DCO-ID Content negotiation Respect other locations, names USE: W3C Provenance Ontology (PROV-O) Developed by W3C Provenance Working Group W3C 2013 Recommendation - http ://www.w3.org/TR/prov-o / AGUFM13 - IN11A-1509 (MS Hall A-C) CKAN VIVO GHS – Handle.net VIVO - represents academic research communities • Every person, organization, or other data entity in VIVO has a unique identifier • VIVO enables the discovery of research and scholarship across disciplines at one institution or across many • Records are both human-readable and machine-readable • We’ve extended (yes, ontologies) VIVO to the science network – datasets, instruments, sites, etc. • Feeding this back to VIVO Knowledge network – implements both the collaboration and the integration Many means of population User generation Machine generation Substantially contributing these enhancements back to open-source community (CKAN, VIVO, GHS) Information models provide domain level view and logical models implemented in ontology leverage a wide variety of vocabularies and encoding schemes Alignment performed using … RDFS subClassOf assertions SKOS broadMatch statements

Upload: tawny

Post on 22-Jan-2016

34 views

Category:

Documents


0 download

DESCRIPTION

AGUFM13 - IN11A-1509 (MS Hall A-C). Integrate Semantic Applications into a Data Science Platform. VIVO - represents academic research communities Every person, organization, or other data entity in VIVO has a unique identifier - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups

Key integrating concepts Groups

Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership

Linked All content can be tagged at a variety of level of detail

Incentives to contribute content Content is visible in key reports – accepted by sponsors

Progress in Open-World, Integrative, Collaborative Science Data Platforms

Peter Fox1 ([email protected])

(1Rensselaer Polytechnic Institute 110 8th St., Troy, NY, 12180 United States – see Acknowledgements)

Glossary:RPI – Rensselaer Polytechnic InstituteTWC – Tetherless World Constellation at Rensselaer Polytechnic InstituteCKAN – Comprehensive Knowledge Archive NetworkVIVO - Research Collaboration Network model and softwareGHS – Global Handle SystemPROV – W3 Provenance Model and Ontology

Acknowledgments:DCO Data Science TeamW3C Provenance Working Group

Sponsors:AP Sloan FoundationNational Science FoundationTetherless World Constellation

ABSTRACT

As collaborative, or network science spreads into more Earth and space science fields, both the participants and their funders have expressed a very strong desire for highly functional data and information capabilities that are a) easy to use, b) integrated in a variety of ways, c) leverage prior investments and keep pace with rapid technical change, and d) are not expensive or time-consuming to build or maintain.

At a conceptual level, science networks (even small ones) deal with people, and many intellectual artifacts produced or consumed in research, organizational and/our outreach activities, as well as the relations among them. Increasingly these networks are modeled as knowledge networks, i.e. graphs with named and typed relations among the 'nodes'. Nodes can be people, organizations, datasets, events, presentations, publications, videos, meetings, reports, groups, and more. In this heterogeneous ecosystem, it is also important to use a set of common informatics approaches to co-design and co-evolve the needed science data platforms based on what real people want to use them for.

Community Data and Groups

Integrate Semantic Applications into a Data Science Platform

Information Models and Leveraged Ontology

Community Dashboard and Reporting Progress

Solution: Leverage

Open-Source

Semantic

Technologies

Triples tie it together

Integrate terms from VIVO, FoaF, BIBO, O&M,

VSTO, BCO-DMO, TWC and others Starting with their information models where

possible

Progress = integration at the application level

to bring several key functions together in a

Web-based environment oovering most

research needs

HIDE ALL OF “IT” FROM THE USERS!!

Identify Everything• DCO-ID

• Content negotiation

• Respect other locations, names

USE: W3C Provenance Ontology (PROV-

O)• Developed by W3C Provenance Working Group

• W3C 2013 Recommendation - http://www.w3.org/TR/prov-o/

AGUFM13 - IN11A-1509 (MS Hall A-C)

CKAN VIVO

GHS – Handle.net

VIVO - represents academic research communities• Every person, organization, or other data entity in

VIVO has a unique identifier• VIVO enables the discovery of research and

scholarship across disciplines at one institution or across many

• Records are both human-readable and machine-readable

• We’ve extended (yes, ontologies) VIVO to the science network – datasets, instruments, sites, etc.

• Feeding this back to VIVO

Knowledge network – implements

both the collaboration and the

integration Many means of population

User generation Machine generation

Substantially contributing these

enhancements back to open-source

community (CKAN, VIVO, GHS)

Information models provide domain level view

and logical models implemented in ontology

leverage a wide variety of vocabularies and

encoding schemes Alignment performed using …

RDFS subClassOf assertions SKOS broadMatch statements