247th acs meeting: experiment markup language (exptml)

21
Experiment Markup Language: A Combined Markup Language and Ontology to Represent Science Stuart J. Chalk Department of Chemistry University of North Florida [email protected] 2014 Spring ACS Meeting – CINF Paper

Upload: stuchalk

Post on 22-Jan-2015

77 views

Category:

Technology


2 download

DESCRIPTION

To integrate science into the semantic web it is important to capture the context of research as it is done. ExptML is designed to store information and workflows from the scientific process.

TRANSCRIPT

  • 1. Experiment Markup Language: A Combined Markup Language and Ontology to Represent Science Stuart J. Chalk Department of Chemistry University of North Florida [email protected] 2014 Spring ACS Meeting CINF Paper 19

2. Digital Representation of Science Electronic Notebooks The Eureka Research Workbench Experiment Markup Language ExptML Schema and Files Semantic Data and Ontologies File Storage Eureka Interface Web Interface Conclusion Outline 3. Most research on digital science is focused on the data Standards exist for the digital representation of Data -> individual measurements, time series, spectra Molecules Chemical Reactions Context is important! Context can be added ad-hoc Needs to be added systematically - to be searchable We need a digital representation of the scientific process Digital Representation of Science 4. Conceptualized in 2006 Need a way to store Research activities Laboratory resources Data Need to capture the workflow of scientists not define it Writing in a lab notebook is equivalent to blogging but the context of the entries is important and varies Many data types, so how to capture information? Experiment Markup Language (ExptML) Eureka Research Workbench 5. A specification (written in XML) that describes different types of information recorded during the scientific process (http://exptml.sourceforge.net) Experiment Markup Language (ExptML) Sample Solution Space Specimen Substance Task Template Timeline User Vendor Annotation Api Calculation Chemical Citation Customer Data Dataset Definition Element Equipment Event Experiment Group Message Project Protocol Quote Report Result 6. ExptML Chemical Schema 7. ExptML Chemical Schema 8. ExptML Chemical (Instance) 9. To allow ExptML to capture a scientific workflow, an ontology is needed to represent the structure Needs to be Flexible able to be used in a wide variety of areas Logical the links make sense in the context of science Searchable so we can find research done in a similar way Comprehensive! This is the BIG problem Many existing ontologies Linking ExptML Files 10. In computer science and ontology formally represents knowledge as a set of concepts within a domain, and the relationships between those concepts. It can be used to model a domain and support reasoning about concepts.* In essence, an ontology allows us to define the relationships and assertions about concepts For samples represented in ExptML we define isSample (assertion) hasSample (relationship) isSampleOf (relationship) ExptML Ontology *https://en.wikipedia.org/wiki/Ontology_(information_science) 11. ExptML Ontology 12. XML is nice for storage, archiving and transmitting information but it is not so easy to use in software Many XML readers but each have their own syntax Can be cumbersome to deal in software with File size (XML is verbose) Namespaces Data types (e.g. string, decimal, etc) So the solution is Developments in ExptML 13. JSONize it! Compact string representation of arrays of data Used in AJAX requests in web browsers Javascript Object Notation (JSON) { exptmlid: exptml:ann1, anntype: comment, text: Had to wait for the biochemistry lab to finish using the spectrophotometer before the I could get on it. The standards sat around for 1 hr 30 minutes before I could run them., date: 2011-11-25T11:05:17-04:00 } commentHad to wait for the biochemistry lab to finish using the spectrophotometer before the I could get on it. The standards sat around for 1 hr 30 minutes before I could run them.2011-11-25T11:05:17-04:00 14. JSON-based Serialization for Linked Data Current W3C recommendation* Allows us to define a specification for the JSON data @content is equivalent to an XML Schema JSON-LD *http://www.w3.org/TR/json-ld { @context: { exptmlid: http://www.w3.org/2001/XMLSchema#string, anntype: http://www.w3.org/2001/XMLSchema#string, text: http://www.w3.org/2001/XMLSchema#string, date: http://www.w3.org/2001/XMLSchema#dateTime } } 15. JSON-LD { @context: { exptmlid: http://www.w3.org/2001/XMLSchema#string, anntype: http://www.w3.org/2001/XMLSchema#string, text: http://www.w3.org/2001/XMLSchema#string, date: http://www.w3.org/2001/XMLSchema#dateTime } exptmlid: exptml:ann1, anntype: comment, text: Had to wait for the biochemistry lab to finish using the spectrophotometer before the I could get on it. The standards sat around for 1 hr 30 minutes before I could run them., date: 2011-11-25T11:05:17-04:00 } 16. @id represents an Internationalized Resource Identifier (IRI) The IRI identifies a node and allows this data to be linked JSON-LD { @context: http://exptld.org/annotation.jsonld @id: https://eureka.coas.unf.edu/exptml:ann1, anntype: comment, text: Had to wait for the biochemistry lab to finish using the spectrophotometer before the I could get on it. The standards sat around for 1 hr 30 minutes before I could run them., date: 2011-11-25T11:05:17-04:00 } 17. Current the ontology defines generic relationships Should be expanded to provide additional context Developments in the Ontology has solution Indicates that an experiment makes use of a particular solutionhas buffer Indicates that an experiment makes use of a buffer (solution)has reagent Indicates that an experiment makes use of a reagent (solution)has calibration standard Indicates that an experiment makes use of a calibration standard