overview of the analytical information markup language (animl)
Post on 16-Jul-2015
352 Views
Preview:
TRANSCRIPT
Overview of the Analytical Information Markup LanguageStuart J. Chalk, Department of Chemistry, University of North Florida
schalk@unf.edu
ACS Meeting Denver 2015
Data Formats
Goals for Data Handling
Introduction to AnIML
Sections of an AnIML file
AnIML Schemas and Files
AnIML Technique Definitions
Publishing Instrument Data
Referencing Data Elements
Calculations on Data
Future Developments
Conclusion
Overview
Native Data Formats Proprietary formats
"Metadata" separated from result data
Metadata and data in multiple files
Metadata not available electronically
No way to link metadata with result data
Interchange Data Formats Available for only a few techniques
ANDI — GC, LC, MS
JCAMP-DX — IR/FTIR, NMR, UV/Vis, IMS
Fixed order, fixed syntax, immutable formats
Content limitations
Inconsistent implementations
Data Formats
Extensible Easy to add new elements without breaking existing
applications
Flexible Useful for diverse needs: Interchange, Interconversion,
Archiving...
Useable & Maintainable Easy to create, use, adapt, maintain... Readily available tools
Acceptable Use standard mechanisms accepted by mainstream
computing
Human readable eXtensible Markup Language
Goals for Data Handling
Extensible Markup Language (XML) specification
Development under ASTM E13.15 ‘AnIML Task Group’
Data standard to:
“Develop an analytical data standard that canbe used to store data from any analytical instrument”
Introduction to AnIML
http://animl.sourceforge.net
JCAMP-DX http://www.jcamp-dx.org/
ANDI (netCDF)
ThermoML (NIST)
SpectroML Nguyen, A. D. T., Arslan, A., Travis, J., Smith, M., Schafer, R., &
Kramer, G. W. (2004) ‘Molecular Spectrometry Data Interchange Applications for NIST's SpectroML’, JALA 9 (6), 346-354. doi:10.1016/j.jala.2004.09.001
Generalized Analytical Markup Language (GAML) http://www.gaml.org/
First official meeting March 23, 2003 @ ASTM
Brief History of Time AnIML
Broad scope
Different types of data
Size of data sets
Everyone calls ‘widgit’ something different
Need for metadata dictionaries
One size does not fit all
Getting broad community involvement Domain experts
User communities
What format?
Challenges for AnIML
AnIML XML elements are ‘pigeon holes’ for metadata
Minimal ‘required’ information
If it’s not required you don’t have to include the element
Extensible
Store raw data not processed data(except for FT techniques)
Support for legacy data
Record of changes
Validatable
Signable (digital sense)
AnIML Design Philosophy
Access
Reference
Search
Visualize
Export
Manipulate
Process
Contextualize
Leverage XMLtools/formats
AnIML in an ELN
Expose an AnIML file at a URL
Optional: Define a DOI for that URL
Use XPath to reference a specific data point in an AnIML file
//ExperimentStepSet[1]/ExperimentStep[1]/Method[1]/Author[1]/Name[1]
Encode the XPath expression so it can be part of the URL
Referencing Instrument Data
Calculations with Instrument Data
Extract data from files using XPath
XML data to JSON conversion using XSLT*
Browser based JavaScript functions to
Smooth: moving window, Savitsky-Golay
Integrate: summation
Conversion: Absorbance <-> %T
Linear regression
*http://www.bramstein.com/projects/xsltjson/
AnIML 1.0 Deliverables Core Schema - Fundamental framework for AnIML documents Technique Schema - Fundamental framework for technique definition and
extension documents AnIML Technique Definition Documents (ATDD) - Rules for content of
specific technique file AnIML Naming and Design Rules - Specifies rules about data element
structure for interoperability Standard Practice for AnIML Files - Describes how the specification is
supposed to work How to Create a Technique Definition Document - Guidelines for creating
new technique definition documents
Other documents Draft Requirements Specification for AnIML Version 1.0 Requirements and Goals of the Analytical Information Markup Language
AnIML Specification
http://animl.sourceforge.net
Documentation
Core specification
Technique and extension specification
Naming and design rules
Annotated technique definitions(UV/Vis, IR, 1D NMR, MS, Chromatography)
Balloting through ASTM (end of 2015)
Vendor, User, Developer extensions
Semantic extension
Ontological reference to AnIML metadata items
Future Developments
Conclusion
AnIML is a great solution for storing instrument data
Human readable (plain text - UTF-8)
Platform neutral
Archivable
Validatable
Being XML based leverages the extensive XML ecosystem of tools that are mostly free
Software designers are familiar with dealing with XML due to its well defined and stable architecture
schalk@unf.edu
Phone: 904-620-1938
Skype: stuartchalk
LinkedIn/Slidehare: https://www.linkedin.com/in/stuchalk
ORCID: http://orcid.org/0000-0002-0703-7776
ResearcherID: http://www.researcherid.com/rid/D-8577-2013
Questions?
top related