This presentation will introduce the ISO Metadata Developer’s Toolkit developed by the Alaska Data Integration working group (ADIwg) It covers the purpose, implementation, and capabilities, of the eight tools in the toolkit
ISO Metadata Developer’s Toolkit
A collection of open-source software tools designed to assist individuals and organizations create metadata for their research projects and data.
6/10/2015 www.adiwg.org 2
GitHub Repositories: https://github.com/adiwg
Presenter
Presentation Notes
-- the toolkit supports ISO 19115-2, 19110, and HTML at present -- architecture supports extension to other ISO and non-ISO metadata standards -- 15 minute overview followed by 15 demo
• State of Alaska o University of Alaska (UAF, UAS) o Geographic Information Network of Alaska (GINA) o International Arctic Research Center (IARC)
• Non-Governmental Organizations (NGOs)
o Arctic Ocean Observing System (AOOS) o Arctic Research Mapping Application (ARMAP) -
Nunatech Consulting o North Pacific Research Board (NPRB) o North Slope Science Initiative (NSSI)
• Cooperatives/Joint-Ventures
o Arctic LCC
6/10/2015 www.adiwg.org 3
Presenter
Presentation Notes
-- ADIwg (Alaska Data Integration working group) -- Team consists of technical representatives from 14 organizations in Alaska working on Climate Science -- ADIwg made the decision in 2012 to adopt the ISO 19115-2 metadata standard for distribution of or data metadata and later to include project metadata -- Because this was a sizable task, and beyond the capacity of our smaller organizations, ADIwg took on ISO implementation as joint development effort
Project Objectives • Make it easier for organizations to achieve ISO
compliance:
o Integrate ISO support into local applications and services
o Implement custom web services with ISO metadata capabilities
• Support both project and data metadata
• Offer as open-source, extensible software architecture
• Eliminate necessity of users needing to learn the ISO 19115 family of standards to produce metadata
• Support individual researchers and large organizations
• Host a public web service for generation of ISO metadata records
• Host a public web app for researchers to enter and edit metadata content
6/10/2015 4 www.adiwg.org
Presenter
Presentation Notes
-- ADIwg set the following objectives at the onset -- at this time all objectives, other than the web app for researchers, have been achieved
Prioritize ISO Content Polled members for priority data types and usage
6/10/2015 www.adiwg.org 5
Presenter
Presentation Notes
-- rather than implement the entire ISO standard at once, determine core elements and extend over time -- ADIwg polled members to assess data use patterns -- from summary slide tabular and geospatial data comprise the majority of our research data -- we started with support for these data types
Supported ISO Fields
6/10/2015 www.adiwg.org 6
• ADIwg supported fields
• ~80 classes
• ~350 attributes
• 70% of full standard
Presenter
Presentation Notes
-- selected ISO fields to support tabular and geospatial data -- built an object class model to document the fields selection -- estimate the selected ISO fields will cover about 80%+ of our data products -- not a small subset, (see stats)
ADIwg Profile in JSON
• Why JSON and not XML? o Wide support from programming languages
o Easy to read by both humans and machines
o Focus is on the data, less markup
o Native to JavaScript - browsers
o Validation against schema definition
• Support multiple standards (primarily ISO)
• Support custom fields
• Support complete data dictionaries
• Support complex geography in GeoJSON
• Flexibility to extend profile
6/10/2015 www.adiwg.org 7
Presenter
Presentation Notes
-- designed a JSON structure to hold the metadata content specified in our “Supported ISO Fields” model -- we chose JSON as an intermediate metadata content holder primarily because it is easy to manipulate in most modern development languages -- this also gave us independence from any single established metadata standard. -- we use the same JSON input to create all the supported output standards 19115-2, 19110, and HTML. -- requirements for 19115-1 were also considered when designing the JSON structure -- we named this JSON profile mdJson
JavaScript Object Notation
6/10/2015 www.adiwg.org 8
Presenter
Presentation Notes
-- an example JSON -- this is the top part of a minimal mdJson file. -- note the similarity to ISO 19115
6/10/2015 www.adiwg.org 9
mdJ
SO
N
Version
Contacts
Individual
Organization
Metadata
Metadata Info
Resource Info
Citation
Keywords
Extents
... Distribution
Associated Resources
Additional Docs
Data Dictionary
Presenter
Presentation Notes
-- high-level view of mdJSON’s hierarchical structure -- note the similarities and differences to ISO structure -- similarity: ---- the metadata section closely follows ISO -- difference: ---- contacts are normalized into a contact array; when a contact is required in the metadata sections it is referenced by an id and associated with a role -- difference: ---- the data dictionary section is included with the metadata record; the data dictionary organization closely resembles SQL syntax
mdJson-schema
6/10/2015 www.adiwg.org 10
• Complete structural validation of JSON
• Latest IETF draft (version 4)
• Validation engines available in many languages
• http://json-schema.org/
Presenter
Presentation Notes
-- JSON files can be validated using a ‘json-schema’ definition -- a json-schema fills a similar role as the XML XSD -- the insert shows a portion of the schema definition for citation -- Internet Engineering Task Force (IETF)
-- the mdTranslator is the core of the ISO Metadata Developer’s Toolkit -- accepts input, reads it into the internal store, and generates metadata in the requested standard -- in this example… ---- mdJson is sent to the translator ---- the input is validated using the toolkit component ‘mdJson-schemas’ ---- if the input passes validation, it is sent to mdJson reader to load the metadata content to the ‘internal data store’ ---- if the loading succeeds, control passes to the ISO 19115-2 writer ---- properly formatted ISO 19115-2 metadata is passed back to requestor -- other supported metadata writer standards are shown in blue -- planned metadata writer standards are shown in light-blue -- readers and writers shown in gray areas are being considered
mdCodes
• Developed for metadata content editors (mdEditor) to load codelist values
• Contains all ISO codelists needed by ADIwg Profile
• Codes current with 19115-2, 19115-1, including some ADIwg extensions
• Each codelist is a maintained as a YAML file o “Yet Another Markup Language” or “YAML Ain’t Markup
Language”
o Suited for text editing structured data
o Supported in Ruby, Python, Perl, grep
• Will generate an ISO CT_CodelistCatalogue for codelists
6/10/2015 www.adiwg.org 12
Presenter
Presentation Notes
-- mdCodes was designed is to support loading metadata content editors with valid ISO codes and -- to support extension of codelists without needing to modify the translator or editing applications -- the mdCodes module also generates the appropriate ISO CT_CodelistCatalogue on request
Code available on GitHub:
ISO Toolkit Components
• mdJson o Standard for encoding project and
data metadata
• mdTranslator o Provides translation to established
metadata standards
• mdTools o Groups documentation, validation,
and translator interface tools
• mdEditor o Online preparation and editing of mdJson files
• mdBook o Online documentation for all tools
in the ISO Metadata Developer’s Toolkit
• mdCodes o Standard ISO codelists for populating metadata editors
• mdJson-schemas o Schema definition for mdJson for validating mdJson
file structure and content
• mdTranslator-rails o Ruby on Rails website for public access to hosted mdTranslator
6/10/2015 13 www.adiwg.org
https://github.com/adiwg/
Presenter
Presentation Notes
-- these are the 8 components of the ISO Developer’s Toolkit -- each has its own GitHub repository -- all are open source
-- the next series of slides illustrates how the tools stack and interact -- for individuals and organizations that wish to customize any of the tools -- establish a Ruby environment and clone or fork the mdTranslator code repository from GitHub -- then write a simple Ruby program to pass your mdJson to the mdTranslator and catch the result
mdTranslator as a gem
6/10/2015 www.adiwg.org 15
Ruby code
Ruby Gem
Ruby install + program
Presenter
Presentation Notes
-- for individuals and organizations that wish to integrate the mdTranslator into local systems without customization -- establish a Ruby environment and use Gem to install the adiwg-mdtranslator gem -- then write a simple Ruby program to pass your mdJson to the mdTranslator and catch the result -- an advantage of using Gem is that it will automatically install all other mdTranslator dependencies
mdTranslator as a web service
6/10/2015 www.adiwg.org 16
Ruby code
Ruby Gem
mdTranslator-rails
Public hosted web service
Web Application
Presenter
Presentation Notes
-- for individuals and organizations that do not need to integrate the mdTranslator with local applications -- write a simple web page that will POST your mdJson file to the hosted mdTranslator API; no Ruby development environment is required -- use POST to avoid the limitations some browsers place on GET (2K)
mdTranslator from mdTools
6/10/2015 www.adiwg.org 17
Ruby code
Ruby Gem
mdTranslator-rails Document mdJson
Validate mdJson
Submit & Capture
Public hosted web service
Browser
Presenter
Presentation Notes
-- for individuals and organizations that wish to interact with the mdTranslator using a pre-built service you can use mdTools -- mdTools can validate your mdJson file, POST, and catch the result -- no development is required
mdJson from mdEditor
6/10/2015 www.adiwg.org 18
Ruby code
Ruby Gem
mdTranslator-rails
Public hosted web service
Browser
Presenter
Presentation Notes
-- for individuals and organizations that desire a service to help organize their metadata content into a mdJson file -- use the web accessible metadata content editor, mdEditor -- not ready yet, planned for release fall of 2015 -- mdEditor will run client side JavaScript – all metadata content will remain local until submitted to the hosted mdTranslator API