update and thoughts on directions for metadata work

20
Update and Thoughts on Directions for Metadata Work Carol Hert March 17, 2003

Upload: cameroon45

Post on 27-May-2015

400 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Update and Thoughts on Directions for Metadata Work

Update and Thoughts on Directions for Metadata Work

Carol Hert

March 17, 2003

Page 2: Update and Thoughts on Directions for Metadata Work

Our Metadata ActivitiesUser study to understand metadata necessary for integration tasks (we’re finding needs for metadata not available in agencies)Ongoing efforts to understand DDI and ISO11179 for deploying in end-user toolsIdentification of host of other relevant standards (open archives, business XML, Z39.50, …)Marked-up tables using DDIAttempting to acquire particular metadata

Page 3: Update and Thoughts on Directions for Metadata Work

Metadata Aspects for GovStatConceptual Tasks Determining elements and attributes to be used

in wrapping data and contextual info (an XML DTD presumably)

User study et al. to determine appropriate content “thought” experiments with implementations related to

elements, attributes, and their values Developing conceptual metadata model for SKN

Practical Tasks Finding the actual metadata content to be

“wrapped” via the elements finding data with metadata to port into tools

Page 4: Update and Thoughts on Directions for Metadata Work

Today’s Presentation

Focus on the Conceptual TasksStatus report on potentially relevant

standards and projectsConsidering the user tools and the

public intermediary

Start strategizing on directions to pursue further

Page 5: Update and Thoughts on Directions for Metadata Work

Concept. Task 1: Identifying Elements, Attributes, and

ValuesCurrent Contenders for Elements, Attributes (and some values)DDI (and its implementations) ISO11179 (and its implementations)Hybrids

Corporate Metadata Repository (CMR) from Oracle

Data cubes for Tables from NESSTAR, DDI

Page 6: Update and Thoughts on Directions for Metadata Work

DDI

Data set is the basic elementData archives perspective-designed primarily for people who archive data sets and those who will retrieve and reuse those datasetsDoes capture information on variables, values, etc. Still actively working on specifications for tables (see Ryssevik memo 3/6/2003)

Page 7: Update and Thoughts on Directions for Metadata Work

DDI Issues Doesn’t have good mechanism for relating

surveys and instances of those surveys-each data set is considered as stand-alone

Hard to compare across variables and time-series

Elements for tables still in development and other data presentations (such as news releases, graphics) not well developed

Currently working backwards to a conceptual model for the metadata

Page 8: Update and Thoughts on Directions for Metadata Work

DDI Implementations of Note

Counting California Virtual Data Center (Harvard/MIT) NESSTAR/FASTER

Developed CRISTAL datacubes and FasterCubes

Minnesota Population Center Developed WendyCubes for data cubes WendyCubes and FasterCubes being merged

Data Ferrett (Census)

Page 9: Update and Thoughts on Directions for Metadata Work

ISO11179

from the data producers’ perspective (Dan argues that it doesn’t take any perspective)

Able to relate survey instances, etc.

Isn’t capable of handling the full range of metadata we might need, nor can it handle data representations such as news releases, webpages, etc. (same problem with DDI)

Page 10: Update and Thoughts on Directions for Metadata Work

ISO11179 Implementations

StatCanadaDan G. has reservations about this

implementation and feels it doesn’t meet the standard (more as I understand the problem better)

Page 11: Update and Thoughts on Directions for Metadata Work

Is CMR the answer?

CMR as a registry to describe data, data processes, data quality and which links to datasets and dataCMR incorporates all of ISO11179, and DDI, in addition can support a variety of metadata types (those news releases)CMR not open source, cost unknown (software cost and Oracle consultants)Two good contacts for us Dan has gotten for BLS Sarah Nusser acquiring for Iowa State

Page 12: Update and Thoughts on Directions for Metadata Work

Seque to Conceptual Task 2

My original goal was to determine what metadata elements would be necessary for a given end-user tool (e.g. the SIG) and determine which standard(s) could provide necessary functionality (enabling metadata to get from agencies to the user tools)

I started by looking at the SIG and also at DDI implementations to see what functionalities we could acquire

Page 13: Update and Thoughts on Directions for Metadata Work

The Plot Thickens

Two new questions emerged from these activitiesWhat functions/information (data &

metadata) would be necessary in SKNWhat other standards efforts should be

considered in creating the SKN?

Page 14: Update and Thoughts on Directions for Metadata Work

The SKN Architecture

Agency with mutliplemetadata

respositories

agency backend data and metadata

agency backend data and metadata

Distributed public intermediary:

variable/concept level, XML-based incorporating

ISO11179 and DDI, providing java-based

statistical literacy tools to user interface

Statistical Ontology

firewall

Domain ExpertsEnd User

Communities

Domain Ontologies

I n

t e r

f a

c e

sU

s e

r

end user

end user

end user

end user

end users: interactwith data frominformation/conceptperspective, not justagency perspective

end user

end user

end user

Agency data with integrated metadata

Page 15: Update and Thoughts on Directions for Metadata Work

 

INTERNAL TO AGENCIES PUBLIC INTERMEDIARY

POSSIBLE SKN USER TOOLS/FUNCTIONS

TRANSFERS

Agency data production

Data archives

  standards, projects and their functions

 

CMR;Proprietary metadata repositories;Presentation formats (html, xml, pdf, etc.);Database formats (ACCESS, ALMIS );DDI Datacubes  NESSTAR/Faster CRISTAL;XML for Analysis;Common Warehouse Metadata Model;Statistical disclosure (SDC in Nesstar); StatCan ISO imp. 

 DDI (and DDI for datacubes) NESSTAR/Faster CRISTAL 

Middleware (whatever that includes) NEOOM from Nesstar/Faster From Virtual Data Center (VDC): federated metadata harvesting, repository exchange and caching, federated authentication and authorization, naming  

Searching: Z39.50 Data analysis, Bookmarking, Downloading datasets (nesstar);Cataloging, archiving functions (VDC);Online search, data conversion, exploration, data analysis (VDC);Glossary (The Neuchatel Group) Statistical Interactive Glossary (SIG—our project) Ontologies (ISI/Columbia for gas);Relation Browsers;Online Help

Z39.50(used by VDC) Open Archives (VDC) DC, MARC, DDI metadata import and export (VDC) SOAP HTTP RDF (Nesstar) ASN.1  

Page 16: Update and Thoughts on Directions for Metadata Work

Information/Metadata Needed Task(s) using this metadata

ISO1179 or DDI map

Comments Source of metadata

Term name (s) Search, presentation, and anchor for linking presentation

ISO11179 data element name (if term is a variable or concept)

Agency content,

GovStat ontology

Definition Provide content ISO11179 data element definition (if term is variable)

Agency documentation, statistics experts, statistics texts, GovStatontology

Examples, demonstrations, etc. Provide Content None within ISO11179 or DDI (though data elements in both might be usable)

Values: Audio, video, static text/graphic;Under user control;Links to more specific agency documentation

Agency content supplemented by designer

Context specificity level of definition (e.g. statistic, table, agency)

Provides specific explanations

None within ISO11179 or DDI

Under user control—needs context information

User’s current webpage or table,GovStat ontology

Format Type of presentation None within ISO11179 or DDI

Some terms may be better explained in some formats user control

Current user interaction or preset preferences, computing capabilities,GovStat ontology

Page 17: Update and Thoughts on Directions for Metadata Work

New Strategic Direction for Us?

Specification of metadata necessary throughout SKN?Will require specification of interactions

among components of SKNAnd perhaps the specification of specific

standards

Page 18: Update and Thoughts on Directions for Metadata Work

An example of a possible interaction

User via interface “I want data on gasoline price indices in the state of MD”

Query transferred to intermediary.

Intermediary query agent has business rule requiring check of terms so forwards the term “indices” to the SIG

Page 19: Update and Thoughts on Directions for Metadata Work

Example continued

SIG responds with 3 definitions of index (specificity of definition) and multiple display optionsIntermediary business rule indicates to take most general and to use the term “index” in queries sent to agency data sourcesEtc.

Page 20: Update and Thoughts on Directions for Metadata Work

New Strategic Direction for Us?

Specification of functions (and related information) necessary throughout SKN?Will require specification of interactions

among components of SKN (possible queries, acceptable responses, bindings among agents, etc.)

And perhaps the specification of specific standards