met a-data resources in europe: within nsis and from dosis projects
Post on 14-Jan-2016
34 Views
Preview:
DESCRIPTION
TRANSCRIPT
Meta-data Resources in Europe: within NSIs and from
Dosis Projects
Wilfried Grossmann
Department of Statistics and Decision Support Systems
University Vienna
29.3.2000 Metadata Resources in Europe 2
Contents
Introduction
Contents of Meta-data
IT- Structures for Meta-data
Processing Meta-data
Conclusions
29.3.2000 Metadata Resources in Europe 3
Introduction
Continuing hot topics in the meta-data discussion
Content-orientation versus IT-orientation
There is a lack of communication between these two groups
29.3.2000 Metadata Resources in Europe 4
Introduction
Meta-data providers versus meta-data users
Who provides which type of information for whom?
29.3.2000 Metadata Resources in Europe 5
Contents of Meta-data
What kind of objects should be documented?
Basic statistical structures Variables Values Data sets
____________________
Statistical output Statistical Systems Statistical Processing
29.3.2000 Metadata Resources in Europe 6
Contents of Meta-data
Approaches towards meta-data content
The template oriented approach
The data warehouse approach
The process oriented approach
29.3.2000 Metadata Resources in Europe 7
Contents of Meta-data The template oriented approach
Templates defined by a number of working groups
For micro data and data setsDDI, Dublin Core
For (economic) macrodata
OECD, IMF, ECE (Internet)
29.3.2000 Metadata Resources in Europe 8
Contents of Meta-dataThe template oriented approach
The OECD Template:
Concepts and sources
Data Collection
Data manipulation by national source
Data quality
Data Transmission
International Standards
Data Storage and Manipulation by OECD
Output preparation and delivery by OECD
29.3.2000 Metadata Resources in Europe 9
Contents of Meta-data The template oriented approach
The IMF Template:
Coverage
Periodicity
Timeliness
Quality of disseminated data
Integrity of disseminated data
Access by the public
29.3.2000 Metadata Resources in Europe 10
Contents of Meta-data The template oriented approach
Although the OECD approach seems more
reliable from statistical point of view, IMF is
favoured at the moment by international
organisations (EUROSTAT)
29.3.2000 Metadata Resources in Europe 11
Contents of Meta-dataThe warehouse approach
Integration of the data inside the NSIs in a data warehouse
Output and dissemination as first step
Meta-data are oriented towards the needs of the
data warehouse
29.3.2000 Metadata Resources in Europe 12
Contents of Meta-dataThe warehouse approach
Projects in this direction in many NSI
Best documentation: Australian Office
Definitional meta-data
Procedural meta-data
Operational meta-data
Systems meta-data
Datasets meta-data
29.3.2000 Metadata Resources in Europe 13
Contents of Meta-dataThe process oriented approach
Combines statistical and IT considerations
Statistical data are considered not as final products but as the result of a process chain
More detailed consideration of statistical terminology
29.3.2000 Metadata Resources in Europe 14
Contents of Meta-dataThe process oriented approach
Starting point was the SCB-DOC model
(Rosen and Sundgren, 1991)
• A sequence of templates accompanying the statistical production process
• Ongoing activities at Statistics Sweden
• A number of NSIs want to adopt the model
29.3.2000 Metadata Resources in Europe 15
Contents of Meta-dataThe process oriented approach
The IDARESA model
Object oriented representation based on
SCB-DOC with emphasis on possible semi-automatic processing
29.3.2000 Metadata Resources in Europe 16
Contents of Meta-dataThe process oriented approach
The US-Bureau of census model
(Gillman, Appel et al. running project):
Statistical system defined as an identifiable process .... to produce one or more deliverables
29.3.2000 Metadata Resources in Europe 17
Contents of Meta-dataSummary
Process oriented approach seems
to be favourable for a number of reasons
Two Examples:
Classification servers
Data Quality
29.3.2000 Metadata Resources in Europe 18
Contents of Meta-dataSummary: Classification server
A classification server should
Support unified use of terminology inside NSIs or international organisations
Support harmonisation between (international) standard classifications and locally defined (adapted) classifications
29.3.2000 Metadata Resources in Europe 19
Contents of Meta-data Summary: Classification server
Requirements for a classification server
• A data base supporting easy and user friendly manipulation of hierarchy trees
• A mapping tool supporting the definition of correspondence tables between classifications
• A management strategy for implementation
29.3.2000 Metadata Resources in Europe 20
Contents of Meta-data Summary: Classification server
Up to now only few successful implementations
for partial solutions
EUROSTAT (SIMONE-Server)
New Zealand,
29.3.2000 Metadata Resources in Europe 21
Contents of Meta-data Summary: Data Quality
Data Quality Criteria for quality of statistics are well known
(Relevance, accuracy, timeliness, accessibility, comparability, coherence, completeness)
The problem
• Achieve quality in the production process
• Document quality by appropriate meta-data
29.3.2000 Metadata Resources in Europe 22
Contents of Meta-data Summary: Data Quality
Experience shows that documentation
quality is rather poor as soon as it is
separated from the production process
Example for an integration project
SIDI-approach by ISTAT
29.3.2000 Metadata Resources in Europe 23
IT Structures for Meta-data
Internet and data warehouse offer new opportunities for
Meta-data and data repositories
Meta-data access and exchange
Lead towards a more open policy in data dissemination
29.3.2000 Metadata Resources in Europe 24
IT Structures for Meta-dataMeta-data repositories
Approaches towards repositories
The thesaurus approach
The template oriented approach
The Data Warehouse oriented approach
29.3.2000 Metadata Resources in Europe 25
IT Structures for Meta-dataMeta-data repositories
Example for a thesaurus oriented approach
EUROSTAT servers for concepts and
definitions
• Advantage: available on the Internet
• Problem: Navigation not so easy
29.3.2000 Metadata Resources in Europe 26
IT Structures for Meta-dataMeta-data repositories
• Contents
– Descriptions (dictionaries)
– Semantic (coverage, standard classifications coherence of information)
– Administration (responsible persons)
– Selection (keywords, search facilities)
29.3.2000 Metadata Resources in Europe 27
IT Structures for Meta-dataMeta-data repositories
Example for the template oriented approach
StatBase: supporting access to meta-data as well as data and reports
• Meets quite well the requirements of OECD data template
• No direct connection between data and meta-data
29.3.2000 Metadata Resources in Europe 28
IT Structures for Meta-dataMeta-data repositories
Example for the warehouse oriented approach
StatLine(CBS): Based on data access from multidimensional tables (cubes)
• Accompanying meta-information is only in Dutch
• Extraction of special meta-data items is not so easy as in StatBase
29.3.2000 Metadata Resources in Europe 29
IT Structures for Meta-dataMeta-data access and exchange
Ongoing work in access and exchange
New Standards for access and exchange
Accessing distributed sources
Combination of information
29.3.2000 Metadata Resources in Europe 30
IT Structures for Meta-dataMeta-data access and exchange
Actual trends in standardization
• Traditional standards for data and meta-data exchange like GESMES or CLASET will probably switch to XML-platform.
• New standards from the Object Management Group (OMG)
29.3.2000 Metadata Resources in Europe 31
IT Structures for Meta-dataMeta-data access and exchange
Example MOF (Meta Object Facility)
– Extensible Framework for meta-data model definition
– Programming interface for storage and access of meta-data
– Integration facilities across domains
But note: This is a general approach for warehouses not necessarily tied with statistics
29.3.2000 Metadata Resources in Europe 32
IT Structures for Meta-dataMeta-data access and exchange
Example for Accessing and processing distributed sources
ADDSIA: Accessing and processing distributed sources for analysis purposes
• Minimum requirements for standardisation in advance
• Orientation towards statistical problems
29.3.2000 Metadata Resources in Europe 33
Processing Meta-data
Goal Data and meta-data are processed
together
<OldDataSets, OldMetadataSets>
<NewData, NewMetadata>
29.3.2000 Metadata Resources in Europe 34
Processing Meta-data
Advantages Reduction of documentation effort
More consistency in meta-data
Requirements Software tools supporting this view
Operational models for meta-data
29.3.2000 Metadata Resources in Europe 35
Processing Meta-data
Up to know only prototypes with emphasis
on different aspects of processing
The planning approach
The throughput approach
The transformation approach
29.3.2000 Metadata Resources in Europe 36
Processing Meta-dataThe planning approach
Develop software tools (workbench) for setting up meta-data documentation
BRIDGE/IMIM: A desktop for planning surveys and statistical
production Meta-data generated in the planning phase are
managed by the system No data are processed
29.3.2000 Metadata Resources in Europe 37
Processing Meta-dataThe planning approach
Improvement and adaptation of meta-data models for new tasks like quality and use of administrative sources
SIDI (Statistics Italy) Integration of quality in the statistical
production process Standardization of the production process
29.3.2000 Metadata Resources in Europe 38
Processing Meta-dataThe throughput approach
Use as much meta-data as possible from OldMeta-data to obtain NewMeta-data
CBS (ongoing work):
Use BLAISE meta-data as input Produce StatLine meta-data as output
29.3.2000 Metadata Resources in Europe 39
Processing Meta-dataThe transformation approach
Define meta-data algorithms for all types of data algorithms
Throughput meta-data Modified meta-data New meta-data Meta-data summarization
29.3.2000 Metadata Resources in Europe 40
Processing Meta-dataThe transformation approach
IDARESA project
Meta-data algorithms for elementary data base operations
ISMIS
Identification of added value in meta-data (new meta-data)
Pursuit of the production process inside EUROSTAT
41Metadata Resources in Europe29.3.2000
Processing Meta-dataThe transformation approach
In p u t d a ta 1In p u t M eta -d a ta 1
In p u t d a ta 2In p u t M eta -d a ta 2
In te rim d a ta 1In te rim M eta-d a ta 1
In p u t d a ta 3In p u t M eta -d a ta 3
In te rim d a ta 4In te rim m eta-d a ta 4
In p u t d a ta 4In p u t M eta -d a ta 4
In p u t d a ta 5In p u t M eta -d a ta 5
In p u t d a ta 6In p u t M eta -d a ta 6
In te rim d a ta 2In te rim M eta-d a ta2
In p u t d a ta 7In p u t M eta -d a ta 7
In te rim d a ta 3In te rim M eta-d a ta 3
In te rim d a ta 5In te rim m eta-d a ta 5
O u tp u t d a taO u tp u t M eta-d a ta
29.3.2000 Metadata Resources in Europe 42
Conclusions
Is there progress in meta-data research and development?
Yes, but rather slow because There is a lack of co-ordination in research
(Probably improved by a forthcoming meta-data working group)
There is an information gap between meta-data research groups and NSIs
NSIs seem to prefer their own solutions
top related