standards landscape for micro and aggregated data › sdd › sdmx 2013 session 4.5 -...

17
Standards landscape for micro and aggregated data How the standards-based industrialization of statistical production fits into the picture (SDMX, DDI, GSBPM, GSIM,…) 11-13 September 2013 SDMX Global Conference 2013 OECD, Paris 1 Marco Pellegrino, Eurostat

Upload: others

Post on 03-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

Standards landscape for micro and aggregated data

How the standards-based industrialization of statistical production

fits into the picture (SDMX, DDI, GSBPM, GSIM,…)

11-13 September 2013 SDMX Global Conference 2013

OECD, Paris

1

Marco Pellegrino, Eurostat

Page 2: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

Outline

1. Where are we?

2. The landscape: a portfolio of used standards

3. SDMX and DDI: a set of use cases

4. Conclusions and way forward

2 11-13 September 2013 SDMX Global Conference 2013

OECD, Paris

Page 3: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

Where are we?

Dramatic changes in the environment of official statistics producers

(e.g. data deluge)

Modernization of statistical information system seen as a question

of survival for the sector of official statistics

Standardization viewed as a key enabler for modernization

"Standards-based” industrialization of statistical production

11-13 September 2013 SDMX Global Conference 2013

OECD, Paris

3

Page 4: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

Standardization

21 June 2013 4

• Without a standardized concept of statistical production,

we will not see:

– Economies of scale across statistical institutes internationally -

shared solutions

– Good vendor support for the industry

– Harmonization of statistical data (leading to more comparable

data)

– Reusable, interoperable data for users

• Some major standards have emerged:

– Statistical Data and Metadata Exchange (SDMX)

– Data Documentation Initiative (DDI)

– RDF, Linked Open Data (LOD)

Page 5: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

A portfolio of standards

SDMX Preferred standard for exchange and sharing of data and metadata in the global

statistical community (UNSC, 2008) – Widely used in the ESS for aggregated data

DDI: Data Documentation Initiative Standard for the documentation of data, initially focused on archiving micro-data in

the area of social sciences – widely used in national data archives – extended to

support the full life-cycle of data

RDF W3C standard for web-based discovery, dissemination, and linking – an

alternative to XML

RDF vocabularies based on SDMX (data cube), DDI, and the Neuchatel

classification model

JSON Web-developer-friendly alternative to XML. JSON version of SDMX

XBRL Standard for reporting accounting information and banking supervision data

11-13 September 2013 SDMX Global Conference 2013

OECD, Paris

5

Page 6: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

Characterizing the Standards: SDMX

• Describes the structure of aggregate/dimensional data

(“structural metadata”)

• Provides formats for the dimensional data

• Provides a model of data reporting and dissemination

• Provides a way of describing and formatting stand-alone

metadata sets (“reference metadata”)

• Provides standard registry interfaces, providing a

catalogue of resources

• Provides guidelines for deploying standard web services

for SDMX resources

• Provides a way of describing statistical processes

11-13 September 2013 SDMX Global Conference 2013

OECD, Paris

6

Page 7: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

Characterizing the Standards: DDI

DDI Lifecycle can provide a very detailed set of metadata

covering:

– The study or series of studies

– Many aspects of data collection, including surveys and

processing of microdata

– The structure of data files, including hierarchical files and

those with complex relationships

– The lifecycle events and archiving of data files and their

metadata

– The tabulation and processing of data into tables (Ncubes)

• Allows for a link between the microdata variables and

the resulting aggregates

11-13 September 2013 SDMX Global Conference 2013

OECD, Paris

7

Page 8: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

Characterizing the Standards: RDF

• Allows for after-the-fact linking of any types of resources on the Web – Data can be linked without the one knowing of the other’s

existence

– Linking press releases and speeches with relevant data from a statistical organization

• Based on “triples” of subject, predicate, object enabling data to be linked

• Powerful querying language for distributed searches on the web

• Very popular with “open data” and “open government” initiatives

11-13 September 2013 SDMX Global Conference 2013

OECD, Paris

8

Page 9: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

Characterizing the Standards: XBRL

• XML-based standard using linked taxonomies of various

types

• No formal model

– Communities standardise the taxonomies to support their needs

– Mapping to other models requires an understanding of the

implied model of the community

• Good tools are required to hide this complexity

11-13 September 2013 SDMX Global Conference 2013

OECD, Paris

9

Page 10: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

SDMX and DDI together

• People have been discussing the use of SDMX and DDI

together for some time (many technical similarities)

• Now, we are at the stage where implementations are

being investigated and prototyped

• This is done in the context of the Generic Statistical

Business Process Model (GSBPM)

– Idea of “industrialized” statistical production

– Strong emphasis on process management

Page 11: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

DDI DDI SDMX SDMX

SDMX

Page 12: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

SDMX-DDI dialogue

Launched in 2010 with 3 goals:

To avoid duplication of efforts and thus avoid confusion about which

standards should be used for specific types of applications

To provide reassurance to the user communities of DDI and SDMX

that the end-to-end statistical process can be managed, and that

standards bodies are considering the needs of users

To provide specific technical guidance about the use cases and

implementation of the standards for specific purposes

Endorsed by DDI Alliance and SDMX Sponsors / Secretariat

11-13 September 2013 SDMX Global Conference 2013

OECD, Paris

12

Page 13: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

Analysis of use cases for SDMX and DDI

Set of use cases where the two standards are compared:

1. Survey data collection

2. Administrative and register data

3. Combined use of DDI and SDMX

4. Micro-data access and on-demand tabulation of micro-data

5. Metadata and quality reporting

SDMX experts (TWG) and national experts involved

E.S.S. Cross-cutting Project on Information Models and Standards (IMS)

11-13 September 2013 SDMX Global Conference 2013

OECD, Paris

13

Page 14: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

DDI offers a very rich model

for the documentation of

micro-data

SDMX offers a very

integrated exchange

platform for statistical

outputs (IT architectures,

tools, web services)

DDI and SDMX

The combined use of both standards could allow a higher level of integration of the complete production process

But: The devil is in the detail!

11-13 September 2013 SDMX Global Conference 2013

OECD, Paris

14

Page 15: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

Generic Statistical Information Model (GSIM)

Common Generic

lndustrialised Statistics

GSBPM GSIM

Methods Technology

Business Concepts Information Concepts

Statistical HowTo Production HowTo

conce

ptu

al

pra

ctic

al

Common Generic

lndustrialised Statistics

GSBPM GSIM

Methods Technology

Business Concepts Information Concepts

Statistical HowTo Production HowTo

conce

ptu

al

pra

ctic

al

SDMX, DDI, RDF,

ISO-11179, etc.

11-13 September 2013 SDMX Global Conference 2013

OECD, Paris

15

Page 16: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

Other relevant standards

DDI SDMX

GSIM Conceptual model

Implementation

standards

11-13 September 2013 SDMX Global Conference 2013

OECD, Paris

16

Page 17: Standards landscape for micro and aggregated data › sdd › SDMX 2013 Session 4.5 - Standards... · Conclusions and way forward 11-13 September 2013 2 ... –Data can be linked

Summary

• To enable a modernized statistical production, standards

are the key

• Standards at different levels are being used in an

increasingly coherent way

• GSBPM and GSIM provide conceptual models and

facilitate communication

• SDMX, DDI and other standards provide implementation

models which can be used in a coordinated way

• There are now more technologies than just GESMES and

XML: a coherent overall model is critical

17 11-13 September 2013 SDMX Global Conference 2013

OECD, Paris