eurostat – directorate b: corporate statistical and it services sdmx basics training – 2013 sdmx...

37
Eurostat – Directorate B: Corporate statistical and IT services SDMX Basics Training – 2013 SDMX basics Marco Pellegrino Eurostat, Directorate B

Upload: antonia-georgia-hudson

Post on 31-Dec-2015

222 views

Category:

Documents


4 download

TRANSCRIPT

Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

SDMX basics

Marco PellegrinoEurostat, Directorate B

2Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

Purpose of this training session

At the end of this session you will:

– Know the basics of the SDMX model

– Understand the techniques to identify the structure of data

– Identify the concepts in a simple data set

– Be able to develop simple data structure definitions using SDMX tools

– Be familiar with the main IT architecture and tools used by Eurostat for SDMX implementation projects

World BankUNSDUNSD

Statistical Data and Metadata eXchange

SDMXISO IS 17369

4Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

According to ISO: A standard is

a document,

established by consensus and

approved by a recognized body,

that provides rules, guidelines or characteristics

for common and repeated use,

for activities or their results,

aimed at the achievement of the optimum degree of order in a given context.

What is a standard?

5Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

Lack of standardisation in data exchanges or across organisationsLack of standardisation in data exchanges or across organisations

Different formats of

data and metadata

Different formats of

data and metadata

EDIFACT

Structured Files

XML

paper form

Different places to store data and metadataDifferent places to store data and metadata

Different mediaDifferent media

Email

file upload

Web-form

removable media

dial-up

Paper

6Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

WHAT SDMX IS

This is what SDMX provides and enables

A model to describe statistical data and metadata

A standard for automated communication from machine to machine

A technology supporting standardised IT tools

In order to take advantage of all this:

Statisticians agree to use a common description for data and metadata

The data exchange process is then driven by the common description

Data descriptions are made available for everybody who wants to understand and reuse the data

7Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

Version 1.0 GESMES/TS

Version 2.0SDMX-EDISDMX-MLSDMX Registry

2008SDMX accepted

at UN level

ISO/TS 17369

September 2004

Version 1.0

Version 2.0

February 2008

SDMX recognised and supported as the preferred standard

SDMX 2.1

April2011 November 2005

From version 1.0 to version 2.1 to…?

8Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

All good standards change…

All standards change over time, and are released in a series of versions

Changes always have some impact on users

– Users are not expected to always use the latest version of a standard

– Standards organisations (like SDMX) have to provide support for several versions of the standard, all of which are in use

9Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

Change management

Danger (1): too much change may discourage adoption

Danger (2): not giving users the functionalities they want will discourage adoption

Need to find a balance

10Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

THE SDMX COMPONENTS

Technical Specifications

The SDMX

Information Model

Guidelines to

Harmonise Content

Content-oriented Guidelines (COG)

Tools

IT Architectures for data exchange

SDMX compliant tools

11Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

Models?

12Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

A model is a partial analogy of a system

René Magritte

“This is not a pipe”

The analogy between the model and the represented reality is partial.

The properties of the model are not identical to the properties of the reality.

I can’t smoke with this pipe!

13Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

The four meta-modelling levels

Real data(e.g. BOP, ESA)

Data model: concepts, codes, DSD

SDMX metamodel

A model represents a system and conforms to a metamodel

14Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

The Generic Statistical Business Process Model

15Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

Overall integration of methods and techniques

Design Build Collect Process Disseminate Use

DATADATA

Software Services

Administrator

DEFINITIONSDEFINITIONS

User

Information Model

16Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

A user level formal language to: • express, agree and design information needs

• give specifications to reporting agents

• communicate with IT people

• drive the software (which doesn’t change)

• document the system

User autonomy

Flexible information system, evolving fast & cheaply

The role of the Information model

SDMX Information Model (“metamodel”)

Dataset

Structure

Dataset

Structure

DataData

Structural

Metadata

Structural

Metadata

Data Structure Definition (DSD) Dimensions

(ex: country, variable/topic,

year)

Dimensions

(ex: country, variable/topic,

year)

Attributes

(ex: unit of measure)

Attributes

(ex: unit of measure)

Code listsCode lists

Metadata about an individual value, a time series or a group of time series

Metadata about an individual value, a time series or a group of time series

Provides a way of modelling data, metadata and exchange processes

Identify/Describe

18Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

Describing the data exchange

Who?

What?

When?Who?

Where?How?

What?

Data Structure Definition: Concept Usage

Unit Multiplier

Unit

Topic

Time/Frequency

CountryStock/Flow

Observation

(Dimension)(Dimension)

(Dimension)

(Attribute)

(Dimension)

(Dimension)

(Attribute)

(Measure)

20Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

Data Structure Definition:Defining Multi-dimensional Structures

• Comprises– Concepts that identify the observation value– Concepts that add additional metadata about the

observation value– Concept that is the observation value– Any of these may be

• coded• text• date/time• number• etc.

Dimensions

Attributes

Measure

Representation

Use of cross-domain concepts

Domain 1

Cross-domain concepts and code lists

FREQ

REF. AREA

Domain 2

Set of used concepts

Cross-domain concepts

COMPARABILITY

Metadata Common Vocabulary

Statistical subject-matter domains

Based on the UNECE Classification of International Statistical Activities

Content-Oriented guidelines

Cross-domain concepts and code listsCross-domain concepts and code lists

Statistical subject-matter domainsStatistical subject-matter domains

Metadata common vocabularyMetadata common vocabulary

Recommendations to harmonise implementations

Organisation 1 Organisation 2 Organisation 3

interoperability

26Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

SDMX provides support for things that are essential to Statisticians, but are often difficult for them to achieve

International standard for holding all of the elements involved in the statistical process together in a clear information model

Approach that maximises the amount of information on the statistical context that can be passed through to users, and the capacity of linking statistics from similar or different sources

Automation of processes: SDMX enables the development of common tools that can be used by all statistical organisations to improve their activities

Some benefits from SDMX standards

27Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

SDMX Reference Infrastructure

SDMX Reference Infrastructure

Statistical Organisation

Statistical Organisation

Benefits from SDMX standards (2)

Web services enable query, visualisation, and automated loading of data and metadata. SDMX tools allow querying a database, or a file system, for the creation of tables, charts, and graphs from the results of the query.

SDMX is also an advanced standard for data discovery using web-based services

28Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

by end of June

Organisation scheme

Concepts

Codelists

Concept Schemes

Provision Agreement

SDMX describes the data and metadata exchange

DSD

maintainer SDMX

Registry

29Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

Data Repository (Warehousing) Architecture

NSI

EurostatPull Requestor

eDAMIS

Data Input

SDMX Registry

Intermediatestorage

Verification /ConversionTo SDMX

Receiveddata in

SDMX-MLLoader

register

Warehousestorage

Eurobase

query

Dissemination

XSL forSDMX-ML

PULL

PUSH

30Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

The SDMX Hub

Data warehouse

Data warehouse

Data warehouse

SDMX-RI

(web service)

SDMX-RI

(web service)

SDMX-RI

(web service)

Data Hub

Data Providing Organizations Data collector Organizations Users

messagesSDMX

Data warehouse

Data warehouse

Data warehouse

SDMX-RI

(web service)

SDMX-RI

(web service)

SDMX-RI

(web service)

SDMX-RI

(web service)

SDMX-RI

(web service)

SDMX-RI

(web service)

Data Hub

Data Providing Organizations Data collector Organizations Users

messagesSDMX

31Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

SDMX progress, 2011 to 2015

Standards Development: April 2011, SDMX 2.1 Technical Standards released @ sdmx.org

May 2011: SDMX Global Conference in Washington, D.C.Next: 11-13 September 2013 (OECD, Paris)

Self-learning tutorials comprising video, textbook and self-test.

Governance: Creation of two SDMX Working Groups (Technical Working Group and Statistical Working Group)

Action Plan 2011 to 2015

32Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

How to know more about SDMX

33Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

http://sdmx.org/

34Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

http://epp.eurostat.ec.europa.eu/portal/page/portal/pgp_ess/news/ess_news_detail?id=112774074&pg_id=2417&cc=ESTAT_EUROSTAT

https://webgate.ec.europa.eu/fpfis/mwikis/sdmx

36Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

Training courses on SDMX

SDMX basics (for statisticians and IT staff)Held at Eurostat. Aimed at people in charge of managing SDMX-based transmission and dissemination of data and metadata.

SDMX advanced course (for IT developers)Held at Eurostat. Targeted at IT developers and proposed in two versions:JAVA programmers

.NET programmers

ESTP course on “Advanced technologies for data collection and transmission” External

37Eurostat – Directorate B: Corporate statistical and IT servicesSDMX Basics Training – 2013

For more information

http://www.sdmx.org (SDMX web site)

https://webgate.ec.europa.eu/fpfis/mwikis/sdmx (Eurostat Info Space)

[email protected] (General info on SDMX)

[email protected] (Eurostat implementation projects)

[email protected]