escwa sdmx workshop
DESCRIPTION
ESCWA SDMX Workshop. Session: SDMX and Data. Session Objectives. At the end of this session you will: Know the SDMX model of a data structure definition Understand the techniques to identify the structure of data Identify the concepts in a simple data set - PowerPoint PPT PresentationTRANSCRIPT
ESCWA SDMX Workshop
Session: SDMX and Data
Session Objectives
• At the end of this session you will:– Know the SDMX model of a data structure
definition– Understand the techniques to identify the
structure of data– Identify the concepts in a simple data set – Be able to develop simple data structure
definitions using SDMX tools
Data Set
Data Set: Structure
Data Set Structure
• Computers need to know the structure of data in terms of:– Concepts– Code Lists– Dimensionality– Additional metadata
First: Identify the Concepts
• A concept is a unit of knowledge created by a unique combination of characteristics (SDMX Information Model)
Unit Multiplier
Unit
Topic
Time/Frequency
CountryStock/Flow
Data Set Structure: Concepts
Data Set Structure: Code Lists
Code Lists
TOPIC
A Brady Bonds
B Bank Loans
C Debt Securities
AR Argentina
MX Mexico
ZA South Africa
COUNTRY STOCK/FLOW
1 Stock
2 Flow
CONCEPTS
Topic
Country
Flow
Concepts
16457
Q,ZA,B,1,1999-06-30=16457
Data Makes Sense
Data Set Structure: Defining Multi-dimensional Structures
• Comprises– Concepts that identify the observation value– Concepts that add additional metadata about the
observation value– Concept that is the observation value– Any of these may be
• coded• text• date/time• number• etc.
Dimensions
Attributes
Measure
Representation
Data Set Structure: Concept Usage
Unit Multiplier
Unit
Topic
Time/Frequency
CountryStock/Flow
Observation
(Dimension)(Dimension)
(Dimension)
(Attribute)
(Dimension)
(Dimension)
(Attribute)
(Measure)
has code list
Code List
Code List
AttributesAttributes
concepts that add metadata
has format
concepts that identify groups of keys
concepts that identify the observation
Data Structure Definition
Data Structure Definition
Key Key Group Key Group Key
Dimensions Dimensions
Concept Concept
MeasuresMeasures
CONCEPTS
Topic
Country
Flow
takes semantic
from
has formattakes
semantic
from
takes semantic
from
has format
concepts that are observed phenomenon
TOPIC
A Brady Bonds
B Bank Loans
C Debt Securities
Representation
Coded Coded Non-
coded Non-
coded
16457
Q,ZA,B,1,1999-06-30=16457
Data Makes SenseFrequency,Country,Topic,Stock/Flow,Time=Observation
Quarterly, South Africa, Bank Loans, Stocks, 2nd quarter 1999
Identifying Concepts
• Identifying Concepts - Sources– Existing data set tables
• From website• From applications
– Data Collection Instruments• Questionnaires• Excel spreadsheets
– Regulations, Handbooks, User Guides• Labour Statistics Convention, 1985 (No. 160), Recommendation,
1985 (No. 170)• Council Regulation No: 311/76/EEC of 09/021976; OJ: L039 of
14/02/1976; Compilation of statistics on foreign workers
– Database Tables– Existing Data Structure Definitions
• From other organisations
Identify Concepts – from website
Source: FAO proof of concept project
Measurement = 1,000 Kg
Concepts
Reference Region
Commodity
Frequency and Time
Observation Value
Measure Type
Unit and Unit Multiplier
Measurement = 1,000 Kg
Exercise: Identify Concept Role
Concept Role: Reminder
• Dimensions– Are the concepts that identify the observation value
• Attributes– Are the concepts that add additional metadata about
the observation value
• Measure– Is the concept that is the observation value
Concepts
Reference Region
Commodity
Frequency and Time
Observation Value
Measure Type
Unit and Unit Multiplier
Measurement = 1,000 Kg
Exercise:Concept Role
Reference Region
Commodity
Frequency and Time
Observation Value
Measure Type
Unit and Unit Multiplier
Measurement = 1,000 Kg
(Dimension)(Dimensions)
(Measure)
(Dimension)
(Dimension)
(Attributes)
Data Set and Structure
Dimension Concept
FREQ
REF_AREA_REG
COMMODITY
MEASURE_TYPE
TIME
Measure Concept
OBS_VALUE
Attribute Concept
OBS_STATUS
OBS_CONF
UNIT
UNIT_MULTIPLIER
Identify/Define Code Lists
• Purpose of a Code List– Constrains the value domain of concepts when used
in a structure like a data structure definition– Defines a shortened language independent
representation of the values– Gives semantic meaning to the values, possibly in
multiple languages
• Agreeing on harmonised code lists is the most difficult aspect of defining a data structure definition
Code Lists Required
Source: FAO proof of concept project
Reference Region
Commodity
Frequency Measure Type
Unit and Unit Multiplier
Measurement = 1,000 Kg
Code Lists
Code Lists
Code Lists (CL_)
For Time Series the SDMX Cross Domain Concepts recommend all observations have a status code (Concept = OBS_STATUS) and a confidentiality code (Concept = OBS_CONF)
Data Structure Definition
Data Structure Definition
Data Structure Definition
Key Key Group Key Group Key
Dimensions Dimensions
Concept Concept
AttributesAttributes MeasuresMeasures
takes semantic
from
has format
takes semantic
from
takes semantic
from
has format
has format
concepts that add metadata
concepts that identify the observation
concepts that are observed phenomenon
concepts that identify groups of keys
Representation
Coded Coded Non-
coded Non-
coded
Code List
Code List
has code list
Data Structure Definition - Reminder
CL_FREQCL_AREA_CTYCL_COMMODITYCL_MEASURE_ELEMENT
Data Structure Definition - Agriculture
CL_OBS_STATUSCL_OBS_CONFCL_UNITCL_UNIT_MULT
Data Structure Definition
Data Structure Definition
Key Key Group Key Group Key
Dimensions Dimensions
Concept Concept
AttributesAttributes MeasuresMeasures
AGRICULTURE_COMMODITY
OBS_STATUSOBS_CONFUNITUNIT_MULT
FREQREF_AREA_REGCOMMODITYMEASURE_TYPETIME
OBS_VALUE
Representation
Coded Coded Non-
coded Non-
coded
Code List
Code List
© Metadata Technology
SDMX and Data Formats
Exercise: Identify Concepts
Identifying Concepts
• Identifying Concepts - Sources– Existing data set tables
• From website• From applications
– Data Collection Instruments• Questionnaires• Excel spreadsheets
– Regulations, Handbooks, User Guides• Labour Statistics Convention, 1985 (No. 160), Recommendation,
1985 (No. 170)• Council Regulation No: 311/76/EEC of 09/021976; OJ: L039 of
14/02/1976; Compilation of statistics on foreign workers
– Database Tables– Existing Data Structure Definitions
• From other organisations
Identifying Concepts
• Identifying Concepts - Sources– Existing data set tables
• From website• From applications
– Data Collection Instruments• Questionnaires• Excel spreadsheets
– Regulations, Handbooks, User Guides• Labour Statistics Convention, 1985 (No. 160), Recommendation,
1985 (No. 170)• Council Regulation No: 311/76/EEC of 09/021976; OJ: L039 of
14/02/1976; Compilation of statistics on foreign workers
– Database Tables– Existing Data Structure Definitions
• From other organisations
Exercise: Identify Concepts – from collection instrument
Source: UNESCO Institute for Statistics
Data Entry - Table 2.1
Source: UNESCO Institute for Statistics
Data Entry - Table 2.2
Source: UNESCO Institute for Statistics
Identifying Concepts
• Identifying Concepts - Sources– Existing data set tables
• From website• From applications
– Data Collection Instruments• Questionnaires• Excel spreadsheets
– Regulations, Handbooks, User Guides• Labour Statistics Convention, 1985 (No. 160), Recommendation,
1985 (No. 170)• Council Regulation No: 311/76/EEC of 09/021976; OJ: L039 of
14/02/1976; Compilation of statistics on foreign workers
– Database Tables– Existing Data Structure Definitions
• From other organisations
Exercise: Identify Dimension Concepts – from website
Source: International Labor Organisation
Identify Concepts: Table 2A
Source: International Labor Organisation
Identify Concepts: Table 2B
Source: International Labor Organisation
Identify Concepts: Table 2C
Source: International Labor Organisation
Identify Concepts: Table 2D
Source: International Labor Organisation
Identify Concepts: Table 2E
Source: International Labor Organisation
Dimension Concept
Identify Concepts: Table 2A
Reference Area
Sex Time Period Frequency
Measure Type
Identify Concepts: Table 2B
Economic Activity
Measure Type
Identify Concepts: Table 2C
OCCUPATION
Measure Type
Identify Concepts: Table 2D
Status in Employment
Measure Type
Identify Concepts: Table 2EMeasure Type
Exercise: Identify Concepts – from collection instrument
Source: UNESCO Institute for Statistics
Time
Reference Area
Dimension Concepts - Tables 2.1/2.2
Source: UNESCO Institute for Statistics
Education Level
Sex
Institution Type
Measure Type
Work Mode
Programme Orientation
© Metadata Technology
Labor Statistics: Data Structure Definition(Incomplete)
Dimension Concept Representation
Frequency (FREQ) CL_FREQ
Reference Area (REF_AREA) CL_REF_AREA
Education level (EDUC_LEVEL) CL_EDUCATLVTYP
Sex (SEX) CL_SEX
Programme Orientation (PROG_ORIENTATION)
CL_PROG_ORIENTATION
Institution Type (INSTITUTION_TYPE) CL_INSTITUTION_TYPE
Work Mode (WORK_MODE) CL_WORK_MODE
Measure Type (MEASURE_TYPE) CL_MEASURE_TYPE
Time (TIME) Date/Time
Measure Concept Representation
Observation Value (OBS_VAL) Numeric
Education Statistics : Data Structure Definition (Incomplete)
Attribute Concept Assignment Status
Attachment Representation
Observation Status (OBS_STATUS)
M(andatory) Observation CL_OBS_STATUS
Observation Confidentiality (OBS_CONF)
C(onditional) Observation CL_OBS_CONF
Unit (UNIT) M Series CL_UNIT
Unit Multiplier (UNIT_MULTIPLIER)
M Series CL_UNIT_MULT
Education Statistics : Data Structure Definition (Incomplete)
Identify Concepts from User Guide