ecuador - adp | accelerated data...
TRANSCRIPT
Content
I. Executive Summary ............................................................................................. 2
II. Section 1: DDI Assessment .................................................................................. 3
III. Section 2: NADA Assessment ............................................................................. 4
IV. Section 3: Status on Dissemination ..................................................................... 5
Data Access and User Management .................................................................................................................. 5
Status on the Dissemination Policy & Statistics Act ...................................................................................... 6
V. Section 4: Institutional Progress & Relation with Users .................................... 7
Estimated Budget of Documentation & Dissemination of a Survey in INEC .......................................... 7
VI. Section 5: Key Performance Indicators ............................................................... 9
VII. Section 6: Innovation & Data Revolution ......................................................... 10
VIII. Section 7: Conclusions & Recommendations ................................................... 13
IX. Section 8: Looking Forward .............................................................................. 14
X. ANNEXES ......................................................................................................... 15
Annex 1: Score by metadata category for INEC’s NADA Catalogue DDIs ........................................... 15
Annex 2: Summary of ADP implementation in ECUADOR ..................................................................... 15
Executive Summary
The Accelerated Data Program (ADP) was implemented in Ecuador through the national statistics office or the
Instituto Nacional de Estadística y Censos (INEC). The process of engaging the country began in 2010, through the Andean
Community meetings and other awareness activities, to later formally implement the program in 2011, initiating the
first documentation.
Figure 1: Timeline of the Implementation of ADP in the Country
Since its inception, the ADP funded 4 specific country interventions in direct country level support (See Annex 2). The
specific activities break down as follows:
Documentation of statistical operations at the NSO, using the DDI & Dublin Core Standards
Extending the documentation process into the National Statistical System agencies
Setting up a Central National Data Archive to disseminate metadata & anonymized microdata
The INEC has demonstrated it has great interest in moving into an electronic government and it has initiated an
ambitious plan to certify quality of the microdata and metadata produced in the National Statistics System (NSS)
included in the Code of Best Practices in the national statistical program approved by the National Statistics in 2013
based on the Statistical Production Model that the INEC is implementing. It remains at the forefront of developing
approaches to administrative records and big data. Ecuador is among the countries evaluated and ranked third mainly
because they are still in the process of implementing the new statistical production model, which includes, as a
mandatory task, the documentation of statistical projects using the DDI & Dublin Core Standards. They have a vast
coverage of statistical projects documented (129 statistical studies) that cover almost all the INEC’s statistical
production, and efforts have been made to keep expanding and improving data access and transparency as well as
expanding the DDI & Dublin Core Standards of documentation in the National Statistics System, since they were
successful in internalizing the activities that the ADP promotes.
The INEC has fully institutionalized the documentation processes and allocated the documentation process to the
design of statistical operations. The INEC estimates that they spend approximately USD 1801.67 per documented
survey of its own budget to maintain the process. The INEC is considered to be a mature country program in terms of
the documentation and dissemination process and it is able to adapt to the new data environment pro-actively. The
implementation of these activities resulted in more than 179 Surveys documented by December 2015 with over 50%
of projects having microdata available to download directly online (available on the National Data Archive - NADA).
It is also important to acknowledge the INEC’s initiative regarding new approaches to managing mobile phones data
(big data) and administrative records. Ecuador is the forth country in the region, besides Mexico, Costa Rica and
Colombia, that has, through their National Statistical Program, introduced a mandatory mandate in which the curation
process, the terminology and standard processes for generating official statistics are established to formally adopt the
National Data Archive (NADA) & the DDI/Dublin Core Standard in all Statistical Metadata production in the
• INEC joins the ADP Initiative & adopts DDI & DC Standards
2010
• Documentation of 94 statistical operations with coverage of 6 years in INEC
2011 • NADA launched with
94 projects documented.
• Regional Training Workshop on NADA 4.0
2013
• DDI Expansion Workshop for NSS agencies
2014 • DDI & DC Standard and NADA as the official dissemination tools for the NSS within INEC's Quality Certification Initiative for the NSS
2015
National Statistics System, as part of the Statistics Production Model, giving the legal basis for the program’s
sustainability.
Section 1: DDI Assessment
The DDI is a metadata standard adapted by the World Bank. It was originally intended to form a common standard
for researchers to exchange information on research projects. For this reason, the custodian of the standards remains
in large part within the ICPSR. The tools developed for the assessment evaluate the main metadata elements and are
designed to check completeness and coherence of the metadata such that a researcher would be able to capitalize on
the available information.
The main areas evaluated in the DDI Checklist are:
1. Description of the document: Key descriptive elements to define the document
2. Description of study: Overview, Coverage, Producers & Sponsors, Sampling, Data Collection, Data Processing,
Data Appraisal, Data Access.
3. Description of the data file & Variables: Content, producer, version, literal questions text, universe, variable labels
and categories, methods of derivation and imputation and confidentiality, etc.
4. External Reference Materials: Using the Dublin Core Standard is a summarized description of Questionnaires,
coding information, technical and analytical reports, interviewer manuals, data processing and analysis
software, photos, maps, etc.
5. Resources related to the study: Metadata, Citation and Use, IHSN Catalogue.
Table 1 provides a summary of the scores of the National Data Archive by July 2015 of INEC:
Table 1: DDI Assessment Summary of the NADA Catalog of the INEC – ECUADOR
RESULTS OF ECUADOR SCORE
Regional Ranking Third
Number of statistical projects in NADA 129
Average Score 67.4 %
Highest Score 93.2 %
Lowest Score 37.6%
Critical Categories Data Files; Variable; Metadata, Citation And Use; IHSN Catalog
Difficulty to Access Data 50% of projects don’t have microdata available.
Main Findings • No databases were documented in 64 projects
• The documentation is not available in PDF format
• Only 26 projects are available in the Central IHSN Catalog
For more information on the scores in a disaggregated form please refer to the Annex 1.
The INEC ranked third in the regional ranking of the assessment achieving an overall quality score of 67.4% in terms
of their published metadata. Nonetheless, there were 4 specific areas that presented issues: Data Files, Variables,
Metadata Citations and use, IHSN Catalog. The main information gaps in the documentation was found at a variable
level in each dataset, because the dissemination of certain datasets was not approved and so the documentation of
those datasets was not provided either. A similar situation applies to the description of the Datasets and Citations.
Finally, several DDIs found in the INEC’s NADA Catalog aren’t present in the Central IHSN Catalog.
Table 2: DDI Assessment Summary of NADA Catalog of the INEC – Ecuador
ADP Implementation in ECUADOR by July 2015 NSO – INEC NSS TOTAL COUNTRY
Number of Surveys Documented (Available on NADA) 129 0 129
Number of Microdata available in NADA 65 0 65
Figure 2 plots the individual scores ranked from highest to lowest. The pattern can give some indication as to the
reasons for decreasing quality. The detail can be seen below with the INEC’s review scores breakdown (See circled
grouping).
Figure 2: Assessed studies by score
(Under/Above of average score and the Minimal DDI Quality)
Section 2: NADA Assessment
The assessment of the NADA catalog site was conducted by applying a Comprehensive Review of the NADA, which
consists of 41 items grouped into 7 categories. The evaluation was conducted on specific aspects about: Configuration
and Visibility, Registration, Search and Filters, Data Dissemination Policy, Data Access, User and Citations, and
Innovations (see Annex 2). Table 3 shows the average of each of the seven categories that grouped the 41 questions
that assess the content and operation of the NADA catalog site. Because the maximum score for each category is
different, scores are expressed as a percentage of the total possible score for each category.
Table 3. Score of NADA assessment by category
NADA Category Score (%)
TOTAL SCORE 53.3%
NADA visibility 45%
NADA Registration 75%
Search and Filters 75%
Data dissemination policy
0%
Data access 100%
Metadata and Citations
0%
Innovation 0%
Days to respond 100%
67.4
75
25.0
40.0
55.0
70.0
85.0
100.0
0 20 40 60 80 100 120 140
Sco
res
Studies
Score by study Average score Minimal DDI Quality
41% of the DDIs show that they
comply with the minimal
requirement to be considered high
quality documentation (above 75)
whereas 59% don’t meet this
minimal standard and the
metadata quality
needsimprovement.
This notorious gap is due to the
lack of documented datasets at a
variable level. Ecuador can
improve all documentation
published by completing the
metadata in the datasets, since this
category represents a high weight
in the DDI assessment.
53.3% 45%
75% 75%
0%
100%
0% 0%
100%
0.0%
20.0%
40.0%
60.0%
80.0%
100.0%
120.0%
Ecuador: NADA assessment Score by category
The overall score achieved by the INEC’s NADA Catalog was 53.3%. The areas that presented the most problems
were Visibility of the Website for absence of a specific tab for policies and procedures as well as Contact information,
Citations and Innovation. Furthermore, since the data dissemination policy is still in the process of drafting, is not
found on the INEC’s webpage. Nonetheless, there is a Code of Best Practices available that explains, in general, the
Statistical Process Model adopted in the INEC, which reinforces the use of the NADA Catalog as the preferred
dissemination tool and formally adopts the use of the DDI and Dublin Core Standards in the metadata production of
the National Statistics System in Ecuador and promotes anonymized microdata access online, through the promotion
of an e-government. The categories in which the INEC excelled in the NADA assessment were Data Access and the
number of days to respond to a user’s request.
Section 3: Status on Dissemination
Data Access and User Management
In July 2015, the date on which the assessment of statistical projects contained in the NADA catalog from INEC of
Ecuador was conducted, there were 129 published studies, which were the subject of this assessment. It is important
to mention that the INEC Ecuador is one of the NSOs that provides microdata access to a vast number of datasets
(50%) and shares the first position with Peru in this particular category of the assessment.
Table 4: Users Statistics in Ecuador’s Central NADA Catalogue
Number of registered users (active and non-active) in NADA 985
Number of Active registered users 515 Number of deactivated users (blocked) 465 Number of Licensed Requests (total) - Number of pending licensed requests up to 24/10/2015 - Number of citations on-line - Number of surveys with data unavailable in-country but available elsewhere (DHS, MICS) 2
The NADA catalog of the INEC from Ecuador has only three types of data access: direct data access, public use files,
and no data available, while the licensed data access and data enclave types of access are not currently in use. The
Direct Data Access and the Public Use Files can be downloaded easily into an appropriate format, once the user has
completed the registration. There are other two options not in use currently, licensed data files that require users to be
registered and to fill out a data request form, accepting the terms and conditions of the Agreement to data access; if
the request is granted, the system immediately shows a confirmation message and a link is provided to access the data.
For data enclave or available via an access request at the headquarters of the INEC, the NADA catalog should show
information describing the process and the conditions under which such consultation should be carried out and
provide an email account to make the user’s specific request. Although this option is not detailed in the NADA, it is
stipulated in the mandate of the Code of Best Practices within the Statistics Production Model of the INEC – Ecuador.
Figure 3: Studies by type of data access
Overall, of the 129 studies, 50% the microdata is available and 49.6% of the microdata is not available. This is higher
than any other country in the assessment with the exception of Peru, mainly because there is a formal approach from
the INEC to have all data accessible and online in order to achieve an electronic government in the future.
Nonetheless, almost half of the datasets are not available, mainly due to the anonymization process of several datasets
that is still in process and strict transfer agreements for exchange of information, which must be fully justified.
Just recently (December 2015) a couple of datasets from the NSS became available in the NADA, since the NSS just
started the documentation process at the end of 2014. The INEC needs to sign specific agreements with each NSS
institution in order to publish their microdata, even though a Central NADA Catalog is in place and each agency has
the capacity to manage data access of its statistical projects individually. There are also other ways to access the data by
requesting a consultation or processing a query, but for confidential datasets, especially in economics, the datasets are
not documented and they are not available to download.
Status on the Dissemination Policy & Statistics Act
The dissemination policy and its Regulation has not yet been officially approved by the NSO (INEC) board of
directors. The approach that the INEC is pursuing is to have a general dissemination policy for all National Statistics
System agencies through guidelines and a Code of Best Practices that certify statistics quality, including the metadata
and data aspect, by establishing norms on standardization and harmonization. This process has been recently approved
and its implementation is being coordinated at the time of this assessment. The INEC has a unit working on the
dissemination policy, the internal regulation currently in place is the “Standard for Statistical Confidentiality and Good
Use of Information”, The INEC Resolution 1. Official Gazette 449. 02-March-2015.
Table 6: STATUS OF DISSEMINATION POLICY & STATISTICS ACT
ECUADOR NSO NSS COMENTS & PROGRESS
Statistics Act allows Microdata Access
YES YES
In December 2012, the National Assembly passed the draft Law on Statistics, which will replace the existing legal body in place since 1976. It states that the INEC’s main function is to coordinate the development of the National Policy of Statistics and the National Statistical Plan, which will be subject to the approval of CONACES (National Council of Statistics). The issue of Statistical Secrecy is maintained and the statistical information generated by public agencies will become part of the National Information System upon certification by the INEC or the Central Bank of Ecuador, according to their nature. Free access to statistical information is assured, provided it is delivered digitally. The cost of reproduction, where applicable, will be
47%
3.10%
49.60%
Percentage of studies by type of data access
Direct data access
Public use files
Data not available
regulated by the Law. In a period not exceeding 120 days from the enactment of the Act, the Executive shall issue the implementing regulation. On September 23, 2013, the National Statistical Program was approved by the National Statistics Council; instrument containing guidelines that guide the entities subject to the National Statistical System in statistical research activities in order to meet the information needs for national planning.
Is the Statistics act available on line (provide link)
NO NO Only available as the Statistics Act of 1976
Dissemination Policy allows Microdata Access
YES YES The Dissemination policy that will be approved once the Draft Law of Statistics is approved which will allow free microdata access digitally, shared at a level of National Statistics System.
Is the dissemination policy available on line? Provide link
NO NO In its place, there is only an internal regulation “Resolución Nro. 022 DIREJ-DIJU-NI–2012” that provides data access guidelines.
Regulation of Dissemination Policy includes Calendar of Publications, Type of Users, Type of Data Access per survey, NADA & DDI Standard
YES NO
In its place, there is only an internal regulation “Code of Best Practices” that provides guidelines regarding data access, but includes explicitly the DDI/Dublin Core Standard and the NADA Catalog in its recommendations.
Other channels to Microdata Access available
YES Through explicit request to the INEC headquarters.
% Data Accessible through NADA Catalog
50% 1% The dissemination policy states that all microdata disseminated must be anonymized. This process is still ongoing for most economic statistics.
Comments and Progress of Dissemination Policy (if not approved/updated)
60% 60% The INEC has completed 60% of the Dissemination Policy & the Regulation draft. For the NSS agencies, the process will start after INEC’s coordination is concluded.
Section 4: Institutional Progress & Relation with Users
In the INEC, the survey budge at the NSO includes documentation of microdata, while there is a separate process
manual for the data documentation and a specific unit monitors the data dissemination as well as the metadata quality,
which is also assigned to microdata documentation and NADA Catalogue management. Within this specialized unit,
there are staff members who have the expertise to facilitate data documentation workshops by themselves, and they
also have expertise to maintain and update the NADA Catalog independently. The INEC has staff proficient in the
use of most IT applications related to data management and Statistical Packages with the exception of Geospatial Data,
but they don’t budget for training on the use of any of these applications. There have been at least 2 international
training workshops for NSO staff members in which the INEC has participated besides one local workshops for 20
staff members from the INEC and an awareness conference for 25 staff members of the NSS agencies to introduce
and showcase the NADA Catalog.
The documentation of microdata is irregular and not scheduled at the same time as the execution of the data collection.
Nonetheless, the documentation and dissemination processes are fully institutionalized for each statistical project
within a year, since this is part of the job description of each technician involved in the process who works
permanently in the agency. Regarding user management, the INEC keeps records of the registered users and
communicates with them. The channel to get feedback from users regarding its data dissemination is done generally
through email, although the INEC has other channels to contact users. There are not, however, specific procedures to
consider user's feedback for future survey design, or improving the data dissemination system. As at September 2015,
there are 985 users registered in the NADA Catalog and, of those, 515 are active users.
Estimated Budget of Documentation & Dissemination of a Survey in INEC
Based on the INEC’s experience and time spent in the documentation and dissemination process of a standard survey,
the estimated average to document the survey is approximately USD 1,801.67. Similarly a budget pertaining to the
implementation of international standards DDI & Dublin Core in the National Statistical System (NSS) during 2016
was estimated to reach USD 131,522.26 to document 73 Statistical Operations in 28 NSS institutions. This estimation
includes IT staff for cleaning datasets; analysts to use IHSN manager tools; processing staff, validation and publication
of information and an expert in management metadata and a DDI local trainer. Specifically:
USD 5,480.14 for training provided by the INEC and installation of the Microdata Management Tools
USD 65,761.08 for monitoring and documentation accompanying
USD 55,896.96 for quality control of metadata documented
Table 7: Estimated Budget of Documentation & Dissemination of a Survey in the INEC (in USD)
INEC’s Curation Estimated Budget Yes/No Amount USD Comments
Is there a dedicated line in the NSO budget for documentation?
No
Is there a dedicated amount in each survey for documenting?
No
Are consultants hired to document the surveys using internal sources?
No
Are there dedicated personnel that document as part of their Job Description? If Yes, How many and the amount of time dedicated.
Yes 884 USD Coordination & Training for 1 week to be ready to document a survey aprox. 114 hrs.
Yes 243 USD Full time staff to document a survey for a dataset of 78 variables, aprox. 32 hrs.
Yes 332 USD Internal review of survey metadata and validate it with the head project, 40 hrs.
Yes 210 USD Publication of DDI by IT staff, 20 hrs.
Are there staff who review the DDIs? How many and time dedicated?
Yes 84 USD The ADP Coordination Unit who has full time dedication to DDIs validation 8 hrs.
How much does a survey metadata costs? 1,753 USD Total cost of documenting a 78 variable survey
Since institutionalization is one of the most important aspects to evaluate the effectiveness of the activities, a SWOT
analysis was undertaken. On balance, the INEC has fully institutionalized the activities that the ADP promotes, being
capable of ensuring ADP activities sustainability in time. The hardest part will be expanding this standard to all NSS
agencies since their statistical strength is not so developed as in INEC and the time necessary to internalize the tools
within their regular activities will be a challenge.
Figure 5: Strengths and Weaknesses of ADP institutionalization in the Country
Figure 6 indicates the Curation process that takes place in the INEC – Ecuador. A survey is documented until its
release is approved, and the documentation process is in charge of the generating areas of the data, going through two
STRENGHTS:
• The documentation process is handled internally as part of the Job Description of the staff involved.
• There's a designated Unit in charge of coordinating ADP & NADA activities in the INEC
• The documentation & dissemination activities are budgeted for each year within the INEC.
• The general level of motivation and personnel capacity is high since the internalization of the ADP has been achieved in the production areas of statistics.
• The INEC formally adopted DDI & Dublin Core and recognizes the NADA Catalog as the preferred dissemination plataform
WEAKNESSES:
•Potential overcommitment with too many new initiatives that harm the integrity of data management.
•There is no formal dissemination policy, limiting microdata access.
•There is a great deal of follow up and monitoring time that needs to be done to expand the ADP activities in the NSS sustainably.
•The excesive work load of staff members might relegate the documentation tasks.
•There are no funds for regular DDI training
OPPORTUNITIES:
• The Draft Law of Statistics will provide the legal basis to expand the DDI Standard into all the NSS agencies.
• The issue of data accesibility has become more relevant since the SDGs, giving the INEC a window of opportunity to expand the ADP activities to all NSS agencies.
• INEC Ecuador is considering requesting to be considered an OECD member and has initiated the procedures to comply with the requirements.
THREATS:
•The implemenetation of parallel metadata standards is not compliant with international guidelines.
•The addtional time needed to process external training for staff members of any NSS agency requires approval by central goverment which may be a disincentive.
•There isn't a statistical unit in each NSS agency, which complicates the ADP extension.
•The rotation of staff in different agencies forces INEC to re-train staff constantly to maintain the program.
Internalization of Metadata Management Process (Data Curation)
Sust
ain
abili
ty o
f th
e A
DP
stages of validation. The first stage of validation is conducted by the DDI supervisors within the INEC to verify
compliance with the DDI & Dublin Core Standard and a second validation by the Head of the statistical operation to
check concepts, descriptions and accuracy of the documentation. The final stage is the publication of the metadata in
the NADA Catalog the responsibility of which lies in the IT Area of the INEC.
DATA CURATION PROCESS IN INEC –�ECUADOR
DOCUMENTATION
(Generating Areas)
REVIEW & VALIDATION(DDI supervisors)
RELEASE(Head of the Survey)
PUBLICATION(IT Area)
PRO
CESS
BY
STAF
F IN
VO
LVED
IN D
OCU
MEN
TIN
G A
SU
RVEY
START
Designation of the person in charge of
documenting the survey within the generating areas of the statistical
project
Compiles or modify the documentation of the survey, datasets and reference materials
1
2
Validates the documentation using the Nesstar Publisher tools
Generates a first draft of the PDF file to check
spelling and commposition
Request the validation of the DDI supervisors of the survey, annexing a
PDF file.. The Head of the generating area validates
and submits the approved files for
dissemination.
Review, Validates and complements the
survey documentation
Is the documentation
correct?
NO
YES
Upload and/or modifies the documentation on the preproduction site of NADA and closes the
process with the project manager
Review the documentation of the
survey
Is the documentation
correct?
NO
YES
Once the metadata is approved, generate the XML and RDF files and
submit them to National Data Archive
(NADA Catalogue)
Publication of the metadata in NADA
Catalogue
Notify all areas involved in the
survey
END
DDI Supervisors train staff,
before, during and after the
documentation
Figure 6: Curation process of ADP activities in INEC – Ecuador
Section 5: Key Performance Indicators
Ultimately, the objective of the assessment is to evaluate the performance of the country based on key performance
indicators (KPIs) constructed by taking 4 dimensions into account: Metadata Quality, Relationship with Users,
Productivity and Institutional Management. Each dimension is composed of 3 KPIs, were each category is measured
with a maximum score of 8.33. After running the assessment on all four dimensions, the INEC ranked third in the
regional ranking of the quality assessment achieving an overall quality score of 77.25% in their published metadata in
their National Data Archive.
It is worth noting that metadata has a Web presence, hence, poor quality documentation and poor performance of the
National Data Archive website results in misuses of information, generating confusion and misinterpretations by the
user. For a detailed breakdown of the performance indicators scores, refer to Figure 7.
Figure 7: Key Performance Indicators: ECUADOR
Section 6: Innovation & Data Revolution
In the context of the country's statistical preparation for the post-2015 development agenda, especially for the SDGs, a
questionnaire was administered to the NSO which was composed of two parts. In the first part, each of the GSBPM
(see Figure 8) phases and sub-processes were listed and the NSOs were asked to indicate which processes posed the
most challenges to their organisation, and which were those where support from external agencies and donors is
needed. In the second part, questions were more focused on IT technologies that are considered important for the
Data Revolution, asking the NSO to indicate which specific technologies required most support.
The main purpose of this questionnaire was to gain a detailed understanding of the problems faced by countries in the
full life-cycle of statistical data (and metadata) processing. The survey is based around the processes and sub-processes
of the Generic Statistical Business Process Model (GSBPM). These processes cover the phases of design and building
surveys, and the collection, analysis and dissemination of data. The information collected from NSOs were analyzed to
see how new technological innovations, existing available software solutions and other global standards can be used to
help overcome these problems.
The comments of the INEC on each process that presents problems are detailed below:
Design:
o Design outputs: There are no specialized systems and tools to design the products to be distributed.
o Design Variable Descriptions: There is no accurate documentation that supports compliance with a
prior design of variables or indicators to be generated, since the design and the construction is done
at the same time.
5.58
3.62
4.41
4.38
6.94
2.78
Metadata Quality
Metadata Qualityabove 60
National DataArchive
Response to users
% Data Acccessible
Days Website isonline
Agents trained persurvey
Number of surveysadded last month
Total Number ofsurveys added
Institutional Index
ADP inclusion inNSO budget
Workshops organizedby NSO
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
Key Performance Indicators: ECUADOR (Each category over 8.33)
Max.Score (8.33) Ecuador
77% Average KPI
Score of Ecuador
THE BEST:
Days Website is Online
Appropriation of ADP
Organization of training
Workshops
Response to users
Agents trained per survey
Updating within a year
the NADA Catalogue
THE WORST:
- 50% of Data not accessible.
-A considerable amount of
studies don’t comply with
quality score above 60
-No regular upload of
surveys in NADA
- Institutional Index reduced
mainly due to lack of user
management
o Design Collection: New methods of data collection are not designed nor evaluated, given there is no
specialized staff for this purpose, but there is innovation in the design of the physical or digital
questionnaires or the application of new technologies of information and communication.
o Design processing and analysis: There are no specialized personnel and efficient programs for
designing a validation grid. There's lack of knowledge in the procedures that may be performed for
the imputation of information.
Build:
o Build or enhance process components: There are problems identifying the necessary improvements
to the processing components.
o Build or enhance dissemination components: There is no expertise to identify how to perform an
effective dissemination of the statistical operations information.
o Test production system: There are inconveniences in doing an effective test pilot that allows INEC
to properly test all systems that intervene in routines such as critic & coding.
o Finish production system: More training is needed for staff members regarding how to implement
the process of statistical production.
Collect:
o Set up collection: There are drawbacks in planning times for collection, in addition, training the
interviewers is not always sufficiently effective. It is important to receive assistance on how to
socialize and raise awareness in the target population effectively so the research responds to the
objectives set.
o Finalize collection: Novel methods of automatic data entry involving both typing, synchronization
and extraction.
Process:
o Classify & Code: A better use of classifications and knowing coding systems to automated entry,
enabling a reduction in the execution times of statistical operations.
o Edit and impute: Training is required to know what imputation methods of information are
accepted and valid for statistical offices.
o Finalize Datasets: How to do the dissemination controls to the final dataset, resulting from the
information processing.
Analyze:
o Interpret and explain outputs: Taking into account that INEC only does three types of information
analysis (descriptive, comparative and evolutional), it’s important to get to know other types of
analysis that can be made to processed information that serves to strengthen the statistical
operation.
o Apply disclosure controls (anonymization): Know what other control mechanism of dissemination
can be applied to the dataset under analysis, given there are different kinds of controls between the
stages of processing, analysis and dissemination. All this to avoid breaching the confidentiality of
information.
Disseminate:
o Update Output Systems: Determine which types of formats for data and metadata are more
practical and user friendly.
o Produce dissemination products: Determine how to perform proper control of compliance of the
rules of publication.
o Manage release of dissemination products: Identifying whether to generate a dissemination policy or
a microdata access or confidential data for groups of authorized users such as researchers. Is this a
sufficient instrument to ensure access, or what other types of tools would allow compliance with
this process?
o Manage User Support: Determine how to provide support to the user of statistical information,
taking into account that it is a process of policies, procedures and methods.
The INEC is capable of offering support to other countries/agencies in the region. The INEC, as coordinator of
statistics, may provide technical assistance to the entities that comprise the National Statistical System or external
producers of Statistics. With this background, the INEC has provided support in field work operatives, has played an
advisory and counselling role in pilot tests, has provided support in planning prior to collection and in construction of
sampling frames and samples.
Figure 8: The Generic Statistical Business Process Model (GSBPM)
Specify Needs
1.1 Identify needs
1.2 Consult & Confirm
needs
1.3 Establish output
objectives
1.4 Identify concepts
1.5 Check data
availability
1.6 Prepare business
case
Design
2.1 Design outputs
2.2 Design variable
descriptions
2.3 Design collection
2.4 Design frame & sample
2.5 Design processing &
analysis
2.6 Design production systems & workflow
Build
3.1 Build collection instrument
3.2 Build or enhance process
components
3.3 Build or enhance
dissemination components
3.4 Configure workflows
3.5 Test production
system
3.6 Test statistical
business process
3.7 Finalise production
system
Collect
4.1 Create
frame & select
sample
4.2 Set up
collection
4.3 Run collection
4.4 Finalise
collection
Process
5.1 Integrate data
5.2 Classify & code
5.3 Review & validate
5.4 Edit & Impute
5.5 Derive new variables
& units
5.6 Calculate weights
5.7 Calculate aggregates
5.8 Finalise data files
Analyse
6.1 Prepare
draft outputs
6.2 Validate outputs
6.3 Interpret
& explain outputs
6.4 Apply
disclosure control
6.5 Finalise outputs
Disseminate
7.1 Update output systems
7.2 Produce dissemination
products
7.3 Manage release of
dissemination products
7.4 Promote dissemination
products
7.5 Manage User support
Evaluate
8.1 Gather
evaluation inputs
8.2 Conduct
evaluation
8.3 Agree an action
plan
QUALITY MANAGEMENT /METADATA MANAGEMENT
DATA CURATION PROCESSES
Section 7: Conclusions & Recommendations
•During the metadata quality assessment, only a few gaps in documentation were identified in the following sections: sampling, data collection and data processing, mainly due to instructions on the field description not entirely accurate, or because the documentation of these elements in the metadata template isn't mandatory.
•The documentation of the data access policy is not consistent with the criteria of the assessment template in the following elements: official policies and procedures to disseminate data, structure to citation of datasets and the copyright statement, resulting in low scores in the assessment of this category.
•Documenting different time periods of the same project in differentiated studies is not exactly an error but causes the NADA Catalog to show duplicate documentation even when the conceptual design and structure of datasets has not changed over time.
•The default settings of INEC's NADA catalog don't show elements that were documented in the metadata editor, such as: Questionnaires description, geographic unit and ending date in time period.
•A recommendation is not to use the same documentation for labels and literal questions.(e.g. administrative records).
•For 64 projects, the datasets or documentation or structure of databases were omitted when it has been decided that the microdata doesn't have public access available. Even if the dataset is not available, it's metadata should be present and available in the NADA Catalog.
•42 studies were identified that do not correspond to projects of basic statistics. This leads to inconsistencies on documentation and metadata gaps because the metadata templates were created to document basic statistical projects such as Census, Surveys and administrative records data only.
Objective 1. Quality Assessment of metadata published in INEC's NADA Catalogue
•There is a Draft Law of Statistics passed on December 2012 that allows free access to statistical information, provided it is delivered digitally, and establishes the INEC’s main function which is to coordinate the development of the National Policy of Statistics and the National Statistical Plan, and was approved on Sept. 23, 2013.
•The dissemination policy and its Regulation hasn’t been officially approved by the NSO (INEC) board of directors. The internal regulation currently in place is Standard for Statistical Confidentiality and Good Use Of Information. INEC Resolution 1. Official Gazette 449. 02-March-2015.
•The dissemination policy and its Regulations should be available on the website of the National Data Archive.
Objective 2. Status on the Statistics Act and Dissemination Policy of Microdata
•During the INEC’s assessment of the National Data Archive - NADA it was evident that the Website layout is different in relation to other NADAs from other regions, since it lacked content on some sections that should be present such as: mission and purposes, citations, policies and procedures, innovations and contacts, causing low scores on the NADA assessment.
•The NADA Catalog will render greater benefits when it is formalized as the preferred method of dissemination for the National Statistics System, even though it is already working for INEC as an official institutional internal regulation.
•50% of the microdata is accessible through the NADA Catalog. This score is considerably high in relation to other countries, although for the other 50% there is no data access or the anonymization of datasets is still in progress.
•The inclusion of a new category of "other access to data" in the classification of Data Access is necessary in the next version of the NADA Catalog, since the data access types available don't reflect all form of access to data in INEC.
Objective 3. Assessment of the funtionality of NADA Catalogue and quality of the service to microdata users.
•Sustainability of the activities of ADP is assured as it depends on three fundamental aspects that INEC has successfully implemented:
•a) The inclusion of the tasks of documentation and dissemination within the job description of the staff and the budget of the institution.
•b) A plan of internal training in the use of Microdata IHSN tools managed by the INEC's training area as demanded based on the presentations and tools provided by the ADP. There is no funding for more external trainings in INEC, although there is an individual training budget for each staff member.
•c) INEC has initiated an awareness conferences with NNS agencies and users to showcase the NADA. It's important to expand the outreach activities between users and producers to establish a relationship and take into account the user's feedback to improve survey design and outputs.
Objective 4. ADP activities sustainability and User engagement
Section 8: Looking Forward
In an effort to consider the NSO’s needs in setting priorities for the PARIS21 Agenda post-2015, each country was
asked to fill out a form stating the priority of 15 products that PARIS21 is considering to develop.
The results for the INEC are presented in Figure 9, where SDMX and Innovation & Data Visualization take the first
place, followed by NSDS Data Module & SDGs, Harmonization of Data of different sources, Dissemination Policy
Design and User Management, Data Curation Manuals and Big Data. By offering assistance in the first 3 topics, 63%
of Ecuador’s demand would be covered.
Figure 9: ECUADOR's Prioritization of PARIS21 Activities
Accounting for the country’s expressed needs in the survey drafted by the ‘Informing a Data Revolution Initiative’, the
main areas where specific technical needs were identified are Big Data sources, techniques and guidelines through a
regional Sandbox supported by UNECE Sandbox, to grant INEC the availability of tools and methods to produce
Official Statistics from Big Data sources. The INEC is currently developing a process map, while it is conducting a
comprehensive study of administrative reorganization, which will conclude in 2016.
30%
17% 17% 14% 11% 11%
46%
63%
77%
89%
100%
0%
20%
40%
60%
80%
100%
120%
Innovation Inventory,Data Visualization,Nodal Analysis forData Production &Portal management,
SDMX
NSDS Data Module &SDGs
Harmonizing Data Dissemination Policy,Managin Users & Data
Use
Data Curation Manual& Statistical Revisions
Big Data & MobilePhones Data
ECUADOR's PRIORITIZATION OF PARIS21 ACTIVITIES
% PRIORITY BY TOPIC % PRIORITY BY TOPIC ACCUMULATED
ANNEXES
Annex 1: Score by metadata category for INEC’s NADA Catalogue DDIs
Category Score (%) Main comment for improvement
IDENTIFICATION 97.5
The reference year should not appear in the subtitle
VERSION 76.6
It is suggested that the filling condition to document version of data files changes to mandatory
OVERVIEW 98.3
Avoid documentation that does not contribute to the understanding or use of data
COVERAGE 100
It is necessary to modify the filling instructions to clearly establish differences between coverage and geographical unit
PRODUCERS AND SPONSORS 100
-
SAMPLING
86.1
It is necessary to modify the filling instructions to clearly establish criteria for targeted sampling and sampling frame for administrative records data
DATA COLLECTION
60.9
It is necessary to modify the filling instructions to document information about pilot testing, conformation of enumeration team and conceptual basis for data collection formats
DATA PROCESSING 82.5
It is necessary to modify the filling instructions to document information about kind of data entry and software used to edit data.
DATA APPRAISAL 96.9
-
DATA ACCESS
76.7
It is necessary to modify the filling instructions to document better information about data access policies, format to citations and copyright statements
DATA FILES 43.6
Incorporate and document the structure of the database, even if the files are not available for download
VARIABLE 42.0
Incorporate and document the structure of the database, even if the files are not available for download
EXTERNAL RESOURCES 1 95.3 The labels should be clear and precise, avoiding duplicate
documentation of literal questions and labels.
EXTERNAL RESOURCES 2 91.0 Questionnaires must be integrated in all projects
METADATA, CITATION AND USE 50.2 Verify that the links of external repositories for downloading files
work properly. Favoring downloading in PDF format.
IHSN CATALOG 50.4 Ensure that the version of the NADA allows download
documentation in PDF and XML.
Annex 2: Summary of ADP implementation in ECUADOR
ADP implementation in ECUADOR up to July/2015 NSO – INEC
NSS – SEN TOTAL
COUNTRY
Year ADP Started 2010 2014 2008 - 2015
Number ADP interventions in the Country 3 1 4
Metadata Production Workshops 1 1 2
National Data Archive (NADA) 1 0 1
User Outreach Workshops 0 0 0
Regional Workshops 1 0 1
Number of Institutions Trained in Data Documentation 1 12 13
Number of Institutions attending User Outreach Workshops 0 0 0
Number of persons attending ADP training events 20 56 76
Number of women trained in documentation 32
Number of Surveys Documented (Available on NADA) 129 0 129
Number of Microdata available in NADA 65 0 65
Average score: 67.4%