uganda – october 2009
DESCRIPTION
Uganda – October 2009. Census Data Collection & Processing John Gomersall. Agenda. Census data collection – Enumeration Processing – Data Capture phase Decisions on the methods for both. Chronologically, enumeration comes before data capture & we will consider them in that order. - PowerPoint PPT PresentationTRANSCRIPT
Uganda – October 2009
Census Data Collection & Processing
John Gomersall
AgendaAgenda
Census data collection – EnumerationCensus data collection – Enumeration
Processing – Data Capture phaseProcessing – Data Capture phase
Decisions on the methods for bothDecisions on the methods for both
Chronologically, Chronologically, enumerationenumeration comes before comes before data capturedata capture
& we will consider them in that order.& we will consider them in that order.
However, decisions on one will inevitably influence theHowever, decisions on one will inevitably influence the
other. Most obvious example of this is the design of the other. Most obvious example of this is the design of the
Questionnaire.Questionnaire.
What is possible & what is practical? What is possible & what is practical?
- cost, infrastructure, skills - cost, infrastructure, skills
Choose a Method for enumerationChoose a Method for enumeration
Options
Field based Remote
Choose a Method for enumerationChoose a Method for enumeration
Options
Field based Remote
Enumerator Self Internet / Telephone
Choose a Method for enumerationChoose a Method for enumeration
Options
Field based Remote
Enumerator Self Internet / Telephone
Paper PDA
1ST decision made !!
Data captureData capture
Data captureData capture
Enumeration
Questionnaire Design
Including:
Census Preparations
CSProCensus Results
Data capture ?Data capture ?
Data capture ?Data capture ?
Not so good news ……
No panacea !
Good news …… technology can help
& the choices of technology – are relatively few.
(Not to be confused with the potential number of suppliers of these technologies - which may be many !)
Choose a Method for enumerationChoose a Method for enumeration
Data Capture Methods - Sub Saharan Africa
Field based
Enumerator
Paper
Pen Pencil Pen & Pencil
Second decision ???
Choose a Method for Data CaptureChoose a Method for Data Capture
Data Capture Methods Sub Saharan Africa
Field based / Enumerator / Paper
Checking / Coding (?) / Organising
Data Capture
Keying Scanning + Keying
Considerations:Considerations:
Timeframe for producing resultsTimeframe for producing results Anticipated condition of Questionnaires Anticipated condition of Questionnaires Form preparation requirementsForm preparation requirements Accuracy ??Accuracy ??
If Questionnaires are really bad, If Questionnaires are really bad,
&/or sufficient low cost keying capacity is available and &/or sufficient low cost keying capacity is available and manageable to meet acceptable timescalesmanageable to meet acceptable timescales ……… keying may be the best method !……… keying may be the best method !
Keying or Scanning + Keying ?
Choose a Method for Data CaptureChoose a Method for Data Capture
Data Capture Methods Sub Saharan Africa
Field based / Enumerator / Paper
Checking / Coding (?) / Organising
Data Capture
Keying Timescales / Questionnaire “Preparation” OK
Scanning + Keying
Decision made !!!
Scanning + Keying Options Scanning + Keying Options 44
Scanning + Keying Options Scanning + Keying Options 33
ICR
OMR
Main Pro Characteristics:Main Pro Characteristics:
OMROMR
Quickest, simplest , most accurate data capture method for well completed forms
Enumerator only needs to mark with HB Pencil + Eraser
Appropriate level of technology for current IT resources
Cost competitive solution – known / one time
ICRICR
Questionnaire design & Printing
Potential for technology transferPotential for technology transfer
5
Ethiopia 2007Ethiopia 2007
Sudan 2008 Sudan 2008
MALAWI 2008
Scanning + Keying Options Scanning + Keying Options 22
The alternatives:The alternatives:
In both cases, an image of the form will be captured, In both cases, an image of the form will be captured, and the information which has been filled on the and the information which has been filled on the questionnaire will be questionnaire will be automatically collected. automatically collected. usingusing
OMR (Optical Mark Reading) & / orOMR (Optical Mark Reading) & / or
ICR (Intelligent Character Recognition)ICR (Intelligent Character Recognition)
in either case, this will be followed by: in either case, this will be followed by:
validation, and key correction of “irregularities”validation, and key correction of “irregularities”
So, scanning is not the worry !!
So what is the worry ?So what is the worry ?
The worries are that:The worries are that:
having captured the images - there may be problems with having captured the images - there may be problems with extracting the dataextracting the data from those images, from those images,
and following from thisand following from this
integrity of the dataintegrity of the data – is the form in the correct EA; is all data for a – is the form in the correct EA; is all data for a household together.household together.
So what are the considerations – how do we minimise the worry?So what are the considerations – how do we minimise the worry?
Data Capture Methods Sub Saharan Africa
Field based
Paper
Enumerator Data Capture
Keying Scanning + Keying
Scanning
Extracting the data
Priorities:Priorities:
Identify an appropriate data capture solution – Identify an appropriate data capture solution – hardware + software.hardware + software.
Identify an appropriate data capture solution for you. That Identify an appropriate data capture solution for you. That may be:may be:
Procuring Tools – if you have experienced in-house software Procuring Tools – if you have experienced in-house software development expertise, development expertise,
oror Partnering with an experienced supplier with a customised, tested Partnering with an experienced supplier with a customised, tested
software product, with effective on-site / remote support.software product, with effective on-site / remote support.(Kenya – with USBC, Ethiopia, Sudan, Malawi – with DRS, Egypt, (Kenya – with USBC, Ethiopia, Sudan, Malawi – with DRS, Egypt, Morocco)Morocco)
Basically, Basically, – the software needs to work?the software needs to work?– this means that in addition to OMR / ICR recognition – does it this means that in addition to OMR / ICR recognition – does it
provide elements for batch control, logic validation, key correction, provide elements for batch control, logic validation, key correction, quality control? quality control?
– Will the required support & guidance be available? Will the required support & guidance be available?
There are partners out there who have the technology!There are partners out there who have the technology!
PrioritesPriorites
Implement effective processes & logistics Implement effective processes & logistics
Identify an appropriate data capture solution.Identify an appropriate data capture solution.
Implement effective processes & logistics:Implement effective processes & logistics:
Enumerator Training & Job definitionSupervisor Training & Job definitionQuestionnaire flowChecking & Batching – by EAPreparation for Scanning – SeparationDefine scanning checksDefine scanning checks
In good time -In good time -process a comprehensive Pilotprocess a comprehensive Pilot
PrioritiesPriorities
1.1. Make it easy for the enumerator to do a Make it easy for the enumerator to do a good jobgood job
2.2. Implement effective processes & logistics Implement effective processes & logistics
3.3. Identify an appropriate data capture Identify an appropriate data capture software product and supplier.software product and supplier.
3. IT
2. CSO
1. ENUMERATORS & SUPERVISORS
Make it easy for the enumerator to do a good job.Make it easy for the enumerator to do a good job.
Unfortunate realities ! Unfortunate realities !
Scanning cannot make a “bad form” goodScanning cannot make a “bad form” good
Bad forms will make any scanning technology look Bad forms will make any scanning technology look ineffectiveineffective
The message to remember !The message to remember !
The main concern should be the training of Enumerators The main concern should be the training of Enumerators and Supervisors to provide Questionnaires in good and Supervisors to provide Questionnaires in good condition & which have been well completed --- then the condition & which have been well completed --- then the scanning will take care of itself !scanning will take care of itself !
Questionnaire Considerations:Questionnaire Considerations:
Paper – EnvironmentPaper – EnvironmentSizeSizeScanningScanning
DesignDesignBarcode/unique identifierBarcode/unique identifier
Ease of completionEase of completion
The The singlesingle most important factor for most important factor for accurate data capture is to make sureaccurate data capture is to make sure
‘‘the forms are filled in correctly the forms are filled in correctly & are returned in good condition’& are returned in good condition’
Data Capture
Considerations – Data typesConsiderations – Data types
Data Capture
Tick box data
Constrained handwritten data
Unconstrained handwritten data
Barcode & Unique printed number
It is easier to train 30,000 enumerators to mark in pencil
than it is to train them to correctly write,
& accurately position characters – in ink!
5
Data Capture Methods - Sub Saharan Africa
Field based
Enumerator
Paper
Pen Pencil Pen & Pencil
PrioritiesPriorities
1.1. Make it easy for the enumerator to do a Make it easy for the enumerator to do a good jobgood job
2.2. Implement effective processes & logistics Implement effective processes & logistics
3.3. Identify an appropriate data capture Identify an appropriate data capture software product and supplier.software product and supplier.
Data Capture Methods Sub Saharan Africa
Field based
Paper
Enumerator Data Capture
Keying Scanning + Keying
Scanning
Extracting the data
Validation & Key Correction
Validation & Key Correction IValidation & Key Correction I
The scanning process will capture what the enumerators have filled in on the Questionnaires – mistakes included!
The purpose of running the validation is to identify where the enumerator has not completed the Questionnaire according to the rules set out in the Enumerator’s manual, and to flag those errors for potential key correction.
These errors will include those questions that are: expected to have been filled in but are blank partially marked multi marked
& a key correction screen something like this& a key correction screen something like this
Operator enters “2” as it can be clearly seen that the correct age is 27
Validation & Key Correction IIValidation & Key Correction II
Following validation and verification/correction, all captured data will be exported to CSPro, or other processing software.
The end of the beginning !
Data Capture
Summary/RecommendationsSummary/Recommendations
– Choose the method that best suits your organisation/infrastructure Choose the method that best suits your organisation/infrastructure and is the best fit to your requirementsand is the best fit to your requirements
– If choosing a new method plan well ahead, consult with others & If choosing a new method plan well ahead, consult with others & test with a pilot test with a pilot
– Plan for your staffing and logistics Plan for your staffing and logistics
– If you wish to use your data spatially you will need to geo-reference If you wish to use your data spatially you will need to geo-reference your datayour data
– Choose a partner with experience and one that will support your Choose a partner with experience and one that will support your processprocess
Thank you for listening