sumandro_the_art_of_nsso_data_5thelephant_20120728

21
The art of NSSO data sumandro chattapadhyay ajantriks.net | @ajantriks

Upload: sumandro

Post on 28-Nov-2014

305 views

Category:

Self Improvement


0 download

DESCRIPTION

Presentation made at The Fifth Elephant conference organised by HasGeek in July 2012, on structure of data published by the National Sample Survey Office, Govt of India, and its extraction using R.

TRANSCRIPT

Page 1: sumandro_the_art_of_nsso_data_5thelephant_20120728

The art of NSSO data

sumandro chattapadhyayajantriks.net | @ajantriks

Page 2: sumandro_the_art_of_nsso_data_5thelephant_20120728

MPCEQuintile

Class

Per Capita Floor Area (Sq. Mt.)

Pucca Semi-Pucca Katcha All

0-20 5.84 5.03 4.32 5.6320-40 6.72 7.37 5.16 6.75

40-60 7.98 7.85 6.88 7.96

60-80 10.13 9.32 6.50 10.09

80-100 16.83 14.70 20.15 16.83

All 9.77 6.55 4.97 9.45

Page 3: sumandro_the_art_of_nsso_data_5thelephant_20120728

HistoryConceptsData Organisation

Structure

Page 4: sumandro_the_art_of_nsso_data_5thelephant_20120728

HistoryConceptsData Organisation

Structure

Page 5: sumandro_the_art_of_nsso_data_5thelephant_20120728

1862 Statistical Committee constituted, publication of thefirst Statistical Abstract of British India (1840-65)

1881 First Decennial Population Census begins

1914 Directorate of Statistics established, later became the Directorate of Commercial Intelligence and Statistics

1939 Wholesale Price Index collection and calculation begins

History

Page 6: sumandro_the_art_of_nsso_data_5thelephant_20120728

1947 P.C. Mahalanobis appointed as the HonouraryStatistical Advisor

1949 The Central Statistical Unit established

1951 Central Statistical Organisation (CSO) and Department of Statistics are established as nodal national data gathering institutions. Presently CSO is part of the Ministry of Statistics and ProgrammeImplementation

History

Page 7: sumandro_the_art_of_nsso_data_5thelephant_20120728

HistoryConceptsData Organisation

Structure

Page 8: sumandro_the_art_of_nsso_data_5thelephant_20120728

Round: Each annual cycle of data collection by NSSO

Schedule: Thematic focus for data collection, multiple schedules per Round

Thick Round: Major data collection rounds repeated every 5 years (hence called quinquennial rounds)

Thin Round: Minor data collection rounds

Concepts

Page 9: sumandro_the_art_of_nsso_data_5thelephant_20120728

State-Region: Usually a cluster of three or more districts in a state

Fixed-Width Data Format: Data files in text format specified by fixed column widths, pad character and left/right alignment.

Schedule File: Questionnaire for the survey concerned

Layout File: Description of organisation of variables

Concepts

Page 10: sumandro_the_art_of_nsso_data_5thelephant_20120728

HistoryConceptsData Organisation

Structure

Page 11: sumandro_the_art_of_nsso_data_5thelephant_20120728

Organisation of Raw Data:

- Fixed-width file (.txt)- Binary coding of information

Supporting Files:

- Schedule file- Layout file- Readme file- State and district codes

Data Organisation

Page 12: sumandro_the_art_of_nsso_data_5thelephant_20120728

Levels:

- Multi-row coding of information about same entity- Binary coding of information

Questions to housholds and individuals:

- Need to generate unique IDs at the household and - at the individual levels- Appropriate weightage (by household size)

Data Organisation

Page 13: sumandro_the_art_of_nsso_data_5thelephant_20120728

Schedule: 1. What is the serial No. Of a person?2. What is his/her age?3. What is his/her daily wage?

Layout: 1. Serial Number : Column 1-3

2. Age : Column 4-53. Daily wage : Column 6-9

Data: 1212121212121234343434343434

Data Organisation

Page 14: sumandro_the_art_of_nsso_data_5thelephant_20120728

Schedule: 1. What is the serial No. Of a person?2. What is his/her age?3. What is his/her daily wage?

Layout: 1. Serial Number : Column 1-3

2. Age : Column 4-53. Daily wage : Column 6-9

Data: 1212121212121234343434343434

Data Organisation

Page 15: sumandro_the_art_of_nsso_data_5thelephant_20120728

Schedule: 1. What is the serial No. Of a person?2. What is his/her age?3. What is his/her daily wage?

Layout: 1. Serial Number : Column 1-3

2. Age : Column 4-53. Daily wage : Column 6-9

Data: 1212121212121234343434343434

Data Organisation

Page 16: sumandro_the_art_of_nsso_data_5thelephant_20120728

Schedule: 1. What is the serial No. Of a person?2. What is his/her age?3. What is his/her daily wage?

Layout: 1. Serial Number : Column 1-3

2. Age : Column 4-53. Daily wage : Column 6-9

Data: 1212121212121234343434343434

Data Organisation

Page 17: sumandro_the_art_of_nsso_data_5thelephant_20120728

Schedule: 1. What is the serial No. Of a person?2. What is his/her age?3. What is his/her daily wage?

Layout: 1. Serial Number : Column 1-3

2. Level: Column 42. Age : Column 5-6 (if level = 2)3. Daily wage : Column 5-8 (if level = 4)

Data: 1212121212121212143434343434

Data Organisation

Page 18: sumandro_the_art_of_nsso_data_5thelephant_20120728

Schedule: 1. What is the serial No. Of a person?2. What is his/her age?3. What is his/her daily wage?

Layout: 1. Serial Number : Column 1-3

2. Level: Column 42. Age : Column 5-6 (if level = 2)3. Daily wage : Column 5-8 (if level = 4)

Data: 1212121212121212143434343434

Data Organisation

Page 19: sumandro_the_art_of_nsso_data_5thelephant_20120728

Schedule: 1. What is the serial No. Of a person?2. What is his/her age?3. What is his/her daily wage?

Layout: 1. Serial Number : Column 1-3

2. Level: Column 42. Age : Column 5-6 (if level = 2)3. Daily wage : Column 5-8 (if level = 4)

Data: 1212121212121212143434343434

Data Organisation

Page 20: sumandro_the_art_of_nsso_data_5thelephant_20120728

Schedule: 1. What is the serial No. Of a person?2. What is his/her age?3. What is his/her daily wage?

Layout: 1. Serial Number : Column 1-3

2. Level: Column 42. Age : Column 5-6 (if level = 2)3. Daily wage : Column 5-8 (if level = 4)

Data: 1212121212121212143434343434

Data Organisation

Page 21: sumandro_the_art_of_nsso_data_5thelephant_20120728

sumandro chattapadhyayajantriks.net | @ajantriks