the 6th conference on survey sampling in economic and social research september 21-22, 2009...

35
The 6th Conference on The 6th Conference on Survey Sampling in Economic and Social Research Survey Sampling in Economic and Social Research September 21-22, 2009 Katowice, Poland September 21-22, 2009 Katowice, Poland Criticalities in Applying the Neyman’s Optimality in Business Surveys: a Comparison of Selected Allocation Methods Paola M. Chiodini a,d , Rita Lima c , Giancarlo Manzi b,d , Bianca Maria Martelli c, *, Flavio Verrecchia d [email protected] a. Department of Statistics, Università di Milano-Bicocca, Milan, Italy b. Department of Economics, Business and Statistics, Università degli Studi di Milano, Milan, Italy c. ISAE, Rome, Italy d. ESeC, Assago (MI), Italy

Upload: hunter-jordan

Post on 14-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

The 6th Conference on The 6th Conference on Survey Sampling in Economic and Social Research Survey Sampling in Economic and Social Research

September 21-22, 2009 Katowice, PolandSeptember 21-22, 2009 Katowice, Poland

Criticalities in Applying the Neyman’s Optimality in Business Surveys: a Comparison

of Selected Allocation Methods

Paola M. Chiodini a,d, Rita Lima c , Giancarlo Manzi b,d, Bianca Maria Martelli c,*, Flavio Verrecchia d

[email protected]

a. Department of Statistics, Università di Milano-Bicocca, Milan, Italy b. Department of Economics, Business and Statistics, Università degli Studi di Milano, Milan, Italy c. ISAE, Rome, Italy d. ESeC, Assago (MI), Italy

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

2

DISCUSS POSSIBLE MORE EFFICIENT SAMPLE DESIGNS FOR THE ISAE BUSINESS TENDENCY (BTS) SURVEY

– BTS Economic features

– BTS Statistical features

– Operational bounds

TO MEET EVERYBODY’S NEEDS WHILE STRENGHTENING OUTCOMES RELIABILITY (INDUSTRIAL CONFIDENCE)

AIM OF THE PAPERAIM OF THE PAPER

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

3

BTSBTS ECONOMIC FEATURES ECONOMIC FEATURES

• Business Tendency Surveys investigate CONFIDENCE of economic agents

• CONFIDENCE can be defined as the (positive) attitude of economic agents toward both firms’ (internal) and country’s (external) variables– Corresponding Universe real value unknown

• To this purpose BTS collect information about a wide range of variables selected for their capability, when analysed together, to give an overall picture of industrial sector of the economy (OECD 2003)

• The survey ask entrepreneurs and managers assessmentsassessments on current trends and expectationsexpectations for the near future regarding both their own business and the general situation of the economy

• Business Tendency Survey thus collect qualitativequalitative information, mainly with a three options ordinal scale

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

4

BTSBTS ECONOMIC FEATURES ECONOMIC FEATURES

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

5

CONFIDENCECONFIDENCE

• Answers obtained from the survey are quantified in form of “balances”“balances” , i.e. differences between positive and negative answers’ percentages

• The statistical series derived from business tendency surveys are particularly suitable for monitoring and forecasting business cycles

• The aggregation of selected series (order book level, production expectations and stock) gives the confidenceconfidence indicator

• Confidence indicators (and some single series too) often have leading capabilitiesleading capabilities and are widely used in the analysis of the economic cycle (recessions/expansions)

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

6

SHORT SURVEY HISTORYSHORT SURVEY HISTORY

• The manufacturing survey began 1959 on a quarterly basis and became monthly 1962 on a limited number of questions (purposive panel)

• During the years the survey was broadly modified to meet upcoming occurrences: – 1986 the sample was updated in order to provide information

also a regional level adopting a stratified (sector/region/size) partially random sample

– 1998 the Neyman’s optimal allocation of the reporting units to sample strata based on workforce variance was introduced (Cochran 1977)

– 2003 data processing was upgraded introducing a two-stage weighting system (sample weights and size weights) according to OECD (2003) able to assure a fully fledged comparability between local and national data

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

7

GDP and CONFIDENCEGDP and CONFIDENCE

• Confidence well fit the GDP shifts • In recent times (since April 2009) positive signals from the survey

(last available GDP figures Q II 2009: very negative)

-6

-4

-2

0

2

4

6

60

65

70

75

80

85

90

95

100

105

GDP (t-4 % ch lhs) Confidence (index, 2000=100, s.a. rhs)

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

8

EUROPEAN REFERENCE FRAMEEUROPEAN REFERENCE FRAME

• The Survey is part of the Joint Harmonised Business and Consumer Survey (BCS) program of the European Commission

• The project began 1962 and ISAE (formerly ISCO) was one of the founder member

• The principle of harmonisation underlying the project aims to produce a set of comparable data for all European countries (EC 2007)

• To achieve this goal institutes have to: – Use the same harmonised questionnaire– To strictly respect the Commission timetable in carrying on the

survey and transmitting the results

Institutes are relatively free to define any other aspects of the entire process (apart from a minimum sample size)

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

9

• FRAME : ASIA archive of Italian active firms (last update 2006): + complete universe of firms – relatively late update

BTS Statistical features: SAMPLE DESIGNBTS Statistical features: SAMPLE DESIGN

• QUESTIONNAIRE: fixed by Commission. Can only be integrated

• DATA COLLECTING MODE: CATI (Computer Aided Telephonic Interviewing), partly integrated with fax (foreseen some CAWI):

Keep ASIA as FRAME

MIXED MODE

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

10

OPERATIONAL CONSTRAINSOPERATIONAL CONSTRAINS

– EC: • recommended SAMPLE SIZESAMPLE SIZE about 40004000 units (firms/kind

of activity units), bound to the country population size

• Very strict TIMING CONSTRAINTS:

– MONTHLY FREQUENCY, – 12 DAYS DATA COLLECTION– 1 WEEK PROCESSING RESULTS

– NATIONAL: LOCAL INFORMATIONLOCAL INFORMATION• Governmental priority• Possible revenues

– ISAE: PRESERVING “LOYAL” FIRMSPRESERVING “LOYAL” FIRMS: • Research purposes of longitudinal analyses• Conflicting with sampling theory (Panel rotation)

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

11

BTS STATISTICAL FEATURES BTS STATISTICAL FEATURES

As the total sample size is predetermined (about 4000 units), to increase precision is then mainly possible to work on:

– Strata definition (partially predetermined and bound to economic and administrative settings)

– Units’ allocation to StrataUnits’ allocation to Strata– Panel maintenance– Non response handling– Weighting

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

12

STRATA DEFINITIONSTRATA DEFINITION

STRATA defined according to: • ECONOMIC SECTORS

– 19, nearly EC requests, adapted to Italian economy

• AREAS (NUTS1) – 4, administrative classification, widely different in size

• FIRMS’ SIZE (by workforce)– Small (10-49 ), Medium (50-249) , Large (>=250). Distribution is

right (positively) skewed because of the presence of few “large” establishments and many “small” units

• Minimum threshold of 10 employees – About 80% of total workforce

FIRMS BY STATA

Nord Ovest Nord Est Centro Sud e Isole Total

10-49 50-249 250 &+ 10-49 50-249 250 &+ 10-49 50-249 250 &+ 10-49 50-249 250 &+

10-12. Manufacture of food, beverages and tobacco products 1496 233 56 1715 277 41 1082 90 12 1856 175 12 7045

13. Manufacture of textiles 1584 342 55 575 88 10 1047 82 4 272 32 3 4094

14. Manufacture of wearing apparel 1472 140 23 1856 151 23 1230 102 9 1298 96 6 6406

15. Manufacture of leather and related products 318 46 1 893 125 12 2053 141 9 625 52 3 4278

16-17. Manufacture of wood and paper products 1239 154 17 1404 168 15 860 89 10 750 44 3 4753

18. Printing and reproduction of recorded media 966 75 10 740 60 5 505 33 1 317 21 . 2733

19. Manufacture of coke and refined petroleum products 32 9 7 16 5 . 19 7 4 78 7 4 188

20-21. Manufacture of chemical and pharmaceutical products 680 304 90 328 111 14 227 67 32 221 27 1 2102

22. Manufacture of rubber and plastic products 1511 292 42 939 190 15 546 93 5 449 61 5 4148

23. Manufacture of other non-metallic mineral products 938 126 18 1265 226 49 857 111 15 1204 104 1 4914

24. Manufacture of basic metals 652 208 41 275 119 12 160 35 6 147 34 4 1693

25. Manufacture of fabricated metal products, except machinery and equipment 6199 622 41 4528 438 28 1949 175 11 1941 219 11 16162

26. Manufacture of computer, electronic and optical products 717 145 28 439 97 18 260 57 15 113 24 5 1918

27. Manufacture of electrical equipment 1050 206 35 808 161 29 362 54 15 194 27 4 2945

28. Manufacture of machinery and equipment n.e.c. 3243 692 82 2830 595 104 739 121 8 523 60 3 9000

29-30. Manufacture of transport vehicles 699 183 84 396 95 30 313 70 13 237 85 20 2225

31. Manufacture of furniture 927 95 4 1730 242 19 931 96 7 527 63 6 4647

32. Other manufacturing 619 82 12 719 111 8 506 36 4 185 6 1 2289

33. Repair and installation of machinery and equipment 1450 81 6 979 46 2 657 31 4 849 62 3 4170

Total 25792 4035 652 22435 3305 434 14303 1490 184 11786 1199 95 85710

TOTAL WORKFORCE BY STATA

Nord Ovest Nord Est Centro Sud e Isole Total

10-49 50-249 250 &+ 10-49 50-249 250 &+ 10-49 50-249 250 &+ 10-49 50-249 250 &+

10-12. Manufacture of food, beverages and tobacco products 27893 23432 40923 31525 28191 32670 18935 8486 7621 32959 16070 5503 274208

13. Manufacture of textiles 32739 33020 26112 10710 7955 6052 18756 6527 1551 5062 2917 905 152306

14. Manufacture of wearing apparel 25793 13141 16920 32975 14787 11170 22033 8594 3734 24100 7785 3805 184838

15. Manufacture of leather and related products 5503 4124 386 17289 12460 4817 36840 10993 6341 11089 4621 1071 115532

16-17. Manufacture of wood and paper products 22614 15000 13008 25842 16333 7002 15504 7683 5124 12912 4316 1876 147213

18. Printing and reproduction of recorded media 17487 7378 4488 13377 5689 2794 8674 2984 2383 5279 1549 . 72081

19. Manufacture of coke and refined petroleum products 707 1334 4508 372 510 . 465 746 2259 1360 473 3419 16152

20-21. Manufacture of chemical and pharmaceutical products 14955 32746 58762 7149 11242 8071 4343 7667 32378 4106 2811 261 184492

22. Manufacture of rubber and plastic products 30241 27601 28723 19022 18094 5443 10691 8381 2428 9049 6394 3247 169315

23. Manufacture of other non-metallic mineral products 17815 12261 14026 24414 23525 28662 15779 10672 6612 21637 9113 2218 186734

24. Manufacture of basic metals 13577 21680 46749 5770 13307 8195 3262 3637 5906 2986 3352 1818 130240

25. Manufacture of fabricated metal products, except machinery and equipment 112440 55834 17454 84656 39819 12084 34463 15080 4608 35425 19366 3622 434851

26. Manufacture of computer, electronic and optical products 14252 14888 30018 8887 9793 9102 4951 5539 13192 2110 2744 4238 119713

27. Manufacture of electrical equipment 20542 20544 31418 16787 16037 22000 6837 5834 15590 3759 2206 2206 163760

28. Manufacture of machinery and equipment n.e.c. 64091 68542 45279 57244 58170 62515 14439 11284 7747 10022 5096 1063 405491

29-30. Manufacture of transport vehicles 14322 19757 113600 8376 9871 30848 6087 7599 10836 4722 8721 33591 268329

31. Manufacture of furniture 16196 9189 1547 32435 21947 7379 16910 8348 3290 9853 5070 4874 137038

32. Other manufacturing 11120 8104 3685 13714 10307 13520 8991 3137 1794 3199 612 326 78509

33. Repair and installation of machinery and equipment 24766 7353 4053 16932 3820 723 11118 2579 5974 15474 5411 1436 99639

Total 487054 395926 501659 427476 321858 273047 259079 135768 139368 215102 108625 75479 3340442

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

15

• 10 - 49

FIRMS POPULATION BY SIZE

• 50 - 249• 250 - • Total

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

16

SIMULATION

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

17

UNIT ALLOCATION TO STRATA: UNIT ALLOCATION TO STRATA: SIMULATIONS SETTINGSSIMULATIONS SETTINGS

• REFERENCE POPULATION: ASIA INDUSTRIAL SECTOR– 85710 ENTERPRISES– 3040422 PERSONS EMPLOYED

• 3 DIMENSIONS– AREAS (NUTS1)– ECONOMIC SECTORS – FIRMS’ SIZE

• 226 STRATA

• 500 REPLICATES

• SIMULATION TECHNIQUE: SEQUENTIAL UNIT SELECTION

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

18

UNITS ALLOCATION TO STRATA:UNITS ALLOCATION TO STRATA:ALTERNATIVE ALLOCATION METHODSALTERNATIVE ALLOCATION METHODS

• UNIFORM (21 units per stratum)

• PROPORTIONAL (fh 4,4%)

• NEYMAN (x-optimal)

• ISAE (NEYMAN x-optimal on areas; winsorised 5%)

• AOSU(n1h): UNIFORM(n1h) + NEYMAN(n2h)– n1h= 1, 2, … , 21– n2h= nh-n1h – so that:

• n1h= 0 then AOSU0 = NEYMAN• n1h= 21 then AOSU21 = UNIFORM

• APSU(n1h): UNIFORM(n1h) + PROPORTIONAL(n2h)

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

19

UNIT ALLOCATION TO STRATA:UNIT ALLOCATION TO STRATA:SIMULATION METHODSIMULATION METHOD

START

RANDOM UNIT SELECTION

(SEQUENTIALY RANKED)

REPLICATION

Simulation

DW

If replicate < 500• If replicate = 500•

Allocation

MethodsNeyman samples

ISAE samples

AOSU(n1) samples

OU

TP

UT

EN

D

OVERALL

STATS

DOMAIN

STATS

INF

ER

EN

CE

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

20

Distribution of Replication (Total workforce)Total workforce)

Neyman ISAEAOSU3AOSU9UNIFORMPROP.

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

21

OVERALL POPULATION

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

22

REPLICATION BOX PLOT (Total workforce)Total workforce)

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

23

STATISTICS

• Bias = N – N r

• Total Error (TE) = |Bias| + N r

• Relative Total Error (RTE) = TE / N r

• Range = max(N r) - min(N r)

• Where: : Population mean–

r : Replication mean–

r : Replication STD– N : # Enterprises

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

24

UNIT ALLOCATION:UNIT ALLOCATION:StatisticsStatistics

|BIAS| STD TE RANGE

isae 283 22158 22441 124361

neyman 135 21337 21472 114837

aosu1 520 21648 22168 128626

aosu3 922 22253 23176 126329

aosu9 345 23568 23914 129781

uniform 141 60956 61096 326455

apsu3 6370 123260 129630 648787

proportional 11093 177073 188166 1017149

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

25

REPLICATION BOUNDED BOX PLOT (Total workforce)Total workforce)

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

26

BOUNDED UNIT ALLOCATION: UNIT ALLOCATION:StatisticsStatistics

Bound:

• Max 50% allocation per strata

• Minimum 3 unit per strata

|BIAS| STD TE RANGE

aosu3 1513 46444 47957 256410

aosu9 2724 46362 49086 269084

aosu24 (i.e uniform) 803 59688 60491 362685

apsu3 6462 123244 129706 644813

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

27

DOMAIN ANALYSIS

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

28

STRATA COVERAGE

AOSU, UNIF, PROP:• 0 strata with 0%

allocation

NEYMAN:• 12 strata with 0%

allocation

ISAE:• 7 strata with 0%

allocation

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

29

STRATA STATISTICS

• CVs = rs / rs

• Biass = s – rs

• Total Errors (TEs) = |Biass| + rs

• Relative Total Errors (RTEs) =

TEs / rs = (|Biass| / rs) + rCVs

Where: s : Strata population mean–

rS : Strata replication mean–

rS : Strata replication STD

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

30

STRATA BOX PLOT: |Bias| by strata (|Biass|)

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

31

STRATA BOX PLOT: CV of replication means by strata (rCVs)

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

32

STRATA BOX PLOT: Relative Total Error by strata (RTEs)

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

33

UNIT ALLOCATION TO STRATA:UNIT ALLOCATION TO STRATA:StatisticsStatistics

RTEMax

(|BIASs| / rs)

Max

(rCVs)

Max

(RTEs)

isae 0,0067 0,0315 0,5664 0,5979

neyman 0,0064 0,0315 0,5664 0,5979

aosu1 0,0066 0,0244 0,4250 0,4491

aosu3 0,0069 0,0202 0,2775 0,2778

aosu9 0,0072 0,0141 0,1549 0,1624

uniform 0,0183 0,0226 0,4042 0,4152

apsu3 0,0388 0,0582 1,0052 1,0094

proportional 0,0563 0,1033 1,6645 1,6713

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

34

CONCLUDING REMARKS AND OPEN QUESTIONSCONCLUDING REMARKS AND OPEN QUESTIONS

Strata allocation: best proposal seem to be:

Overall population: Neyman

Domain analysis: Approach based on Neyman and strata representativeness constraints

The AOSU(n1) family

ISAE

They allow to strike a balance between theory and practical They allow to strike a balance between theory and practical constraintsconstraints

September, 21-22 2009, 6th Conference “Survey Sampling in Economic and Social Research “ , Katowice, Poland

35

THANK YOU FOR YOUR ATTENTION!