textile fibre sampling

28
Length-Biased Length-Biased Sampling: Sampling: A Review of A Review of Applications Applications Termeh Shafie Termeh Shafie Department of Statistics Department of Statistics Umeå University Umeå University [email protected] [email protected]

Upload: nirmala-last

Post on 14-Jun-2015

1.847 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Textile Fibre Sampling

Length-Biased Sampling:Length-Biased Sampling:A Review of ApplicationsA Review of Applications

Termeh ShafieTermeh Shafie Department of StatisticsDepartment of Statistics

Umeå UniversityUmeå [email protected]@stat.umu.se

Page 2: Textile Fibre Sampling

OutlineOutline1.1. Length-Biased Sampling & the Length-Biased Sampling & the

Estimation-Problem Estimation-Problem

2.2. Applications & Suggested Applications & Suggested SolutionsSolutions

3.3. Simulation under Misspecified Simulation under Misspecified Sampling Inclusion ProbabilitiesSampling Inclusion Probabilities

Page 3: Textile Fibre Sampling

Length-Biased SamplingLength-Biased Sampling

The probability of sample inclusion of a The probability of sample inclusion of a population unit is related to the value of the population unit is related to the value of the variable measured.variable measured.

Cox (1969): Textile fibre samplingCox (1969): Textile fibre sampling A simple illustration of the problem when A simple illustration of the problem when

estimating the population meanestimating the population mean

Page 4: Textile Fibre Sampling

The Estimation ProblemThe Estimation Problem

Assume there is a population with elementsAssume there is a population with elements

The mean of the population isThe mean of the population is

Nxx ,,1

N

iix

N 1

1

Page 5: Textile Fibre Sampling

The Estimation ProblemThe Estimation Problem

Suppose observations form a Suppose observations form a sample with sample mean sample with sample mean

where where if individual if individual i i is sampledis sampledotherwiseotherwise

nXX ,,1

N

iii xD

nX

1

1

0

1iD

Page 6: Textile Fibre Sampling

The Estimation ProblemThe Estimation Problem

The expected value of the sample mean isThe expected value of the sample mean is

where where

are the inclusion probabilities of the population units.are the inclusion probabilities of the population units.

N

iii

N

iii x

nxDE

nXE

11

1)(

1)(

)1( ii DP

Page 7: Textile Fibre Sampling

The Estimation ProblemThe Estimation Problem

Using simple random samplingUsing simple random sampling

and thusand thus

NnDP ii /)1(

N

ii

N

iii x

N

n

nx

nXE

11

11)(

Page 8: Textile Fibre Sampling

The Estimation ProblemThe Estimation Problem

However in general is unknown and thusHowever in general is unknown and thus

The sample mean becomes a biased estimator The sample mean becomes a biased estimator of the population mean. of the population mean.

i

N

iii xn

XE1

1)(

Page 9: Textile Fibre Sampling

Cox (1969)Cox (1969)

Derived the length-biased or weighted pdf and Derived the length-biased or weighted pdf and looked at the estimation of the population looked at the estimation of the population mean from a length-biased sample.mean from a length-biased sample.

Assume is a random sample with Assume is a random sample with pdfpdf

)(

)(xxf

xg

nXX ,,1

0x

Page 10: Textile Fibre Sampling

Cox (1969) Cox (1969)

It can be shown that It can be shown that

An unbiased estimator of isAn unbiased estimator of is

11

XEg

1

n

i iXn 1

11~1

Page 11: Textile Fibre Sampling

Cox (1969) Cox (1969)

with variancewith variance

Note:Note:

~ ~ NN~

2

11

~1

n

XE

Varg

n

XE 1

1

,

2

Page 12: Textile Fibre Sampling

Cox (1969) Cox (1969)

Relation between the moments of Relation between the moments of g(x)g(x) and and f(x)f(x)::

The relative bias is thusThe relative bias is thus

2

2

1)(XEg

2

2

rb

Page 13: Textile Fibre Sampling
Page 14: Textile Fibre Sampling

2. APPLICATIONS2. APPLICATIONSTechnical/Industrial SamplingTechnical/Industrial Sampling

Cox (1969): Sampling textile fibres and the Cox (1969): Sampling textile fibres and the estimation of fibre length distribution.estimation of fibre length distribution.

Page 15: Textile Fibre Sampling

MarketingMarketing

Shopping Center Sampling & Mall Intercept Shopping Center Sampling & Mall Intercept Surveys:Surveys:

- Keillor et al (2001): Global consumer tendencies.Keillor et al (2001): Global consumer tendencies.- Sudman (1980): Quota sampling techniques and Sudman (1980): Quota sampling techniques and

weighting procedures to correct for frequency bias.weighting procedures to correct for frequency bias.- Nowell et al (1991): correction techniques for length-Nowell et al (1991): correction techniques for length-

biased sampling in two situations; when total length biased sampling in two situations; when total length of stay is known or estimated and when only the of stay is known or estimated and when only the recurrence time is known.recurrence time is known.

Page 16: Textile Fibre Sampling

EpidemiologyEpidemiology

Sampling procedure for the collection of Sampling procedure for the collection of positive-valued or lifetime data are length-positive-valued or lifetime data are length-biased (Simon 1980, Zelen et al. 1969)biased (Simon 1980, Zelen et al. 1969)

Wang (1996): statistical analysis of length-Wang (1996): statistical analysis of length-biased data under proportional hazards model. biased data under proportional hazards model. A pseudo-likelihood approach for estimation A pseudo-likelihood approach for estimation of the parameters from length-biased data is of the parameters from length-biased data is presented.presented.

Page 17: Textile Fibre Sampling

Resource EconomicsResource Economics On-site sampling: On-site sampling: - Deriving demand functions for a recreational site Deriving demand functions for a recreational site

(Bockstael 1990, Ovaskainen et al. 2001)(Bockstael 1990, Ovaskainen et al. 2001)- Charting trip taking behavior (Bowker 1998)Charting trip taking behavior (Bowker 1998)- Travel cost models of recreational demand Travel cost models of recreational demand

(Moons et al. 2001)(Moons et al. 2001)- Contingent valuation surveys for the elicitation of Contingent valuation surveys for the elicitation of

non-market goods (Cameron et al. 1987, Nowell non-market goods (Cameron et al. 1987, Nowell et al. 1988)et al. 1988)

Page 18: Textile Fibre Sampling

Resource Economics Resource Economics

Shaw (1988): Three problems with on-site Shaw (1988): Three problems with on-site samples’ regression;samples’ regression;

1.1. Non-negative integersNon-negative integers

2.2. TruncationTruncation

3.3. Endogeneous StratificationEndogeneous Stratification

Page 19: Textile Fibre Sampling

Resource Economics Resource Economics

Shaw (1988): recreational demand modeling under Shaw (1988): recreational demand modeling under two assumptions about the dependent variable’s two assumptions about the dependent variable’s distribution:distribution:

1.1. Normal distributionNormal distribution

2.2. Poisson distribution:Poisson distribution:

y=1,2,…y=1,2,…)!1(

)exp()0|(

1

yYyYP

y

Page 20: Textile Fibre Sampling

Resource Economics Resource Economics

Englin & Shonkwiler (1995):Englin & Shonkwiler (1995):- The Negative Binomial ModelThe Negative Binomial Model

The truncated, stratified model isThe truncated, stratified model is

y=1,2,…y=1,2,…

)/1(1 )1()/1()1(

)/1()0|(

yyy yy

yyYyYP

Page 21: Textile Fibre Sampling

Resource Economics Resource Economics

Nunes (2003): Binary Choice ModelsNunes (2003): Binary Choice Models

The count variable is described by a Poisson The count variable is described by a Poisson distribution with an unobservable distribution with an unobservable heterogeneity term correlated with the error heterogeneity term correlated with the error term in a probit binary choice modelterm in a probit binary choice model

Page 22: Textile Fibre Sampling

3.3. Misspecification of Sampling Misspecification of Sampling Probabilities: A SimulationProbabilities: A Simulation

Aim:Aim:

To see whether or not the effect of To see whether or not the effect of missepecified sampling probabilities is large missepecified sampling probabilities is large or not…or not…

What happens if time per visit is correlated What happens if time per visit is correlated with frequency of visits when estimating the with frequency of visits when estimating the expected number of visits?expected number of visits?

Page 23: Textile Fibre Sampling

Misspecification of Sampling Misspecification of Sampling Probabilities: A SimulationProbabilities: A Simulation

Time is modeled as a function of frequency of Time is modeled as a function of frequency of visits when estimating the population mean.visits when estimating the population mean.

~ ~ PoissonPoisson

~ ~ ExponentialExponential

~ ~ GammaGamma

The inclusion probabilities are proportional to The inclusion probabilities are proportional to the time spent at the site: the time spent at the site:

iX

it

iT

j

jii TT

iX

iXX ,

Page 24: Textile Fibre Sampling

Misspecification of Sampling Misspecification of Sampling Probabilities: A SimulationProbabilities: A Simulation

The three estimators used for the simulation are: The three estimators used for the simulation are: The sample mean:The sample mean:

Shaw’s estimator:Shaw’s estimator:

Cox’s Estimator:Cox’s Estimator:

11~

n

iiiShaw X

n

n

iíx X

n 1

1~

n

i iCox Xn 1

11~1

Page 25: Textile Fibre Sampling

Simulation ResultsSimulation Results

Sample Sample meanmean

0.689 0.964 1.0580.689 0.964 1.058

(0.481) (0.939) (1.131)(0.481) (0.939) (1.131)

0.780 0.983 1.1180.780 0.983 1.118

(0.656) (1.016) (1.301)(0.656) (1.016) (1.301)

Shaw’s Shaw’s estimatorestimator

-0.311 -0.036 0.058 -0.311 -0.036 0.058 (0.103) (0.011) (0.015)(0.103) (0.011) (0.015)

-0.220 - 0.017 0.118-0.220 - 0.017 0.118

(0.096) (0.050) (0.065)(0.096) (0.050) (0.065)

Cox’s Cox’s estimatorestimator

0.398 0.567 0.6420.398 0.567 0.642

(0.162) (0.327) (0.419)(0.162) (0.327) (0.419)

0.155 0.036 0.1760.155 0.036 0.176

(0.100) (0.081) (0.112)(0.100) (0.081) (0.112)

2 0 2 2 0 2

60,5 10,1

Page 26: Textile Fibre Sampling

SummarySummary

If the probabilities of sample inclusion of If the probabilities of sample inclusion of population units are related to the values of the population units are related to the values of the variable measured, the parameter estimates variable measured, the parameter estimates will be biased and inconsistent. will be biased and inconsistent.

Thus correctly specified sampling Thus correctly specified sampling

inclusion mechanisms should inclusion mechanisms should

not be neglected!not be neglected!

Page 27: Textile Fibre Sampling

ReferencesReferences Bockstael , N.E., Strand, I.E., McConnell, K.E., Arsanjani, F., 1990. Sample Selection Bias in the Bockstael , N.E., Strand, I.E., McConnell, K.E., Arsanjani, F., 1990. Sample Selection Bias in the

Estimation of Recreational Demand Functions:An Application to Sportfishing. Estimation of Recreational Demand Functions:An Application to Sportfishing. Land EconomicsLand Economics, vol.66. , vol.66. No 1,40-49No 1,40-49

Bowker, J.M., Leeworthy, V.R., 1998. Accounting for Ethnicity in Recreation Demand: A Flexible Bowker, J.M., Leeworthy, V.R., 1998. Accounting for Ethnicity in Recreation Demand: A Flexible Count Data Approach.Count Data Approach.Journal of Leisure researchJournal of Leisure research 30(1),64-78. 30(1),64-78.

Bush, A.J, Hair, J.F., 1985. An Assessment of the Mall Intercept as a Data Collection Method. Bush, A.J, Hair, J.F., 1985. An Assessment of the Mall Intercept as a Data Collection Method. Journal Journal of Marketing Researchof Marketing Research 22, 158-67. 22, 158-67.

Cameron, T. A., James, M.D., 1987. Efficient Estimation Methods for "Close-Ended" Contingent Cameron, T. A., James, M.D., 1987. Efficient Estimation Methods for "Close-Ended" Contingent Valuation Surveys. Valuation Surveys. The Review of Economics and StatisticsThe Review of Economics and Statistics 69, 269-276. 69, 269-276.

Cox, D.R., 1969. "Some Sampling Problems in Technology" in Cox, D.R., 1969. "Some Sampling Problems in Technology" in New Developments in Survey Sampling,New Developments in Survey Sampling, U. L. Johnson and H. Smith, eds. New York: Wiley Interscience.U. L. Johnson and H. Smith, eds. New York: Wiley Interscience.

Englin, J., Shonkwiler, J.S., 1995. Estimating Social Welfare Using Count Data Models: An Application Englin, J., Shonkwiler, J.S., 1995. Estimating Social Welfare Using Count Data Models: An Application to Long-Run Recreation Demand under Conditions of Endogenous Stratifications and Truncation. to Long-Run Recreation Demand under Conditions of Endogenous Stratifications and Truncation. Review of Economics and StatisticReview of Economics and Statistic 77, 104-112. 77, 104-112.

Keillor, B.D., D'Amico, M., Horton, V., 2001. Global Consumer Tendencies, Keillor, B.D., D'Amico, M., Horton, V., 2001. Global Consumer Tendencies, Psychology and MarketingPsychology and Marketing 18, 1-19.18, 1-19.

Laitila, T., 1998. Estimation of Combined Site-Choice and Trip-Frequency Models of Recreational Laitila, T., 1998. Estimation of Combined Site-Choice and Trip-Frequency Models of Recreational Demand using Choice-based and On-Site Samples. Demand using Choice-based and On-Site Samples. Economics LettersEconomics Letters 64, 17-23. 64, 17-23.

Moons, E., Loomis, J., Proost, S., Eggermont, K., Hermy, M., 2001. Travel Cost and Time Moons, E., Loomis, J., Proost, S., Eggermont, K., Hermy, M., 2001. Travel Cost and Time Measurement in Travel Cost Models. Measurement in Travel Cost Models. Faculty of Economics and Applied Economic SciencesFaculty of Economics and Applied Economic Sciences , Working , Working Paper series, no 2001-22.Paper series, no 2001-22.

Nakanishi, M., 1978. Frequency Bias in Shopper Surveys, in Preceedings of the American Marketing Nakanishi, M., 1978. Frequency Bias in Shopper Surveys, in Preceedings of the American Marketing Association Educators‘ Conferenc. Chicago: Association Educators‘ Conferenc. Chicago: American Marketing AssociationAmerican Marketing Association, 67-70., 67-70.

Nowell, C., Evans, M.A., McDonald, L., 1988. Length-Biased Sampling in Contingent Valuation Nowell, C., Evans, M.A., McDonald, L., 1988. Length-Biased Sampling in Contingent Valuation Studies. Studies. Land EconomicsLand Economics 64 (November), 367-71. 64 (November), 367-71.

Nowell, C., Stanley, L.R., 1991. Length-Biased Sampling in Mall Intercept Surveys. Nowell, C., Stanley, L.R., 1991. Length-Biased Sampling in Mall Intercept Surveys. Journal of Journal of Marketing ResearchMarketing Research 28, 1991, 475-479. 28, 1991, 475-479.

Nunes, L.C., 2003. Estimating Binary Choice Models With On-Site Samples. Faculdade de Economia, Nunes, L.C., 2003. Estimating Binary Choice Models With On-Site Samples. Faculdade de Economia, Universidade Nova de Lisboa.Universidade Nova de Lisboa.

Ovaskainen, V., Mikkola, J., Pouta, E., 2001. Estimating Recreation Demand with On-Site Data: An Ovaskainen, V., Mikkola, J., Pouta, E., 2001. Estimating Recreation Demand with On-Site Data: An Application of Truncated and Endogenously Stratified Count Data Models. Application of Truncated and Endogenously Stratified Count Data Models. Journal of Forest EconomicsJournal of Forest Economics 7:2, 125-144.7:2, 125-144.

Santos Silva, J.M.C., 1997. Unobservables in Count Data Models for On-Site Samples. Santos Silva, J.M.C., 1997. Unobservables in Count Data Models for On-Site Samples. Economics Economics LettersLetters 54, 217-220. 54, 217-220.

Satten, G.A., Kong, F., Wright, D.J., Glynn, S.A., Schreiber, G.B., 2004. How Special is a 'Special' Satten, G.A., Kong, F., Wright, D.J., Glynn, S.A., Schreiber, G.B., 2004. How Special is a 'Special' Interval: Modeling Departure from Length-Biased Sampling in Renewal ProcessesInterval: Modeling Departure from Length-Biased Sampling in Renewal Processes . Biostatistics. Biostatistics 5, 1, 5, 1, 145-151.145-151.

Shaw, D., 1988. On-Site Samples' Regression, Problems of Non-negative Integers, Truncation, and Shaw, D., 1988. On-Site Samples' Regression, Problems of Non-negative Integers, Truncation, and Endogenous Stratification. Endogenous Stratification. Journal of EconometricsJournal of Econometrics 37, 211-223. 37, 211-223.

Simon, R. 1980. Length-Biased Sampling in Etiological Studies. Simon, R. 1980. Length-Biased Sampling in Etiological Studies. Am. J. EpidemAm. J. Epidem. 111, 444-452.. 111, 444-452. Sudman, S., 1980. Improving the Quality of Shopping Center Sampling. Sudman, S., 1980. Improving the Quality of Shopping Center Sampling. Journal of Marketing ResearchJournal of Marketing Research

17, 1980, 423-431.17, 1980, 423-431. Wang, M-C., 1996. Hazards Regression Analysis for Length- Biased Data, Wang, M-C., 1996. Hazards Regression Analysis for Length- Biased Data, Biometrika Biometrika 2, 343-354.2, 343-354. Zelen, M., Feinleib, M. 1969. On The Theory of Screening for Chronic Diseases. Zelen, M., Feinleib, M. 1969. On The Theory of Screening for Chronic Diseases. Boimetrika Boimetrika 56, 601-56, 601-

614614

Page 28: Textile Fibre Sampling

And finally she stops…And finally she stops…