a three-stage optional randomized response model

12
This article was downloaded by: [Florida Atlantic University] On: 20 November 2014, At: 18:32 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of Statistical Theory and Practice Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/ujsp20 A Three-Stage Optional Randomized Response Model Samridhi Mehta a , B. K. Dass a , Javid Shabbir b & Sat Gupta c a Department of Mathematics , University of Delhi , Delhi , India b Department of Statistics , Quad-I-Azam University , Islamabad , Pakistan c Department of Mathematics and Statistics , University of North Carolina at Greensboro , Greensboro , North Carolina , USA Published online: 10 Aug 2012. To cite this article: Samridhi Mehta , B. K. Dass , Javid Shabbir & Sat Gupta (2012) A Three-Stage Optional Randomized Response Model, Journal of Statistical Theory and Practice, 6:3, 417-427, DOI: 10.1080/15598608.2012.695558 To link to this article: http://dx.doi.org/10.1080/15598608.2012.695558 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

Upload: sat

Post on 27-Mar-2017

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: A Three-Stage Optional Randomized Response Model

This article was downloaded by: [Florida Atlantic University]On: 20 November 2014, At: 18:32Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Statistical Theory andPracticePublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/ujsp20

A Three-Stage Optional RandomizedResponse ModelSamridhi Mehta a , B. K. Dass a , Javid Shabbir b & Sat Gupta ca Department of Mathematics , University of Delhi , Delhi , Indiab Department of Statistics , Quad-I-Azam University , Islamabad ,Pakistanc Department of Mathematics and Statistics , University of NorthCarolina at Greensboro , Greensboro , North Carolina , USAPublished online: 10 Aug 2012.

To cite this article: Samridhi Mehta , B. K. Dass , Javid Shabbir & Sat Gupta (2012) A Three-StageOptional Randomized Response Model, Journal of Statistical Theory and Practice, 6:3, 417-427, DOI:10.1080/15598608.2012.695558

To link to this article: http://dx.doi.org/10.1080/15598608.2012.695558

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: A Three-Stage Optional Randomized Response Model

Journal of Statistical Theory and Practice, 6:417–427, 2012Copyright © Grace Scientific Publishing, LLCISSN: 1559-8608 print / 1559-8616 onlineDOI: 10.1080/15598608.2012.695558

A Three-Stage Optional RandomizedResponse Model

SAMRIDHI MEHTA,1 B. K. DASS,1 JAVID SHABBIR,2 ANDSAT GUPTA3

1Department of Mathematics, University of Delhi, Delhi, India2Department of Statistics, Quad-I-Azam University, Islamabad, Pakistan3Department of Mathematics and Statistics, University of North Carolina atGreensboro, Greensboro, North Carolina, USA

In Gupta et al. (2010; 2011), it was observed that introduction of a truth element in anoptional randomized response model can improve the efficiency of the mean estimator.However, a large value of the truth parameter (T) may be needed if the underlyingquestion is highly sensitive. This can jeopardize respondent cooperation. In what wecall a “three-stage optional randomized response model,” a known proportion (T) of therespondents is asked to tell the truth, another known proportion (F) of the respondents isasked to provide a scrambled response, and the remaining respondents are instructed toprovide a response following the usual optional randomized response strategy where arespondent provides a truthful response (or a scrambled response) depending on whetherhe/she considers the question nonsensitive (or sensitive). This is done anonymouslybased on color-coded cards that the researcher cannot see. In this article we show thata three-stage model may turn out to be more efficient than the corresponding two-stagemodel, and with a smaller value of T . Greater respondent cooperation will be an addedadvantage of the three-stage model.

AMS Subject Classification: 62D05.

Keywords: Quantitative sensitive variable; Randomized response; Split sample; Three-stage model.

1. Introduction

Gupta et al. (2002) introduced an optional randomized response technique (RRT) modelwhere the respondents decide themselves whether they want to tell the truth (or scram-ble their true response) depending upon whether the question being asked is perceivedby them as nonsensitive (or sensitive). The proportion of respondents who consider thequestion sensitive is called the sensitivity level of the question and is usually denotedby W. Optional RRT models exist in both the binary response and quantitative responseframework. Here we focus on quantitative response RRT models. Several authors includ-ing Eichhorn and Hayre (1983), Saha (2008), and Chaudhari (2012) have also workedon quantitative response RRT models. The Gupta et al. (2002) model used multiplicative

Received April 11, 2011; accepted January 24, 2012.Address correspondence to: Samridhi Mehta, Department of Mathematics, University of Delhi,

Delhi, 110007, India. Email: [email protected]

417

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

8:32

20

Nov

embe

r 20

14

Page 3: A Three-Stage Optional Randomized Response Model

418 S. Mehta et al.

scrambling, which required some approximation at the parameter estimation stage. TheGupta et al. (2006) one-stage optional RRT model and the Gupta et al. (2010) two-stageoptional RRT model were both based on additive scrambling that eliminated the need forany approximation by using a split- sample approach. The role of multiplicative scramblingis further questioned in Gupta et al. (2012), where it was observed that the additive scram-bling performs better than the linear combination scrambling of Huang (2010). Keepingin mind this background, we use additive scrambling in the current work. In the two-stageGupta et al. (2010) model, a known proportion of respondents (T) is asked to provide a trueresponse to the sensitive question while maintaining anonymity, and the remaining propor-tion of respondents (1 – T) respond using additive optional scrambling. It was shown inthat paper that a two-stage optional RRT model will always perform better than the corre-sponding one-stage optional RRT model (where T = 0) except that a larger value of T maybe needed if the question is highly sensitive. Gupta et al. (2011) presented a method forselecting an optimum value of T .

Obviously, the Gupta et al. (2010) two-stage model has the potential of respondentnon-cooperation if the value of T is too high, even if the element of truth is introduced inthe model maintaining complete anonymity. This is our motivation for introducing a three-stage optional RRT model. In this model, a known proportion (T) of the respondents isasked to tell the truth, another known proportion (F) of the respondents is asked to providea scrambled response, and the remaining respondents are asked to provide a response fol-lowing the usual optional strategy. This is done anonymously based on color-coded cardsthat the researcher will not see. Clearly, respondent cooperation is not jeopardized by intro-ducing F. The main focus of this paper is to show that in many cases a three-stage modelcan achieve the same or better efficiency as compared to a two-stage model but with asmaller value of T . This is done in two steps. First, in section 2, we introduce a differenttype of two-stage model where we introduce F but not T . We show that in many cases, thismodel can do better than the one-stage model offering a more respondent-friendly alter-native to the Gupta et al. (2010) model. Then in section 3 we introduce the three-stagemodel for which the efficiency can be improved even more by using a smaller value of T inconjunction with F.

2. A Different Two-Stage Model

In this section, we propose a model where a predetermined proportion (F) of respon-dents is instructed to scramble their response to the sensitive question and the remainingrespondents use the optional RRT strategy. This model clearly improves respondent trust.

The sample of size n is split into two subsamples of size n1 and n2 (n1 + n2 = n),with each subsample using a different scrambling device. In each subsample, a fixed pre-determined proportion of respondents (F) is instructed to scramble their response and theremaining proportion of respondents (1 – F) have an option of scrambling their responseadditively if they consider the question to be sensitive, or else they can report their trueresponse X if they consider the question non-sensitive. For F = 0, the model is same as theGupta et al. (2006) model.

Let X be the true response with unknown mean μX and unknown variance σ 2X . Let Si

(i = 1, 2) be the scrambling variable associated with the ith (i = 1, 2) subsample. Let theknown mean of Si be θ i and the known variance of Si be σ 2

Si . Let W denote the sensitivitylevel of the underlying question. Assume that X, S1, and S2 are mutually independent. LetZi (i = 1, 2) be the reported response in the ith subsample (i = 1, 2). Thus,

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

8:32

20

Nov

embe

r 20

14

Page 4: A Three-Stage Optional Randomized Response Model

Optional Randomized Response Model 419

Zi ={

X with probability (1 − F)(1 − W)X + Si with probability F + (1 − F)W

(1)

From Eq. (1), for i = 1, 2,

E(Zi) = μX + θi.[F + (1 − F)W] (2)

On solving the preceding equations for μX and W, we get

μX = θ2E(Z1) − θ1E(Z2)

θ2 − θ1, θ1 �= θ2 (3)

W =(

1

1 − F

)(E(Z2) − E(Z1)

θ2 − θ1− F

), θ1 �= θ2, F �= 1 (4)

Unbiased estimators μX and W for μX and W, respectively, can be obtained byestimating E(Zi) by Zi (i = 1, 2). It is easy to verify that the estimators

μX = θ2Z1 − θ1Z2

θ2 − θ1, θ1 �= θ2 (5)

W =(

1

1 − F

)(Z2 − Z1

θ2 − θ1− F

), θ1 �= θ2, F �= 1 (6)

are unbiased.Also, for θ1 �= θ2, μX ∼ AN(μX , V1) where

V1 = 1

(θ2 − θ1)2

(θ2

2

σ 2Z1

n1+ θ2

1

σ 2Z2

n2

), θ1 �= θ2, (7)

σ 2Z1

= σ 2X + σ 2

S1[F + (1 − F)W] + θ2

1 [F + (1 − F)W]{1 − [F + (1 − F)W]} (8)

and

σ 2Z2

= σ 2X + σ 2

S2[F + (1 − F)W] + θ2

2 [F + (1 − F)W]{1 − [F + (1 − F)W]} (9)

Similarly for θ1 �= θ2, F �= 1, W ∼ AN(W, V2), where

V2 = 1

(θ2 − θ1)2(1 − F)2

(σ 2

Z1

n1+ σ 2

Z2

n2

), θ1 �= θ2, F �= 1 (10)

and σ 2Z1

and σ 2Z2

are as given in Eqs. (8) and (9), respectively.Let Var(μX) and Var(μX)F respectively denote variances of the mean estimator for the

one-stage Gupta et al. (2006) model and the new two-stage model (involving F).

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

8:32

20

Nov

embe

r 20

14

Page 5: A Three-Stage Optional Randomized Response Model

420 S. Mehta et al.

Theorem 1. Var(μX)F ≤ Var(μX) ⇔ F ≥ (1

1−W

)[n2θ22 σ 2

S1 + n1θ21 σ 2

S2

nθ21 θ2

2

+ 1 − 2W

]= F∗

Proof.

Var(μX)F ≤ Var(μX) ⇔ θ22

n1

{(σ 2

S1 + θ21 )[F + (1 − F)W − W] − θ2

1

[[F + (1 − F)W]2 − W2

]}+ θ21

n2

{(σ 2

S2 + θ22 )

[F + (1 − F)W − W] − θ22

[[F + (1 − F)W]2 − W2

]} ≤ 0

⇔ [F + (1 − F)W − W][n2θ

22 σ 2

S1 + n1θ21 σ 2

S2

+ nθ21 θ2

2 {1 − [F + (1 − F)W + W]} ] ≤ 0

⇔ [F + (1 − F)W − W]

[n2θ

22 σ 2

S1 + n1θ21 σ 2

S2

nθ21 θ2

2

+{1 − [F + (1 − F)W + W]} ] ≤ 0

⇔ [F(1 − W)]

[n2θ

22 σ 2

S1 + n1θ21 σ 2

S2

nθ21 θ2

2

+ 1 − [F(1 − W) + 2W]

]≤ 0

�As the first term in the preceding expression is nonnegative,

Var(μX)F ≤ Var(μX) ⇔[

n2θ22 σ 2

S1 + n1θ21 σ 2

S2

nθ21 θ2

2

+ 1 − [F(1 − W) + 2W]

]≤ 0

⇔ n2θ22 σ 2

S1 + n1θ21 σ 2

S2

nθ21 θ2

2

+ 1 − 2W ≤ F(1 − W)

⇔ F ≥(

1

1 − W

)[n2θ

22 σ 2

S1 + n1θ21 σ 2

S2

nθ21 θ2

2

+ 1 − 2W

]= F∗

Note that

F∗ ≤ 1 ⇔ n1θ21 σ 2

S2 + n2θ22 σ 2

S1

nθ21 θ2

2

≤ W

This condition is very likely to hold true when W is large. Recall that this is exactly thecase where the Gupta et al. (2010) model has to rely on a larger value of T.

Table 1 provides a numerical comparison of the variances of the mean estimator forthe new two-stage model and the Gupta et al. (2006) model for various values of F and W.

It can be observed from Table 1 that for smaller values of W, the value of V(μX)F

may initially increase with an increase in F before it starts decreasing. For larger valuesof W, it begins to decrease right away. Thus, for more sensitive questions, introducing Fhelps. In view of the observations in Gupta et al. (2010; 2011) and Table 1, an appropriate

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

8:32

20

Nov

embe

r 20

14

Page 6: A Three-Stage Optional Randomized Response Model

Optional Randomized Response Model 421

Table 1V(μX)F (in bold) and V(μX) for various values of W and F, with n = 1,000,

n1 = n2 = 500, X ∼ Poisson(4), S1 ∼ Poisson(2) and S2 ∼ Poisson(5)

F W = 0.1 W = 0.3 W = 0.5 W = 0.7 W = 0.9

0 0.03133 0.03978 0.04467 0.04600 0.043780.03133 0.03978 0.04467 0.04600 0.04378

0.1 0.03557 0.04189 0.04533 0.04589 0.043570.03133 0.03978 0.04467 0.04600 0.04378

0.3 0.04189 0.04482 0.04600 0.04544 0.043140.03133 0.03978 0.04467 0.04600 0.04378

0.5 0.04533 0.04600 0.04578 0.04467 0.042670.03133 0.03978 0.04467 0.04600 0.04378

0.7 0.04589 0.04544 0.04467 0.04357 0.042160.03133 0.03978 0.04467 0.04600 0.04378

0.9 0.04357 0.04314 0.04267 0.04216 0.041620.03133 0.03978 0.04467 0.04600 0.04378

strategy might be to use the Gupta et al. (2010) model with a small value of T for lesssensitive questions and to use this new two-stage model for highly sensitive questions.

We now compare this new two-stage model with the Gupta et al. (2010) model. LetVar(μX)T denote the variance of the mean estimator for the two-stage Gupta et al. (2010)model. The following theorem compares the variances of the two two-stage models underthe assumption that the scrambling variables of both the models have the same mean andthe same variance.

Theorem 2.

Var(μX)F ≤ Var(μX)T ⇔ F ≥(

1

1 − W

)[[1 + W(T − 2)] + n1θ

21 σ 2

S2 + n2θ22 σ 2

S1

nθ21 θ2

2

]= F∗∗

Proof.

Var(μX)F ≤ Var(μX)T ⇔ F(1 − W) + W(1 − T) + W ≥ 1 +(

n1θ21 σ 2

S2 + n2θ22 σ 2

S1

nθ21 θ2

2

)

⇔ F(1 − W) ≥ [1 + W(T − 2)] +(

n1θ21 σ 2

S2 + n2θ22 σ 2

S1

nθ21 θ2

2

)

⇔ F ≥(

1

1 − W

)[[1 + W(T − 2)]

]+(

1

1 − W

)(

n1θ21 σ 2

S2 + n2θ22 σ 2

S1

nθ21 θ2

2

)= F∗∗

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

8:32

20

Nov

embe

r 20

14

Page 7: A Three-Stage Optional Randomized Response Model

422 S. Mehta et al.

Note that

F∗∗ ≤ 1 ⇔ [1 + W(T − 2)] + n1θ21 σ 2

S2 + n2θ22 σ 2

S1

nθ21 θ2

2

≤ 1 − W

⇔ n1θ21 σ 2

S2 + n2θ22 σ 2

S1

nθ21 θ2

2

≤ W(1 − T)

⇔ T ≤ 1 − n1θ21 σ 2

S2 + n2θ22 σ 2

S1

nWθ21 θ2

2

Table 2 provides a numerical comparison of values of V(μX) for the new two-stageoptional RRT model (involving F) and the Gupta et al. (2010) model for various values ofF and T for W = 0.9.

In Gupta et al. (2011), it was observed that for questions with small sensitivity level,any value of T would make the Gupta et al. (2010) model perform better than the Guptaet al. (2006) model, but for questions with larger sensitivity level, a larger value of T wasneeded, which is not recommended as it affects respondent cooperation. So instead of usinga large value of T , one should rather opt for the new two-stage model. For example, one cannote from Table 2 (for W = 0.9), that V(μX) is 0.04378 when F = 0 and T = 0.5. However,a smaller value of V(μX) can be obtained with the new model where T = 0.

3. A Three-Stage Optional Randomized Response Model

In the previous section, it was observed that under some reasonably mild conditions thenew two-stage model (involving F) can perform as well as or even better than the Guptaet al. (2006) model and the Gupta et al. (2010) model. In this section we consider a modelthat uses both T and F simultaneously.

Table 2V(μX)F (in bold) and V(μX)T for various values of T and F, with W = 0.9, n = 1,000,

n1 = n2 = 500, X ∼ Poisson(4), S1 ∼ Poisson(2) and S2 ∼ Poisson(5)

F T = 0 T = 0.1 T = 0.3 T = 0.5 T = 0.7 T = 0.9

0 0.04378 0.04378 0.04378 0.04378 0.04378 0.043780.04378 0.04522 0.04594 0.04378 0.03874 0.03082

0.1 0.04357 0.04357 0.04357 0.04357 0.04357 0.043570.04378 0.04522 0.04594 0.04378 0.03874 0.03082

0.3 0.04314 0.04314 0.04314 0.04314 0.04314 0.043140.04378 0.04522 0.04594 0.04378 0.03874 0.03082

0.5 0.04267 0.04267 0.04267 0.04267 0.04267 0.042670.04378 0.04522 0.04594 0.04378 0.03874 0.03082

0.7 0.04216 0.04216 0.04216 0.04216 0.04216 0.042160.04378 0.04522 0.04594 0.04378 0.03874 0.03082

0.9 0.04162 0.04162 0.04162 0.04162 0.04162 0.041620.04378 0.04522 0.04594 0.04378 0.03874 0.03082

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

8:32

20

Nov

embe

r 20

14

Page 8: A Three-Stage Optional Randomized Response Model

Optional Randomized Response Model 423

In the proposed model, the sample of size n is again split into two subsamples of sizesn1 and n2(n1 + n2 = n). In each sample, a fixed predetermined proportion (T) of respon-dents is instructed to tell the truth and a fixed predetermined proportion (F) of respondentsis instructed to scramble their response. The remaining proportion (1 – T – F) of respon-dents have an option of scrambling their responses additively if they consider the questionto be sensitive, or else they can report their true response X. For F = 0, the model is sameas the Gupta et al. (2010) model. For both T and F equal to zero, the model is same as theGupta et al. (2006) model.

Let X denote the true response with unknown mean μX and unknown variance σ 2X . Let

Si (i = 1, 2) be the scrambling variable associated with the ith subsample (i = 1, 2) Letthe known mean of Si be θ i and known variance of Si be σ 2

Si . Let W denote the sensitivitylevel of the sensitive characteristic. Assume that X, S1, and S2 are mutually independent.Let Zi (i = 1, 2) be the reported response in the ith sample (i = 1, 2). Thus,

Zi ={

X with probability T + (1 − T − F)(1 − W)X + Si with probability F + (1 − T − F)W

(11)

For i = 1, 2,

E(Zi) = μX + θi.[F + (1 − T − F)W] (12)

It can be seen that

μX = θ2E(Z1) − θ1E(Z2)

θ2 − θ1, θ1 �= θ2 (13)

W =(

1

1 − T − F

)(E(Z2) − E(Z1)

θ2 − θ1− F

), θ1 �= θ2, T + F �= 1 (14)

Unbiased estimators μX and W for μX and W, respectively, can be obtained byestimating E(Zi) by Zi (i = 1, 2), and they are given here:

μX = θ2Z1 − θ1Z2

θ2 − θ1, θ1 �= θ2 (15)

W =(

1

1 − T − F

)(Z2 − Z1

θ2 − θ1− F

), θ1 �= θ2, T + F �= 1 (16)

It may be verified that for the three-stage model,

σ 2Z1

= σ 2X + σ 2

S1[F + (1 − T − F)W] + θ2

1 [F + (1 − T − F)W]{1 − [F + (1 − T − F)W]}(17)

and

σ 2Z2

= σ 2X + σ 2

S2[F + (1 − T − F)W] + θ2

2 [F + (1 − T − F)W]{1 − [F + (1 − T − F)W]}(18)

Note that for θ1 �= θ2, μX ∼ AN(μX , V3), where

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

8:32

20

Nov

embe

r 20

14

Page 9: A Three-Stage Optional Randomized Response Model

424 S. Mehta et al.

V3 = 1

(θ2 − θ1)2

(θ2

2

σ 2Z1

n1+ θ2

1

σ 2Z2

n2

), θ1 �= θ2, (19)

and for θ1 �= θ2, T + F �= 1, W ∼ AN(W, V4), where

V4 = 1

(θ2 − θ1)2(1 − T − F)2

(σ 2

Z1

n1+ σ 2

Z2

n2

)(20)

The quantities σ 2Z1

and σ 2Z2

are given in Eqs. (17) and (18), respectively.Let Var(μX)3 be the variance of the mean estimator for the three-stage model intro-

duced in this section. The following theorem compares the variances of the three-stagemodel and the two-stage Gupta et al. (2010) model under the assumptions that the samevalue of T is used in both the models and the scrambling variables of both the models havethe same mean and the same variance.

Theorem 3. Var(μX)3 ≤ Var(μX)T ⇔ F ≥ (2

1−W

)[θ22 σ 2

S1n2 + θ21 σ 2

S2n2

2nθ21 θ2

2

+ 1

2− W(1 − T)

]

Proof.

Var(μX)3 ≤ Var(μX)T ⇔ F(1 − W) + 2W(1 − T) ≥ 1 +(

n1θ21 σ 2

S2 + n2θ22 σ 2

S1

nθ21 θ2

2

)

⇔ F ≥(

1

1 − W

)[θ2

2 σ 2S1n2 + θ2

1 σ 2S2n2

nθ21 θ2

2

+ 1 − 2W(1 − T)

]= F #

�Further,

F# ≤ 1 ⇔[θ2

2 σ 2S1n2 + θ2

1 σ 2S2n2

nθ21 θ2

2

+ 1 − 2W(1 − T)

]≤ 1 − W

⇔ θ22 σ 2

S1n2 + θ21 σ 2

S2n2

nθ21 θ2

2

+ 1 − 2W + 2WT ≤ 1 − W

⇔ T ≤(

1

2W

)[W − θ2

2 σ 2S1n2 + θ2

1 σ 2S2n2

nθ21 θ2

2

]

⇔ T ≤ 1

2− θ2

2 σ 2S1n2 + θ2

1 σ 2S2n2

2Wnθ21 θ2

2

In view of the preceding theorem, one can say that Var(μX) can be decreased furtherby using a three-stage model if the condition of Theorem 3 is satisfied.

In the next theorem, we investigate whether, corresponding to the Gupta et al. (2010)model with any value of T , there exists a value T1 < T and a value of F (> 0) for whichthe three-stage model performs better than the Gupta et al. (2010) model.

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

8:32

20

Nov

embe

r 20

14

Page 10: A Three-Stage Optional Randomized Response Model

Optional Randomized Response Model 425

Theorem 4. For the Gupta et al. (2010) model where T ≤ 1 − 1

2W− n2θ

22 σ 2

S1 + n1θ21 σ 2

S2

2nWθ21 θ2

2holds, there exists a value T1 < T and a value of F (> 0) for which the three-stage modelperforms better than the Gupta et al. (2010) model.

Proof. Consider the Gupta et al. (2010) model for any arbitrary value of T ∈ (0, 1] andW ∈ (0, 1]. Let Var(μX)T∗,F denote the variance of the mean estimator of the three-stagemodel at T = T∗ and F and Var(μX)T denote the variance of the mean estimator of theGupta et al. (2010) model. �Then,

Var(μX)T∗ , F ≤ Var(μX)T ⇔⎛⎜⎝

θ 22

n1[(σ 2

S1 + θ21 )(F(1 − W) + W(T − T∗)) − θ2

1 {(F(1 − W) + W(1 − T∗))2 − W2(1 − T)2}]+θ 2

1

n2[(σ 2

S2 + θ22 )(F(1 − W) + W(T − T∗)) − θ2

2 {(F(1 − W) + W(1 − T∗))2 − W2(1 − T)2}]

⎞⎟⎠ ≤ 0

⇔ [F(1 − W) + W(T − T∗)]

⎛⎜⎝

θ 22

n1[(σ 2

S1 + θ21 ) − θ2

1 {F(1 − W) + W(2 − T − T∗)}]+θ2

1

n2[(σ 2

S2 + θ22 ) − θ2

2 {F(1 − W) + W(2 − T − T∗)}]

⎞⎟⎠ ≤ 0

The product of the terms is nonpositive if and only if the terms in the product are of oppositesign. Thus,

Var(μX)T∗,F ≤ Var(μX)T ⇔ F(1 − W) + W(T − T∗) ≤ 0 and

n2θ22 (σ 2

S1 + θ21 ) + n1θ

21 (σ 2

S2 + θ22 )

nθ21 θ2

2

≥ F(1 − W) + W(2 − T − T∗)

Or

F(1 − W) + W(T − T∗) ≥ 0 and

n2θ22 (σ 2

S1 + θ21 ) + n1θ

21 (σ 2

S2 + θ22 )

nθ21 θ2

2

≤ F(1 − W) + W(2 − T − T∗)

⇔ T∗ − T ≥ max

{F(1 − W)

W,

F(1 − W)

W+ 2(1 − T) − 1

W− n2θ

22 σ 2

S1 + n1θ21 σ 2

S2

nWθ21 θ2

2

}

or

T∗ − T ≤ min

{F(1 − W)

W,

F(1 − W)

W+ 2(1 − T) − 1

W− n2θ

22 σ 2

S1 + n1θ21 σ 2

S2

nWθ21 θ2

2

}(21)

For the Gupta et al. (2010) model if the value of T satisfies 2(1 − T) − 1

W−

n2θ22 σ 2

S1 + n1θ21 σ 2

S2

nWθ21 θ2

2

≥ 0 or equivalently T ≤ 1 − 1

2W− n2θ

22 σ 2

S1 + n1θ21 σ 2

S2

2nWθ21 θ2

2

, then the

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

8:32

20

Nov

embe

r 20

14

Page 11: A Three-Stage Optional Randomized Response Model

426 S. Mehta et al.

Table 3V(μX)3 (in bold) and V(μX)T for several combinations of T and F, with W = 0.8,n = 1000, n1 = n2 = 500, X ∼ Poisson(4), S1 ∼ Poisson(2) and S2 ∼ Poisson(5)

F T = 0 T = 0.1 T = 0.2 T = 0.3 T = 0.4 T = 0.5 T = 0.7 T = 0.9

0 0.04533 0.04594 0.04597 0.04544 0.04434 0.04267 0.03762 0.030290.04533 0.04594 0.04597 0.04544 0.04434 0.04267 0.03762 0.03029

0.1 0.04509 0.04584 0.04602 0.04563 0.04467 0.04314 0.03837 —0.04533 0.04594 0.04597 0.04544 0.04434 0.04267 0.03762

0.3 0.04451 0.04554 0.04600 0.04589 0.04522 0.04397 — —0.04533 0.04594 0.04597 0.04544 0.04434 0.04267

0.5 0.04378 0.04509 0.04584 0.04602 0.04563 — — —0.04533 0.04594 0.04597 0.04544 0.04434

0.7 0.04291 0.04451 0.04554 — — — — —0.04533 0.04594 0.04597

0.8 0.04242 0.04416 — — — — — —0.04533 0.04594

0.9 0.04189 — — — — — — —0.04533

minimum in the above expression is F(1 − W)/F, which is a nonnegative quantity. Hence

T1 can be chosen to be any value less than T and the inequality (21) will clearly hold andVar(μX)T1,F ≤ Var(μX)T for such a value.

The significance of this result is that it shows if the condition of Theorem 4 holds trueone can reduce the variance of the Gupta et al. (2010) mean estimator further by using asmaller value of T in conjunction with a value of F.

4. Numerical Comparisons

Table 3 provides a comparison between the three-stage additive optional randomizedresponse model and the two-stage Gupta et al. (2010) model for various combinations of Tand F. In Table 3, we use W = 0.8. Blanks are for those combinations of T and F for whichT + F ≥ 1.

Note that when W = 0.8, the three-stage model performs better than or as well as theGupta et al. (2010) model for many combinations of T and F values. For example: For T =0.4, V(μX)T is 0.04434 but for F = 0.8 and T = 0.1, V(μX)3 is 0.04416, or one may evenconsider the case F = 0.5 and T = 0, for which V(μX)3 is 0.04378. It may also be notedthat the value of variance of the mean estimator for a specific value of T can be reducedfurther by introducing F, as was pointed out in Theorem 3. As an example, for T = 0.2,V(μX)T is 0.04597, but for F = 0.4 and T = 0.2, V(μX)3 is 0.04584.

5. Conclusion

In Gupta et al. (2010; 2011), it was observed that introduction of T into the Gupta et al.(2006) model doesn’t always decrease the variance of mean estimator. In order to reducethis variance, T should be chosen carefully. However, for questions with high sensitiv-ity level, a larger value of T may be needed, which might adversely affect respondent

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

8:32

20

Nov

embe

r 20

14

Page 12: A Three-Stage Optional Randomized Response Model

Optional Randomized Response Model 427

cooperation. Results of this paper show that one may achieve the desired variance reductionwith a smaller value of T by introducing F.

ReferencesChaudhuri, A. 2012. Unbiased estimation of a sensitive proportion in general sampling by three non-

randomized response schemes, J. Stat. Theory and Pract., 6(2): 376–381.Eichhorn, B. H., and L. S. Hayre. 1983. Scrambled randomized response methods for obtaining

sensitive quantitative data. J. Stat. Plan. Inference, 7, 307–316.Gupta, S. N., B. C. Gupta, and S. Singh. 2002. Estimation of sensitivity level of personal interview

survey questions. J. Stat. Plan. Inference 100, 239–247.Gupta, S. N., B. Thornton, J. Shabbir, and S. Singhal. 2006. A comparison of multiplicative and

additive optional RRT models. J. Stat. Theory Appl., 5(3), 226–239.Gupta, S. N., S. Mehta, J. Shabbir, and B. K. Dass. 2011. Some optimality issues in estimating two-

stage optional randomized response models. Amer. J. Math. Management Sci., 31(1–2), 1–12.Gupta, S. N., S. Mehta, J. Shabbir, and B. K. Dass. 2012. Generalized scrambling in quantitative

optional randomized response models. Commun. Stat. Theory Methods, (in press).Gupta, S. N., J. Shabbir, and S. Sehra. 2010. Mean and sensitivity estimation in optimal randomized

response models. J. Stat. Plan. Inference 140, 2870–2874.Huang, K. C. 2010. Unbiased estimators of mean, variance and sensitivity level for quantitative

characteristics in finite population sampling. Metrika, 71, 341–352.Saha, A. 2008. A randomized response technique for quantitative data under unequal probability

sampling. J. Stat. Theory Pract., 2(4), 589–596.

Dow

nloa

ded

by [

Flor

ida

Atla

ntic

Uni

vers

ity]

at 1

8:32

20

Nov

embe

r 20

14