a new estimator of population mean in stratified sampling
TRANSCRIPT
This article was downloaded by: [Monash University Library]On: 09 September 2013, At: 12:26Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Communications in Statistics - Theoryand MethodsPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/lsta20
A New Estimator of Population Mean inStratified SamplingJavid Shabbir a & Sat Gupta ba Department of Statistics, Quaid-I-Azam University, Islamabad,Pakistanb Department of Mathematical Sciences, University of North Carolinaat Greensboro, North Carolina, USAPublished online: 22 Sep 2006.
To cite this article: Javid Shabbir & Sat Gupta (2006) A New Estimator of Population Mean inStratified Sampling, Communications in Statistics - Theory and Methods, 35:7, 1201-1209, DOI:10.1080/03610920600629112
To link to this article: http://dx.doi.org/10.1080/03610920600629112
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.
This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions
Communications in Statistics—Theory and Methods, 35: 1201–1209, 2006Copyright © Taylor & Francis Group, LLCISSN: 0361-0926 print/1532-415X onlineDOI: 10.1080/03610920600629112
Sampling Theory
A New Estimator of PopulationMeanin Stratified Sampling
JAVID SHABBIR1 AND SAT GUPTA2
1Department of Statistics, Quaid-I-Azam University, Islamabad, Pakistan2Department of Mathematical Sciences, University of North Carolinaat Greensboro, North Carolina, USA
Kadilar and Cingi (2005) have suggested a new ratio estimator in stratifiedsampling. The efficiency of this estimator is compared with the traditional combinedratio estimator on the basis of mean square error (MSE). We propose anotherestimator by utilizing a simple transformation introduced by Bedi (1996). Theproposed estimator is found to be more efficient than the traditional combined ratioestimator as well as the Kadilar and Cingi (2005) ratio estimator.
Keywords Bias; MSE; Ratio estimator; Stratified sampling.
Mathematics Subject Classification Primary 62D05; Secondary 62F10.
1. Introduction
Let a finite population having N distinct and identifiable units be divided into Lstrata. Let nh be the size of the sample drawn from hth stratum of size Nh by usingsimple random sampling without replacement. Let
L∑h=1
nh = n andL∑
h=1
Nh = N�
Let y and x be the response and auxiliary variables, respectively, assuming valuesyhi and xhi for the ith unit in the hth stratum. Let the stratum means be
�Yh =1Nh
Nh∑i=1
yhi and �Xh =1Nh
Nh∑i=1
xhi�
Received August 12, 2005; Accepted November 23, 2005Address correspondence to Sat Gupta, Department of Mathematical Sciences,
University of North Carolina at Greensboro, North Carolina, USA; E-mail: [email protected]
1201
Dow
nloa
ded
by [
Mon
ash
Uni
vers
ity L
ibra
ry]
at 1
2:26
09
Sept
embe
r 20
13
1202 Shabbir and Gupta
respectively. Let
yst =L∑
h=1
Whyh and xst =L∑
h=1
Whxh�
where
yh =1nh
nh∑i=1
yhi and xh =1nh
nh∑i=1
xhi
are the stratum sample means and Wh = Nh/N . To estimate �Y = ∑Lh=1 Wh
�Yh, weassume that �X = ∑L
h=1 Wh�Xh is known.
A commonly used estimator for �Y is the traditional combined ratio estimatordefined as
�YCR = yst
( �Xxst
)� (1)
The bias of �YCR, to a first degree of approximation, is given by
Bias��YCR� �1�X
L∑h=1
W 2h �h
[RS2
xh − �hSyhSxh]� (2)
where �h = � 1nh
− 1Nh�, R = �Y/�X, and �h, Syh, and Sxh are the population correlation
coefficients between y and x and the population standard deviation of y and thepopulation standard deviation of x in stratum h.
The MSE of �YCR, to a first degree of approximation, is given by
MSE��YCR� �L∑
h=1
W 2h �h
[S2yh + R2S2
xh − 2R�hSyhSxh]� (3)
In this paper, we propose another estimator that performs better than the traditionalcombined ratio estimator and its modification proposed by Kadilar and Cingi(2005).
2. Kadilar and Cingi Estimator
Following Searls (1964), Kadilar and Cingi (2005) have suggested this modificationof the combined ratio estimator:
�YKC = K�YCR� (4)
where K is a constant. The bias and MSE of �YKC , to a first degree of approximation,are given by
Bias��YKC� = �K − 1��Y + KBias��YCR� (5)
Dow
nloa
ded
by [
Mon
ash
Uni
vers
ity L
ibra
ry]
at 1
2:26
09
Sept
embe
r 20
13
A New Estimator of Population Mean 1203
and
MSE��YKC� = �K − 1�2�Y 2 + K2MSE��YCR�� (6)
The optimum value of K so as to minimize MSE��YKC� is given by
K∗ = Y 2
�Y 2 +MSE��YCR��
The corresponding optimum MSE is given by
MSE��YKC�opt =�Y 2MSE��YCR�
�Y 2 +MSE��YCR�� (7)
It is obvious from (7) that
MSE��YKC�opt < MSE��YCR��
Hence the estimator �YKC is more efficient than the combined ratio estimator �YCR.We would like to propose a similar modification, but of a different estimator,
that is based on a transformed auxiliary variable.
3. Proposed Estimator
Consider the following estimator, which is an adaptation of the estimator by Rayand Singh (1981).
�YRS =[y + b��X� − x��
](�Xx
)�
� (8)
where � and � are constant and b is the sample regression coefficient. Kadilar andCingi (2004) considered a special case of this estimator by setting � = � = 1. Themodified estimator is given by
�Y ∗RS =
[y + b��X − x�
](�Xx
)(9)
In stratified sampling, the dual of this estimator is given by
�Y = [yst + b��X − xst�
]( xst�X
)� (10)
This estimator can be further modified by using appropriate transformation ofthe auxiliary variable. We consider the transformation by Bedi (1996). Let Zi =xi + X �i = 1� � � � � N�, where X denotes the population total for the auxiliaryvariable. In stratified sampling, the above transformation becomes Zhi = xhi + X.
Dow
nloa
ded
by [
Mon
ash
Uni
vers
ity L
ibra
ry]
at 1
2:26
09
Sept
embe
r 20
13
1204 Shabbir and Gupta
Also zh = xh +X and �Zh = �Xh + X, so zst =∑L
h=1 Wh�xh + X� = xst + N�X and �Z =∑Lh=1 Wh��Xh + X� = �N + 1��X. Now let
0 =yst −�Y
�Y � 1 =xst −�X
�X �
Then E�0� = E�1� = 0, E�20� =∑L
h=1 W2h �hC
2yh, E�21� =
∑Lh=1 W
2h �hC
2xh, and
E�01� =∑L
h=1 W2h �h�hCyhCxh, where Cyh = Syh/�Y and Cxh = Sxh/�X.
Using Bedi’s transformation, the estimator �Y in (10) can be modified to
�YM = [yst + b��X − xst�
]( zst�Z)� (11)
The estimator �YM can also be written as
�YM = [�Y �1+ 0�+ b�X −�X�1+ 1��](�X�1+ 1�+ N�X
�N + 1��X)
or
�YM = [�Y �1+ 0�− b�X1](
1+ 1N + 1
)� (12)
We now propose an estimator of the type discussed in (4) by modifying theestimator in (11). The proposed estimator is
�Y P = ��YM� (13)
where � is a constant to be determined later. By (12) and (13), we have
�Y P = �
[�Y(1+ 0 +
1N + 1
+ 01N + 1
)− b�X
(1 +
21N + 1
)]
or
�Y P −�Y = ��− 1��Y + �
[�Y(0 +
1N + 1
+ 01N + 1
)− b�X
(1 +
21N + 1
)]� (14)
From (14), the bias of �Y P is given by
Bias��Y P� = E��Y P −�Y � = ��− 1��Y + ��YE(
01N + 1
)− ���XE
(21
N + 1
)� (15)
where � =∑L
h=1 W2h �h�hSyhSxh∑L
h=1 W2h �hS
2xh
is the population regression coefficient. Substituting for �in (15), we get the bias as
Bias��Y P� = ��− 1��Y � (16)
Dow
nloa
ded
by [
Mon
ash
Uni
vers
ity L
ibra
ry]
at 1
2:26
09
Sept
embe
r 20
13
A New Estimator of Population Mean 1205
Also from (14), the MSE��Y P� is given by
MSE��Y P� = E��Y P −�Y �2 = E
[��− 1��Y + ��Y
(0 +
1N + 1
)− �b�X1
]2
or
MSE��Y P� = ��− 1�2�Y 2 + �2[�Y 2E
(20 +
21�N + 1�2
+ 201
�N + 1�
)+ �2�X2E�21�
]
− 2�2��Y�XE(01 +
21�N + 1�
)� (17)
Again substitution for � gives
MSE��YP� = ��− 1�2�Y 2 + �2L∑
h=1
W 2h �h
[S2yh�1− �2
c�+R2S2
xh
�N + 1�2
]� (18)
where �c =∑L
h=1 W2h �h�hSyhSxh√∑L
h=1 W2h �hS
2xh
√∑Lh=1 W
2h �hS
2yh
is combined correlation coefficient in stratified
sampling across all strata.For � = 1, expressions in (16) and (18) will give the bias and variance of �YM as
Bias��YM� = 0 (19)
Var��YM� =L∑
h=1
W 2h �h
[S2yh�1− �2
c�+R2S2
xh
�N + 1�2
]� (20)
However, we seek an optimum value of � by minimizing MSE��Y P�. Setting�MSE��Y P�/�� = 0 in (18), we get
� = �Y 2
�Y 2 + Var��YM�= �opt (say)� 0 < �opt < 1�
We obtain the optimum bias and MSE of �Y P after substituting the optimum valueof � in (16) and (18). These are given by
Bias��Y P�opt = − �YVar��YM�
�Y 2 + Var��YM�(21)
and
MSE��Y P�opt =�Y 2Var��YM�
�Y 2 + Var��YM�� (22)
Dow
nloa
ded
by [
Mon
ash
Uni
vers
ity L
ibra
ry]
at 1
2:26
09
Sept
embe
r 20
13
1206 Shabbir and Gupta
4. Efficiency Comparison
We now compare the proposed estimator �Y P with the traditional combined ratioestimator ��YCR� and the Kadilar and Cingi (2005) estimator �YKC .
(i) From (3) and (22),
MSE��YCR�−MSE(�Y P
)opt
= �Y 2�MSE��YCR − Var��YM��+ �MSE��YCR�Var��YM��
��Y 2 + Var��YM��
Thus MSE��YCR�−MSE��Y P�opt > 0 if MSE��YCR� > Var��YM�. This will be true if
L∑h=1
W 2h �hR
2S2xh
[(1− �hSyh
RSxh
)2
+(
Syh
RSxh
)2
��2c − �2
h�−1
�N + 1�2
]> 0
or
�N + 1�2 >(
A
B + C
)�
where
A =L∑
h=1
W 2h �hS
2xh� B =
L∑h=1
W 2h �hS
2xh
[(1− �hSyh
RSxh
)2]�
and
C =∑L
h=1 W2h �hS
2yh��
2c − �2
h�
R2�
The condition �N + 1�2 >(
AB+C
)is likely to hold true always because of the term
�N + 1� on the left-hand side.
(ii) From (7) and (22),
MSE��YKC�opt −MSE��YP�opt = �Y 4 MSE��YCR�− Var��YM��Y 2 +MSE��YCR���Y 2 + Var��YM��
> 0
if MSE��YCR� > Var��YM�. But this will be true if �N + 1�2 > A/�B + C� as in (i).
Hence the proposed estimator �Y P is more efficient than both the combined ratioestimator �YCR and the estimator �YKC by Kadilar and Cingi (2005) if �N + 1�2 >A/�B + C�, a condition likely to hold true almost always.
5. Data Description and Results
We use the following three examples for comparison. Data summaries and resultsare in Tables 1–4 in the Appendix.
Example 1 [Source: Kadilar and Cingi (2005)]. y is apple production amount in854 villages of Turkey in 1999, and x is the number of apple trees in 854 villages ofTurkey in 1999. The data are stratified by the region of Turkey from each stratum,
Dow
nloa
ded
by [
Mon
ash
Uni
vers
ity L
ibra
ry]
at 1
2:26
09
Sept
embe
r 20
13
A New Estimator of Population Mean 1207
and villages are selected randomly using the Neyman allocation as
nh =NhSh∑Lh=1 NhSh
�
Example 2 [Source: Murthy (1967, p. 228)]. y is factories in region; x is fixedcapital. From 80 factories, the data have been classified arbitrarily into four strataon the basis of x values. The strata are x ≤ 500, 500 < x ≤ 1000, 1000 < x ≤ 2000,and x > 2000, respectively. We have randomly selected samples from each stratumby using the proportional allocation, nh = nNh
N, using a total sample size of n = 45.
Example 3 [Source: Murthy (1967, p. 228)]. y is factories in region; x is number ofworkers. From 80 factories, the data have been classified arbitrarily into four strataon the basis of x values. The strata are x < 100, 100 ≤ x < 200, 200 ≤ x < 500, andx ≥ 500, respectively. Again, we use the same procedure of selecting the samplesfrom each stratum as we did in Example 2. We use a total sample size of n = 45.
From data summaries in Tables 1–3, it is easy to verify that the condition�N + 1�2 > A/�B + C� is satisfied for all three examples comfortably.
Condition verification:
(i) Example 1: �N + 1�2 = 731025 > A/�B + C� = 93�295(ii) Example 2: �N + 1�2 = 6561 > A/�B + C� = 3�911(iii) Example 3: �N + 1�2 = 6561 > A/�B + C� = 1�800
Results in Table 4 clearly show gains in efficiency when using the proposedestimator.
6. Concluding Remarks
The new estimator �Y P proves more efficient than the the traditional combined ratioestimator �YCR and its modification �YKC proposed by Kadilar and Cingi (2005), sincethe condition �N + 1�2 > A/�B + C� is likely to hold true comfortably. For the threeexamples discussed here, there is not much difference between the Kadilar and Cingiestimator �YKC and the traditional combined ratio estimator �YCR.
Appendix
Table 1Example 1 summaries
Total Stratum→ 1 2 3 4 5 6
N = 854 Nh 106 106 94 171 204 173n = 140 nh 9 17 38 67 7 2�X = 37600 �Xh 24375 27421 72409 74365 26441 9844�Y = 2930 �Yh 1536 2212 9384 5588 967 404Sx = 144794 Sxh 49189 57461 160757 285603 45403 18794Sy = 17106 Syh 6425 11552 29907 28643 2390 946� = 0�92 �h 0.82 0.86 0.90 0.99 0.71 0.89R = 0�07793 �h 0.102 0.049 0.016 0.009 0.138 0.4942�c = 0�82629 W 2
h 0.015 0.015 0.012 0.04 0.057 0.041
Dow
nloa
ded
by [
Mon
ash
Uni
vers
ity L
ibra
ry]
at 1
2:26
09
Sept
embe
r 20
13
1208 Shabbir and Gupta
Table 2Example 2 summaries
Total Stratum→ 1 2 3 4
N = 80 Nh 19 32 14 15n = 45 nh 11 18 8 8�X = 1126�46 �Xh 349�684 706�594 1539�57 2620�53�Y = 5182�64 �Yh 2967�95 4657�63 6537�21 7843�67Sx = 845�61 Sxh 109�449 109�222 277�181 370�972Sy = 1835�66 Syh 757�089 669�127 416�113 645�688� = 0�9413 �h 0�9364 0�9260 0�9835 0�9692R = 4�6008 �h 0�0383 0�0243 0�0536 0�0583�c = 0�77693 W 2
h 0�05641 0�16 0�03063 0�03516
Table 3Example 3 summaries
Total Stratum→ 1 2 3 4
N = 80 Nh 25 23 16 16n = 45 nh 14 13 9 9�X = 284�75 �Xh 71�0 140�696 362�937 749�5�Y = 5182�64 �Yh 3156�64 4766�22 6334�19 7795�31Sx = 270�495 Sxh 14�6116 28�0364 91�3823 174�463Sy = 1835�66 Syh 740�012 515�697 501�399 653�09� = 0�9144 �h 0�8167 0�8231 0�9582 0�9805R = 18�201 �h 0�03143 0�03344 0�04861 0�04861�c = 0�67079 W 2
h 0�09766 0�08266 0�04 0�04
Table 4Results based on data in Tables 1, 2, and 3
Example 1 Example 2 Example 3
�Y �Bias� MSE Eff �Bias� MSE Eff �Bias� MSE Eff
�YCR 13�69 223496�77 100�00 0�99 4232�68 100�00 3�73 16456�65 100�00�YKC 87�69 217825�95 102�60 0�17 4232�01 100�02 0�55 16446�57 100�06�Y P 73�00 213877�86 104�50 0�31 1633�77 259�07 0�40 2057�73 799�75
Efficiency = Eff = MSE��YCR�
MSE��Y z�opt× 100, where z = KC, P.
Dow
nloa
ded
by [
Mon
ash
Uni
vers
ity L
ibra
ry]
at 1
2:26
09
Sept
embe
r 20
13
A New Estimator of Population Mean 1209
Acknowledgments
The authors would like to thank the referees for their constructive suggestionsthat helped improve the presentation of the paper. Also, the first author wishes toacknowledge with thanks the facilities made available by UNCG during his visitingassignment in the summer of 2005.
References
Bedi, P. K. (1996). Efficient utilization of auxiliary information at estimation stage. Biomet. J.38(8):973–976.
Kadilar, C., Cingi, H. (2004). Ratio estimator in simple random sampling. Appl. Math.Comput. 151:893–904.
Kadilar, C., Cingi, H. (2005). A new ratio estimator in stratified sampling. Comm. Statist.Theory Meth. 34:1–6.
Murthy, M. N. (1967). Sampling Theory and Methods. India: Statistical Publishing Society.Ray, S. K., Singh, R. K. (1981). Difference cum product type estimators. J. Ind. Statist. Asso.
19:147–151.Searls, D. T. (1964). Utilization of known coefficient of kurtosis in the estimation procedure
of variance. J. Am. Statist. Asso. 59:1225–1226.
Dow
nloa
ded
by [
Mon
ash
Uni
vers
ity L
ibra
ry]
at 1
2:26
09
Sept
embe
r 20
13