prepared by sohier kotb ahmed - cu · cairo university . faculty of economics and political...

20
Cairo University Faculty of Economics and Political Science Department of Statistics ESTIMATION OF THE INTERCLASS CORRELATION COEFFICIENT USING MINQUE APPROACH WITH APPLICATIONS Prepared by Sohier Kotb Ahmed Supervised by Prof. Heba El-Laithy Prof.Sahar El-Tawela Department of Statistics Department of Statistics Faculty of the Economics Faculty of the Economics and Political Science and Political Science Cairo University Cairo University A Thesis Submitted to the Department of Statistics In Partial Fulfillment of the Requirements For the Degree of Master of Science In the Faculty of the Economics and Political Science Cairo University 2012

Upload: dinhkhanh

Post on 28-Sep-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Cairo University

Faculty of Economics and Political Science Department of Statistics

ESTIMATION OF THE INTERCLASS CORRELATION

COEFFICIENT USING MINQUE APPROACH

WITH APPLICATIONS

Prepared by Sohier Kotb Ahmed

Supervised by

Prof. Heba El-Laithy Prof.Sahar El-Tawela Department of Statistics Department of Statistics Faculty of the Economics Faculty of the Economics

and Political Science and Political Science Cairo University Cairo University

A Thesis Submitted to the

Department of Statistics

In Partial Fulfillment of the Requirements

For the Degree of Master of Science

In the Faculty of the Economics and Political Science

Cairo University 2012

CAIRO UNIVERSITY

FACULTY OF THE ECONOMICS AND POLITICAL SCIENCE

DEPARTMENT OF STATISTICS

The undersigned hereby certify that they have read and recommend to the

faculty of Economics and Political Science for acceptance a thesis entitled

“Estimation of The Interclass Correlation Coefficient Using MINQUE

Approach with Applications” by Sohier Kotb Ahmed

in partial fulfillment of the requirements for the degree of Master of Science.

Dated: Nov. 21, 2012

Research Supervisor: _____________________________

Prof. Heba El-Laithy

External Examiner: ______________________________

Prof. Amany Mousa

Internal Examiner:

______________________________

Prof. Dina Magdy

To my family

Mother

Father

Husband

My daughters

Safaa, Sara &Hend

ii

Abstract

This study is concerned with the estimation of the interclass correlation. MINQUE

estimator for the interclass correlation is derived.

We apply this estimator of the interclass correlation on the data of the ‘Egypt

Demographic and Health Survey, 2000’. The relationship between mother’s education

and her children’s education is studied. Relevant data from the EDHS-2000 file was

utilized to construct the different indices using SPSS. MATLAB program was used to

estimate the interclass correlation using the data of the aforementioned data.

Key words: Familial data; Variance-covariance components; The interclass correlation;

Point estimation; Analysis of variance; Minimum Norm Quadratic Unbiased Estimation

(MINQUE); The 2000 Egypt Demographic and Health Survey (2000 EDHS).

Supervised by

Prof. Heba El-Laithy Prof.Sahar El-Tawela

Department of Statistics Department of Statistics

Faculty of the Economics and Faculty of the Economics and

Political Science Political Science

Cairo University Cairo University

iii

Name: Sohier Kotb Ahmed Eldahshan

Nationality: Egyptian.

Date and place of birth: 27/9/1966, Cairo.

Degree: Master.

Specialization: Statistics.

Supervisor: Prof.Dr. Heba El-Laithy

Prof. Sahar El-Tawela

Department of Statistics

Faculty of the Economics and Political Science

Cairo University

Title of the thesis: Estimation of The Interclass Coefficient Using MINQUE Approach

with Application.

Summary

One of the most important topics of interest to vast number of researchers long ago and

nowadays is estimating the degree of resemblance among family members, especially

estimating the interclass correlation.

The present study is mainly concerned with the estimation of the interclass correlation.

MINQUE estimator for the interclass correlation is derived. We apply this estimator of

the interclass correlation on the data of the “Egypt Demographic and Health Survey,

2000". The relationship between mother’s education and her children’s education is

studied. Relevant data from the EDHS-2000 file was utilized to construct the different

indices using SPSS. MATLAB program was used to estimate the interclass correlation

using the EDHS-2000 data.

iv

The aim of this study can be summarized in two main objectives as follows: The first is

to estimate the interclass correlation using MINQUE. The second is to estimate

interclass correlation between mother’s education and the level of education of her

children, using the derived MINQUE. Data of the "Egypt Demographic and Health

Survey, 2000" was used for this purpose.

The present study is divided into five chapters organized as follow:.

Chapter one: contains an introduction to the study which includes the

objectives of the study, data source, background information for both

familial correlation and the MINQUE technique, literature review for

the relationship between education of the mother and her children, and

structure of the study.

Chapter two: presents the principal concept of MINQUE approach.

Chapter three: gives an extensive review of related literature about the estimation of the

interclass correlation methods.

Chapter four: includes the data under-consideration, their source, identifies the study

variables and presents the derivation of the interclass correlation by the

MINQUE method.

Chapter five: presents the conclusions of this research.

Finally, this study includes References and two appendices.

v

Acknowledgements

My profound appreciation and gratitude goes to Prof. Heba El-Laithy for her kind

supervisor, creative suggestion, valuable comments and great help throughout the

accomplishment of this thesis.

I am also thankful to Dr. Zakaria Abdel-Wahed for his guidance through the early

years of confusion, and for the time he spent in revising the formulas appearing in the

thesis.

Special thanks go to Dr. Sahar El –Tawela for her constant support and kind help for

me.

Many thanks go to Prof. Fatma El-Zanaty for providing me with the data used in the

empirical application.

I would like to thank my colleagues at the National Center of Social and

Criminological Research,specially Prof. Magda Abdel-Ghani and also my friend

Dr.Abeer Saleh for their great help.

Special thanks go to my husband and my daughters for their hard efforts and sacrifice

to help me.

Of course, I am grateful to my parents for their patience, love and praying.

Without them this work would never have come into existence

Sohier Kotb Ahmed

Cairo, Egypt

--, 2012

vi

Notations & Abbreviations

N : number of the families.

K : the total number of children in all N families ( NnnnK 21 )

in : the total number of offspring in the i th family

iy : the measurement made on the parent in the i th family

ix : the vector of the measurement made on the offspring in

the ith family

iiniii xxxx ,,, 21

m : the mean of individual measurements of mothers

s : the mean of individual measurements of offspring

ms : the interclass correlation coefficient between a parent and offspring.

ss : the intraclass correlation coefficient among siblings.

2

m : the variance of individual measurements of parents.

2

s : the variance of individual measurements of offspring.

MINQUE: the Minimum Norm Quadratic Unbiased Estimator.

MLE : the Maximum Likelihood Estimator

vii

Table of Contents

Chapter (1): Introduction

1-1 Familial data 2

1-2 Objectives of the study 3

1-3 Familial Correlation Literature Review 4

1-4 The MINQUE technique 6

1-5 Source of Data 7

1-6 The relationship between education of the mother and her

Children 7

1-7 Organization of the thesis 11

Chapter (2): The Minimum Norm Quadratic Unbiased Estimation

2-1 Introduction 12

2-2 The model 12

2-3 The Principles of MINQUE in the Linear Model 14

2-4 MINQUE with a priori weights 21

Chapter (3): Methods of estimating the interclass correlation

3.1 Introduction 24

3.2 The Model for Familial Data 24

3. 3 Estimators of Interclass Correlation 25

3.3.1 The Pairwise Estimator 25

3.3.2 The Sib-Mean Estimator 26

3.3.3 The Random-Sib Estimator 27

3.3.4 The Ensemble Estimator 27

3.3.5 The Maximum likelihood Estimator 29

viii

3.3.6 The Weighted Sums of Squares Estimator 33

3.3.7 The MINQUE Estimator 36

Chapter (4): Estimation of interclass correlation between

mother’s education and her children’s

4.1 Introduction 44

4.2 Source of Data 44

4.2.1 Correlates to children and mother's education 45

4.3 The Study Variables 47

4.3.1 Children’s Education Index. 47

4.3.2 Mother’s Education Index 49 4.4 Derivation of interclass correlation by the MINQUE 51

4-5 The result 55

Chapter (5): Concluding remarks

Concluding remarks 57

References References 60

Appendix Appendix (1) 64

Appendix (2) 68

Chapter one

Introduction

2

Chapter one

Introduction

1-1 Familial data Familial data is observed in many different fields of research including

epidemiology, genetics, heredity, and psychology. A common assumption of

familial data is dependency between family members, as relatives tend to

have similar attributes. Welson (2010) presented an extended history of

research on estimating this dependency using familial correlations.

Formally, familial correlations measure the degree of resemblance

between family members with respect to some specified quantitative

characteristic as height, weight, cholesterol, lung capacity, or blood pressure.

There are two types of familial correlation in familial data: the intraclass

correlation coefficient and interclass correlation coefficient.

The intraclass correlation measures the degree of resemblance between

members of the same group, for example: it might refer to the measure of

resemblance between the children of a family, the sons of a family, or the

daughters of a family.

The interclass correlation measures the degree of resemblance between

members of different groups, for example: it can refer to the measure of

resemblance between the parents and children of a family, the parents and

sons of a family, the parents and daughters of a family, or the sons and

daughters of a family.

3

These types of familial correlations have applications in several areas of

study. Estimation of interclass correlations is the main interest in the present

work.

1-2 Objectives of the study This study is mainly concerned with the estimation of the interclass

correlation using MINQUE approach. Accordingly, MINQUE estimator for

the interclass correlation is derived.

We apply this estimator of the interclass correlation on the data of the

‘Egypt Demographic and Health Survey, 2000’, and the relationship

between mother’s education and her children’s education was investigated.

Relevant data from the EDHS-2000 file is utilized to construct the different

indices using SPSS. MATLAB program was used to estimate the interclass

correlation for the aforementioned data. The aim of this study can be

summarized in two main objectives as follows:

The first is to derive estimate of the interclass correlation using

MINQUE.

The second is to estimate of interclass correlation between mother’s

education and the level of education of her children, using the

derived MINQUE.

Data of the "Egypt Demographic and Health Survey,

2000" was used for this purpose.

4

1-3 Familial Correlation Literature Review

Several estimators have been proposed for the interclass correlation

coefficient. Some of these estimators have been discussed in detail by

Rosner et al. (1977). The first is Pairwise estimator where each child in a

family is paired with the mother of that family. The second is Sib-Mean

Estimator where the mean offspring score from a family is paired with the

mother of that family. The third is Random-Sib Estimator where a random

offspring is chosen from each family and is paired with the mother of that

family . And, the last is Ensemble Estimator , whereby an ‘expected value’

for all random-sib estimator is computed over all possible choices of random

sibs from each family .For estimators Pairwise, Sib-Mean and Random-Sib,

an ordinary Pearson correlation is computed from the set of pairs formed

over all families in the sample.

Under the assumption of normality, Srivastava(1984) derived the

iterative maximum likelihood estimators of parent-children correlation and

using a canonical reduction of the data. He also proposed two sets of

alternative estimators based on the canonical reduction that do not require an

iterative procedure and have better distributional properties. All three sets of

estimators allow families to have different numbers of children.

Srivastava and Keen (1988) derived a noniterative method, the weighted

sums of squares technique for estimating the interclass correlation.

It was shown by Rosner et al. (1977) that estimators pair-wise and

ensemble are superior to Sib-Mean and Random-Sib in terms of mean

squared error with the pairwise estimator being superior in the case of low

5

intraclass correlation, and Ensemble estimator being superior when ss is

high.

Rosner (1979), in a further simulation study, showed that pairwise

estimator is rough equivalent in mean squared error to the maximum

likelihood estimator for small values of ss . For equal numbers sibling per

family, the pair-wise estimator is the maximum likelihood estimator.

Accordingly, the pairwise procedure has generally been accepted as

reasonable approach for estimating interclass correlation in most practical

situations, especially since the maximum likelihood estimation, in general,

presents computational difficulties.

Srivastava and Katapa (1986) compared the asymptotic distributions of the

maximum likelihood estimators and alternative estimators proposed in

Srivastava(1984).

Eliasziw M., et al (1990), demonstrated that the estimator proposed by

Srivastava (1984) is shown to be identical to the modified sib-mean

estimator when the sib-sib correlation is estimated by the method of

unweighted group means and only slightly more efficient than ensemble.

The additional finite-sample Monte Carlo simulation results reaffirm that

ensemble and Srivastava's estimators are essentially indistinguishable in

terms of mean squared error and bias.

6

1-4 The MINQUE technique Hartley, J. N. K. Rao, and Kiefer (1969) proposed a new method of

estimation for general linear models with heteroscedastic error variances.

C. R. Rao(1970) has named it MINQUE (minimum norm quadratic

unbiased estimation or estimator(s), depending on the context).

C. R. Rao (1971a,1971b,1972 ) generalized it for variance and covariance

components models. J.N.K. Rao (1971, 1973) has compared MINQUE and

modified MINQUE, which is just the average of the squared residuals,

estimators of heteroscedastic variances with the usual sample variances in

the case of replicated data.

P. S. R, S. Rao and Chaubey (1978) considered some modifications of

MINQUE and gave generalizations. These authors made it possible to

estimate the distinct elements of the covariance matrix using similar

methods as MINQUE in univariate as well as multivariate situations.

The novelty behind this method of MINQUE is that it lays down a new

optimality criterion of estimators and yields explicit estimators in

complicated situations . Chaubey (1980b) used this method to estimate the

variances and covariances arising from an unbalanced regression with

residuals having a covariance matrix of intraclass form. Kleffe (1993)

derived the explicit form for the estimate of interclass correlation using the

MINQUE.

7

1-5 Source of Data The source of data used in this study is the 2000 Egypt Demographic and

Health Survey (2000 EDHS). This survey interviewed a nationally

representative sample of 15,573 ever-married women aged 15- 49. It is the

sixth in the series of Demographic and Heath Surveys conducted in Egypt.

In addition to the main purpose of 2000 EDHS, obtaining data from

community on the current health situation, it included special module

collecting data on children’s education. From this module, we will construct

two indices: one for mother’s education and another for her children

education.

1-6 The relationship between education of the mother and her

children The majority of researches on relationship between parental education and

child educational focus on the duration of child schooling as primary

outcome measure. Fewer studies, however, have analyzed the relationship

between parental education and children’s learning within school.

Brown, P.H. (2003) presents the landmark study of education in the United

States known as the “Coleman Report” (United States National Center for

Educational Statistics, 1966)which reported that family characteristics are

more important determinants of educational achievement than school quality

or teacher experience, particularly in the early stages of schooling.

The first objective of his paper is to understand how parental education

affects investments in children’s human capital. Using a new survey of

children, households, schools, and communities in Gansu, China, he

8

estimated the demand for six education-related investments. He found that

more educated parents provid higher levels of both education-related goods

(e.g., the provision of children’s books) and education-related time (e.g.,

time spent reading to children). The study suggests that the perceived returns

to education are higher for the children of more educated parents.

The second objective of his paper is to analyze the extent to which these

investments explain the robust relationship between parental education and

children’s learning described in the literature. To facilitate this, he estimate

the effect of parental education on children’s Chinese and mathematics test

scores with and without controlling for individual investments; reductions in

the estimated effect of parental education when controlling for investments

are interpreted as the degree to which the particular investment explains the

relationship between parental education and test scores. Parental education

has a strong positive effect on children’s test scores. Even though the

direction of causality is uncertain in some estimates, the paper shows a

correlation between parent education and various investments in children’s

human capital development is evident.

Magnuson, K. (2002) discuss the following question"Does an increase in a

mother’s education improve her young child’s academic performance?".

Positive correlations between mothers’ educational attainment and children’s

well being, in particular children’s cognitive development and academic

outcomes, are among the most replicated results from developmental studies.

Yet, surprisingly little is known about the causal nature of this relationship.

Because conventional regression (e.g., OLS) and analysis of variance (e.g.,

ANOVA) approaches to estimate the effect of maternal schooling on child

outcomes may be biased by omitted variables, they use experimentally

9

induced differences in mothers’ education to estimate Instrumental Variable

(IV) models. Their data come from the National Evaluation of Welfare to

Work Strategies Child Outcomes Study (NEWWS-COS).

He found that increases in maternal education are significantly and

positively associated with children’s academic school readiness, and

negatively associated with children’s academic problems. The IV models

produce larger, although less precise, estimates compared to the OLS

models.

Jerrim.J and Micklewright.J (2009) focus on the socio-economic

characteristics of each parent and the different influences they exert on boys

and girls. Their data come from the Programme for International Student

Assessment (2003 PISA).

In 2003 PISA tested children’s ability in one major (maths) and two

‘minor’ (Reading and Science) domains. They focus in particular on the

results for maths, they tested the association between each parent’s

education and their child’s cognitive skills at age 15 using regression models

in which each parent’s years of education enters separately. It is not easy to

make broad generalizations from their results about the relative importance

of father’s and mother’s education for their children’s cognitive ability in

secondary school and how this varies for sons and daughters. They

attempted to present a summary of the general picture, focusing on the

results for ability in maths.

First, it is more common for father’s education to have a greater effect than

mother’s education (they use ‘effect’ without implying causality). Second,

this seems to be particularly true of sons, but there are plenty of countries

that are counter examples for both sons and daughters. Third, they found

10

more variation across countries in the differences in the effects of fathers

and mothers than in the differences either parent has on sons and daughters.

Fourth, there is some suggestion of a common pattern across countries that

mothers have more effect on their daughters than their sons, although the

differences are small. Fifth, they frequently found complementarities

between mothers’ and fathers ‘education that warrant further attention

1-7 Organization of the thesis This thesis consists of five chapters. The first one is an introductory

chapter, which presents the importance and the objectives of the study. In

chapter two, the principal of MINQUE approach is presented. In chapter

three, a review of related literature about the estimation of the interclass

correlation methods is presented. Chapter four contains the source of data,

study variables and obtaining the MINQUE estimator for the interclass

correlation. Finally, the conclusion will be presented in the last chapter.