chapter-6 reliability and...
TRANSCRIPT
CHAPTER-6
RELIABILITY AND VALIDITY
6.0 Introduction
6.1 Meaning and Methods of Reliability
6.2 Methods of Estimating Reliability
6.2.1 Test-Retest Method
6.2.2 Alternate or Parallel Forms Method
6.2.3 Split-Half Method
6.2.4 Method of Rational Equivalence
6.3 Reliability of the Present Test
6.3.1 Test-Retest Method
6.3.2 Split-Half Method
6.3.3 Reliability by Rulon Formula
6.3.4 Method of Rational Equivalence
6.3.5 Standard Error of Measurement
6.3.6 Standard Error of Correlations
6.3.7 Comprehensive View of Reliability
6.4 Meaning of Validity
6.5 Methods of Validity
6.5.1 Face Validity
6.5.2 Content Validity
6.5.3 Criterion Validity
6.5.4 Construct Validity
6.5.5 Factorial Validity
6.6 Validity of the Present Test
6.6.1 Content Validity
6.6.2 Criterion Validity
6.6.3 Factorial Validity
6.6.4 Construct Validity
6.7 Conclusion
120
CHAPTER-6
RELIABILITY AND VALIDITY
6.0 Introduction
It is necessary to know the validity and reliability of the tool before
evaluating the result obtained by it. If we do not know the reliability and validity
of the instrument than the evaluation and interpretation of results obtained by such
tool or instrument is meaningless. Different kind of validity and reliability are
discussed in this chapter.
6.1 Meaning of Reliability
Reliability means consistency of the test result. Internal consistency of
results and consistency of results over a long period of time.
According to McMillan and Schumacher (1989),
Reliability refers to the consistency of measurement, the extent to which
the results are similar over different forms of the same instrument or occasions of
data collection. The goal of developing reliable measures is to minimize the
influence of chance or other variables unrelated to the intent of the measure.
Reliability is mathematically defined as the ratio of true score variance and total
variance of test scores (Gregory, 2005).
rxy = б T2 = б T
2___ б X
2
б T2 + б e2
Where rxy is reliability coefficient, 6e is measurement error, 6T2 is true
variance and 6X2
is total variance of the test scores. There is no way to directly
observe or calculate the true score, therefore, a variety of methods are used to
121
estimate the reliability of a test. Correlation coefficient of the test may be in
between -1.0 to +1.0.
6.2 Methods of Estimating Reliability
There are four procedures in common use for computing the reliability
coefficient (sometimes called the self correlation) of a test (Garrett, 1981).
Methods are as under.
6.2.1 Test-Retest Method
In this method same form of the test is administered twice on same group
of individuals over a period of time. Here evaluation is based on the correlation
(Pearson product moment correlation) between two administrations of the same
test, scale or instrument on same group for different times. The resulting test
scores are correlated and this correlation coefficient provides a measure of
stability and indicates how stable the test results are over the given period of time.
6.2.2 Alternate or Parallel Forms Method
Two forms of the same test (alternate form or parallel form) are
independently constructed to meet the same specifications, often on an item-by-
item basis. Thus, alternate forms of a test incorporate similar content and cover
the same range and level of difficulty in items. It is difficult to construct a parallel
form of a test.
6.2.3 Split-Half Method
Reliability also can be estimated from a single administration of a single
form of a test. The test is administered to a group of pupils in the usual manner.
122
After scoring the responses, the test is divided in to two splits; one split of the test
has odd number of items and other split of the test has even number of items. Two
split or halves of a test should be equivalent. The correlation coefficient between
the scores obtained on these splits is calculated and then the reliability of whole
test is calculated. Split half reliability provides a measure of internal consistency.
This coefficient indicates the degree to which consistent results are obtained from
the two halves of the test.
6.2.4 Method of Rational Equivalence
In this method, the test is equivalent to a hypothetical parallel form such
that every item on each form is interchangeable (Garrett, 1981). Kuder-
Richardson (K-R) formula in order to correlate all items on a single test with each
other when each item is scored right or wrong. K-R reliability is thus determined
from a single administration of an instrument, but without having to split the
instrument into equivalent halves. This procedure assumes that all items in an
instrument or test are equivalent to each other, and it is appropriate when the
purpose of the test is to measure a single trait. If a test or instrument has items of
varying difficulty or it measures more than one trait, the K-R estimate would
usually be lower than the split-half reliability.
6.3 Reliability of the Present Test
Reliability of the present test is estimated by
(1) Test-Retest method
(2) Split-Half method
(3) Method of Rational Equivalence
123
6.3.1 Test-Retest Method
To estimate reliability by this method, one school from rural area and one
school from urban area were randomly selected. The test was administered over
107 students of different schools of different area. After 29 days, RAT was given
those students who took part in test. The sample selected for reliability was
indicated in chapter-4. Scores obtained on test-retest is show in table 6.1.
Table 6.1
Scatter Diagram of Scores on Test-Retest
Test
Retest
11-15
16-20
21-25
26-30
31-35
36-40
41-45
46-50
51-55
56-60
fy
51-55 3(20) 3(25) 6
46-50 1(0) 1(8) 2
41-45 3(6) 4(9) 1(12) 1(15) 9
36-40 1(-8) 1(-4) 2(-2) 3(0) 6(2) 2(4) 2(6) 1 (10) 18
31-35 2(-1) 5(0) 4(1) 1(2) 1(4) 13
26-30 1(0) 1(0) 4(0) 2(0) 8
21-25 2(4) 4(3) 4(2) 3(1) 1(0) 1(-1) 2(-3) 17
16-20 3(8) 4(6) 4(4) 2(2) 3(0) 1(-8) 1(-10) 18
11-15 5(12) 3(9) 3(6) 1(3) 12
6-10 2(16) 1(12) 1(0) 4
fx 14 13 16 12 14 11 7 8 6 6 107
Cx = -0.336 Cy =0.0841 Σx’y’= 484 6x=2.668 SER=0.004
Cx2 = 0.1131 Cy
2 =0.007 N=108 6y=2.384
The correlation coefficient between test and retest scores was 0.72, which
was significant at 0.01 levels. Hence the reliability determined by this method for
RAT was high.
124
6.3.2 Split-Half Method
To determine the reliability by this method, the test was administered over
371 students of different schools. The test dividing into two equivalent halves; one
half form of the test contains odd numbered items and other half form of the test
contains even numbered items. Correlation between scores obtained on these two
halves forms was calculated, and then, Spearman-Brown prophecy formula was
applied to find out the reliability. The scatter diagram for correlation between odd
and even numbered items scores is show in table 6.2.
Table 6.2
Scatter Diagram of Scores Obtained on Two Halves of the Test
Odd
Even
1-3
4-6
7-9
10-12
13-15
16-18
19-21
22-24
25-27
28-30
fy
28-30 2(8) 5(12) 11(16) 18
25-27 1(0) 2(3) 11(6) 12(9) 5(12) 31
22-24 1(0) 9(2) 12(4) 7(6) 4(8) 33
19-21 2(-1) 7(0) 18(1) 13(2) 5(3) 45
16-18 1(0) 7(0) 18(0) 16(0) 1(0) 43
13-15 1(4) 3(3) 11(2) 16(1) 16(0) 3(-1) 50
10-12 3(8) 14(6) 25(4) 4(2) 11(0) 3(-2) 60
7-9 9(12) 21(9) 10(6) 4(3) 2(0) 46
4-6 3(20) 8(16) 15(12) 5(8) 5(4) 36
1-3 2(25) 4(20) 3(15) 09
fx 5 25 56 52 38 56 51 38 29 21 371
Cx = -0.369 Cy = -0.595 Σx’y’= 1919 6x=2.10 SER=0.001
Cx2 = 0.136 Cy
2 =0.3548 N=371 6y=2.388 SEM=1.94
The correlation between two split was 0.987 and reliability for whole test
was found out by Spearman-Brown formula. Spearman-Brown reliability for the
whole test was 0.99, which indicated that the RAT was reliable.
125
6.3.3 Reliability by Rulon Formula
An alternate method for finding split-half reliability was developed by
Rulon (1939). It requires only the variance of the differences between each
person’s scores on the two half tests (SDd2) and the variance of total scores (SDx
2);
these two values are substituted in the following formula, which yields the
reliability of the whole test directly (Anastasi and Urbina, 2007):
rtt = 1- ( SDd2/ SDx
2)
Where
rtt = Reliability index of the test,
d = Difference of score for each item,
SDd = S.D. of the differences between each person’s scores on the two
halves-tests
SDx = S.D. of total scores
Here,
SDd = 1.97, SDx =13.73, N=371
= 0.98
Reliability coefficient for the test was 0.98, which shows the moderately
high value of reliability of the test.
6.3.4 Method of Rational Equivalence
Estimating reliability by this method, the test was administered over 371
students of grade 5th to 7th. Then proportion of students selecting right or wrong
response of each item was found out, also standard deviation of the test score was
126
found out. Reliability of the test was found out by KR-20 formula, the formula is
as under.
KR20 = rtt = _n_ SDt2 – Σpq
n-1 SDt2
KR21=rtt= n_ 1 - M(n-M) n-1 n (S.Dt)2
Where,
rtt = reliability coefficient of the whole test
n= number of items in the test
SDt= S.D of total scores on the test
p=proportion of person who pass each item
q=proportion of person who do not pass each item
M=Mean of total scores
Here,
n=60, S.D= 13.73, Σpq= 14.01, M= 31.30
KR20 =rtt = 0.94
KR21 =rtt = 0.94
If KR20=KR21 then the facility values of each items should be equal (1999,
Ambasana). Reliability coefficient calculated by KR20 and KR21 was 0.94 which
indicate that the facility value of each item was nearly equal.
6.3.5 Standard Error of Measurement
According to Garrett (1981), the standard error of measurement of scores
is a better way of expressing the reliability of a test, than the reliability coefficient,
as it takes into account the variability within the group as well as the self-
127
correlation of the test. The effects of variable or chance error in producing
divergences of test scores from their true values is given by the formula
____ SEM = S.D.√1-r2
Where, S.D. = Standard deviation of the test scores
r = Reliability coefficient
SEM was calculated by above formula for each reliability coefficient and shown in
table 6.3.
6.3.6 Standard Error of Correlation
Also reliability of the test can be expressed in terms of standard error of
correlation. Formula for estimating SER is as under.
SER= 1-r2
√N
Where, r= correlation coefficient
N= number of students
Standard error of correlation coefficients for reliability is found out and shown in
table 6.3.
6.3.7 Comprehensive View of Reliability
In table 6.3, each reliability coefficients (rtt) and standard error values have
been shown.
128
Table 6.3
Summary of Reliability
No. Method of
Reliability
Sample
(n)
Reliability
coefficient(rtt)
SEM SER
1 Test-Retest 107 0.72 8.69 0.046
2 Split-Half 371 0.99 1.94 0.001
3 Rulon formula 371 0.98 2.61 0.002
4 KR20 & KR21 371 0.94 4.68 0.006
From above table 6.3, we can see that the values of reliability coefficient
are moderately high while standard error of measurement ranges 1.94 to 8.69 and
standard error of correlation coefficient ranges 0.001 to 0.046. So it can be said
that the test is highly reliable.
6.4 Meaning of Validity
Almost we know the merit of a psychological test is determined first by its
reliability but then ultimately by its validity. The validity of a test concerns what
the test measures and how well it does so. It tells us what can be inferred from test
scores (Anastasi and Urbina, 2007). A test is valid to the extent that inferences
made from it are appropriate, meaningful and useful (Gregory,
2005).Traditionally, the different ways of accumulating validity evidence are
known and discussed here in the context of present study.
6.5 Methods of Validity
Gregory (2005); Anastasi, and Urbina (2007) have discuses various forms
of validity, which are
1. Face Validity
129
2. Content Validity
2. Criterion Validity
3. Construct Validity
4. Factor Validity
6.5.1 Face Validity
It is not validity in the technical sense; it refers, not to what the test
actually measures, but to what it appears superficially to measure. By the look out
of the test, examinee can decide what it appears to measure.
6.5.2 Content Validity
Content related evidence is the extent to which the content of the test is
judged to be representative of some appropriate universe or domain of content. In
establishing content validity, experts typically examine the test items and indicate
whether the items measure predetermined criteria, objectives or content.
6.5.3 Criterion Validity
Whenever test scores are to be used to predict future performance or to
estimate current performance on some valued measure other than the test itself
(called a criteria), we are especially concerned with criterion-related evidence.
There are two different approaches to validity evidence subsumed under the
criterion related validity, (1) concurrent validity and (2) predictive validity
(Gregory, 2005).
130
(1) Concurrent Validity
It is concern with relation of the test performance to some other current
measure of performance. We obtain both measures (test performance as a score
and other current criteria) at approximately the same time and correlate the results.
(2) Predictive Validity
In predicative validity, the correlation is found out between the present test
performance and some other future measures of performance. The difference
between concurrent and predictive validity is on basis of time relations between
criterion and test.
6.5.4 Construct Validity
A construct is a theoretical, intangible quality or trait in which individuals
differ (Messick, 1995; Cited in Psychological Testing, 2005). Construct validity is
appropriate for tests of psychological traits or qualities and requires both logical
arguments and empirical evidences.
6.5.5 Factor Validity
Factor analysis is a specialized statistical technique that is particularly
useful for investigating construct validity. The purpose of factor analysis is to
identify the minimum number of determines (factors) required to account for the
inter correlations among a battery of tests (Gregory, 2005).
Factor analysis is a method for determining the number and nature of the
underlying variables among larger numbers of measures. It may also call a method
for extracting common factor variances from sets of measures (Kerlinger, 2007;
and C.R.Aldous, 2001). There are two forms or methods of factor analysis,
131
confirmatory and exploratory (principal components method) factor analysis. In
confirmatory factor analysis, the purpose is to confirm that test scores and
variables fit a certain pattern predicted by a theory. In exploratory factor analysis,
the purpose is to find out the factors or minimize the number of factors from sets
of measures.
6.6 Validity of the Present Test
Content validity, criterion validity, factor validity and construct validity
were determined for the present test. To determine criterion validity (concurrent
validity) of the present test, correlation coefficient of this test with other tests has
been found out. Two type of factor analysis; exploratory and centroid method
were performed for the test scores.
6.6.1 Content validity
During procedure of test construction, the test was sent to the experts of
different field like, education, mathematics, research, psychology and primary
education. Components were selected on the basis of content analysis of
mathematics text books. Therefore, components selected for the test were related
to reasoning. Experts were asked to give their opinions about test items,
instructions and components. Modifications in the test were done in the context of
experts’ opinions. So, the content of the test was representative of reasoning in
mathematics.
132
6.6.2 Criterion validity
To determine the criterion validity, the correlation coefficient between
scores on RAT and other similar tests which measure comparable attributes has
been calculated. The correlation coefficient of the scores on RAT with the
following standardized tests scores has been calculated.
1. Dr. S.R. Patel’s Verbal Reasoning Ability Test
2. Dr. R.S. Patel’s Numerical Ability Test
3. Dr. Jyotiben Desai’s IQ test
4. Mathematics achievement (Score in preliminary exam)
To determine criterion validity, the above tests and the RAT was
administered over 108 students from Adarsh Vidyalaya, Visnagar and Sawala
Primary School. In this sample, there were 53 boys and 55 girls. Thirty six
students of each grade were selected.
(1) Correlation between RA and Verbal Reasoning Ability
Scatter diagram of scores on RAT and Dr. S.R.Patel’s Verbal Reasoning
Ability Test (VRAT) is shown in table 6.4.
133
Table 6.4
Scatter Diagram of Scores on RAT and Verbal Reasoning Ability Test
VRAT
x
y
RAT
9-15
16-22
23-29
30-36
37-43
44-50
51-57
58-64
65-71
72-78
fy
51-55 1(10) 5(25) 6
46-50 1(12) 1(16) 2
41-45 1(0) 2(6) 4(4) 1(12) 2(15) 10
36-40 1(-8) 1(-4) 1(-2) 1(0) 5(2) 3(4) 5(6) 1(10) 18
31-35 11(-2) 1(-1) 1(0) 4(1) 4(3) 1(4) 13
26-30 1(0) 1(0) 11(0) 1(0) 1(0) 1(0) 7
21-25 1(4) 11(3) 4(2) 3(1) 2(0) 3(-1) 2(-2) 1(-3) 18
16-20 1(8) 1(6) 4(4) 5(2) 2(0) 2(-2) 1(-4) 1(-6) 1(-8) 18
11-15 11(9) 2(6) 1(3) 2(0) 1(-3) 3(6) 2(-9) 13
6-10 1(16) 1(4) 1(-4) 3
fx 5 6 15 13 10 17 12 18 4 8 108
Cx =0.6481 Cy = 0.111 Σx’y’= 355 6x=2.46 SER=0.067
Cx2 = 0.42 Cy2 =0.0123 N=108 6y=2.38
The correlation coefficient between the scores on RAT and Verbal
Reasoning Ability Test was 0.55, which high than expected value (0.24) and
significant at 0.01 level. Standard error of estimation was (SER) 0.067, which was
very low, so the RA and verbal Reasoning Ability was correlated.
(2) Correlation between RA and Numerical Ability
Scatter diagram of scores on RAT and Dr. R.S.Patel’s Numerical Ability
Test (NAT) is shown in table 6.5.
134
Table 6.5
Scatter Diagram of Scores on RAT and Numerical Ability Test
x
NAT
y
RAT
4-6
7-9
10-12
13-15
16-18
19-21
22-24
fy
51-55 1(5) 2(10) 3(15) 6
46-50 1(-8) 1(4) 2
41-45 1(-6) 2(-3) 1(0) 2(3) 2(6) 1(9) 9
36-40 11(-6) 3(-2) 6(2) 6(4) 1(6) 18
31-35 1(-3) 5(-2) 3(-1) 2(0) 2(1) 13
26-30 3(0) 3(0) 1(0) 1(0) 8
21-25 3(3) 1(2) 9(1) 3(0) 2(-1) 18
16-20 5(6) 2(4) 7(2) 4(0) 18
11-15 5(6) 6(3) 1(0) 12
6-10 1(8) 2(4) 1(0) 4
fx 14 16 35 13 15 10 5 108
Cx = -0.5463 Cy =0.07407 Σx’y’= 225 6x=1.65
Cx2
= 0.2984 Cy2 =0.0055 N=108 6y=2.375 SER=0.068
The correlation coefficient between the scores on RAT and Numerical
Ability Test was 0.54, which high than expected value (0.24) and significant at
0.01 level. Standard error of estimation was (SER) 0.068, which very low, so that
RA and numerical ability is correlated.
(3) Correlation between RA and IQ
Scatter diagram of scores on RAT and Dr. Jyotiben Desai’s IQ Test is shown
in table 6.6.
135
Table 6.6
Scatter Diagram of Scores on RAT and IQ Test
IQ
x
y
RAT
5-11
12-
18
19-
25
26-
32
33-
39
40-
46
47-
53
54-
60
61-
67
fy
51-55 1(-5) 2(15) 3(20) 6
46-50 1(-4) 1(12) 2
41-45 2(0) 2(3) 1(6) 2(9) 2(12) 9
36-40 1(-4) 3(0) 9(2) 4(4) 1(6) 18
31-35 3(-1) 2(0) 4(1) 1(2) 1(3) 2(4) 13
26-30 2(0) 3(0) 1(0) 1(0) 1(0) 8
21-25 5(3) 4(2) 5(1) 2(0) 1(-1) 1(-2) 18
16-20 1(8) 5(6) 5(4) 1(2) 2(0) 2(-2) 1(-4) 1(-8) 18
11-15 2(12) 2(6) 5(3) 2(-3) 1(-9) 12
6-10 1(12) 2(8) 1(4) 4
fx 3 13 17 18 11 21 8 9 8 108
Cx =0.0648 Cy =0.074 Σx’y’=334 6x=2.178 SER=0.061
Cx2 = 0.0042 Cy
2 =0.0055 N=108 6y=2.375
The correlation coefficient between the scores on RAT and IQ test was
0.60, which was grater then expected value (0.24) and significant at 0.01 level.
Standard error of estimation was (SER) 0.061, which was very low, so that RA and
IQ are correlated.
(4) Correlation between RA and Mathematics Achievement
To estimate correlation between RAT scores and mathematics
achievement scores, scores obtained in preliminary examination in mathematics
subject was considered as a mathematics achievement. The mathematics scores
were converted in to T-scores, than the correlation coefficient between these T-
scores of mathematics achievement and scores on RAT was calculated. The
136
scatter diagram of scores on RAT and T-scores of mathematics achievement is
plotted in table 6.7.
Table 6.7
Scatter Diagram of Scores on RAT and T-score of Mathematics Achievement
x T-
Scores
y RAT
29-32
33-36
37-40
41-44
45-48
49-52
53-56
57-60
61-64
65-68
fy
51-55 1(15) 2(20) 3(25) 6
46-50 1(16) 1(20) 2
41-45 1(6) 4(9) 4(12) 9
36-40 2(-2) 2(0) 3(2) 5(4) 3(6) 1(8) 2(10) 18
31-35 3(-1) 1(0) 2(1) 4(2) 2(4) 1(5) 13
26-30 1(0) 3(0) 2(0) 2(0) 8
21-25 1(2) 2(1) 3(0) 7(-1) 2(-2) 1(-3) 2(-4) 18
16-20 2(8) 1(6) 4(4) 4(2) 4(0) 1(-2) 2(-4) 18
11-15 1(12) 4(9) 2(6) 3(0) 1(-3) 1(-6) 12
6-10 1(16) 1(8) 1(4) 1(0) 4
fx 4 5 8 13 17 16 17 9 12 7 108
Cx =0.9259 Cy =0.07407 Σx’y’=441 6x=2.344 SER=0.046
Cx2 = 0.8573 Cy
2 =0.0055 N=108 6y=2.375
The correlation coefficient between the scores on RAT and T-scores of
mathematics achievement in preliminary exam was 0.72, which was high than
expected value (0.24) and significant at 0.01 level. Standard error of estimation
was (SER) 0.046, which is very low, so that RA and mathematics achievement is
correlated.
137
6.6.3 Factorial Validity
The factor validity of the scores on RAT was found out by principal
component method (exploratory factor analysis) and Thurstone’s Centroid
method. Randomly 371 students were selected as a sample for factor analysis.
6.6.3.1 Exploratory Factor Analysis
The exploratory factor analysis was done with the help of SPSS, 17.0 trial
version. The analysis can be done in four different stages (George and Mallery,
2006), which are
(1) To confirm the appropriateness of the data for factor model to be used
(2) Extraction of the factors
(3) Rotation
(4) Calculation of factor scores
(1) Appropriateness of Factor Analysis Model
The question normally arise that, is the data appropriate for the factor
analysis model? To answer this question researcher has studied the descriptive
statistics. Most of correlations between items were found positive (greater than
0.5) and significant at 0.05 and above level. The Kaiser-Meyer-Olkin measure of
sampling adequacy was 0.899 which was near to 1. Bartlett’s test of sphericity
was 7903.317, which was significant at 0.000 levels. Residual were computed
between observed and reproduced correlations. There were 359 (20%) non
redundant residuals with absolute values > 0.05. All these results show that the
factor analysis model was appropriate for the data.
138
(2) Extraction of the Factors
The common factors were extracted by factor analysis of the scores on
RAT. The statistics of factors extraction is mentioned in table 6.8.
Table 6.8
Factor Extraction and Statistics
Item
No.
Communality
Factor
Eigen
values
Percentage
of variance
Cumulative
percentage of
variance
1 0.354 1 13.865 23.108 23.108
2 0.353 2 2.422 4.037 27.145
3 0.292 3 2.311 3.851 30.996
4 0.479 4 1.848 3.080 34.076
5 0.379 5 1.655 2.759 36.835
6 0.482 6 1.614 2.690 39.525
7 0.561 7 1.467 2.445 41.970
8 0.364 8 1.417 2.362 44.332
9 0.587 9 1.311 2.185 46.517
10 0.510 10 1.261 2.101 48.618
11 0.395 11 1.234 2.056 50.674
12 0.458 12 1.192 1.986 52.660
13 0.556 13 1.151 1.918 54.579
14 0.541 14 1.121 1.868 56.447
15 0.480 15 1.078 1.797 58.245
16 0.497 16 1.021 1.702 59.947
17 0.552 0.998
18 0.350 0.978
19 0.382 0.934
20 0.492 0.908
21 0.483 0.877
22 0.646 0.845
23 0.561 0.835
24 0.560 0.789
25 0.593 0.784
26 0.569 0.761
27 0.617 0.752
28 0.559 0.720
139
Item
No.
Communality
Factor
Eigen
values
Percentage
of variance
Cumulative
percentage of
variance
29 0.357 0.708
30 0.579 0.685
31 0.531 0.613
32 0.539 0.600
33 0.439 0.595
34 0.543 0.589
35 0.524 0.577
36 0.572 0.568
37 0.521 0.557
38 0.398 0.548
39 0.554 0.502
40 0.536 0.488
41 0.553 0.467
42 0.401 0.454
43 0.371 0439
44 0.454 0.421
45 0.382 0.405
46 0.509 0.392
47 0.524 0.386
48 0.479 0.378
49 0.401 0.362
50 0.381 0.351
51 0.535 0.335
52 0.467 0.328
53 0.390 0.321
54 0.438 0.300
55 0.552 0.278
56 0.616 0.274
57 0.452 0.251
58 0.390 0.243
59 0.443 0.234
60 0.403 0.203
Above table 6.8 show that, there were 16 factors extracted and mean and
standard deviation of communality was 0.48 and 0.08 respectively. The computer
140
programmed by default, extract the factor which has the Eigen value is greater
than one. The Eigen value of first factor and sixteenth factor was 13.865, 1.021
respectively. It suggests that the first factor is strong than other. Out of total
variance, 59% variance was correlated with 16 common factors. The Scree plot
for factors and Eigen values show in graph 6.1. Here extracted factors were
approximately less than one third of the total items.
Scree Plot
Component Number
5855524946434037343128252219161310741
Eige
nval
ue
16
14
12
10
8
6
4
2
0
Graph 6.1
Scree Plot
First factor’s Eigen value was high than other 15 factors, so that the graph
become elbow type which indicate the first factor was strong.
(3) Rotation
The factor analysis was followed by a varimax rotation to achieve simple
structure, assisting the resolution of factors. If the items were correlated with more
than one factor before rotation, these items were correlated with only one factor
after rotation. Before rotation, except item number 1, 2, 4, 48 and 57, all other 55
141
items out of 60 items were correlated with one factor, which was Reasoning
Ability in mathematics.
The result of factor analysis indicates that the Reasoning Ability test in
mathematics is valid.
6.6.3.2 Thurstone’s Centroid Method
Also, factor analysis done by centroid method of Thurstone. The RAT was
constructed on the basis of different five components. The final form of the test
contained 60 items in which, 12 items depend on each component. Randomly 371
students were selected for find out correlation between components. The
correlation matrix with first factor loading is shown in table 6.9.
Table 6.9
Correlation Matrix and First Factor Loading
Component A B C D E Verification
(Total)
A (0.711) 0.702 0.711 0.554 0.620 3.298
B 0.702 (0702) 0.648 0.573 0.662 3.287
C 0.711 0.648 (0.711) 0.687 0.710 3.467
D 0.554 0.573 0.687 (0.687) 0.591 3.092
E 0.620 0.662 0.710 0.591 (0.710) 3.293
E 3.298 3.287 3.467 3.092 3.293 16.437=T1
a1=mE 0.815 0.812 0.856 0.764 0.813 __ √T = 4.054
m=0.247 Σa1= 4.06
From table 6.9, first factor loading was calculated on the basis of
correlation coefficient between the five components. On the basis of first factor
loading, residual correlation and second order factor loading was calculated.
Residual correlation matrix and second order factor loading is shown in table 6.10.
142
Table 6.10
Residual Correlation Matrix and Second Order Factor Loading
0.815 0.812 0.856 0.764 0.813 a1 →
↓ Component A B C D E
Total
0.815 A 0.069
(0.047)
0.40 0.013 -0.069 -0.043 -0.012
0.812 B 0.040 0.047
(0.043)
-0.047 -0.047 0.002 -0.009
0.856 C 0.013 -0.047 0.047
(-0.022)
0.033 0.014 -0.009
0.764 D -0.069 -0.047 0.033 0.069
(0.103)
-0.030 -0.01
0.813 E -0.043 0.002 0.014 -0.030 0.043
(0.049)
-0.008
Σ0 -0.012 -0.009 -0.009 -0.01 -0.008 -0.048
Σj2 -0.059 -0.052 0.013 -0.113 -0.057 -0.268
Column D 0.079 0.042 -0.053 0.113 0.003 0.184
Column C 0.053 0.136 0.053 0.179 -0.025 0.396
Column E 0.139 0.132 0.081 0.119 0.025 0.496
tj2
a2
0.208
0.237
0.179
0.204
0.128
0.146
0.188
0.214
0.068
0.077
0.771=T2
__
√T2=0.8781
Second order factor scores (a2), factor variance and communality was
manually calculated for each component, details are in table-6.11.
143
Table 6.11
Centroid-Factor Matrix
Test
(Component)
Factor Scores Factor Variance Communality
a1 a2 a12 a2
2 h2
A 0.815 0.237 0.664 0.056 0.720
B 0.812 0.204 0.659 0.042 0.701
C 0.856 -0.146 0.733 0.021 0.754
D 0.764 -0.214 0.584 0.046 0.630
E 0.813 -0.077 0.661 0.006 0.667
Total 3.301
95.07 %
0.171
4.93 %
3.472
100 %
As shown in table 6.11, the second order scores of test (component) C, D
and E were negative due to their rotation. We can see that, 95.07 % of total
variance was related to first order factor score which was Reasoning Ability in
mathematics. The results of factor analysis (Exploratory and Centroid method)
indicate that the RAT measures only Reasoning Ability in mathematics, so that
the test is highly valid.
6.6.4 Construct Validity
Construct related evidence is important for instruments or tests that assess
a trait or theory that cannot be measured directly, such as when the purpose of the
instrument is to measure an unobservable trait. Many psychometric theorists
regard construct validity as the unifying concept for all types of validity evidence
(Cronbach, 1988; Guion, 1980; Messick, 1995; in Psychological Testing, 2005).
In the present study, construct validity was found out in following
categories:
144
(1) Test Homogeneity
The RAT measures Reasoning Ability in mathematics (single construct),
so that the test items and components must be homogeneous. During test
development, all the items selected in the test were internally correlated with the
whole test and significant at 0.01 level, also factor analysis indicate that the test
measures only Reasoning Ability, also internal consistency reliability of the test
was high. So, the items of RAT are homogeneous.
(2) Developmental Change
Manny constructs can be assumed to show regular age-graded changes
from early childhood to mature adulthood and perhaps beyond (Gregory, 2005).
The development theory of Piaget indicates that the level of reasoning is increases
with age and experiences (Four stage of development). In the present study, the
Reasoning Ability of 5th to 7th grade students was measured; Reasoning Ability of
5th grade students was found lower than 6th grade students and 6th grade students
were lower than 7th grade. These differences indicate that the Reasoning Ability
was found to be increased with age and experiences and knowledge of
mathematics which show in graph 6.2.
145
0
20
40
60
80
100
120
140
160
180
200
3 8 13 18 23 28 33 38 43 48 53 58 63 68
Midpoint
Freq
uenc
y
5th Grade6th Grade7th Grade
Scale:On X-axis 0.9 cm=5 Unit
On Y-axis 0.7 cm=20 UnitUnit
X
Y
Graph-6.2
Comparison of Frequency Curves
It is quite obvious from the study of the graphical representation that the
frequency curve of the 7th grade students is significantly moved towards right,
while the frequency curve of the 5th grade students was significantly moved
toward left. The curve of the 6th grade students is in between the curve of 5th and
7th grade students.
(3) Correlation of the RAT Scores with Other Tests Scores
The correlation coefficients between the scores on RAT and Verbal
Reasoning Ability Test scores, Numerical Ability Test scores, IQ test scores and
T-scores of Mathematics Achievement in preliminary exam were found out. All
the correlation coefficient was high and significant at 0.01 levels. Also the values
of standard error of estimation and standard error of measurement of correlation
coefficient were found very low. These results indicate that the scores on RAT are
correlated with other similar tests.
146
(4) Factor analysis
The factor analysis was performed on the test scores, the results obtained
by exploratory factor analysis and Centroid method of Thurstone, indicated that
there was a single factor (Reasoning Ability) in RAT.
6.7 Conclusion
In this chapter, reliability and validity of the present test has been
discussed. The final form of the test was ready for final data collection. The
results show a satisfactory value of the reliability and validity of the present test.
The next chapter presents the analysis of the data.