lecture 7: factor analysis

45
Lecture 7: Factor Analysis Laura McAvinue School of Psychology Trinity College Dublin

Upload: stone-phillips

Post on 03-Jan-2016

18 views

Category:

Documents


0 download

DESCRIPTION

Lecture 7: Factor Analysis. Laura McAvinue School of Psychology Trinity College Dublin. The Relationship between Variables. Previous lectures… Correlation Measure of strength of association between two variables Simple linear regression - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lecture 7: Factor Analysis

Lecture 7:Factor Analysis

Laura McAvinue

School of Psychology

Trinity College Dublin

Page 2: Lecture 7: Factor Analysis

The Relationship between Variables

• Previous lectures…

• Correlation– Measure of strength of association between two variables

• Simple linear regression

– Describes the relationship between two variables by expressing one variable as a function of the other, enabling us to predict one variable on the basis of the other

Page 3: Lecture 7: Factor Analysis

The Relationship between Variables

• Multiple Regression

– Describes the relationship between several variables, expressing one variable as a function of several others, enabling us to predict this variable on the basis of the combination of the other variables

• Factor Analysis

– Also a tool used to investigate the relationship between several variables

– Investigates whether the pattern of correlations between a number of variables can be explained by any underlying dimensions, known as ‘factors’

Page 4: Lecture 7: Factor Analysis

Uses of Factor Analysis

Test / questionnaire constructiono For example, you wish to design an anxiety questionnaire…o Create 50 items, which you think measure anxietyo Give your questionnaire to a large sample of peopleo Calculate correlations between the 50 items & run a factor

analysis on the correlation matrixo If all 50 items are indeed measuring anxiety…

• All correlations will be high

• One underlying factor, ‘anxiety’

Verification of test / questionnaire structureo Hospital Anxiety & Depression Scaleo Expect two factors, ‘anxiety’ & ‘depression’

Page 5: Lecture 7: Factor Analysis

Uses of Factor Analysis

Examining of the structure of a psychological construct

What is ‘attention’?A single ability? Several different abilities?Some neuropsychological evidence for existence of different

neural pathways for ‘selective’ & ‘sustained’ attentionAdminister tests measuring both aspects to large sample &

run factor analysisOne underlying factor? Two?

Page 6: Lecture 7: Factor Analysis

An example

Visual Imagery Ability

Two kinds of measureSelf-report questionnaires v Objective tests

Is self-reported imagery related to imagery measured by objective tests?

Do tests and questionnaires measure the same thing?

Page 7: Lecture 7: Factor Analysis

How does it work?

• Correlation Matrix

– Analyses the pattern of correlations between variables in the correlation matrix

– Which variables tend to correlate highly together?

– If variables are highly correlated, likely that they represent the same underlying dimension

• Factor analysis pinpoints the clusters of high correlations between variables and for each cluster, it will assign a factor

Page 8: Lecture 7: Factor Analysis

Correlation Matrix

• Q1-3 correlate strongly with each other and hardly at all with 4-6

• Q4-6 correlate strongly with each other and hardly at all with 1-3

• Two factors!

Q1 Q2 Q3 Q4 Q5 Q6

Q1 1

Q2 .987 1

Q3 .801 .765 1

Q4 -.003 -.088 0 1

Q5 -.051 .044 .213 .968 1

Q6 -.190 -.111 0.102 .789 .864 1

Page 9: Lecture 7: Factor Analysis

Factor Analysis

• Two main things you want to know…

– How many factors underlie the correlations between the variables?

– What do these factors represent?• Which variables belong to which factors?

Page 10: Lecture 7: Factor Analysis

Steps of Factor Analysis

1. Suitability of the Dataset

2. Choosing the method of extraction

3. Choosing the number of factors to extract

4. Interpreting the factor solution

Page 11: Lecture 7: Factor Analysis

1. Suitability of Dataset

Selection of Variables

Sample Characteristics

Statistical Considerations

Page 12: Lecture 7: Factor Analysis

Selection of Variables

Are the variables meaningful?• Factor analysis can be run on any dataset• ‘Garbage in, garbage out’ (Cooper, 2002)

Psychometrics • The field of measurement of psychological constructs• Good measurement is crucial in Psychology• Indicator approach

• Measurement is often indirect• Can’t measure ‘depression’ directly, infer on the basis of an

indicator, such as questionnaire

Based on some theoretical / conceptual framework, what are these variables measuring?

Page 13: Lecture 7: Factor Analysis

Selection of Variables, Example

Variables selected were measures of key aspects of imagery ability, according to theory

Questionnaires (Richardson, 1994) Vividness Control Preference

Objective Tests (Kosslyn, 1999) Generation Inspection Maintenance Transformation Visual STM

Page 14: Lecture 7: Factor Analysis

Sample Characteristics

Size At least 100 participants

Participant : Variable Ratio Estimates vary Minimum of 5 : 1, ideal of 10 : 1

Characteristics Representative of the population of interest? Contains different subgroups?

Page 15: Lecture 7: Factor Analysis

Sample Characteristics, Example

Size 101 participants

Participant : Variable Ratio 101 : 9

11.22 : 1

Characteristics Interested in imagery ability of general adult population so took a

mixed sample of males and females, varying widely in age, educational and employment backgrounds

Page 16: Lecture 7: Factor Analysis

Statistical Considerations

Assumptions of factor analysis regarding data Continuous Normally distributed Linear relationships

These properties affect the correlations between variables

Independence of variables Variables should not be calculated from each other

e.g. Item 4 = Item 1 + 2 + 3

Page 17: Lecture 7: Factor Analysis

Statistical Considerations

Are there enough significant correlations (> .3) between the variables to merit factor analysis?

Bartlett Test of Sphericity

Tests Ho that all correlations between variables = 0

If p < .05, reject Ho and conclude there are significant correlations between variables so factor analysis is possible

Page 18: Lecture 7: Factor Analysis

Statistical Considerations

Are there enough significant correlations (> .3) between the variables to merit factor analysis?

Kaiser-Meyer-Olkin Measure of Sampling Adequacy

Quantifies the degree of inter-correlations among variables Value from 0 – 1, 1 meaning that each variable is perfectly

predicted by the others Closer to 1 the better If KMO > .6, conclude there is a sufficient number of

correlations in the matrix to merit factor analysis

Page 19: Lecture 7: Factor Analysis

Statistical Considerations, Example

• All variables• Continuous• Normally Distributed• Linear relationships• Independent

• Enough correlations?• Bartlett Test of Sphericity (χ2 = 114.56; df = 36; p

< .001)• KMO = .734

Page 20: Lecture 7: Factor Analysis

2. Choosing the method of extraction

Two methods

Factor AnalysisPrincipal Components AnalysisDiffer in how they analyse the variance in the

correlation matrix

Page 21: Lecture 7: Factor Analysis

Variable

Specific

Variance

Error

Variance

Common Variance

Variance unique to the variable itself

Variance due to measurement error or some

random, unknown source

Variance that a variable shares

with other variables in a

matrix

When searching for the factors underlying the relationships between a set of variables, we are interested in detecting and explaining the common variance

Page 22: Lecture 7: Factor Analysis

Principal Components Analysis

•Ignores the distinction between the different sources of variance

•Analyses total variance in the correlation matrix, assuming the components derived can explain all variance

•Result: Any component extracted will include a certain amount of error & specific variance

Factor Analysis

•Separates specific & error variance from common variance

•Attempts to estimate common variance and identify the factors underlying this

Which to choose?

•Different opinions

•Theoretically, factor analysis is more sophisticated but statistical calculations are more complicated, often leading to impossible results

•Often, both techniques yield similar solutions

V

Page 23: Lecture 7: Factor Analysis

2. Choosing the method of extraction, Example

Tried both

Chose Principal Components Analysis as Factor Analysis proved impossible (estimated communalities > 1)

Page 24: Lecture 7: Factor Analysis

3. Choosing the number of factors to extract

• Statistical Modelling– You can create many solutions using different

numbers of factors

• An important decision– Aim is to determine the smallest number of factors

that adequately explain the variance in the matrix– Too few factors

• Second-order factors

– Too many factors• Factors that explain little variance & may be meaningless

Page 25: Lecture 7: Factor Analysis

Criteria for determining Extraction

Theory / past experience

Latent Root Criterion

Scree Test

Percentage of Variance Explained by the factors

Page 26: Lecture 7: Factor Analysis

Latent Root Criterion (Kaiser-Guttman)

• Eigenvalues

– Expression of the amount of variance in the matrix that is explained by the factor

– Factors with eigenvalues > 1 are extracted– Limitations

• Sensitive to the number of variables in the matrix• More variables… eigenvalues inflated… overestimation of

number of underlying factors

Page 27: Lecture 7: Factor Analysis

Scree Test (Cattell, 1966)

• Scree Plot

– Based on the relative values of the eigenvalues– Plot the eigenvalues of the factors– Cut-off point

• The last component before the slope of the line becomes flat (before the scree)

Page 28: Lecture 7: Factor Analysis

Scree Plot

Component Number

654321

3.5

3.0

2.5

2.0

1.5

1.0

.5

0.0

-.5

Elbow in the graph

Take the components above the elbow

Page 29: Lecture 7: Factor Analysis

Percentage of Variance

• Percentage of variance explained by the factors

– Convention– Components should explain at least 60% of the

variance in the matrix (Hair et al., 1995)

Page 30: Lecture 7: Factor Analysis

3. Choosing the number of factors to extract, Example

• Three components with eigenvalues > 1

• Explained 67.26% of the variance

Scree Plot

Component Number

987654321

Eig

en

valu

e

4.0

3.5

3.0

2.5

2.0

1.5

1.0

.5

0.0

Page 31: Lecture 7: Factor Analysis

4. Interpreting the Factor Solution

• Factor Matrix– Shows the loadings of each of the variables on the

factors that you extracted– Loadings are the correlations between the variables

and the factors– Loadings allow you to interpret the factors

• Sign indicates whether the variable has a positive or negative correlation with the factor

• Size of loading indicates whether a variable makes a significant contribution to a factor

– ≥ .3

Page 32: Lecture 7: Factor Analysis

Component 1 – Visual imagery tests

Component 2 – Visual imagery questionnaires

Component 3 – ?

Variables Component 1 Component 2 Component 3

Vividness Qu -.198 -.805 .061

Control Qu .173 .751 .306

Preference Qu .353 .577 -.549

Generate Test -.444 .251 .543

Inspect Test -.773 .051 -.051

Maintain .734 -.003 .384

Transform (P&P) Test

.759 -.155 .188

Transform (Comp) Test

-.792 .179 .304

Visual STM Test .792 -.102 .215

Page 33: Lecture 7: Factor Analysis

Factor Matrix

• Interpret the factors

• Communality of the variables– Percentage of variance in each variable that can be

explained by the factors

• Eigenvalues of the factors– Helps us work out the percentage of variance in the

correlation matrix that the factor explains

Page 34: Lecture 7: Factor Analysis

Communality of Variable 1 (Vividness Qu) = (-.198)2 + (-.805)2 + (.061)2 = . 69 or 69%

Eigenvalue of Comp 1 = ( [-.198]2 + [.173]2 + [.353]2 + [-.444]2 + [-.773]2 +[.734]2 + [.759]2 + [-.792]2 + [.792]2 ) = 3.36

3.36 / 9 = 37.3%

Variables Component 1 Component 2 Component 3 Communality

Vividness Qu -.198 -.805 .061 69%

Control Qu .173 .751 .306 69%

Preference Qu .353 .577 -.549 76%

Generate Test -.444 .251 .543 55%

Inspect Test -.773 .051 -.051 60%

Maintain .734 -.003 .384 69%

Transform (P&P) Test

.759 -.155 .188 64%

Transform (Comp) Test

-.792 .179 .304 75%

Visual STM Test .792 -.102 .215 69%

Eigenvalues 3.36 1.677 1.018 /

% Variance 37.3% 18.6% 11.3% /

Page 35: Lecture 7: Factor Analysis

Factor Matrix

• Unrotated Solution– Initial solution– Can be difficult to interpret– Factor axes are arbitrarily aligned with the variables

• Rotated Solution– Easier to interpret– Simple structure– Maximises the number of high and low loadings on

each factor

Page 36: Lecture 7: Factor Analysis

Factor Analysis through Geometry

• It is possible to represent correlation matrices geometrically

• Variables– Represented by straight lines of equal length– All start from the same point– High correlation between variables, lines positioned

close together– Low correlation between variables, lines positioned

further apart– Correlation = Cosine of the angle between the lines

Page 37: Lecture 7: Factor Analysis

60º

30º

V1

V2

V3

The smaller the angle, the bigger the cosine and the bigger the correlation

V1 & V3

90º angle

Cosine = 0

No relationship

V1 & V230º angleCosine = .867r = .867

V2 & V360º angleCosine = .5R = .5

Page 38: Lecture 7: Factor Analysis

V1

V5

V4

Factor Loading

Cosine of the angle between each factor and the variable

Factor Analysis

Fits a factor to each cluster of

variables

Passes a factor line through the

groups of variables

V2 V3

V6

F1

F2

Page 39: Lecture 7: Factor Analysis

V1

V5

V4

V2 V3

V6

F1

F2

V1

V5

V4

V2 V3

V6

F1

F2

Two Methods of fitting Factors

Orthogonal Solution

Factors are at right angles

Uncorrelated

Oblique Solution

Factors are not at right angles

Correlated

Page 40: Lecture 7: Factor Analysis

V1

V5

V4

V2 V3

V6

F1

F2

Two Step Process

Factors are fit arbitrarily Factors are rotated to fit the clusters of variables better

V1

V5

V4

V2 V3

V6

F1

F2

Page 41: Lecture 7: Factor Analysis

Variables C1 C2 C3

Vividness Qu -.198 -.805 .061

Control Qu .173 .751 .306

Preference Qu .353 .577 -.549

Generate Test -.444 .251 .543

Inspect Test -.773 .051 -.051

Maintain Test .734 -.003 .384

Transform (P&P) Test

.759 -.155 .188

Transform(Comp) Test

-.792 .179 .304

Visual STM Test .792 -.102 .215

Variables C1 C2 C3

Vividness Qu -.029 -.831 .008

Control Qu .174 .744 .323

Preference Qu -.010 .679 -.547

Generate Test -.197 .112 .709

Inspect Test -.717 -.103 .279

Maintain Test .819 .116 .043

Transform (P&P) Test

.779 -.013 -.166

Transform(Comp) Test

-.599 -.01 .626

VisualSTM Test .813 .045 -.147

Unrotated Solution Solution following Orthogonal Rotation

For example…

Page 42: Lecture 7: Factor Analysis

Factor Rotation

• Changes the position of the factors so that the solution is easier to interpret

• Achieves simple structure

– Factor matrix where variables have either high or low loadings on factors rather than lots of moderate loadings

Page 43: Lecture 7: Factor Analysis

Evaluating your Factor Solution

• Is the solution interpretable?– Should you re-run and extract a bigger or smaller number of

factors?

• What percentage of variance is explained by the factors?– >60%?

• Are all variables represented by the factors?– If the communality of one variable is very low, suggests it is not

related to the other variables, should re-run and exclude

Page 44: Lecture 7: Factor Analysis

Variables C1 C2 C3

Vividness Qu -.029 -.831 .008

Control Qu .174 .744 .323

Preference Qu -.010 .679 -.547

Generate Test -.197 .112 .709

Inspect Test -.717 -.103 .279

Maintain Test .819 .116 .043

Transform (P&P) Test

.779 -.013 -.166

Transform(Comp) Test

-.599 -.01 .626

VisualSTM Test .813 .045 -.147

First Solution Second Solution

For example…

Variables Component 1 Component 2

Vividness Qu .013 -.829

Control Qu -.023 .770

Preference Qu .195 .648

Generate Test -.493 .130

Inspect Test -.760 -.146

Maintain Test .711 .183

Transform (P&P) Test .773 .042

Transform (Comp) Test

-.811 -.028

Visual STM Test .792 .103

Component 3 = ?C1 = Efficiency of objective visual imagery

C2 = Self-reported imagery efficacy

Page 45: Lecture 7: Factor Analysis

References

• Cooper, C. (1998). Individual differences. London: Arnold.

• Kline, P. (1994). An easy guide to factor analysis. London: Routledge.