unit 10 slides

45
Psychology 3800, Lab 002 Factor Analysis

Upload: lveselka

Post on 03-Dec-2014

373 views

Category:

Education


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Unit 10  slides

Psychology 3800, Lab 002

Factor Analysis

Page 2: Unit 10  slides

•  assignment #8: feedback

•  factor analysis: overview

•  example analysis in SPSS

•  final assignment

In Lab Today…

Page 3: Unit 10  slides

Assignment 8: Feedback Check the lab blog for a comments and suggestions:

http://uwo3800g.tumblr.com/post/81517223719/assignment-8-commonly-made-errors

Help posts also available on:

(1) Reading the Excluded Variables table: http://uwo3800g.tumblr.com/post/81332757555/faq-how-do-we-interpet-the-information-in-the-excluded

(2) Differentiating between r and R: http://uwo3800g.tumblr.com/post/81575779291/faq-what-is-the-difference-between-r-and-r-in

Page 4: Unit 10  slides

Factor Analysis: Overview

Page 5: Unit 10  slides

•  method of data reduction (i.e. reduces collection of variables to a smaller number of variable clusters)

•  determines the overarching structure of the data by identifying underlying factors within our data

•  variables that are highly intercorrelated will be sorted into the same factor

What Is Factor Analysis?

Note: factors are also called “dimensions”, “components” or “latent variables”

Page 6: Unit 10  slides

•  interested in variables contributing to exam grades in Psych 3800

•  measured 135 students on 9 variables •  attitude toward math •  class attendance •  office hours visits •  previous stats experience •  time management skills •  level of partying during the week before the exam •  tendency to procrastinate •  level of studying •  effort put into lab assignments

•  want to identify whether this list can be reduced to a few broad factors

Example Analysis

Page 7: Unit 10  slides

Run a factor analysis in SPSS…

•  assesses which variables are highly intercorrelated (and uncorrelated with other variables)

•  groups variables into minimum number of factors (parsimony) that account for a significant (large) portion of the data

•  outputs factor loadings (correlations of each variable with the factors)

•  researcher assesses factor loadings to decide on what the factors represent

General Analysis Approach

Page 8: Unit 10  slides

Example Output

Variables Factor 1 Factor 2 Factor 3

Assignment effort -.157 .629 .388 Class attendance .258 .723 .124 Office hours visits .063 -.663 .206 Math love -.122 -.091 .864 Previous experience .372 .370 .669 Partying -.742 .038 -.272 Procrastination -.753 .296 -.187 Studying .720 .017 -.273 Time management .596 .245 -.130

Page 9: Unit 10  slides

Example Output

Variables Factor 1 Factor 2 Factor 3

Assignment effort -.157 .629 .388 Class attendance .258 .723 .124 Office hours visits .063 -.663 .206 Math love -.122 -.091 .864 Previous experience .372 .370 .669 Partying -.742 .038 -.272 Procrastination -.753 .296 -.187 Studying .720 .017 -.273 Time management .596 .245 -.130

Identifying which variables load most highly on which factor…

Page 10: Unit 10  slides

Variables Exam Prep Coursework Experience

Assignment effort -.157 .629 .388 Class attendance .258 .723 .124 Office hours visits .063 -.663 .206 Math love -.122 -.091 .864 Previous experience .372 .370 .669 Partying -.742 .038 -.272 Procrastination -.753 .296 -.187 Studying .720 .017 -.273 Time management .596 .245 -.130

Example Output

Naming the factors to reflect the pattern of factor loadings…

Page 11: Unit 10  slides

So, the various variables assessed can be grouped into three overarching factors:

1)  level of preparation for the exam 2)  diligence with various aspects of the course 3)  general experience with math/stats

We can now use these latent variables to predict course success rather than assessing one variable at a time

•  each factor represents a broader underlying concept •  each concept takes more than one variable into account

Example Conclusion

Page 12: Unit 10  slides

•  eigenvalues provide insight into the magnitude of each factor extracted in factor analysis (greater value = greater magnitude)

•  so, there will be an eigenvalue for each factor in the analysis

•  to calculate the eigenvalue for each factor: o  square each factor loading o  sum up the factor loadings within each factor (columns)

•  note: if we divide each eigenvalue by the number of variables in the factor analysis, we obtain a value representing the proportion of variance in the data accounted for by each factor

Proportion of Variance For Each Factor: Eigenvalues

Page 13: Unit 10  slides

Variables Factor 1 Factor 2 Factor 3

Assignment effort -.1572 .6292 .3882 Class attendance .2582 .7232 .1242 Office hours visits .0632 -.6632 .2062 Math love -.1222 -.0912 .8642 Previous experience .3722 .3702 .6692 Partying -.7422 .0382 -.2722 Procrastination -.7532 .2962 -.1872 Studying .7202 .0172 -.2732 Time management .5962 .2452 -.1302

Proportion of Variance For Each Factor: Eigenvalues

squaring each factor loading from original output…

Page 14: Unit 10  slides

Variables Factor 1 Factor 2 Factor 3

Assignment effort -.1572 .6292 .3882 Class attendance .2582 .7232 .1242 Office hours visits .0632 -.6632 .2062 Math love -.1222 -.0912 .8642 Previous experience .3722 .3702 .6692 Partying -.7422 .0382 -.2722 Procrastination -.7532 .2962 -.1872 Studying .7202 .0172 -.2732 Time management .5962 .2452 -.1302

Eigenvalue Σ = 2.240 Σ = 1.653 Σ = 1.603

Proportion of Variance For Each Factor: Eigenvalues

magnitude of each factor

Page 15: Unit 10  slides

Variables Factor 1 Factor 2 Factor 3

Assignment effort -.1572 .6292 .3882 Class attendance .2582 .7232 .1242 Office hours visits .0632 -.6632 .2062 Math love -.1222 -.0912 .8642 Previous experience .3722 .3702 .6692 Partying -.7422 .0382 -.2722 Procrastination -.7532 .2962 -.1872 Studying .7202 .0172 -.2732 Time management .5962 .2452 -.1302

.249 .184 .178

Proportion of Variance For Each Factor: Eigenvalues

proportion of variance

accounted for

Eigenvalue / # of variables Note: here we have divided each eigenvalue by 9 (number of variables)

Page 16: Unit 10  slides

•  communalities tell us proportion of variance in each variable that is explained by all of the factors you’ve included in your model

•  so, there will be a communality for each variable in the analysis

•  to calculate the communality for each factor: •  square each factor loading •  sum up the factor loadings within each variable (rows)

Proportion of Variance For Each Variable: Communalities

Page 17: Unit 10  slides

Proportion of Variance For Each Variable: Communalities

Variables Factor 1 Factor 2 Factor 3

Assignment effort -.1572 .6292 .3882 Class attendance .2582 .7232 .1242 Office hours visits .0632 -.6632 .2062 Math love -.1222 -.0912 .8642 Previous experience .3722 .3702 .6692 Partying -.7422 .0382 -.2722 Procrastination -.7532 .2962 -.1872 Studying .7202 .0172 -.2732 Time management .5962 .2452 -.1302

squaring each factor loading on original output (same as for eigenvalue calculations)…

Page 18: Unit 10  slides

Proportion of Variance For Each Variable: Communalities

Variables Factor 1 Factor 2 Factor 3 h2

Assignment effort -.1572 .6292 .3882 Σ = .571 Class attendance .2582 .7232 .1242 Σ = .605 Office hours visits .0632 -.6632 .2062 Σ = .486 Math love -.1222 -.0912 .8642 Σ = .770 Previous experience .3722 .3702 .6692 Σ = .723 Partying -.7422 .0382 -.2722 Σ = .626 Procrastination -.7532 .2962 -.1872 Σ = .690 Studying .7202 .0172 -.2732 Σ = .593 Time management .5962 .2452 -.1302 Σ = .432

Page 19: Unit 10  slides

•  interpretation of communalities (example using first variable):

(a) 57.1% of the variance in assignment effort is accounted for by the three overarching factors combined (Exam Prep, Coursework, Experience)

…or…

(b) the three factors explain 57.1% of the variance in assignment effort

Proportion of Variance For Each Variable: Communalities

Page 20: Unit 10  slides

Proportion of Variance For Each Variable: Communalities

When we have low communalities: o  factors don’t account for much variance in that variable

…or… o  the variable doesn’t have much in common with the other variables in the analysis

Possible causes of low communalities:

1)  the variable is actually very different from the other variables 2)  the measurement of that variable was unreliable 3)  an insufficient number of factors was extracted

Page 21: Unit 10  slides

o  extract only those factors that have eigenvalues greater than 1

o  why greater than 1?

  when eigenvalue > 1: a factor has loadings from 2 or more variables

  if a factor has loadings from only one variable (so, that variable is its own factor), its eigenvalue will be about 1

  if the eigenvalue of a factor is greater than one, it describes more variance than one variable could alone

Deciding on Number of Factors Method #1: Eigenvalue > 1

Page 22: Unit 10  slides

1)  output a scree plot via statistical software (SPSS) 2)  look to see where data elbows 3)  number of dots above the curve reveal number of factors to extract

Deciding on Number of Factors Method #2: Scree Plot

this graph suggests extracting three factors from our data

Page 23: Unit 10  slides

•  mathematical technique that provides a simpler description of the relationships among variables

Varimax Method of Rotation o  orthogonal type of rotation (keeps the axes at 90°)

o  forces the factor loadings to get closer to ‘0’ or ‘1’ (remember factor loadings are the correlations of the variables with the factors)

o  essentially, with this rotation, each variable will strongly load on only one factor (and not so much on the other factors)

o  this way, you don’t have any “medium sized” factor loadings, and it will be easier to interpret which variables belong to which factors

o  resultant factors will also be independent (uncorrelated) because they will be defined by unique variables

Rotation

Page 24: Unit 10  slides

Factor Analysis: Example

Page 25: Unit 10  slides

Nine variables potentially contributing to final exam grade in Psych 3800…

each row represents a given participant’s score on each of the nine measures

SPSS Example: The Data

Page 26: Unit 10  slides

Analyze Dimension Reduction Factor…

SPSS Example: Factor Analysis

move all of the variables that you would like to subject to

factor analysis into the “Variables” section

Page 27: Unit 10  slides

SPSS Example: Factor Analysis Extraction Menu

stick with default method of extraction

Let SPSS decide on optimal number of factors using the

“eigenvalue > 1” method

request scree plot

Page 28: Unit 10  slides

SPSS Example: Factor Analysis Rotation Menu

request Varimax (orthogonal) rotation

Page 29: Unit 10  slides

SPSS Example: Factor Analysis Scores Menu

request that the factors be saved in your current data file

for further analysis

Note: these outputted values are helpful for examining correlations between extracted factors

Page 30: Unit 10  slides

communalities for the extracted factors are provided by SPSS

SPSS Output: Factor Analysis Communalities

Interpretation (example): the extracted factors explain 60.5% of the variance in class attendance

Page 31: Unit 10  slides

SPSS Output: Factor Analysis Total Variance Explained

eigenvalues for three extracted factors (unrotated)

proportion of variance accounted for by each of the three extracted factors

(unrotated)

so, the first factor in the unrotated solution has an eigenvalue of 2.370 and accounts for 26.337% of the variance in the data

Page 32: Unit 10  slides

SPSS Output: Factor Analysis Total Variance Explained

eigenvalues for three extracted factors (rotated)

proportion of variance accounted for by each of the three extracted factors

(rotated)

  so, the first factor in the rotated solution has an eigenvalue of 2.238 and accounts for 24.865% of the variance in the data

Page 33: Unit 10  slides

SPSS Output: Factor Analysis Scree Plot •  scree plot seems to suggest that a three-factor solution would be ideal •  in this case, the scree-plot method and the eigenvalue > 1 method are in agreement (this will not always be the case)

Page 34: Unit 10  slides

three-factor solution noted (but difficult to interpret without subsequent rotation)

SPSS Output: Factor Analysis Component Matrix This table provides the factor loadings for all of the variables on the factors

extracted using the eigenvalue > 1 method, prior to rotation.

Page 35: Unit 10  slides

SPSS Output: Factor Analysis Rotated Component Matrix This table provides the factor loadings for all of the variables on the factors

extracted using the eigenvalue > 1 method, after rotation.

three-factor solution is much easier to interpret now that the factors are made to be independent

Page 36: Unit 10  slides

SPSS Output: Factor Analysis Rotated Component Matrix

The first factor has its highest loadings from the highlighted variables…

these variables seem to suggest an exam preparation factor (what you do with your time in the weeks leading up to the exam)

Factor 1 is made up of low partying and procrastination (negative loadings) and high time management and studying (positive loadings).

Page 37: Unit 10  slides

SPSS Output: Factor Analysis Rotated Component Matrix

The second factor has its highest loadings from the highlighted variables…

these variables seem to suggest a coursework factor (effort put into the various components of the course)

Factor 2 is made up of low office hours visits (negative loadings) and high class attendance and effort put into assignments (positive loadings).

Page 38: Unit 10  slides

SPSS Output: Factor Analysis Rotated Component Matrix

The third factor has its highest loadings from the highlighted variables…

these variables seem to suggest a math experience factor (generalized familiarity of and appreciation for math)

Factor 3 is made up of high appreciation for math and greater experience with stats (positive loadings)

Page 39: Unit 10  slides

regression method was used to output scores on each factor for every participant (can be used in subsequent analyses of the three factors)

SPSS Output: New Variables

Page 40: Unit 10  slides

Final Assignment (yay!)

Page 41: Unit 10  slides

•  not written as an APA-style results section •  answer all questions fully, in numbered format

•  maximum of 2.5 pages, double-spaced •  APA formatting required

•  answer all questions in sentence-form (no point-form this week)

Assignment 10: Overview

•  submit all output as well

Page 42: Unit 10  slides

Question Tips and Hints

•  for question #2 and question #5: argue both sides of each issue provide statistics or evidence from your output to support each side presented

•  you will run three separate analyses in order to be able to answer the assignment questions:

(a) factor analysis (principal components, Varimax rotation) (b) bivariate correlation of all variables (c) bivariate correlation of factor scores

•  all remaining questions are self-explanatory (answer all parts of each question)

Page 43: Unit 10  slides

•  submit the factor analysis assignment to me by Friday, April 11, no later than 9:00AM

o  can send it via e-mail ([email protected]) o  can submit to me in person in my office o  can slide it under my door if I am not in my office, but please send me an e-mail letting me know it’s there

Assignment Submission

Page 44: Unit 10  slides

•  no further labs

•  office hours next week Tuesday, April 8, 9:30AM-11:30AM o  after that, set up appointment if you would like to meet o  e-mail me your questions o  note: I do not get to see the exam, so exam-related questions should be directed to Dr. McRae

•  I’ll be in touch with an update of how to pick up your remaining assignments

•  don’t forget that practice labs are available on Sakai/OWL for you to check out if you would like extra practice

all units should be posted by next week answers are available for all practice labs

What’s Next?

Page 45: Unit 10  slides

Good luck on the exam! Thank you!