factor analysis - open universityfactor analysis this tutorial will show you how to carry out a...

14
Factor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between a large number of variables (for example, questions on a questionnaire), and see whether they can be grouped and summarised using a smaller number of factors (or latent variables). Latent variables are not measured directly, but are hidden underneath your data, influencing the scores on variables that you do have. The key concept here is that groups of variables may be related to one another, because they are all associated with the same underlying factor. Factor analysis enables you to group variables together to identify and interpret what those factors might be; which in turn helps you to better understand the variance in your data. There are different types of factor analysis, and different methods for carrying it out. This tutorial will focus on exploratory factor analysis using principal components analysis (PCA). Exploratory factor analysis is used when you do not have a pre-defined idea of the structure or number of factors there might be in a set of data. As such it tends to be used to explore newly developed questionnaires. PCA aims to reduce a set of variables into a smaller, more meaningful set of factors by looking for clusters of variables that appear to be related to one another (and therefore may be tapping into the same underlying factor). PCA is primarily concerned with identifying variables that share variance with one another. Worked Example In this tutorial, we will look at how to use factor analysis to analyse a newly developed questionnaire designed to measure teachers’ beliefs about different approaches to learning, and their attitudes towards signing and inclusivity in the classroom. In this example, 600 participants completed a questionnaire examining views on teaching styles by agreeing or disagreeing with a series of statements about how children learn. The questionnaire had 15 items (variables) and answers were given on a scale of 1 to 5, where: 1 = Strongly agree 2 = Agree 3 = Neither agree nor disagree 4 = Disagree 5 = Strongly disagree

Upload: others

Post on 06-Sep-2021

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Factor Analysis - Open UniversityFactor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between

Factor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between a large number of variables (for example, questions on a questionnaire), and see whether they can be grouped and summarised using a smaller number of factors (or latent variables). Latent variables are not measured directly, but are hidden underneath your data, influencing the scores on variables that you do have. The key concept here is that groups of variables may be related to one another, because they are all associated with the same underlying factor. Factor analysis enables you to group variables together to identify and interpret what those factors might be; which in turn helps you to better understand the variance in your data. There are different types of factor analysis, and different methods for carrying it out. This tutorial will focus on exploratory factor analysis using principal components analysis (PCA). Exploratory factor analysis is used when you do not have a pre-defined idea of the structure or number of factors there might be in a set of data. As such it tends to be used to explore newly developed questionnaires. PCA aims to reduce a set of variables into a smaller, more meaningful set of factors by looking for clusters of variables that appear to be related to one another (and therefore may be tapping into the same underlying factor). PCA is primarily concerned with identifying variables that share variance with one another. Worked Example In this tutorial, we will look at how to use factor analysis to analyse a newly developed questionnaire designed to measure teachers’ beliefs about different approaches to learning, and their attitudes towards signing and inclusivity in the classroom. In this example, 600 participants completed a questionnaire examining views on teaching styles by agreeing or disagreeing with a series of statements about how children learn. The questionnaire had 15 items (variables) and answers were given on a scale of 1 to 5, where: 1 = Strongly agree 2 = Agree 3 = Neither agree nor disagree 4 = Disagree 5 = Strongly disagree

Page 2: Factor Analysis - Open UniversityFactor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between

This is what the data looks like in SPSS (you can download this yourself, from datafile: ‘Week 7 FA Data.sav’:

Remember, in SPSS all data from one participant needs to go in one row. On a questionnaire, each question represents a different variable… so we need one column for each question. As questionnaires consist of multiple questions, you will have a dataset with a large number of columns. The aim of factor analysis is to reduce large sets of variables, to a smaller number of latent factors. To start the analysis, CLICK on Analyze, then Dimension Reduction and Factor.

Page 3: Factor Analysis - Open UniversityFactor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between

This opens the Factor Analysis dialog box. Here we need to tell SPSS which variables we want to include in the analysis. As we want to run the factor analysis on the whole questionnaire, we need to select all of the variables, as shown here. Then move them across to the Variables box by CLICKING on the top blue arrow.

Now, CLICK on the Descriptives… option to tell SPSS what output you want it to produce.

As factor analysis looks for relationships between variables, we first need to establish that these relationships do, in fact, exist. To do this, we need to produce a Correlation Matrix. To do this, SELECT the Coefficients option. To ensure accurate correlations (and factor analysis), it’s important that we have a fairly large sample size. And for factor analysis to be meaningful, we have to have a suitable number of correlations between variables. The Kaiser-Meyer-Olkin (KMO) test of sampling adequacy can tell us whether our sample size is sufficient. And Bartlett’s test of sphericity indicates whether we have enough correlations.

Once you have selected these options, CLICK Continue to return to the main Factor Analysis Dialog box. From here, we want to tell SPSS how we want it to extract our factors. To do this, CLICK on the Extraction option.

Page 4: Factor Analysis - Open UniversityFactor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between

There are a number of different methods that can be used to extract factors from our data.

In this case, we are using Principal Components Analysis, so we want to make sure this

option is selected (which it is).

Factor analysis will always

extract as many factors as

there are variables. In this case

we have 15 questions on the

questionnaire, so 15 factors will

be extracted.

BUT – a lot of these factors will

be meaningless. As such, we

want to extract the smallest

number of factors that we can,

that best explains the patterns

in our data.

There are a number of different methods you can use to decide how many factors you want

to keep, and which make the most sense. One method commonly used is the Scree Plot, so

we need to SELECT this option first (as shown above).

As this factor analysis is exploratory, we don’t know how many factors we want the analysis

to extract. As such, we keep the default option that extracts factors with eigenvalues over

1. However, if you have an idea of how many factors you are expecting (for example,

because you have already run your analysis and tried to interpret the results), you can ask

SPSS to produce that specific number of factors by selecting the ‘Fixed number of factors’

option.

For future reference, you may need to increase the number in the ‘Maximum Iterations for

Convergence’ box if you have tried to run your analysis and the output tells you that a

solution could not be found in 25 iterations.

Once you have chosen your Extraction options, CLICK Continue to return to the main Factor

Analysis Dialog box. From here, we need to tell SPSS what type of rotation we want it to use

when finding a factor solution. To do this, CLICK the Rotation button.

Page 5: Factor Analysis - Open UniversityFactor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between

Rotation maximises the loading of each of your variables onto one factor, while minimizing it’s loading on the others. This should help you when it comes to interpreting what the factors represent.

In most instances, like this one, you will use Varimax, which is appropriate when you think your factors are likely to be independent of one another. However, if research suggests your factors are likely to be related (i.e. highly correlated), then Direct Oblimin should be your choice here. Again, if you have tried to run your analysis

and the output tells you that a solution could

not be found in 25 iterations, you may need to

increase the number in the final box.

Once you have chosen your Rotation options, CLICK Continue to return to the main Factor

Analysis Dialog box. If you plan to run some form of analysis on how your participants

scored on the different factors you extracted… you can ask SPSS to save these scores. To do

this, CLICK on the Scores… button.

To create and save participants scores for each factor that you produce, SELECT the Save as variables option. CLICK continue

Once back at the Factor Analysis dialog box, CLICK on the final Options button.

Page 6: Factor Analysis - Open UniversityFactor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between

To help us interpret the SPSS output, we want to ask SPSS to list the variables that load onto each factor in order – so question items that are most strongly related to the factor appear first. This will help us interpret what these factors are likely to represent.

To do this, SELECT the Sorted by size option. To further help with interpretation, we want to filter out any variables that are only weakly related to each factor. To do this SELECT the Suppress small coefficients option. For even more help, we can change the Absolute value below option to .40, so that any extremely weak loadings below this cut-off will not be displayed. CLICK continue to return to the Factor Analysis dialog box. And CLICK OK to run the factor analysis.

SPSS produces the results of the Factor Analysis in the Output window.

Page 7: Factor Analysis - Open UniversityFactor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between

This tutorial will now go through the relevant output box-by-box. Correlation Matrix

The first box displays the correlation matrix for your data. As SPSS produced a huge correlation table, it has been cropped here for the tutorial. As Factor Analysis looks for relationships between the variables, there need to be at least some moderate-to-high correlations in your data (i.e. correlations above the value of r=0.3). There are plenty of moderate correlations here (see above) suggesting the analysis is appropriate… but in factor analysis we also want to avoid multicollinearity (i.e. any overly high correlations of r>0.9). If this is a problem, you may want to consider removing one of the questions with the high correlation. As this isn’t a problem in the data set, we can continue with the output.

Page 8: Factor Analysis - Open UniversityFactor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between

KMO and Bartlett’s Test The next box in our output displays two statistics we need to look at to confirm that Factor Analysis is appropriate for our data set.

First, the Kaiser-Meyer-Olkin test of sampling adequacy assesses whether or not our sample size is sufficient for factor analysis. A value of less than 0.5 indicates the sample is too small, but ideally we are aiming for 0.7 or above. In this case the value is KMO = .87, which means our sample size is sufficient. The second statistic is Bartlett’s test of sphericity which tells us whether we have an adequate number of correlations between our variables for factor analysis. In this case we are looking for a significance value of less than your alpha level (i.e. p<0.05), just like ANOVA. In this case the value is p < .001, which means that we have enough correlations for factor analysis. Communalities

We don’t need the next table to interpret our output. Skip to the next output box

Page 9: Factor Analysis - Open UniversityFactor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between

Total Variance Explained There are three key components to this table, which are highlighted here.

Initial Eigenvalues: The first three columns list all of the factors that can be found within the data set. As factor analysis always extracts as many factors as there are variables, in this case there are 15 factors in total.

The % of Variance column tells you how much of the variance in the dataset can be explained by each factor. The first few factors account for relatively large proportions of the variance compared to the latter factors. We are really only interested in extracting factors that account for a meaningful amount of variance.

Extraction Sums of Squared Loadings: The middle set of columns is almost identical to the first, except it only displays the factors that account for a significant amount of variance in our data.

As we asked SPSS to use a criterion of eigenvalues over 1 for extraction, this section only displays the factors that meet this criteria. The eigenvalue for each factor (before rotation) can be seen in the Total column. In this example, SPSS has extracted four factors as a result of the factor analysis.

Rotation Sums of Squared Loadings: The final set of columns gives the eigenvalues of the extracted factors after rotation has taken place.

Rotation maximises the loading of each of your variables onto one factor, while minimising its loading on the others. This optimises the factor loadings which also brings the eigenvalues more into line with one another.

When reporting how much variance each factor accounts for, you want to use this set of columns.

Page 10: Factor Analysis - Open UniversityFactor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between

Scree Plot So far SPSS has extracted four factors. But how many factors you actually end up extracting is up to you - not SPSS… so you need to consider other options. This graph plots all 15 eigenvalues for your factors. This can help visualise which factors to keep. These plots often show a point in the curve (or 'elbow') where the eigenvalues drop off and level out. Eigenvalues above this point may be important enough to retain, whereas the others may not.

Scree plot curves can often be difficult to interpret. For example, in this case the graph appears to tail off after 2 factors… but there is also another drop after 4. So using this method of extraction, you may be able to justify either 2 or 4 factors here. To determine exactly how many factors to retain, you may want to run your analysis a few times exploring the different factor options and see which one makes the most sense. Unlike many statistics which are black and white, Factor Analysis is more grey… it is an exploratory tool and should only be used as a guide to you, the researcher.

Page 11: Factor Analysis - Open UniversityFactor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between

Component Matrix This table tells you how each variable loads onto each of four factors before rotation. We are not really interested unrotated factor solution, so scroll down to the (very similar looking) next table. Rotated Component Matrix This is the most important table in your output. It tells you how each variable loads onto each of four factors after rotation, and to what extent. This allows you interpret what each of your extracted factors might represent. Factor Analysis only tells you which variables group together mathematically - it’s up to you, the researcher, to interpret what this means. To establish what your factors might be, you need to look at all of the variables that load onto them and try to establish a common theme. For example, looking at the 7 variables that load onto the Factor 1, they all seem to be about aspects of social interaction in the learning process. As such, we might want to name this factor ‘Social Learning’.

Page 12: Factor Analysis - Open UniversityFactor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between

The variables that load most strongly onto Factor 2 seem to refer to opinions surrounding the inclusion of children with special needs in the classroom and in teaching practices. In this case, we might want to name this factor ‘Inclusivity’. For Factor 3, items seem to refer to the importance of student-directed vs teacher-directed learning. Negative loadings simply have the opposite relationship with the factor than positive loadings... although the factor name should reflect the positive relationships. So in this case, we could call this factor: ‘Self-directed learning’. Finally, Factor 4 seems to comprise items referring to the importance of clear and well explained solutions in learning. As such, we could call this factor: ‘Clarity’.

Page 13: Factor Analysis - Open UniversityFactor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between

In some cases, variables load onto more than one factor. In which case, you have to decide which one is most appropriate in terms of interpretability. In this case, the 6th and 13th variables refer directly to the student's or teacher's role in learning. As such, it seems that both of these variables belong with Factor 3 (rather than Factors 1 or 4).

Component Transformation Matrix

This final table is not needed for interpreting the Factor Analysis.

Page 14: Factor Analysis - Open UniversityFactor Analysis This tutorial will show you how to carry out a factor analysis in SPSS. Factor analysis allows you to look at the relationship between

How do we write up our results?

When writing up the results of your factor analysis you need to include all of the relevant information covered in this tutorial:

1. State how many questions you analysed and the type of analysis used:

Fifteen questions relating to teaching style and attitude were factor analysed using principal components analysis with varimax rotation.

2. Confirm that factor analysis was appropriate by reporting the KMO measure of sampling adequacy and the Bartlett’s test of sphericity:

Kaiser-Meyer-Olkin measure of sampling adequacy was .87, above the commonly recommended value of .6, and Bartlett’s test of sphericity was significant (χ2 (105) = 2983.77, p < .001).

3. Say how many factors you have found, how you extracted them, and how much variance is explained overall by these factors:

Using both the scree plot and eigenvalues > 1 to determine the underlying components, the analysis yielded four factors explaining a total of 60.66 per cent of the variance in the data.

4. Explain what each of your factors represent, with examples of the questions that

load on to them, and how much variance each explains. For example, for Factor 1 you could say:

Factor 1 was labelled ‘social learning’ because of the high loadings by the following items: children learn best through collaborative activities; helping children to talk to one another in class productively is a good way of teaching; meaningful learning takes place when individuals are engaged in social activities. This first factor explained 22.53 per cent of the variance after rotation.

You need to include similar descriptions for each of the other three Factors: Inclusivity, Self-directed learning, Clarity… along with your rotated components matrix table. You may also want to include a copy of your scree plot in the report. This brings us to the end of this tutorial. Why not download the data file from this tutorial and see if you can run the analysis yourself.