Download - Cluster Analysis
Cluster Analysis
Hierarchical agglomerative cluster analysis
Use of a created cluster variable in secondary analysis
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
1
KEY CONCEPTS*****
Cluster Analysis
Research questions addressed by cluster analysisCluster analysis assumptionsAlternative names for cluster analysisCaveats in using cluster analysisSimilarity/dissimilarity matrix, also called a distance matrix
• Squared Euclidean distance• Euclidean distance• Cosine of vector variables• City block (Manhattan distance)• Chebychev distance metric• Distances in absolute power metric• Pearson product-moment correlation coefficient• Minkowski metric• Mahalanobis D2• Jaccard's coefficient(s)• Gower's coefficient• Simple matching coefficient
Cluster-seeking vs. cluster-imposing methodsClustering algorithms
• Hierarchical MethodsAgglomerative Methods
Single average/linkage (nearest neighbor)Complete average/linkage (furthest neighbor)Average linkageWard's error sum of squaresCentroid methodMedian clustering
-Divisive MethodsK-means clusteringTrace methodsA Splinter-Average Distance methodAutomatic Interaction Detection (AID)
• Non-Hierarchical MethodsIterative Methods
Sequential threshold methodParallel threshold methodOptimizing methods
KEY CONCEPTS (CONT.)
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
2
Factor AnalysisQ-Analysis
Density MethodsMultivariate probability approaches(NORMIX, NORMAP)
Clumping MethodsGraphic Methods
Glyphs & MetroglyphsFourier SeriesChernoff Faces
Agglomeration ScheduleFusion coefficientAlternative ways to determine the optimal number of clustersCriteria: clusters as internally homogeneous and significantly different from each otherDendrogramScaled distanceCluster scoresProfiling clustersUsing a cluster variable as an IV or DV in secondary analysisSokal, Robert & Smeath, Peter, Principles of Numerical Taxonomy (1963)Steps in cluster analysis
Variable selection, construction of data base, testing assumptionsSelecting measure of similarity/distanceSelecting clustering algorithmDetermining number of clustersProfile clustersValidation
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
3
Cluster Analysis
Interdependency Technique
Designed to group a sample of subjects
Into significantly different groups
Based upon a number of variables
The groups are constructed to be as different as statistically possible
And as internally homogeneous as statistically possible
Assumptions
The sample needs to be representative of the population
Multiple collinearity among the variables should be minimal
Absence of outliers & good N to k ratio
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
4
Cluster Analysis by Other Names
Similar techniques have been independently developed in various fields, giving rise to different names for this statistical technique (e.g. biology, archeology. etc.)
Cluster Analysis
Numerical Taxonomy
Q-Analysis
Typology Analysis
Classification Analysis
There are a number of different clustering techniques depending upon …
The procedure used to measure the similarity or distance among subjects
And the clustering algorithm used.
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
5
Caveats in Using Cluster Analysis
There is no one best way to perform a cluster analysis
There are many methods and most lack rigorous statistical reasoning or proofs
Cluster analysis is used in different disciplines, which favor different techniques for:
Measuring the similarity or distance among subjects relative to the variables
And the clustering algorithm used
Different clustering techniques can produce different cluster solutions
Cluster analysis is supposed to be “cluster -seeking”, but in fact it is “cluster - imposing”
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
6
Applications of Cluster Analysis
Cluster analysis seeks to reduce a sample of cases to a few statistically different groups, i.e. clusters, based upon differences/similarities across a set of multiple variables
A useful tool for constructing typologies among cases
Example
Is each case filed with court unique, or can cases be sorted into distinctly different types based upon the amount of the evidence, quality of the defense, complexity of the charges, etc.?
Example
Is a murder a murder, or can cases be sorted into distinctively different types on the basis of victim/offender characteristics, circumstances, motives, etc.?
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
7
The Logic of Cluster Analysis
Step 1 Cluster analysis begins with an N x k database
Step 2 Using one of several methods, an N x N matrix is created that indicates the similarity (or dissimilarity) of very case to every other case, based on the k number of variables
Matrix of Dissimilarities
Subjects 1 2 3 … N
1 1.782 2.538 … 47.236
2 1.782 0.821 … 39.902
3 2.538 0.821 … 41.652
… … … … …
n 47.236 39.902 41.652 …
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
8
The Logic of Cluster Analysis (cont.)
Step 3 Using one of several clustering algorithms, the subjects are sorted into significantly different groups where …
The subjects within each group are as homogeneous as possible, and …
The groups are as different from one another as possible
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
9
Measures of Similarity or Difference
Cluster analysis begins by creating a matrix indicating the similarity between (or the distance between) each pair of subjects relative to the k variables in the database.
There are a number of ways that this can be done.
Technique Technique
Squared Euclidean Distance * Pearson Correlation Coefficient *
Euclidean Distance * Mahalanobis D 2 *
Cosine of Vector Variables * Minkowski Metric *
City Block or Manhattan Distances * Jaccard’s Coefficient
Chebychev Distance Metric * Gower’s Coefficient
Distances in the Absolute Power Metric
Simple Matching Coefficient
* Available in SPSS
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
10
An Example ofSquared Euclidean Distances
SubjectsVariables
Subject 1
Subject 2
(Si - Sj) (Si - Sj) 2
X1 18 19 -1 1
X2 15 17 -2 4
X3 9 10 -1 1
X4 12 10 +2 4
X5 0 1 +1 1
X6 1 1 0 0
X7 9 8 +1 1
Totals NA NA NA 12
Squared Euclidean Distance = (Si - Sj) 2 = 12
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
11
A Variety of Clustering Algorithms
There is no proven best way to cluster subjects into homogeneous groups
Different techniques have been developed in different fields based upon different logics (e.g. biology, archeology, etc.)
Given the same database, similar clustering results can be achieved using different clustering algorithms, but not always.
Clustering algorithms are generally classified into two broad types …
Hierarchical methods
Non-hierarchical methods
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
12
Hierarchical Clustering Algorithms
Agglomerative Methods Divisive Methods
Single Average (Nearest Neighbor) *
K-Means Clustering *
Complete Average (Furthest Neighbor) *
Trace Methods
Average Linkage * A-Splinter-Average Distance Method
Ward’s Error Sum of Squares * Automatic Interaction Detection (AID)
Centroid Method *
Median Clustering
* Available in SPSS
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
13
Non-hierarchical Clustering Algorithms
Iterative Methods
Sequential Threshold MethodParallel Threshold MethodOptimization Methods
Factor Analysis
Q-Factor Analysis
Density Methods
Multivariate Probability ApproachesNORMIXNORMAP
Clumping Methods
Graphic Methods
GlyphsMetroglyphsFourier SeriesChernoff Faces
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
14
An Example of a Clustering AlgorithmWard’s Errors Sum of Squares Algorithm
Imagine that data on seven variables (Xk) was gathered on 70 subjects (n)
Imagine further that a dissimilarity matrix was constructed indicating the differences among all pairs of subjects using squared Euclidean distances
Step 1 Ward's algorithm begins with each of 70 subjects in their own cluster
Step 2 Next it finds the two subjects that are most similar and creates a cluster with two subjects
Now there are 69 clusters, one with two subjects, and 68 with one subject each
Step 3 Now it finds the next two most similar subjects and creates a two-subject cluster
Now there are 68 clusters, two with two subjects each, and 66 with one subject each
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
15
An Example of a Clustering AlgorithmWard’s Errors Sum of Squares Algorithm (cont.)
As Ward's algorithm progresses it will begin to combine a single subject into a pre-existing cluster,
And then begins to combine one pre-existing cluster with another
This process is continued until all 70 subjects are finally combined into one cluster
Ward's algorithm forms clusters by selecting that subject (or another cluster if combining clusters) which minimizes the within cluster sum of squares (i.e. error sum of squares)
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
16
A Seven Variable Example of Cluster Analysis
The database 70 subjects and 7 variables
The variables
Sentence in years: sentence
Number of prior convictions: pr_conv
Degree of drug dependency: dr_score
Age: age
Age at first arrest: age_firs
Educational equivalency: educ_eqv
Level of work skill: skl_indx
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
17
Steps in the Cluster Analysis
Step 1 Transform the seven variables to standard scores, i.e. Z-scores
Step 2 Create a dissimilarity matrix using squared Euclidean distances
Squared Euclidean Distances
Subjects 1 2 3 … 70
1 1.782 2.538 … 47.236
2 1.782 0.821 … 39.902
3 2.538 0.821 … 41.652
… … … … …
70 47.236 39.902 41.652 …
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
18
Steps in the Cluster Analysis (cont.)
Step 3 Use Ward's algorithm to cluster the 70 subjects, beginning with 70 clusters of one subject each and terminating with one cluster containing all 70 subjects
Agglomeration Schedule
Cluster Combined
Coefficients
Stage Cluster
First Appears
Next Stage
Stage Cluster 1 Cluster 2 Cluster 1 Cluster 2
1 62 63 .255 0 0 40 2 31 33 .610 0 0 37 3 2 3 1.021 0 0 43 4 7 8 1.502 0 0 31 5 29 30 1.984 0 0 45 6 14 15 2.495 0 0 31 7 52 67 3.031 0 0 34 8 18 19 3.588 0 0 49 9 46 47 4.191 0 0 35
10 27 28 4.803 0 0 44 11 36 40 5.437 0 0 33 12 9 13 6.095 0 0 49 13 48 49 6.760 0 0 51 14 32 38 7.435 0 0 42 15 20 21 8.128 0 0 39 16 22 64 8.844 0 0 39 17 35 39 9.580 0 0 52 18 5 12 10.324 0 0 36 19 23 24 11.093 0 0 29 20 57 59 11.878 0 0 32 21 37 43 12.702 0 0 42 22 6 10 13.551 0 0 55 23 1 4 14.439 0 0 28 24 11 45 15.358 0 0 46 25 41 44 16.284 0 0 33 26 55 56 17.220 0 0 41 27 51 66 18.237 0 0 48 28 1 50 19.329 23 0 47 29 17 23 20.483 0 19 38 30 54 69 21.732 0 0 41 31 7 14 23.076 4 6 46
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
19
32 57 58 24.425 20 0 53 33 36 41 25.784 11 25 40 34 52 53 27.173 7 0 51 35 42 46 28.626 0 9 58 36 5 16 30.251 18 0 54 37 31 34 32.018 2 0 62 38 17 68 33.905 29 0 59 39 20 22 35.806 15 16 57 40 36 62 37.855 33 1 56 41 54 55 39.918 30 26 50 42 32 37 42.118 14 21 52 43 2 65 44.428 3 0 47 44 25 27 46.758 0 10 45 45 25 29 49.344 44 5 59 46 7 11 52.395 31 24 54 47 1 2 55.709 28 43 63 48 26 51 59.223 0 27 61 49 9 18 62.772 12 8 57 50 54 70 66.383 41 0 65 51 48 52 70.076 13 34 60 52 32 35 73.798 42 17 58 53 57 60 77.659 32 0 65 54 5 7 81.736 36 46 55 55 5 6 86.189 54 22 64 56 36 61 90.955 40 0 66 57 9 20 97.853 49 39 60 58 32 42 105.430 52 35 62 59 17 25 114.736 38 45 67 60 9 48 125.105 57 51 61 61 9 26 136.517 60 48 63 62 31 32 150.461 37 58 68 63 1 9 167.695 47 61 64 64 1 5 194.756 63 55 66 65 54 57 222.045 50 53 67 66 1 36 258.210 64 56 68 67 17 54 298.955 59 65 69 68 1 31 361.556 66 62 69 69 1 17 483.000 68 67 0
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
20
Interpretation of the Agglomeration Schedule
Stage 1 Cases 62 and 63 are combined into a cluster. Now there is one cluster with two cases and 68 clusters with one case each, 69 total clusters, or 70 - 1 = 69
Coefficient The squared Euclidean distance over which these two cases were joined = 0.255, called a fusion coefficient
Next Stage The next stage at which one of these cases is joined to a cluster is Stage 40 when case 62 is joined to case 36
Stage 33 Cases 36 and 41 are joined together over a distance = 25.784. At this stage 37 clusters have been formed (70 - 33 = 37)
Stage Cluster first Appears
Cluster 1 Notice that case 36 was previously joined with case 40 at Stage 11
Cluster 2 Again, notice that case 41 was previously joined with case 44 at Stage 25
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
21
Interpretation of the Agglomeration Schedule (cont.)
Next Stage The next stage at which one of these cases is joined to a cluster is Stage 40 when case 36 is joined with case 62
Stage 69 Case 1 is joined with case 17 at an Euclidean distance of 483.0, clearly two cases that are very dissimilar.
At Stage 69 all 70 cases have been included in a single cluster. Obviously this one cluster is a heterogeneous cluster, containing many very dissimilar cases.
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
22
How Do You Determine the Optimal Number of Clusters in the Final Solution?
In this example, Ward's algorithm yields clusters ranging from 70 clusters with one case each, to one cluster containing all 70 cases.
Somewhere in between these two extremes is an optimal number of clusters which best satisfies the following conditions …
The clusters are as internally homogeneous as possible (i.e. minimum within sum of squares)
And the various clusters are as different as possible
Determining the optimal number of clusters
Theory about the number of underlying groups
Ease of profiling the groups
Magnitude of change in the fusion coefficient
Dendogram with rescaled distance measure
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
23
What is a Dendogram?* * * * * * H I E R A R C H I C A L C L U S T E R A N A L Y S I S * * * * * *
Dendrogram using Ward Method
Rescaled Distance Cluster Combine
C A S E 0 5 10 15 20 25 Label Num +---------+---------+---------+---------+---------+
Case 62 62 -+ Case 63 63 -+ Case 36 36 -+ Case 40 40 -+-------------+ Case 41 41 -+ | Case 44 44 -+ | Case 61 61 -+ | Case 6 6 -+ | Case 10 10 -+ | Case 5 5 -+---------+ +---------+ Case 12 12 -+ | | | Case 16 16 -+ | | | Case 11 11 -+ | | | Case 45 45 -+ | | | Case 7 7 -+ | | | Case 8 8 -+ | | | Case 14 14 -+ +---+ | Case 15 15 -+ | | Case 1 1 -+ | | Case 4 4 -+ | | Case 50 50 -+-----+ | | Case 2 2 -+ | | | Case 3 3 -+ | | | Case 65 65 -+ | | | Case 51 51 -+ +---+ | Case 66 66 -+---+ | | Case 26 26 -+ | | +-----------------------+ Case 48 48 -+ | | | | Case 49 49 -+---+-+ | | Case 52 52 -+ | | | Case 67 67 -+ | | | Case 53 53 -+ | | | Case 20 20 -+ | | | Case 21 21 -+-+ | | | Case 22 22 -+ | | | | Case 64 64 -+ +-+ | | Case 18 18 -+ | | | Case 19 19 -+-+ | | Case 9 9 -+ | | Case 13 13 -+ | | Case 31 31 -+ | | Case 33 33 -+---+ | | Case 34 34 -+ | | | Case 46 46 -+ +-------------------+ | Case 47 47 -+-+ | | Case 42 42 -+ +-+ | Case 35 35 -+ | | Case 39 39 -+-+ | Case 32 32 -+ | Case 38 38 -+ | Case 37 37 -+ | Case 43 43 -+ | Case 23 23 -+ | Case 24 24 -+ | Case 17 17 -+-+ | Case 68 68 -+ +-------------+ | Case 29 29 -+ | | | Case 30 30 -+-+ | | Case 27 27 -+ | | Case 28 28 -+ | | Case 25 25 -+ +-------------------------------+ Case 55 55 -+ | Case 56 56 -+ | Case 54 54 -+---------+ | Case 69 69 -+ | | Case 70 70 -+ +-----+ Case 57 57 -+ | Case 59 59 -+ | Case 58 58 -+---------+ Case 60 60 -+
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
24
What is a Dendogram? (cont.)
The Scaled Distance
The fusion coefficient transformed to a scale ranging from 0 to 25
The Dendogram
The dendogram shows which cases were joined together into clusters and at what distance, and at latter stages, which clusters were joined together into larger clusters, and at what distance.
Interpretation
The point at which the "foothills" become the "mountain peaks" is probability the optimal number of clusters
Optimal Number of Clusters
A five-cluster solution appears about optimal
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
25
Computing a Five-Cluster Solution
Having hypothesized that a five-cluster solution may be optimal …
The next step is to compute a five-cluster solution and …
Save the cluster scores
Cluster scores
In this case, a cluster score is a number between 1 and 5 assigned to each case indicating the cluster to which a particular case has been assigned
5-Cluster Solution
This is accomplished by repeating the cluster analysis and specifying that five clusters are to be extracted and the cluster scores saved.
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
26
Saved Cluster Scores
1.0 12.0 13.0 14.0 15.0 16.0 17.0 18.0 19.0 110.0 111.0 112.0 113.0 114.0 115.0 116.0 117.0 218.0 119.0 120.0 121.0 122.0 123.0 224.0 225.0 226.0 127.0 2
………
46.0 347.0 348.0 149.0 150.0 151.0 152.0 153.0 154.0 555.0 556.0 557.0 558.0 559.0 560.0 561.0 462.0 463.0 464.0 165.0 166.0 167.0 168.0 269.0 570.0 5
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
27
Profiling the Five Clusters
One way to profile the characteristics of the five clusters is to compute the means of the seven variables for each of the five clusters
Ward Method
Cluster 1 +------------------------+-----------+ | | Mean | +------------------------+-----------+ |SENTENCE | 4.6 | | | | |PR_CONV | 1.5 | | | | |DR_SCORE | 7.5 | | | | |AGE | 21.6 | | | | |AGE_FIRS | 16.2 | | | | |EDUC_EQV | 7.3 | | | | |SKL_INDX | 6.0 | +------------------------+-----------+
Ward Method
Cluster 2 +------------------------+-----------+ | | Mean | +------------------------+-----------+ |SENTENCE | 7.3 | | | | |PR_CONV | 4.8 | | | | |DR_SCORE | 5.7 | | | | |AGE | 24.7 | | | | |AGE_FIRS | 14.4 | | | | |EDUC_EQV | 3.4 | | | | |SKL_INDX | 2.8 | +------------------------+-----------+
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
28
Profiling the Five Clusters (cont.)Ward Method
Cluster 3 +------------------------+-----------+ | | Mean | +------------------------+-----------+ |SENTENCE | 2.4 | | | | |PR_CONV | .9 | | | | |DR_SCORE | 3.3 | | | | |AGE | 21.3 | | | | |AGE_FIRS | 19.3 | | | | |EDUC_EQV | 3.3 | | | | |SKL_INDX | 2.5 | +------------------------+-----------+
Ward Method
Cluster 4 +------------------------+-----------+ | | Mean | +------------------------+-----------+ |SENTENCE | 3.1 | | | | |PR_CONV | .9 | | | | |DR_SCORE | 3.0 | | | | |AGE | 20.6 | | | | |AGE_FIRS | 19.0 | | | | |EDUC_EQV | 10.7 | | | | |SKL_INDX | 8.1 | +------------------------+-----------+
Ward Method
Cluster 5 +------------------------+-----------+ | | Mean | +------------------------+-----------+ |SENTENCE | 16.3 | | | | |PR_CONV | 2.1 | | | | |DR_SCORE | 8.1 | | | | |AGE | 30.2 | | | | |AGE_FIRS | 14.7 | | | | |EDUC_EQV | 5.3 | | | | |SKL_INDX | 3.8 | +------------------------+-----------+
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
29
Ranking the Variable Means of the Five Clusters
Variable Clusters
1 2 3 4 5
Age M H L LL HH
Age_Firs M L HH H LL
Dr_Score H M L LL HH
Educ_Eqv H L LL HH M
Pr_Conv M HH L LL H
Sentence M H LL L HH
Skl_Indx H L LL HH M
LL = lowest L = low M = median H = high HH = Highest
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
30
Profile Descriptions of the Five Clusters
Cluster 1
Better educated drug users who are highly skilled workers, about median age
Cluster 2
Older offenders, unskilled, poorly educated with some history of drug use, career criminals serving long sentences
Cluster 3
Young 1st offenders, unskilled, poorly educated with little drug history, serving very short sentences
Cluster 4
Very young, highly educated, skilled 1st offenders serving short sentences, little history of drug use
Cluster 5
Severely drug dependent old offenders with long criminal careers serving very long sentences
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
31
Secondary Applications of the Results of a Cluster Analysis
Some statistical techniques use a priori categorical independent or dependent variables such as analysis of variance or discriminant analysis.
Cluster analysis allows us to create an empirically derived categorical variable wherein the groups or clusters are determined to be homogeneous and significantly different from each other.
Other statistical tests can then be conducted using the cluster variable as a categorical IV or DV.
Example
Do the five clusters of offenders differ significantly in the seriousness of the crime of which they were convicted? This is a one-way ANOVA problem.
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
32
Secondary Applications of the Results of a Cluster Analysis (cont.)
Univariate Analysis of Variance
Between-Subjects Factors
33
9
12
7
9
1
2
3
4
5
WardMethod
N
Tests of Between-Subj ects Effects
Dependent Variable: SER_I NDX
152. 593a 4 38. 148 19. 471 . 000
853. 296 1 853. 296 435. 527 . 000
152. 593 4 38. 148 19. 471 . 000
127. 350 65 1. 959
1306. 000 70
279. 943 69
SourceCorrect ed Model
I nt ercept
CLU5_1
Error
Tot al
Correct ed Tot al
Type I I I Sumof Squares df Mean Square F Sig.
R Squared = . 545 (Adjust ed R Squared = . 517)a.
Post Hoc TestsWard Method
Interpretation
There are significant mean differences in the crime seriousness of the offences committed by the five clusters of offenders.
Tukey's HSD test is used to determine which group mean differences are significant.
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
33
Secondary Applications of the Results of a Cluster Analysis (cont.)
M ul t i pl e Com par i sons
Dependent Var iable: SER_I NDX
Tukey HSD
- 2. 5152* . 5264 . 000 - 3. 9921 - 1. 0382
1. 2348 . 4718 . 079 - 8. 9081E- 02 2. 5588
1. 3420 . 5825 . 157 - . 2923 2. 9763
- 2. 8485* . 5264 . 000 - 4. 3254 - 1. 3716
2. 5152* . 5264 . 000 1. 0382 3. 9921
3. 7500* . 6172 . 000 2. 0182 5. 4818
3. 8571* . 7054 . 000 1. 8779 5. 8364
- . 3333 . 6598 . 987 - 2. 1847 1. 5181
- 1. 2348 . 4718 . 079 - 2. 5588 8. 908E- 02
- 3. 7500* . 6172 . 000 - 5. 4818 - 2. 0182
. 1071 . 6657 1. 000 - 1. 7607 1. 9750
- 4. 0833* . 6172 . 000 - 5. 8152 - 2. 3515
- 1. 3420 . 5825 . 157 - 2. 9763 . 2923
- 3. 8571* . 7054 . 000 - 5. 8364 - 1. 8779
- . 1071 . 6657 1. 000 - 1. 9750 1. 7607
- 4. 1905* . 7054 . 000 - 6. 1697 - 2. 2112
2. 8485* . 5264 . 000 1. 3716 4. 3254
. 3333 . 6598 . 987 - 1. 5181 2. 1847
4. 0833* . 6172 . 000 2. 3515 5. 8152
4. 1905* . 7054 . 000 2. 2112 6. 1697
( J) War d M et hod2
3
4
5
1
3
4
5
1
2
4
5
1
2
3
5
1
2
3
4
( I ) War d M et hod1
2
3
4
5
M eanDif f er ence
( I - J) St d. Er r or Sig. Lower Bound Upper Bound
95% Conf idence I nt er val
Based on obser ved m eans.
The m ean dif f er ence is signif icant at t he . 05 level.* .
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
34
Secondary Applications of the Results of a Cluster Analysis (cont.)
SER_INDX
Tukey HSDa ,b ,c
7 2.1429
12 2.2500
33 3.4848
9 6.0000
9 6.3333
.196 .982
Ward Method4
3
1
2
5
Sig.
N 1 2
Subset
Means for groups in homogeneous subsets are displayed.Based on Type III Sum of SquaresThe error term is Mean Square(Error) = 1.959.
Uses Harmonic Mean Sample Size = 10.445.a.
The group sizes are unequal. The harmonic meanof the group sizes is used. Type I error levels arenot guaranteed.
b.
Alpha = .05.c.
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
35
Using the Categorical Cluster Variable as a Dependent Variable
Example
To what extent does the type of defense counsel, pretrial jail time, and time to case disposition predict differences among the five groups of offenders?
This is a discriminant analysis problem with the cluster variable as the DV. (If the cluster variable were used as the IV, this would be a MANOVA problem)
Discriminant analysis results
Three discriminant functions were extracted since there are 3 IVs, which is less than 5 groups. (functions: g-1 or k, if kg)
Only the 1st discriminant function is significant.
Z1 = -0.313 - 0.866 council + 0.021 jail_tm -0.002 tm_disp
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
36
Using the Cluster Variable as a Dependent Variable (cont.)
Discriminant
Group Statistics
33 33.000
33 33.000
33 33.000
9 9.000
9 9.000
9 9.000
12 12.000
12 12.000
12 12.000
7 7.000
7 7.000
7 7.000
9 9.000
9 9.000
9 9.000
70 70.000
70 70.000
70 70.000
COUNSEL
JAIL_TM
TM_DISP
COUNSEL
JAIL_TM
TM_DISP
COUNSEL
JAIL_TM
TM_DISP
COUNSEL
JAIL_TM
TM_DISP
COUNSEL
JAIL_TM
TM_DISP
COUNSEL
JAIL_TM
TM_DISP
Ward Method1
2
3
4
5
Total
Unweighted Weighted
Valid N (listwise)
Analysis 1Summary of Canonical Discriminant Functions
Eige nv a lue s
.4 9 2 a 8 9 .5 8 9 .5 .5 7 4
.0 4 2 a 7 .6 9 7 .1 .2 0 0
.0 1 6 a 2 .9 1 0 0 .0 .1 2 5
Fu n c ti o n1
2
3
Ei g e n v a l u e% o f Va ri a n c eCu mu l a t i v e %Ca n o n i c a lCo rre l a t i o n
F i rs t 3 c a n o n i c a l d i s c ri mi n a n t fu n c t i o n s we re u s e d i n th ea n a l y s i s .
a .
Wilk s ' La m bda
.6 3 3 2 9 .6 8 6 1 2 .0 0 3
.9 4 5 3 .6 7 8 6 .7 2 0
.9 8 4 1 .0 1 9 2 .6 0 1
Te s t o f Fu n c ti o n (s )1 th ro u g h 3
2 th ro u g h 3
3
Wi l k s 'L a mb d a Ch i -s q u a re d f Sig .
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
37
Using the Cluster Variable as a Dependent Variable (cont.)
Standardized Canonical Discriminant Function Coefficients
.549 .863 .523
-.627 .807 .607
.102 .384 -.962
COUNSEL
JAIL_TM
TM_DISP
1 2 3
Function
Structure Matrix
-.867* .488 .103
.848* .455 .271
-.086 .555 -.827*
J AIL_TM
COUNSEL
TM_DISP
1 2 3
Func tion
Pooled wi thin-groups c orrelations between disc riminatingv ariables and s tandardiz ed c anonic al dis c riminant func tions Variables ordered by abs olute s iz e of c orrelation wi thin func tion.
Larges t abs olute c orrelation between eac h v ariable andany dis c riminant func tion
*.
Canonical Discr im inant Function Coefficients
1.235 1.943 1.176
-.016 .020 .015
.004 .015 -.039
-.304 -3.221 2.205
COUNSEL
JAIL_TM
TM_DISP
(Constant)
1 2 3
Function
Unstandardized coefficients
Functions at Gr oup Centr oids
.213 -.115 -9.76E -02
-.803 .140 -2.89E -02
.673 .366 4.822E -02
.618 -.266 .291
-1.357 -1.51E -03 9.600E -02
W ard Method1
2
3
4
5
1 2 3
Function
Unstandardized canonical discriminant functionsevaluated at group means
Cluster Analysis: Charles M. Friel Ph.D., Criminal Justice Center, Sam Houston State University
38