math 5364/66 notes principal components and factor analysis in sas jesse crawford department of...
TRANSCRIPT
![Page 1: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/1.jpg)
Math 5364/66 NotesPrincipal Components and Factor Analysis in SAS
Jesse Crawford
Department of MathematicsTarleton State University
![Page 2: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/2.jpg)
Setting for Principal Components
1,Random vector ( , taking values i) ' n ppX X X
![Page 3: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/3.jpg)
Typical Coordinate System
1,Random vector ( , taking values i) ' n ppX X X
![Page 4: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/4.jpg)
Principal Components
1,Random vector ( , taking values i) ' n ppX X X
![Page 5: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/5.jpg)
Relation to Eigenvectors
1
1
1
• Let cov( )
• Suppose 0 are the eigenvalues of
• Let , be corresponding orthonormal eigenvectors
• Then , are the principal component
,
, s
p
p
p
X
a a
a a
![Page 6: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/6.jpg)
Implementation in R
![Page 7: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/7.jpg)
Simulating the Data in SAS
![Page 8: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/8.jpg)
Simulating the Data in SAS
1 1
2 2
5 0
2 0.4
X Z
X Z
![Page 9: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/9.jpg)
1 1
2 2
1 1
2 2
2 2
5 0
2 0.4
5 0 5 2cov cov
2 0.4 0 0.4
5 0 5 2
2 0.4 0 0.4
25 10
10 4.16
X Z
X Z
X
I
Z
X Z
![Page 10: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/10.jpg)
Covariance Matrix in SAS
25 10
10 4.16
![Page 11: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/11.jpg)
Principal Components in SAS
![Page 12: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/12.jpg)
Inputting a Covariance Matrix Manually
![Page 13: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/13.jpg)
PCA Using Original Data
![Page 14: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/14.jpg)
Example: Math and Reading Exams
![Page 15: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/15.jpg)
Example: Adelges (Winged Aphids)
• 19 variables
• 4 principal components needed to explain 90% of the total variation
• PCA can be used to reduce dimensionality
![Page 16: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/16.jpg)
PCA Summary
• -dimensional random vector
• Covariance matrix
• Principal components are simply an orthonormal eigenbasis of
• Dimensionality reduction is achieved by dropping components with small eigenvalues
p X
![Page 17: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/17.jpg)
Setting for Factor Analysis
1• Random vector ( ,
• Example from Spearman (1904). Exam scores for 33 students.
(Classics,French,English,Math,Pit
,
ch,Music) '
) 'pX X X
X
![Page 18: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/18.jpg)
Setting for Factor Analysis
1
1
• Random vector ( ,
• Example from Spearman (1904). Exam scores for 33 students.
(Classics,French,English,Math,Pitch,Music) '
• Idea: Explain the variation in with a random vector
, ) '
( , , )
p
k
X X X
X
X f f f
'
via a regression equation
1 1 11 1 1 1
2 2 21 1 2 2
1 1
k k
k k
p p p pk k p
X f l f
f l
l
f
f l
X
X l f
l
ò
ò
ò
![Page 19: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/19.jpg)
Setting for Factor Analysis
1• Random vector ,( , ) 'pX X X
1 1 11 1 1 1
2 2 21 1 2 2
1 1
k k
k k
p p p pk k p
X f l f
f l
l
f
f l
X
X l f
l
ò
ò
ò
Observed data(Random)
Intercept Term(Constant)
Factor loadings(Constant) Common factors
(Random)
Specific factors(Random)
![Page 20: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/20.jpg)
Setting for Factor Analysis
1• Random vector ,( , ) 'pX X X
1 1 11 1 1 1
2 2 21 1 2 2
1 1
k k
k k
p p p pk k p
X f l f
f l
l
f
f l
X
X l f
l
ò
ò
ò
Observed data(Random, Observable)
Intercept Term(Constant)
Factor loadings(Constant) Common factors
(Random)
Specific factors(Random)
Unobservable
![Page 21: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/21.jpg)
1 1 11 1 1 1
2 2 21 1 2 2
1 1
• is a -dimensional random vector
•
•
• is a -dimensional random vector
• is a -dimensional random vector
k k
k k
p p p pk k p
p
p k
l
X l
X l
X
X
X f l f
f l f
f l f
Lf
p
k
p
L
f
ò
ò
ò
ò
ò
![Page 22: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/22.jpg)
• is a -dimensional random vector
•
•
• is a -dimensional random vector
• is a -dimensional random vector
p
p k
Lf
p
k
p
X
X
L
f
ò
ò
1
1
1
• ( ) 0
• ) 0
• cov( , ) 0
• cov( )
• cov( ) diag( , , ),
with each 0
(
• cov( )
k
p
k p
k k
p
i
E f
f
f I
E
X
ò
ò
ò
![Page 23: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/23.jpg)
• is a -dimensional random vector
•
•
• is a -dimensional random vector
• is a -dimensional random vector
p
p k
Lf
p
X
X
k
L
f
p
L
L
ò
ò
1
1
1
• ( ) 0
• ) 0
• cov( , ) 0
• cov( )
• cov( ) diag( , , ),
with each 0
(
• cov( )
k
p
k p
k k
p
i
E f
f
f I
E
X
ò
ò
ò
![Page 24: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/24.jpg)
2 21
2
• is a -dimensional random vector
•
•
• is a -dimensional random vector
• is a -dimensional random vector
Var( )
p
p k
i ii i ik i
i i
Lf
p
k
p
LL
X
X
X
L
l
h
l
f
ò
ò
1
1
1
• ( ) 0
• ) 0
• cov( , ) 0
• cov( )
• cov( ) diag( , , ),
with each 0
(
• cov( )
k
p
k p
k k
p
i
E f
f
f I
E
X
ò
ò
ò
![Page 25: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/25.jpg)
2 21
2
• is a -dimensional random vector
•
•
• is a -dimensional random vector
• is a -dimensional random vector
Var( )
p
p k
i ii i ik i
i i
Lf
p
k
p
LL
X
X
X
L
l
h
l
f
ò
ò
1
1
1
• ( ) 0
• ) 0
• cov( , ) 0
• cov( )
• cov( ) diag( , , ),
with each 0
(
• cov( )
k
p
k p
k k
p
i
E f
f
f I
E
X
ò
ò
ò
Communality orCommon variance
Uniqueness or Specific variance
![Page 26: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/26.jpg)
![Page 27: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/27.jpg)
L̂
2
2
Var( )
If corr( ), then
1
i i i
i i
Lf
h
h
X
X
X
ò
2ˆih
![Page 28: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/28.jpg)
1 1
cov( , )
If corr( ), then
corr( , )
i i i ik k i
i j ij
i j ij
X f l f
X f l
X f l
l
X
ò
Correlations between
's and iX f
![Page 29: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/29.jpg)
Principal Component Method for Factor Analysis
1
1 1 1
, where is orthogonal
and diag( , , )
ˆDefine ( ,
ˆ ˆ ˆDefine ')
ˆ ˆ ˆ'
, )
(
ˆ ˆ ˆRes '
p
k
ii ii ii
Lf
LL
L
LL
X
LL
LL
ò
1
2
12
21 2
1
th column of
th principal component of
th eigenvalue
0 res res
res 0 resRes
res res 0
p
p
p p
i
i
i
i
i
![Page 30: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/30.jpg)
L̂
2ˆih
'sˆDiagonal Entries:
Off-diagonal entries : re ss 'ii
ij
Rule of thumb:
If RMS 0.05, then the model is acceptable
![Page 31: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/31.jpg)
![Page 32: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/32.jpg)
1 1 1
cov( )
ˆ
ˆ ˆ ˆ ˆ ˆ ˆ( ) ) (Generalized/weighted least squares method)
ˆˆˆ
ˆValues of are called factor sc r
(
o es.
Lf
X
f L L L X
X Lf
f
X
X
X
ò
ò
ò
Estimating Factor Scores
![Page 33: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/33.jpg)
![Page 34: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/34.jpg)
Rotation of Factors
• is a -dimensional random vector
•
•
• is a -dimensional random vector
• is a -dimensional random vector
p
p k
Lf
p
k
p
X
X
L
f
ò
ò
1
1
1
• ( ) 0
• ) 0
• cov( , ) 0
• cov( )
• cov( ) diag( , , ),
with each 0
(
• cov( )
k
p
k p
k k
p
i
E f
f
f I
E
X
ò
ò
ò
• be an orthogonal matrix.
• Then and satisfy the above conditions.
Let
fLL f
å å
![Page 35: Math 5364/66 Notes Principal Components and Factor Analysis in SAS Jesse Crawford Department of Mathematics Tarleton State University](https://reader035.vdocuments.mx/reader035/viewer/2022070404/56649f395503460f94c55ee7/html5/thumbnails/35.jpg)