lecture 17 factor analysis

32
Lecture 17 Factor Analysis

Upload: mahon

Post on 22-Feb-2016

29 views

Category:

Documents


0 download

DESCRIPTION

Lecture 17 Factor Analysis. Syllabus. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lecture 17  Factor Analysis

Lecture 17

Factor Analysis

Page 2: Lecture 17  Factor Analysis

SyllabusLecture 01 Describing Inverse ProblemsLecture 02 Probability and Measurement Error, Part 1Lecture 03 Probability and Measurement Error, Part 2 Lecture 04 The L2 Norm and Simple Least SquaresLecture 05 A Priori Information and Weighted Least SquaredLecture 06 Resolution and Generalized InversesLecture 07 Backus-Gilbert Inverse and the Trade Off of Resolution and VarianceLecture 08 The Principle of Maximum LikelihoodLecture 09 Inexact TheoriesLecture 10 Nonuniqueness and Localized AveragesLecture 11 Vector Spaces and Singular Value DecompositionLecture 12 Equality and Inequality ConstraintsLecture 13 L1 , L∞ Norm Problems and Linear ProgrammingLecture 14 Nonlinear Problems: Grid and Monte Carlo Searches Lecture 15 Nonlinear Problems: Newton’s Method Lecture 16 Nonlinear Problems: Simulated Annealing and Bootstrap Confidence Intervals Lecture 17 Factor AnalysisLecture 18 Varimax Factors, Empircal Orthogonal FunctionsLecture 19 Backus-Gilbert Theory for Continuous Problems; Radon’s ProblemLecture 20 Linear Operators and Their AdjointsLecture 21 Fréchet DerivativesLecture 22 Exemplary Inverse Problems, incl. Filter DesignLecture 23 Exemplary Inverse Problems, incl. Earthquake LocationLecture 24 Exemplary Inverse Problems, incl. Vibrational Problems

Page 3: Lecture 17  Factor Analysis

Purpose of the Lecture

Introduce Factor Analysis

Work through an example

Page 4: Lecture 17  Factor Analysis

Part 1

Factor Analysis

Page 5: Lecture 17  Factor Analysis

source A

ocean

sediment

source B

s4s2 s3s1

Page 6: Lecture 17  Factor Analysis

sample matrix S

S arranged row-wisebut we’ll use a column vector s(i) for individual samples)

Page 7: Lecture 17  Factor Analysis

theory

samples are a linear mixture of sources

S = C F

Page 8: Lecture 17  Factor Analysis

theory

samples are a linear mixture of sources

S = C Fsamples contain

“elements”

Page 9: Lecture 17  Factor Analysis

theory

samples are a linear mixture of sources

S = C Fsources called “factors”

factors contain “elements”

Page 10: Lecture 17  Factor Analysis

factor matrix F

F arranged row-wisebut we’ll use a column vector f(i) for individual factors

Page 11: Lecture 17  Factor Analysis

theory

samples are a linear mixture of sources

S = C Fcoefficients

called “loadings”

Page 12: Lecture 17  Factor Analysis

loading matrix C

Page 13: Lecture 17  Factor Analysis

inverse problem

given Sfind C and F

so that S=CF

Page 14: Lecture 17  Factor Analysis

very non-unique

given T with inverse T-1

if S=CFthen S=[C T-1][TF] =C’F’

Page 15: Lecture 17  Factor Analysis

very non-unique

so a priori information needed to select a solution

Page 16: Lecture 17  Factor Analysis

simplicity

what is the minimum number of factors needed

call that number p

Page 17: Lecture 17  Factor Analysis

does S span the full space of M elements?

or just a p –dimensional subspace?

Page 18: Lecture 17  Factor Analysis

0 0.2 0.4 0.6 0.8 10

0.5

1

0

0.2

0.4

0.6

0.8

1

E2

E1

E3 E3

E2

s1s3

A

Bs1

s4

E1

Page 19: Lecture 17  Factor Analysis

we know how to answer this question

p is the number of non-zero singular values

Page 20: Lecture 17  Factor Analysis

-0.50 0.5

11.5

-1

-0.5

0

0.5

1

-1

-0.5

0

0.5

1

E1

E2

E3

E3

E2E1

s2 s3

A

Bs1

s4

v2

v1

v3

Page 21: Lecture 17  Factor Analysis

SVD identifies a subspace

but the SVD factors

f(i) = v(i) i=1, pnot unique

usually not the “best”

Page 22: Lecture 17  Factor Analysis

factor f(1)v with the largest singular value

usually near the mean sample

sample mean <s>minimize

eigenvector <v>minimize

Page 23: Lecture 17  Factor Analysis

factor f(1)v with the largest singular value

usually near the mean sample

sample mean <s>minimize

eigenvector <v>minimize

about the same if samples are clustered

Page 24: Lecture 17  Factor Analysis

0 0.2 0.4 0.6 0.8 10

0.5

10

0.2

0.4

0.6

0.8

1

E1

E2E

30 0.2 0.4 0.6 0.8 1

0

0.5

10

0.2

0.4

0.6

0.8

1

E1

E2

E3 s3

f(1) s1s4

f(1)

f(2) f(2)s4s3s2 s2s1E

3E2 E1

E3E2 E1

(A) (B)

Page 25: Lecture 17  Factor Analysis

[U, LAMBDA, V] = svd(S,0);lambda = diag(LAMBDA);F = V';C = U*LAMBDA;

in MatLab

Page 26: Lecture 17  Factor Analysis

[U, LAMBDA, V] = svd(S,0);lambda = diag(LAMBDA);F = V';C = U*LAMBDA;

“economy” calculationLAMBDA is M⨉Min MatLab

Page 27: Lecture 17  Factor Analysis

since samples have measurement noise

probably no exactly singular valuesjust very small ones

so pick pfor whichS≈CF

is an adequate approximation

Page 28: Lecture 17  Factor Analysis

Atlantic Rock Dataset

51.97 1.25 14.28 11.57 7.02 11.67 2.12 0.0750.21 1.46 16.41 10.39 7.46 11.27 2.94 0.0750.08 1.93 15.6 11.62 7.66 10.69 2.92 0.3451.04 1.35 16.4 9.69 7.29 10.82 2.65 0.1352.29 0.74 15.06 8.97 8.14 13.19 1.81 0.0449.18 1.69 13.95 12.11 7.26 12.33 2 0.1550.82 1.59 14.21 12.85 6.61 11.25 2.16 0.1649.85 1.54 14.07 12.24 6.95 11.31 2.17 0.1550.87 1.52 14.38 12.38 6.69 11.28 2.11 0.17(several thousand more rows)

SiO2 TiO2 Al2O3 FeOt MgO CaO Na2O K2O

Page 29: Lecture 17  Factor Analysis

Al203

Ti02Al203

Si02

K20

Fe0

Mg0

Al203

A) B)

C) D)

Page 30: Lecture 17  Factor Analysis

1 2 3 4 5 6 7 80

1000

2000

3000

4000

5000singular values, s(i)

index, i

s(i)

sing

ular

val

ues,

λ i

index, i

Page 31: Lecture 17  Factor Analysis

f2 f3 f4 f5

f2p f3p f4p f5pf(5)f(2) f(3) f(4)

SiO2

TiO2

Al2O3

FeOtotal

MgO

CaO

Na2O

K2O

Page 32: Lecture 17  Factor Analysis

C2C3

C4