lecture 18 varimax factors and empircal orthogonal functions

Post on 23-Jan-2016

24 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Lecture 18 Varimax Factors and Empircal Orthogonal Functions. Syllabus. - PowerPoint PPT Presentation

TRANSCRIPT

Lecture 18

Varimax Factorsand

Empircal Orthogonal Functions

SyllabusLecture 01 Describing Inverse ProblemsLecture 02 Probability and Measurement Error, Part 1Lecture 03 Probability and Measurement Error, Part 2 Lecture 04 The L2 Norm and Simple Least SquaresLecture 05 A Priori Information and Weighted Least SquaredLecture 06 Resolution and Generalized InversesLecture 07 Backus-Gilbert Inverse and the Trade Off of Resolution and VarianceLecture 08 The Principle of Maximum LikelihoodLecture 09 Inexact TheoriesLecture 10 Nonuniqueness and Localized AveragesLecture 11 Vector Spaces and Singular Value DecompositionLecture 12 Equality and Inequality ConstraintsLecture 13 L1 , L∞ Norm Problems and Linear ProgrammingLecture 14 Nonlinear Problems: Grid and Monte Carlo Searches Lecture 15 Nonlinear Problems: Newton’s Method Lecture 16 Nonlinear Problems: Simulated Annealing and Bootstrap Confidence Intervals Lecture 17 Factor AnalysisLecture 18 Varimax Factors, Empircal Orthogonal FunctionsLecture 19 Backus-Gilbert Theory for Continuous Problems; Radon’s ProblemLecture 20 Linear Operators and Their AdjointsLecture 21 Fréchet DerivativesLecture 22 Exemplary Inverse Problems, incl. Filter DesignLecture 23 Exemplary Inverse Problems, incl. Earthquake LocationLecture 24 Exemplary Inverse Problems, incl. Vibrational Problems

Purpose of the Lecture

Choose Factors Satisfying A Priori Information of Spikiness

(varimax factors)

Use Factor Analysis to DetectPatterns in data

(EOF’s)

Part 1: Creating Spiky Factors

can we find “better” factors

that those returned by svd()

?

mathematically

S = CF = C’ F’with F’ = M F and C’ = M-1 Cwhere M is any P×P matrix with an inverse

must rely on prior information to choose M

one possible type of prior information

factors should contain mainly just a few elements

example of rock and mineralsrocks contain minerals

minerals contain elements

Mineral Composition

Quartz SiO2

Rutile TiO2

Anorthite CaAl2Si2O8

Fosterite Mg2SiO4

example of rock and mineralsrocks contain minerals

minerals contain elements

Mineral Composition

Quartz SiO2

Rutile TiO2

Anorthite CaAl2Si2O8

Fosterite Mg2SiO4

factors

most of these minerals are “simple” in the sense that each contains just a few elements

spiky factors

factors containing mostly just a few elements

How to quantify spikiness?

variance as a measure of spikiness

modification for factor analysis

modification for factor analysis

depends on the square, so positive and negative values are treated the same

f(1)= [1, 0, 1, 0, 1, 0]T

is much spikier than

f(2)= [1, 1, 1, 1, 1, 1]T

f(2)=[1, 1, 1, 1, 1, 1]T

is just as spiky as

f(3)= [1, -1, 1, -1, -1, 1]T

“varimax” procedure

find spiky factors without changing P

start with P svd() factors

rotate pairs of them in their plane by angle θto maximize the overall spikiness

fB fA

f’B

f’A

determine θ by maximizing

after tedious trig the solution can be shown to be

and the new factors are

in this example A=3 and B=5

now one repeats for every pair of factors

and then iterates the whole process several times

until the whole set of factors is as spiky as possible

f2 f3 f4 f5

f2p f3p f4p f5p

Old New

f5f2 f3 f4 f’5f’2f’3 f’4

SiO2

TiO2

Al2O3

FeOtotal

MgO

CaO

Na2O

K2O

example: Atlantic Rock dataset

f2 f3 f4 f5

f2p f3p f4p f5p

Old New

f5f2 f3 f4 f’5f’2f’3 f’4

SiO2

TiO2

Al2O3

FeOtotal

MgO

CaO

Na2O

K2O

example: Atlantic Rock dataset

not so spiky

f2 f3 f4 f5

f2p f3p f4p f5p

Old New

f5f2 f3 f4 f’5f’2f’3 f’4

SiO2

TiO2

Al2O3

FeOtotal

MgO

CaO

Na2O

K2O

example: Atlantic Rock dataset

spiky

Part 2: Empirical Orthogonal Functions

row number in the sample matrix could be meaningful

example: samples collected at a succession of times

time

column number in the sample matrix could be meaningful

example: concentration of the same chemical element at a sequence of positions

distance

S = CFbecomes

S = CFbecomes

distance dependence time dependence

S = CFbecomes

each loading: a temporal pattern of variability of the corresponding factor

each factor:a spatial pattern of variability of the element

S = CFbecomes

there are P patterns and they are sorted into order of importance

S = CFbecomes

factors now called EOF’s (empirical orthogonal functions)

example 1

hypothetical mountain profiles

what are the most important spatial patternsthat characterize mountain profiles

this problem has space but not time

s( xj , i ) = Σk=1p cki f (k)(xj)

this problem has space but not time

s( xj , i ) = Σk=1p cki f (k)(xj )factors are spatial patterns that add together to make mountain profiles

0 5 100

5

i

s 1

0 5 100

5

i

s 2

0 5 100

5

i

s 3

0 5 100

5

i

s 4

0 5 100

5

i

s 5

0 5 100

5

i

s 6

0 5 100

5

i

s 7

0 5 100

5

i

s 80 5 10

0

5

i

s 9

0 5 100

5

i

s 10

0 5 100

5

i

s 11

0 5 100

5

is 12

0 5 100

5

i

s 13

0 5 100

5

i

s 14

0 5 100

5

i

s 15

0 5 100

5

i

s 1

0 5 100

5

i

s 2

0 5 100

5

i

s 3

0 5 100

5

i

s 4

0 5 100

5

i

s 5

0 5 100

5

i

s 6

0 5 100

5

i

s 7

0 5 100

5

i

s 80 5 10

0

5

i

s 9

0 5 100

5

i

s 10

0 5 100

5

i

s 11

0 5 100

5

is 12

0 5 100

5

i

s 13

0 5 100

5

i

s 14

0 5 100

5

i

s 15

elements:elevations ordered by distance along profile

0 5 10-1

0

1

i

F1

0 5 10-1

0

1

i

F2

0 5 10-1

0

1

i

F3

EOF 3EOF 2EOF 1

index, i index, i index, i

f i (1)

f i (2)

f i (3)

λ1 = 38 λ2 = 13 λ3 = 7

factor loading, Ci2

factor loa

ding, C i3

example 2

spatial-temporal patterns

(synthetic data)

1 2 3 4 5

6 7 8 9 10

11 12 13 14 15

16 17 18 19 20

21 22 23 24 25

the data

1 2 3 4 5

6 7 8 9 10

11 12 13 14 15

16 17 18 19 20

21 22 23 24 25

the data

spatial

pattern at

a single

time

x

y

t=1

1 2 3 4 5

6 7 8 9 10

11 12 13 14 15

16 17 18 19 20

21 22 23 24 25

the data

time

1 2 3 4 5

6 7 8 9 10

11 12 13 14 15

16 17 18 19 20

21 22 23 24 25

the data

4 6 19 1 21 3 6

461912136s

need to unfold each 2D image into vector

0 5 10 15 20 250

50

100

150

200

index, iλi

p=3

1 2 3EOF 1 EOF 2 EOF 30 5 10 15 20 25

-100

-50

0

0 5 10 15 20 25-20

0

20

0 5 10 15 20 25-50

0

50

load

ng1

load

ng 2lo

adng

3(A)

(B)

time, ttime, ttime, t

example 3

spatial-temporal patterns

(actual data)

sea surface temperature in the Pacific Ocean

29S

29N

124E 290E

latitude

longitude

equatorial Pacific Ocean

sea surface temperature (black = warm)

CAC Sea Surface Temperature

the image is 30 by 84 pixels in size, or 2520 pixels total

to use svd(), the image must be unwrapped into a vector of length 2520

2520 positions in the equatorial Pacific ocean

399

tim

es

“element” means temperature

sing

ular

val

ues,

ii

index, i

singular values

sing

ular

val

ues,

ii

index, i

singular values

no clear cutoff for P, but the first 12 singular values are considerably larger than the rest

using SVD to approximate data

S=CMFM

S=CPFP

S≈CP’FP’

With M EOF’s, the data is fit exactly

With P chosen to exclude only zero singular values, the data is fit exactly

With P’<P, small non-zero singular values are excluded too, and the data is fit only approximately

A) Original B) Based on first 5 EOF’s

top related