some 2-d data (x)

25
Some 2-D Data (X) Singular Value Decomposition (SVD) X = UV T Eigenanalysis XX T = C; CE = E 1) Eigenvector s 2) Eigenvalues 3) Principle Components 3 “Products” of Principle Component Analysis

Upload: zion

Post on 18-Jan-2016

25 views

Category:

Documents


0 download

DESCRIPTION

3 “Products” of Principle Component Analysis. Singular Value Decomposition (SVD) X = U S V T. 1) Eigenvectors. Some 2-D Data (X). 2) Eigenvalues. 3) Principle Components. Eigenanalysis XX T = C; CE = L E. Examples for Today. 1) Eigenvectors – Variations explained in the horizontal. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Some 2-D Data (X)

Some 2-D Data

(X)

Singular Value Decomposition (SVD)

X = UVT

Eigenanalysis XXT = C; CE = E

1) Eigenvectors2) Eigenvalues3) Principle Components

3 “Products” of Principle Component Analysis

Page 2: Some 2-D Data (X)

Photo

(X)

1) Eigenvectors – Variations explained in the horizontal

2) Eigenvalues - % of Variance explained3) Principle Components Variations explained in the vertical

Examples for Today

Fake and Real Space-Time Data

(X)

1) Eigenvectors – Variations explained in space (MAPS)

2) Eigenvalues - % of Variance explained (spectrum)3) Principle Components Variations explained in the time (TIMESERIES)

Page 3: Some 2-D Data (X)

Examples for Today

Fake and Real Space-Time Data

(X)

1) Eigenvectors – Variations explained in space (MAPS)

2) Eigenvalues - % of Variance explained (spectrum)

3) Principle Components Variations explained in the time (TIMESERIES)

Page 4: Some 2-D Data (X)

Eigenvectors, Eigenvalues, PC’s

• Eigenvectors explain variance in one dimension; Principle components explain variance in the other dimension.

• Each eigenvector has a corresponding principle component. The PAIR define a mode that explains variance.

• Each eigenvector/PC pair has an associated eigenvalue which relates to how much of the total variance is explained by that mode.

Page 5: Some 2-D Data (X)

EOF’s and PC’s for geophysical data

• For geophys. data, we often set it up so that the eigenvectors give us spatial structures (EOFs – empirical orthogonal functions) and the PC’s give us an associated time series (principle components) .

• The EOF’s and PC’s are constructed to efficiently explain the maximum amount of variance in the data set.

• In general, the majority of the variance in a data set can be explained with just a few EOFs.

Page 6: Some 2-D Data (X)

EOF’s and PC’s for geophysical data

• By construction, the EOFs are orthogonal to each other, as are the PCs.

• Provide an ‘objective method’ for finding structure in a data set, but interpretation requires physical facts or intuition.

Page 7: Some 2-D Data (X)

Variance of Northern Hemisphere Sea Level Pressure Field in Winter

Page 8: Some 2-D Data (X)

EOF 1: AO/NAM (23% expl).

EOF 2: PNA (13% expl.)

EOF 3: non-distinct(10% expl.)

PCA for the REAL Sea-Level Pressure Field

Page 9: Some 2-D Data (X)

EOF 1 (AO/NAM) EOF 2 (PNA) EOF 3 (?)

PC 1 (AO/NAM)

PC 2 (PNA)

PC 3 (?)

Page 10: Some 2-D Data (X)

QuickTime™ and aVideo decompressor

are needed to see this picture.

PCA: An example using contrived data

Page 11: Some 2-D Data (X)

QuickTime™ and aVideo decompressor

are needed to see this picture.

EOF 1

PC 1

EOF 2

PC 2

Page 12: Some 2-D Data (X)

QuickTime™ and aVideo decompressor

are needed to see this picture.

EOF 1 - 60% variance expl.

PC 1

EOF 2 - 40% variance expl.

PC 2

Page 13: Some 2-D Data (X)

EOF 1 - 60% variance expl.

PC 1

EOF 2 - 40% variance expl.

PC 2

Eigenvalue Spectrum

Page 14: Some 2-D Data (X)

QuickTime™ and aVideo decompressor

are needed to see this picture.

PCA: Example 2 based on contrived data

Page 15: Some 2-D Data (X)

QuickTime™ and aVideo decompressor

are needed to see this picture.

EOF 1 - 65% variance expl.

PC 1

EOF 2 - 35% variance expl.

PC 2

Page 16: Some 2-D Data (X)

Significance• Each EOF / PC pair

comes with an associated eigenvalue

• The normalized eigenvalues (each eigenvalue divided by the sum of all of the eigenvalues) tells you the percent of variance explained by that EOF / PC pair.

• Eigenvalues need to be well separated from each other to be considered distinct modes.

First 25 Eigenvalues for DJF SLP

Page 17: Some 2-D Data (X)

Significance: The North Test

• North et al (1982) provide estimate of error in estimating eigenvalues

• Requires estimating DOF of the data set.

• If eigenvalues overlap, those EOFs cannot be considered distinct. Any linear combination of overlapping EOFs is an equally viable structure.

First 25 Eigenvalues for DJF SLP

These two just barely overlap. Need physical intuition to help judge.

Example of overlapping eigenvalues

Page 18: Some 2-D Data (X)

Validity of PCA modes: Questions to ask

• Is the variance explained more than expected for null hypothesis (red noise, white noise, etc.)?

• Do we have an a priori reason for expecting this structure? Does it fit with a physical theory?

• Are the EOF’s sensitive to choice of spatial domain?

• Are the EOF’s sensitive to choice of sample? If data set is subdivided (in time), do you still get the same EOF’s?

Page 19: Some 2-D Data (X)

Regression Maps vs. Correlation Maps

PNA - Correlation map (r values of each point with index)

PNA - Regression map (meters/std deviation of index)

Page 20: Some 2-D Data (X)
Page 21: Some 2-D Data (X)

Practical Considerations

• EOFs are easy to calculate, difficult to interpret. There are no hard and fast rules, physical intuition is a must.

• Due to the constraint of orthogonality, EOFs tend to create wave-like structures, even in data sets of pure noise. So pretty… so suggestive… so meaningless. Beware of this.

Page 22: Some 2-D Data (X)

Practical Considerations

• EOF’s are created using linear methods, so they only capture linear relationships.

• By nature, EOF’s give are fixed spatial patterns which only vary in strength and in sign. E.g., the ‘positive’ phase of an EOF looks exactly like the negative phase, just with its sign changed. Many phenomena in the climate system don’t exhibit this kind of symmetry, so EOF’s can’t resolve them properly.

Page 23: Some 2-D Data (X)
Page 24: Some 2-D Data (X)
Page 25: Some 2-D Data (X)