some 2-d data (x)

Some 2-D Data

(X)

Singular Value Decomposition (SVD)

X = UVT

Eigenanalysis XXT = C; CE = E

1) Eigenvectors2) Eigenvalues3) Principle Components

3 “Products” of Principle Component Analysis

Photo

(X)

1) Eigenvectors – Variations explained in the horizontal

2) Eigenvalues - % of Variance explained3) Principle Components Variations explained in the vertical

Examples for Today

Fake and Real Space-Time Data

(X)

1) Eigenvectors – Variations explained in space (MAPS)

2) Eigenvalues - % of Variance explained (spectrum)3) Principle Components Variations explained in the time (TIMESERIES)

Examples for Today

Fake and Real Space-Time Data

(X)

1) Eigenvectors – Variations explained in space (MAPS)

2) Eigenvalues - % of Variance explained (spectrum)

3) Principle Components Variations explained in the time (TIMESERIES)

Eigenvectors, Eigenvalues, PC’s

• Eigenvectors explain variance in one dimension; Principle components explain variance in the other dimension.

• Each eigenvector has a corresponding principle component. The PAIR define a mode that explains variance.

• Each eigenvector/PC pair has an associated eigenvalue which relates to how much of the total variance is explained by that mode.

EOF’s and PC’s for geophysical data

• For geophys. data, we often set it up so that the eigenvectors give us spatial structures (EOFs – empirical orthogonal functions) and the PC’s give us an associated time series (principle components) .

• The EOF’s and PC’s are constructed to efficiently explain the maximum amount of variance in the data set.

• In general, the majority of the variance in a data set can be explained with just a few EOFs.

EOF’s and PC’s for geophysical data

• By construction, the EOFs are orthogonal to each other, as are the PCs.

• Provide an ‘objective method’ for finding structure in a data set, but interpretation requires physical facts or intuition.

Variance of Northern Hemisphere Sea Level Pressure Field in Winter

EOF 1: AO/NAM (23% expl).

EOF 2: PNA (13% expl.)

EOF 3: non-distinct(10% expl.)

PCA for the REAL Sea-Level Pressure Field

EOF 1 (AO/NAM) EOF 2 (PNA) EOF 3 (?)

PC 1 (AO/NAM)

PC 2 (PNA)

PC 3 (?)

QuickTime™ and aVideo decompressor

are needed to see this picture.

PCA: An example using contrived data



EOF 1

PC 1

EOF 2

PC 2



EOF 1 - 60% variance expl.

PC 1


PC 2


PC 1


PC 2

Eigenvalue Spectrum



PCA: Example 2 based on contrived data




PC 1


PC 2

Significance• Each EOF / PC pair

comes with an associated eigenvalue

• The normalized eigenvalues (each eigenvalue divided by the sum of all of the eigenvalues) tells you the percent of variance explained by that EOF / PC pair.

• Eigenvalues need to be well separated from each other to be considered distinct modes.

First 25 Eigenvalues for DJF SLP

Significance: The North Test

• North et al (1982) provide estimate of error in estimating eigenvalues

• Requires estimating DOF of the data set.

• If eigenvalues overlap, those EOFs cannot be considered distinct. Any linear combination of overlapping EOFs is an equally viable structure.

First 25 Eigenvalues for DJF SLP

These two just barely overlap. Need physical intuition to help judge.

Example of overlapping eigenvalues

Validity of PCA modes: Questions to ask

• Is the variance explained more than expected for null hypothesis (red noise, white noise, etc.)?

• Do we have an a priori reason for expecting this structure? Does it fit with a physical theory?

• Are the EOF’s sensitive to choice of spatial domain?

• Are the EOF’s sensitive to choice of sample? If data set is subdivided (in time), do you still get the same EOF’s?

Regression Maps vs. Correlation Maps

PNA - Correlation map (r values of each point with index)

PNA - Regression map (meters/std deviation of index)

Practical Considerations

• EOFs are easy to calculate, difficult to interpret. There are no hard and fast rules, physical intuition is a must.

• Due to the constraint of orthogonality, EOFs tend to create wave-like structures, even in data sets of pure noise. So pretty… so suggestive… so meaningless. Beware of this.

Practical Considerations

• EOF’s are created using linear methods, so they only capture linear relationships.

• By nature, EOF’s give are fixed spatial patterns which only vary in strength and in sign. E.g., the ‘positive’ phase of an EOF looks exactly like the negative phase, just with its sign changed. Many phenomena in the climate system don’t exhibit this kind of symmetry, so EOF’s can’t resolve them properly.

some 2-d data (x)

Documents

percent of variance

total variance

contrived data eof

data set

aonam eof

pna eof

aonam pc

pna pc