dimension reduction and classification using pca, factor analysis and ant functions a short overview

43
Outline Principal Components Factor Analysis Discriminant Functions References Dimension Reduction and Classication Using PCA, Factor Analysis and Discriminant Functions - A Short Overview Dipayan Maiti Laboratory for Interdisciplinary Statistical Analysis Department of Statistics Virginia Tech http://www.stat.vt.edu/consult/ October 28, 2008 Dipayan Maiti Dimension Reduction and Classication Using PCA, Factor An

Upload: adrian-albert

Post on 06-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 1/43

OutlinePrincipal Components

Factor AnalysisDiscriminant Functions

References

Dimension Reduction and ClassificationUsing PCA, Factor Analysis and

Discriminant Functions - A Short Overview

Dipayan Maiti

Laboratory for Interdisciplinary Statistical Analysis

Department of Statistics

Virginia Tech

http://www.stat.vt.edu/consult/

October 28, 2008

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 2/43

OutlinePrincipal Components

Factor AnalysisDiscriminant Functions

References

Outline

Principal ComponentsThe ConceptExampleApplications

Factor AnalysisThe ConceptExampleApplications - Disussions

Difference between Principal Components and Factor AnalysisDiscriminant Functions

The ConceptExampleApplications

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 3/43

OutlinePrincipal Components

Factor AnalysisDiscriminant Functions

References

The ConceptExampleApplications

The Problem

What to do when you have too many predictors in a model?

For example you have expression level data for 1000 genes!

Or you have customer attributes in hundreds and you areinterested in making a predictive model based on customerattributes!

Or you have second by second stock market data over a

trading day for stocks! Or in survey data where multiple questions might capture the

same kind of information (highly correlated)

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 4/43

OutlinePrincipal Components

Factor AnalysisDiscriminant Functions

References

The ConceptExampleApplications

The Cars Example

A researcher wants to build a model to find out which variables aremost significant in predicting the demand for cars but believes thata lot of variables have high correlation and the study can beeffectively done on a small number of variables without losingmuch information.

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

O li

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 5/43

OutlinePrincipal Components

Factor AnalysisDiscriminant Functions

References

The ConceptExampleApplications

The Problem

Given a data set with N observations like X = (x 1, . . . , x p ) for avery large p .

Figure: Data with 11 possible predictors

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

O tli

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 6/43

OutlinePrincipal Components

Factor AnalysisDiscriminant Functions

References

The ConceptExampleApplications

The Problem

How do we reduce the number of columns in X but still not throwaway too much information?

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

Outline

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 7/43

OutlinePrincipal Components

Factor AnalysisDiscriminant Functions

References

The ConceptExampleApplications

The Problem

JMP → Analyze → Multivariate Methods → PrincipalComponents → Mutlivariate(Tab) → Scatterplot Matrix

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

Outline

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 8/43

OutlinePrincipal Components

Factor AnalysisDiscriminant Functions

References

The ConceptExampleApplications

The Problem

Notice the highly correlated variables! We will attempt  to explain most of the variability in the data,

but use a small number of  principal components  (parsimony)if it is possible.

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

Outline

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 9/43

OutlinePrincipal Components

Factor AnalysisDiscriminant Functions

References

The ConceptExampleApplications

The Geometric Interpretation

We intend to come up with rotations and projections in p 

dimensions that captures most of the variability.

Figure: Plot of Principal Components in three dimensionsDipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

Outline

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 10/43

OutlinePrincipal Components

Factor AnalysisDiscriminant Functions

References

The ConceptExampleApplications

The Geometric Interpretation - Eigens

We can write the principal components as:

Y 1 = a1X

. . .

Y p  = apX

such that the Y s  are uncorrelated and the variances for each Y  isas large as possible.We find out the eigenvalues λ of the data matrix and rank them in

terms of their size.The a’s are obtained from the corresponding eigenvectors and theeigenvalues correspond to corresponding variances.Since Total population Variance = λ1 + · · · + λp 

Variance explained by the k th principal component = λk 

λ1+···+λp 

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

Outline

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 11/43

OutlinePrincipal Components

Factor AnalysisDiscriminant Functions

References

The ConceptExampleApplications

The Geometric Interpretation - Eigens Summary

Principal components are determined by our predictors There is a principal component for every eigenvalue

The value of the eigenvalue gives a measure of much variationthe corresponding principal component explains

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

Outline

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 12/43

Principal ComponentsFactor Analysis

Discriminant FunctionsReferences

The ConceptExampleApplications

The Geometric Interpretation - Eigens Summary

By choosing the first few principal components (and henceeigenvalues) we might be able to explain a lot of the variationamong the predictors (not all!)

Hence we throw away some information but hopefully notmuch

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

Outline

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 13/43

Principal ComponentsFactor Analysis

Discriminant FunctionsReferences

The ConceptExampleApplications

The Cars Example

We have data about 387 cars with the following variables

Suggested Retail Price

Invoice price

Engine Size (liters) Number of Cylinders (=-1 if rotary engine)

Horsepower

City Miles Per Gallon

Highway Miles Per Gallon Weight (Pounds)

Wheel Base (inches)

Length (inches)

Width (inches)

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

Outline

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 14/43

Principal ComponentsFactor Analysis

Discriminant FunctionsReferences

The ConceptExampleApplications

The Cars Example Again

A researcher wants to build a model to find out which variables aremost significant in predicting the demand for cars but believes that

a lot of variables have high correlation and the study can beeffectively done on a small number of variables without losingmuch information.

But how to choose a fewer number of predictors?

Principal Components Analysis!

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlineP i i l C Th C

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 15/43

Principal ComponentsFactor Analysis

Discriminant FunctionsReferences

The ConceptExampleApplications

The Cars Example

Use JMP → Analyze → Multivariate Methods → PrincipalComponents

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlineP i i l C t Th C t

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 16/43

Principal ComponentsFactor Analysis

Discriminant FunctionsReferences

The ConceptExampleApplications

The Cars Example

Let us first look at the correlations between the variables.

Figure: CorrelationsDipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components The Concept

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 17/43

Principal ComponentsFactor Analysis

Discriminant FunctionsReferences

The ConceptExampleApplications

The Cars Example

What about the principal components? Can we interpret them?

Figure: Principal Components

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components The Concept

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 18/43

Principal ComponentsFactor Analysis

Discriminant FunctionsReferences

The ConceptExampleApplications

The Cars Example

How many principal components do we need? How much of thevariation is explained?

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components The Concept

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 19/43

Principal ComponentsFactor Analysis

Discriminant FunctionsReferences

The ConceptExampleApplications

Key Points

Principal components are functions of the predictors

The first few principal components can give us almost all theinformation in terms of the variability in the data

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components The Concept

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 20/43

Principal ComponentsFactor Analysis

Discriminant FunctionsReferences

The ConceptExampleApplications

Applications - Discussion

To reduce the number of predictors As a first step for a predictive model where we would like to

remove correlated variables

General dimension reduction - expecting a low dimensionalstructure where higher dimensions are basically noise

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components

The ConceptE l

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 21/43

p pFactor Analysis

Discriminant FunctionsReferences

ExampleApplications - DisussionsDifference between Principal Components and Factor Analysis

The Problem

Sometimes inherent structure of the data motivates theresearcher to group the data based on some unseen underlyingfactors.

This inherent structure can be identified through thecorrelation matrix of  X .

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components

The ConceptExample

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 22/43

p pFactor Analysis

Discriminant FunctionsReferences

ExampleApplications - DisussionsDifference between Principal Components and Factor Analysis

The Subject Scores Problem

Consider examination scores in 6 subjects for 220 male students.The 6 subjects are Latin, English, History, Arithmetic, Algebra andGeometry. Consider the correlation matrix for the scores.

1.000.439 1.000.410 .351 1.000.288 .354 .164 1.000.329 .320 .190 .595 1.000.248 .329 .181 .470 .464 1.000

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components

The ConceptExample

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 23/43

Factor AnalysisDiscriminant Functions

References

ExampleApplications - DisussionsDifference between Principal Components and Factor Analysis

The Problem

The researcher believes that the subject scores will be correlatedamongst themselves in groups.

A possible hypothesis might be that there are probably twounderlying factors for the students’ scores - a factor thatcaptures the liberal arts scores and another that captures thescience scores.

But how to verify such a hypothesis? Factor Analysis!

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components

The ConceptExample

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 24/43

Factor AnalysisDiscriminant Functions

References

ExampleApplications - DisussionsDifference between Principal Components and Factor Analysis

Factor Loadings

For our problem the researcher thinks that there are two underlyingfactors.The underlying factors correspond to two different loadings on the

6 subjects.

Latin = L11F 1 + L12F 2 + 1

English = L21F 1 + L22F 2 + 2

. . .

Geometry  = L61F 1 + L62F 2 + 6

The loadings L’s will hopefully help us interpret the factors.

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components

F A l i

The ConceptExample

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 25/43

Factor AnalysisDiscriminant Functions

References

ExampleApplications - DisussionsDifference between Principal Components and Factor Analysis

The Approach

Data has underlying factors

→ researcher determines number of factors

→ factor loadings to be obtained through the covariancematrix

→ researcher interprets factors based on loadings

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components

F t A l i

The ConceptExample

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 26/43

Factor AnalysisDiscriminant Functions

References

pApplications - DisussionsDifference between Principal Components and Factor Analysis

Factor Loadings for the Subject Scores Example

Variable F1 F2 Communalities

Latin .553 .429 .490English .568 .288 .406History .392 .450 .356

Arithmetic .740 -.273 .623Algebra .724 -.211 .569

Geometry .595 -.132 .372

The factor loadings do not give us any immediately identifiable

groups or factor interpretation.Or DOES it?

Communalities give a measure of how much of the variance of thevariable is explained by the factor structure.

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 27/43

OutlinePrincipal Components

Factor Analysis

The ConceptExample

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 28/43

Factor AnalysisDiscriminant Functions

References

Applications - DisussionsDifference between Principal Components and Factor Analysis

The Factor Rotation

The factors are not immediately identifiable

What do we do now?

Factor structure in terms of variance explained remainsunchanged if we rotate the factors

Lets rotate and see if the factor loadings become interpretable

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components

Factor Analysis

The ConceptExample

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 29/43

Factor AnalysisDiscriminant Functions

References

Applications - DisussionsDifference between Principal Components and Factor Analysis

Rotated Factor Loadings for the Subject Scores Example

Variable F1 F2 Communalities

Latin .369 .594 .490

English .433 .467 .406History .211 .558 .356Arithmetic .789 -.001 .623

Algebra .752 -.054 .569Geometry .604 -.083 .372

Rotation makes the two factors immediately identifiable

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components

Factor Analysis

The ConceptExampleA li i Di i

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 30/43

Factor AnalysisDiscriminant Functions

References

Applications - DisussionsDifference between Principal Components and Factor Analysis

Rotated Factor Loadings Plot

Figure: Plot of factor loadings with two factors for the scores example

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components

Factor Analysis

The ConceptExampleA li ti Di i

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 31/43

yDiscriminant Functions

References

Applications - DisussionsDifference between Principal Components and Factor Analysis

Factor Analysis Approach - Summary

Decide on number of factors

Obtain factor loadings for the variables

Interpret factors

If interpretation not obvious rotate factors and check loadingsagain

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components

Factor Analysis

The ConceptExampleApplications Disussions

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 32/43

yDiscriminant Functions

References

Applications - DisussionsDifference between Principal Components and Factor Analysis

Applications

Psychometrics, Psychology, human factors - identify ”factors”that explain a variety of results on different tests

Marketing - Identify the salient attributes consumers use toevaluate products in this category.

Physical sciences, geochemistry, ecology, and hydrochemistry

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components

Factor Analysis

The ConceptExampleApplications - Disussions

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 33/43

Discriminant FunctionsReferences

Applications - DisussionsDifference between Principal Components and Factor Analysis

Differences

Principal components capture most of the variability in data

by using fewer dimensions that where the data exists Hence the principal components lie in the same space as data

Factor analysis conceptually tries to search for underlying butunobserved factors that define the correlation in the data

Hence factors lie in a different space than the data

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components

Factor AnalysisThe ConceptExample

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 34/43

Discriminant FunctionsReferences

Applications

The Problem

Given data for two groups X1 = {x 11, . . . , x 1p } and

X2 = {x 21, . . . , x 2p }.x s  can be thought of as explanatory variables.We would like to come up with a classification rule based on data.When we see new data we can use the classification rule to assignthe new data to any of the two groups.

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components

Factor AnalysisDi i i t F ti

The ConceptExampleA li ti

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 35/43

Discriminant FunctionsReferences

Applications

The Classification Criteria

The rule should be based on some criteria.Our Criteria → Minimize Expected Miss-classification Cost

Expected Miss-classification Cost depends on Information about prior classification probability

Cost of miss-classifying

Note: Cost of miss-classifying can be assymetric.

In the absence of prior beliefs about classification probability wewill assume a 50:50 chance.

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components

Factor AnalysisDiscriminant F nctions

The ConceptExampleApplications

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 36/43

Discriminant FunctionsReferences

Applications

The Classification Criteria

We intend to come up with a hyperplane in p  dimensions thatseparates the two groups after minimizing the cost.

Figure: Plot of discriminant function in two dimensions with correctlyclassified and miss-classified dataDipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal Components

Factor AnalysisDiscriminant Functions

The ConceptExampleApplications

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 37/43

Discriminant FunctionsReferences

Applications

Normal populations with equal variances

When we minimize miss-classification our classification criteria foran arbitrary data point X is given by:

if  f 1(X)f 2(X) > (cost ratio)*(prior probability ratio) then group 1

else group 2

Suppose that our two groups have normal densities f 1(x) and f 2(x)with means µ1 and µ2 and equal covariance Σ

In this case the classification criteria reduces to checking on whichside of a linear discriminant function the arbitrary data point X lies.

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal ComponentsFactor Analysis

Discriminant Functions

The ConceptExampleApplications

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 38/43

Discriminant FunctionsReferences

Applications

The Hemophilia Example

To construct a procedure for detecting potential hemophilia A

carriers based on measurements of two variableslog 10(AHFActivity ) and log 10(AHF − likeAntigen)One group was from a population that did not carry thehemophilia gene and the other group was from known populationof hemophilia carriers.

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal ComponentsFactor Analysis

Discriminant Functions

The ConceptExampleApplications

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 39/43

Discriminant FunctionsReferences

Applications

The Hemophilia Example - The Discriminant Function

Figure: The discriminant function in two dimensions for Hemophilia dataDipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal ComponentsFactor Analysis

Discriminant Functions

The ConceptExampleApplications

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 40/43

Discriminant FunctionsReferences

Applications

The Hemophilia Example - The Posterior Probabilities

Figure: The posterior probabilities that the data belongs to a specificDipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal ComponentsFactor Analysis

Discriminant Functions

The ConceptExampleApplications

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 41/43

Referencespp

The Hemophilia Example - JMP Output

JMP gives the following output:

JMP rotates the data by the canonical axes

All calculations in JMP correspond to the rotated axes It gives the classification matrix and the percent data

miss-classfied

It gives the predicted group and the posterior probability thata data point belongs to a specific group

It gives the estimates of the population means, µ1 and µ2 andthe covariance Σ for the normal populations

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal ComponentsFactor Analysis

Discriminant Functions

The ConceptExampleApplications

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 42/43

References

Predictive Classification Applications

Educational research - To investigate which variablesdiscriminate between high school graduates who decide (1) togo to college, (2) to attend a trade or professional school, or(3) to seek no further training or education.

Medical research - Record different variables relating topatients’ backgrounds in order to learn which variables bestpredict whether a patient is likely to recover completely(group 1), partially (group 2), or not at all (group 3).

Biology - Record different characteristics of similar types(groups) of flowers, and then perform a discriminant functionanalysis to determine the set of characteristics that allows forthe best discrimination between the types.

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An

OutlinePrincipal ComponentsFactor Analysis

Discriminant Functions

8/3/2019 Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short Overview

http://slidepdf.com/reader/full/dimension-reduction-and-classification-using-pca-factor-analysis-and-ant-functions 43/43

References

References

Richard Johnson, Dean Wishern - Applied MultivariateStatistical Analysis, 5e

Dipayan Maiti Dimension Reduction and Classification Using PCA, Factor An