multidimensional scaling. agenda multidimensional scaling goodness of fit measures nosofsky, 1986

29
Multidimensional Scaling

Post on 19-Dec-2015

234 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Multidimensional Scaling

Page 2: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Agenda

• Multidimensional Scaling

• Goodness of fit measures

• Nosofsky, 1986

Page 3: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Proximities

Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland

Amherst 0 9.94 4.32 7.29 6.81 9.94 7.81

Belchertown 0 14.06 14.94 8.25 13.96 17.66

Hadley 0 11.02 10.93 14.49 9.5

Leverett 0 12.57 7.45 5.18

Pelham 0 5.71 16.16

Shutesbury 0 11.04

Sunderland 0

pAmherst, Hadley

Page 4: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Configuration (in 2-D)

xi

Page 5: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Configuration (in 1-D)

Page 6: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Formal MDS Definition

• f: pijdij(X)• MDS is a mapping from proximities to corresponding

distances in MDS space.• After a transformation f, the proximities are equal to

distances in X.

Amherst Belchertown

Hadley Leverett Pelham Shutesbury

Sunderland

Amherst 0 9.94 4.32 7.29 6.81 9.94 7.81

Belchertown

0 14.06 14.94 8.25 13.96 17.66

Hadley 0 11.02 10.93 14.49 9.5

Leverett 0 12.57 7.45 5.18

Pelham 0 5.71 16.16

Shutesbury

0 11.04

Sunderland

0

Page 7: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Distances, dij

dAmherst, Hadley(X)

Page 8: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Distances, dij

dAmherst,Hadley (X) = xAmherst,1 − xHadley,1( )2+ xAmherst,2 − xHadley,2( )

2

= −.5775 −−2.3076( )2+ −1.0928 −−7.1844( )

2

= 6.332

Page 9: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Distances, dij

dAmherst, Hadley(X)=4.32

Page 10: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Proximities and DistancesAmherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland

Amherst 0 9.94 4.32 7.29 6.81 9.94 7.81

Belchertown 0 14.06 14.94 8.25 13.96 17.66

Hadley 0 11.02 10.93 14.49 9.5

Leverett 0 12.57 7.45 5.18

Pelham 0 5.71 16.16

Shutesbury 0 11.04

Sunderland 0

Proximities

Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland

Amherst 0 10.0577 6.3325 7.4738 7.9313 7.8319 7.8328

Belchertown 0 12.0455 16.8332 6.7959 12.7215 17.6600

Hadley 0 12.0350 13.1492 14.1632 8.1892

Leverett 0 12.2097 7.3591 6.6429

Pelham 0 6.3360 15.4250

Shutesbury 0 12.7366

Sunderland 0

Distances

Page 11: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

The Role of f

• f relates the proximities to the distances.

• f(pij)=dij(X)

Page 12: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

The Role of f

• f can be linear, exponential, etc.

• In psychological data, f is usually assumed any monotonic function.– That is, if pij<pkl then dij(X)dkl(X).

– Most psychological data is on an ordinal scale, e.g., rating scales.

Page 13: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Looking at Ordinal RelationsAmherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland

Amherst 0 9.94 4.32 7.29 6.81 9.94 7.81

Belchertown 0 14.06 14.94 8.25 13.96 17.66

Hadley 0 11.02 10.93 14.49 9.5

Leverett 0 12.57 7.45 5.18

Pelham 0 5.71 16.16

Shutesbury 0 11.04

Sunderland 0

Proximities

Amherst Belchertown Hadley Leverett Pelham Shutesbury Sunderland

Amherst 0 10.0577 6.3325 7.4738 7.9313 7.8319 7.8328

Belchertown 0 12.0455 16.8332 6.7959 12.7215 17.6600

Hadley 0 12.0350 13.1492 14.1632 8.1892

Leverett 0 12.2097 7.3591 6.6429

Pelham 0 6.3360 15.4250

Shutesbury 0 12.7366

Sunderland 0

Distances

Page 14: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Stress

• It is not always possible to perfectly satisfy this mapping.

• Stress is a measure of how closely the model came.

• Stress is essentially the scaled sum of squared error between f(pij) and dij(X)

Page 15: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Stress

Dimensions

Str

ess “Correct” Dimensionality

Page 16: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Distance Invariant Transformations

• Scaling (All X doubled in size (or flipped))

• Rotatation (X rotated 20 degrees left)

• Translation (X moved 2 to the right)

Page 17: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Configuration (in 2-D)

Page 18: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Rotated Configuration (in 2-D)

Page 19: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Uses of MDS

• Visually look for structure in data.

• Discover the dimensions that underlie data.

• Psychological model that explains similarity judgments in terms of distance in MDS space.

Page 20: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Simple Goodness of Fit Measures

• Sum-of-squared error (SSE)

• Chi-Square

• Proportion of variance accounted for (PVAF)

• R2

• Maximum likelihood (ML)

Page 21: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Sum of Squared ErrorData Prediction (Data-Prediction)2

7 5.03 3.88

8 6.97 1.06

1 2.12 1.25

8 8.91 0.83

6 6.97 0.94

SSE 7.97

Page 22: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Chi-Square

Data Prediction(Data-

Prediction)2

(Data - Prediction)2/Predictio

n

7 5 4 0.80

8 7 1 0.14

1 2 1 0.50

8 9 1 0.11

6 7 1 0.14

Chi-Square 1.70

Page 23: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Proportion of Variance Accounted for

Data Mean Prediction Model Prediction

Mean Error Error2 Prediction Error Error2

7 6 1 1 5.03 1.97 3.88

8 6 2 4 6.97 1.03 1.06

1 6 -5 25 2.12 -1.12 1.25

8 6 2 4 8.91 -0.91 0.83

6 6 0 0 6.97 -0.97 0.94

SST 34 SSE 7.96

(SST-SSE)/SST = (34-7.96)/34 = .77

Page 24: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

R2

• R2 is PVAF, but…

Data Mean Prediction Model Prediction

Mean Error Error2 Prediction Error Error2

7 6 1 1 5.9 1.1 1.21

8 6 2 4 10.1 -2.1 4.41

1 6 -5 25 4 -3 9

8 6 2 4 5.9 2.1 4.41

6 6 0 0 1 5 25

SST 34 SSE 44.03

(SST-SSE)/SST = (34-44.03)/34 = -0.295

Page 25: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Maximum Likelihood

• Assume we are sampling from a population with probability f(Y; ).

• The Y is an observation and the are the model parameters.

Y

=[0]

N(-1.7; [=0])=0.094

Page 26: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Maximum Likelihood• With independent observations, Y1…Yn,

the joint probability of the sample observations is:

g(Y1,...,Yn ) = f (Yi;θ)i=1

n

Y1

=[0]

0.094 x 0.2661 x .3605 = .0090Y2Y3

Page 27: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Maximum Likelihood

• Expressed as a function of the parameters, we have the likelihood function:

• The goal is to maximize L with respect to the parameters, .€

L(θ) = f (Yi;θ)i=1

n

Page 28: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Maximum Likelihood

Y1

=[0]

0.094 x 0.2661 x .3605 = .0090Y2Y3

Y1

=[-1.0167]

0.3159 x 0.3962 x .3398 = .0425Y2Y3

(Assuming =1)

Page 29: Multidimensional Scaling. Agenda Multidimensional Scaling Goodness of fit measures Nosofsky, 1986

Maximum Likelihood• Preferred to other methods

– Has very nice mathematical properties.– Easier to interpret.– We’ll see specifics in a few weeks.

• Often harder (or impossible?) to calculate than other methods.

• Often presented as log likelihood, ln(ML).– Easier to compute (sums, not products).– Better numerical resolution.

• Sometimes equivalent to other methods. – E.g., same as SSE when calculating mean of a distribution.