the european commission’s science and knowledge service · 2017-12-11 · principal component...

20
The European Commission’s science and knowledge service Joint Research Centre

Upload: others

Post on 01-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

The European Commission’s science and knowledge service

Joint Research Centre

Page 2: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

Step 5: Weighting methods (I)

Principal Component Analysis

Hedvig Norlén

COIN 2017 - 15th JRC Annual Training on Composite Indicators & Scoreboards 06-08/11/2017, Ispra (IT)

Page 3: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

3 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

Step 5: Weighting methods (I) • Not a trivial task to choose the weights for

the components. Should be in agreement with the underlying theoretical framework

• Diverse weighting methods exist and there is no established methodology on how to select the best method

• Weights usually have an important impact on the composite indicator value and on the resulting ranking

• The weighting method should be made clear and transparent

Importance

Page 4: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

4 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

Commonly used weighting methods

Equal weights (EW)

Weights based on statistical methods

• Principal component analysis (PCA) and Factor analysis (FA)

• Data envelopment analysis (DEA)

Weights based on public/expert opinion

• Budget allocation process (BAP) • Analytic hierarchy process (AHP) • Conjoint analysis (CA)

I.

II.

III.

Page 5: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

5 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

I. Equal weights (EW) • Many composite indicators use an equal weighting scheme (EW).

Why? no “à priori” knowledge and no clear reference about the importance of the elements in the composite indicator.

• EW does not mean there are no weights the weights are equal

Equal weighting does not guarantee equal importance and equal contribution of the indicators to the composite indicator!

Page 6: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

6 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

I. Example: Innovation Output Indicator (IOI)

Vertesy, The Innovation Output Indicator 2017: Methodology Report, forthcoming

EW result in an unbalanced contribution of components to the IOI

EW= 0,25 4 components in IOI Adjusted weights

Adjusted weights result in a more balanced contribution of components to the IOI

Weights ComponentsPearson

correlation coefficient

R^2 R^2

0,25 PCT 0,79 0,62 0,60,25 KIABI 0,76 0,58 0,60,25 COMP 0,74 0,55 0,60,25 DYN 0,55 0,30 0,3

Weights ComponentsPearson

correlation coefficient

R^2 R^2

0,22 PCT 0,72 0,52 0,50,22 KIABI 0,70 0,49 0,50,22 COMP 0,71 0,51 0,50,34 DYN 0,67 0,45 0,4

Correlation between IOI and components Correlation between IOI and components

Page 7: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

7 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

I. Equal weights (EW) • Highly correlated indicators – caution!

11 12 1

21 22 2

1 2

.....

.....

.

......

p

p

n n np

x x xx x x

x x x

=

X

n = number of objects (countries, individuals, …) p = number of indicators 𝑤𝑖= weight for indicator i, i=1,…,p

E.g. Correlation between indicator 1 and 2 is very high What should we do? - Disregard one of the two indicators - Reduce the weights of both indicators

Page 8: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

8 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

I. Example EW: The Global Talent Competitiveness Index 2017 (GTCI 2017)

Measures the ability of countries to compete for talent

Page 9: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

9 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

GTCI 2017 model

• 2 sub-indices

• 65 variables (ind)

• 14 sub-pillars (3-7 ind)

• 6 pillars

• Overall index GTCI Equal Weights (EW) & Arithmetic Averaging (AA)

(66,7%) (23,7%)

(16,7%) (16,7%) (16,7%) (16,7%) (16,7%) (16,7%)

(5,6%) (5,6%)

(8,3%) (8,3%) (8,3%) (8,3%)

Page 10: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

10 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

II. Weights based on PCA

• Empirical and objective option for weight selection

• Identify a small number of “averages” (PCs) that explain most of the variance observed.

• Each principal component PCi is a new variable computed as a linear combination of the original (standardized) variables

• Weights by using the coefficients of the first principal component (1-dim case)

Observed indicators are reduced into components

kk xw....xwxwPC +++= 22111

. . .

𝒘𝟏

𝒘𝟐

𝒘𝒑

Component

Item 1 X1

Item 2 X2

Item p Xp

PCA PCA

Page 11: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

11 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

II. Weights based on PCA • Pro: good mathematical properties, determining the set of weights

which explains the largest variation in the original indicators • (potential) Con: small weights are assigned to indicators which

have little variation, irrespective of their possible related contextual importance (“Elitist index”)

• The first PC is often used as the “best” composite index • First PC accounts for a limited part of the variance in the data – can

lose a consistent amount of information

Page 12: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

12 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

II. Example PCA weights - GTCI 2017

Indicators

Sub-pillars

Pillars

GTCI

EW & AA

EW & AA

EW & AA

Default GTCI

Indicators

Sub-pillars

Pillars

GTCI

EW & AA

EW & AA

PC1

Ex 1) PCA weights at pillar level

Indicators

Sub-pillars

Pillars

GTCI

Ex 2) PCA weights at 3 levels

PC1

PC1

PC1

Page 13: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

13 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

Com-ponent

Eigenvalue Variance Cumulative variance

PC1 PC2 PC3 PC4 PC5 PC6

1 4,95 0,82 0,82 1,Enable 0,94 0,18 0,03 -0,08 -0,10 -0,25

2 0,45 0,07 0,90 2,Attract 0,83 0,53 0,05 0,10 0,08 0,10

3 0,25 0,04 0,94 3,Grow 0,92 -0,04 -0,30 -0,22 0,05 0,09

4 0,13 0,02 0,96 4,Retain 0,94 -0,14 0,12 0,03 -0,25 0,13

5 0,12 0,02 0,98 5,VT_skills 0,90 -0,25 0,32 -0,06 0,18 0,00

6 0,10 0,02 1,00 6,GK_skills 0,91 -0,23 -0,21 0,25 0,06 -0,06

SS 4,95 0,45 0,25 0,13 0,12 0,10

Stopping criterion Eigenvalue>0.90 (or cumulative variance over 0.70)

Total variance explainedpillar and principal componentPearson correlation coefficients between

Ex 1) PCA weights - GTCI 2017 • PC1 weights at pillar level

GKSkiVTSkiRtainGrowAttrEnab XXXXXXPC 91.090.094.092.083.094.01 +++++=

Weights (not normalized)

GKSkiVTSkiRtainGrowAttrEnabnorm XXXXXXPC 17.016.017.017.015.017.01 +++++= Scaled to unity sum

Page 14: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

14 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

Ex 1 & 2) PCA weights - GTCI 2017 • Ex 1) PC1norm and default GTCI give clearly very similar rankings

98 out of 118 countries same ranking. Others shift in just one rank position

(Ex 1)

GKSkiVTSkiRtainGrowAttrEnabnorm XXXXXXPC 17.016.017.017.015.017.01 +++++=

∑=6

1

6/1 jXGTCI)17.06/1( ≈

Select the simplest method!

Page 15: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

15 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

Ex 1 & 2) PCA weights - GTCI 2017 • Ex 1) PC1norm and default GTCI give clearly very similar rankings

• Ex 2) PC1norm at 3 levels and default GTCI more divergent rankings (up to 9 positions in difference)

98 out of 118 countries same ranking. Others shift in just one rank position

(Ex 2)

Only 28 out of 118 countries same ranking.

9 positions difference

(Ex 1)

Page 16: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

16 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

Multiple principal components • If there are multiple principal components?

• 2 PCs included with the current stopping criterion

• Fairly low

correlations between PC2 and the 6 pillars

Pearson correlation coefficients between pillar and PC

Com-ponent

Eigenvalue Variance Cumulative variance

PC1 PC2

1 4,18 0,6973 0,6973 1.Enable 0,85 0,44

2 0,91 0,1512 0,8486 2.Attract 0,84 0,46

3 0,42 0,0700 0,9186 3.Grow 0,82 0,27

4 0,24 0,0400 0,9586 4.Retain 0,84 0,41

5 0,19 0,0317 0,9902 5.VT_skills 0,83 0,26

6 0,06 0,0098 1,0000 6.GK_skills 0,83 0,44

SS 4,18 0,91

Stopping criterion Eigenvalue>0,90 (or cumulative variance over 0,70)

Total variance explained

Page 17: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

17 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

Pearson correlation coefficients

between pillar and PCCom-

ponent Eigenvalue VarianceCumulative

variance PC1

1 4,18 0,6973 0,6973 1.Enable 0,85

2 0,91 0,1512 0,8486 2.Attract 0,84

3 0,42 0,0700 0,9186 3.Grow 0,82

4 0,24 0,0400 0,9586 4.Retain 0,84

5 0,19 0,0317 0,9902 5.VT_skills 0,83

6 0,06 0,0098 1,0000 6.GK_skills 0,83

SS 4,18

Stopping criterion Eigenvalue>1

Total variance explained

Multiple principal components

Recommendation:

• Change stopping criterion

• Reduce the threshold to get a single PC

Page 18: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

18 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

Concluding words

• Different weighting systems imply different results/rankings • No explicit weighting method is the norm

• Important to check the choice of weighing method (uncertainty

and sensitivity analysis)

Page 19: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

19 JRC-COIN © | Step 5: Weighting methods (I) Principal Component Analysis

References

Books • Johnson, “Applied Multivariate Statistical Analysis (6th Revised edition)”, Pearson Education Limited

(2013) • Maggino, “Complexity in Society: From Indicators Construction to their Synthesis”, Springer (2017) • Tinsley and Brown, ”Handbook of Applied Multivariate Statistics and Mathematical Modeling”, Academic

Press (2000) Papers • Paruolo, Saisana and Saltelli, “Ratings and rankings: voodoo or science?.” Journal of the Royal

Statistical Society: Series A (Statistics in Society) 176.3: 609-634, 2013 • Becker, Saisana, Paruolo, Vandecasteele “Weights and importance in composite indicators: Closing the

gap”. Ecological Indicators 80:12-22, 2017

Page 20: The European Commission’s science and knowledge service · 2017-12-11 · Principal Component Analysis . Hedvig Norlén . ... Factor analysis (FA) •Data envelopment analysis (DEA)

Any questions? You may contact us at @username & [email protected]

Welcome to email us at: [email protected]

THANK YOU COIN in the EU Science Hub https://ec.europa.eu/jrc/en/coin COIN tools are available at: https://composite-indicators.jrc.ec.europa.eu/

The European Commission’s Competence Centre on Composite Indicators and Scoreboards