risk in markets
TRANSCRIPT
Talk Overview Information and complex datasets Redundancy & Dependency Information Filtering A network approach A key-study: risk distribution in financial equity market Other applications
Information
401 firms on the US equity market between 1996-2009 (data from Reuters)
We are witnessing interesting times when a large amount of detailed information is readily available for all. Using, understanding and filtering such information has become one of the major tasks and a crucial bottleneck for scientific and industrial endeavors For instance, in financial markets every transaction, every bid and every ask are registered and are used for trading, forecasting, assessing risk and pricing
ri (t) = log(pi (t +τ ))− log(pi (t))
Redundancy Typically, there is a large degree of redundancy associated with the information that we acquire and store There are two reasons for this: 1) it has become easy to get information simultaneously from
several similar sources 2) redundancy can be useful to filter out noise
Example: prices of a group of stocks in the same industrial sector behave very similarly The dynamics of one stock contains information about the dynamics of the others
Dependency To measure this redundancy we can look at the degree of similarity between the temporal evolution of the variables
ρi, j =(xi − xi )(x j − x j )
σ iσ j
=0.93 prices 0.65 log- returns
"#$
%$
Pearson’s cross-correlation coefficient
I(X;Y ) =x∈X,y∈Y∑ p(x, y)ln p(x, y)
pX (x)pY (y)=
2.32 prices 0.39 log- returns
#$%
&%
Mutual Information
R Morales P Butler
50 100 150 200 250 300 350 400
50
100
150
200
250
300
350
4000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
−0.2 0 0.2 0.4 0.6 0.8 10
50
100
150
200
250
300
350
400
450
500
Pearson's correlation coefficients for the log-returns of 401 firms on the US equity market 1996-2009
N(N −1)2
= 80200
coefficients
0 20 40 60 80 100 1200
10
20
30
40
50
60
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50
10
20
30
40
50
60
Largest eigenvalue λ=107 λmin =1+1Q− 2 1
Q= 0.4 λmax =1+
1Q+ 2 1
Q=1.8
€
Q =T
N
€
p(λ) =Q
2π
(λmax − λ)(λ − λmin )
λ
18 eigenvalues are larger than λmax
3 eigenvalues are larger than 10
Information Filtering We want to extract meaningful information making use of redundancies/dependencies In general, the system must be studied at two levels:
1) local clustering 2) global "hierarchy
Data clustering
Data hierarchy
Hierarchical clustering via linkage
Thresholding
These two levels can be simultaneously studied by mapping the system into a network where the variables are represented by vertices and the dependencies/redundancies between variables are associated to edges
N(N −1)2
= 80200
different elements with a lot of redundant information only order N is significant
The information of the correlation matrix is associated with the complete graph Kn
A Network approach
50 100 150 200 250 300 350 400
50
100
150
200
250
300
350
4000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Complete graph K401
N=5
In order to filter information, without thresholding, we must prune edges retaining only the most relevant structure while keeping the network connected.
Complete graph K401
Minimum Spanning Tree MST for 401 US stocks
DPS
LO
SNI
RF2 USB
FHN
WB
ABT
ADCT
ADBE
AMD
AET
AFL AH
APD ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA FO
ABX
AEP AXP
AM
AGC
AMGN
HES WYE
AIG
AI
A AM
AMR
APCC
ASO
AS
APC
ADI
ANDW BUD
AOC
APA
AAPL
AMAT
ADM
HOU
A
ACKH
AR ASH 1357Q
T ADP
ADSK
AVY
AVP
BHI
BLL
CEG
ONE
BNY
BCR
BSET
BMG
BOL BAX
BSC
BDX
VZ
BLS BMS
BNL
BHMSQ
BEV
BGEN
BMET
BJS
BA
BBS
BK
BDK
HRB
BMC
BA
OMX
BGG
BMY
SA BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL
CB
CI
CINF
CIN
MZIAQ
CTAS
CCTYQ CSCO
CIC
CLX
CMS
CGP
KO
CCE
CL
HCA
68335Q
CMCSA
UCM
CMA
CA
CPQ CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO CR
CRAY
CCK CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW
1132Q
DIG
DUK RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI EP
EMR EC
ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP GFS/
G
GM
EDS
DTV
GDW GR
GT
GRA
GWW
GAP GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS HLS
HNZ
HP HPC
HSY
HPQ
HLT
HM HD
HNW
HI CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG IP
JEC
FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY
KMB
KW KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC
LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR MCD
MHP
MCI
MWV MDT MEL
CVS
MRK
MT MDP
MER
MSFT
MU
MIL
MMM
MI
MO MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC
NSM
NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE
NI
NBL
JWN
NSC
NTRS
NOC
NRTLQ
WFC
The largest pruning should produce a connected graph which contains only the minimum number of most significant edges. This is the Minimum Spanning Tree. Borůvka, Otakar (1926). "O jistém problému minimálním (About a certain minimal problem)” Práce mor. přírodověd. spol. v Brně III 3: 37–58.; Borůvka, Otakar (1926). "Příspěvek k řešení otázky ekonomické stavby elektrovodních sítí (Contribution to the solution of a problem of economical construction of electrical networks)”. Elektronický Obzor 15: 153–154. J. C. Gower, Biometrika 53 (1966) 325-338; R. N. Mantegna, Eur. Phys. J. B (1999) 193-197.
The Minimum Spanning Tree retains the most significant links and maximally reduces redundancy (between any two vertices there is one and only one path that crosses edges only once)
How can we build such a network?
By embedding on a surface
We want to build a network which retains the most significant links and retains some redundancy
€
N→ Sg
€
g ≥ g∗ =(N − 3)(N − 4)
12
G. Ringel, Map Color Theorem, Springer-Verlag, Berlin, (1974) cap. 4 P. J. Gilbin, Graphs, Surfaces and Homology, Chapman and Hall, 2nd edition (1981) G. Ringel and J. W. T. Youngs, Proc. Nat. Acad. Sci. USA 60 (1968) 438-445.
The embedding of KN is possible on an orientable surface Sg of genus
• locally planar • natural hierarchy • controlled interwovenness • elementary moves
Any network can be embedded on a surface!!
WHY SURFACES ?
any ΓN is a sub-graph of KN and can be embedded on Sg
The surface constraints the network complexity (the degree of interwovenness)
Embedded Maximally Filtered Graphs
g=0 g=1 g=2
Kuratowski’s theorem A finite graph is planar if and only if it does not contain a subgraph that is an expansion of K5 or K3,3
Planar graphs g = 0
PMFG
What do we embed on the surface?
Planar Maximally Filtered Graphs Sort similarities form the largest to the smallest
Connect the first two nodes on the top line of the list
Is the resulting graph planar?
Delete the top line from the list
Discard the edge Keep the edge
Have we reached the maximum number of edges?
yes no
yes
no
http://www.mathworks.com/matlabcentral/fileexchange/27360
16 Eurodollars interest rates 34 US treasury bonds M. Tumminello, T. Aste, T. Di Matteo and R. N. Mantegna, A tool for filtering information in complex systems, PNAS 102 (2005) 10421
€
max( ai, ji, j
∑ Ii, j )
Embedded Maximally Filtered Graphs
Building EMFG Start from a arbitrary triangulation of a surface. Make it evolve towards an optimal graph through Alexander Moves
J. W. Alexander, “The combinatorial theory of complexes” Ann. Math. 31 (1930) 292.
T1 T2 +1
-1
+1
-1 +3
-1 -1
-1 Moves T1 and T2 are topological invariants therefore we can apply them freely without changing the surface embedding
€
E(G) = ai, ji, j
∑ Ii, j
€
E(G') = a'i, ji, j
∑ Ii, j
€
p(G' |G) =1
1+ eβ (E(G' )−E (G ))
W.M. Song, T. Di Matteo and T. Aste, “Building Complex Networks with Platonic Solids” Phys. Rev. E 85 ( 2012) 046115. T A, Ruggero Gramatica and T. Di Matteo, Exploring complex networks via topological embedding on surfaces arXiv:1107.3456v1 (2011) T A, Ruggero Gramatica and T. Di Matteo, Random and frozen states in complex triangulations, Philosophical Magazine (2011). TA and D. Sherrington, "Glass transition in self organizing cellular patterns'', J. Phys. A 32 (1999) 7049-56.
DPS
LO
SNI
RF2 USB
FHN
WB
ABT ADCT
ADBE
AMD
AET
AFL AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD
AOC
APA AAPL
AMAT
ADM HOU
A
ACKH AR
ASH 1357Q
T
ADP
ADSK
AVY
AVP BHI
BLL
CEG
ONE
BNY
BCR
BSET
BMG
BOL
BAX
BSC
BDX
VZ
BLS
BMS
BNL
BHMSQ
BEV
BGEN
BMET
BJS
BA BBS BK
BDK
HRB
BMC
BA
OMX
BGG
BMY
SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL
CB
CI
CINF
CIN
MZIAQ
CTAS CCTYQ
CSCO
CIC
CLX
CMS
CGP
KO
CCE
CL
HCA
68335Q
CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO
CR
CRAY
CCK CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW 1132Q
DIG
DUK
RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR
EC
ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN FTLA
GCI
GPS
GE
H
GD GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM
EDS
DTV
GDW GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM
HD
HNW
HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG
IP
JEC
FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY
KMB KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC MI
MAS
MAT
MAY
MYG
KRB
MDR
MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL
MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC
NSM NSI
NAV
FC
NWL NEM
NMK
GAS
NKE
NI NBL
JWN
NSC
NTRS
NOC
NRTLQ
WFC
PMFG for 401 US stocks
v4!
v8!v3!
v6!v2!v7! v5!
v4!
k1!{v2,v3,v4}!
v2! v3!v4!
k3!{v3,v4,v6}!
v6!v3!
v4!
k2!{v2,v4,v5}!
v2!
v5!
We extract clusters and hierarchies form the PMFG in 5 main steps built around the properties of 3-cliques in maximal planar graphs
Some cliques contain inside others providing a natural hierarchy v9!
3-cliques
Clusters and hierarchies
Won-Min Song, T. Di Matteo, Tomaso Aste, Nested hierarchies in planar graphs, Discrete Applied Mathematics 159 (2011) 2135-2146.
3-cliques on Maximal Planar Graphs have a unique property: They contain other cliques inside or/and they are contained inside the other cliques.
Eurodollars
16 interest rates with maturity dates between 3 months and 4 years. T. Di Matteo and T. Aste, "How does the Eurodollar Interest Rate behave?", Journal of Theoretical and Applied Finance, 5 (2002) 122-127. (arXiv:cond-mat/0101009, 2001).
≥ 2 years
< 2 years
Example of PMFG for Eurodollar rates
v1 v2 v3 v5 v4 v6 v8 v7!!
bα!!
bβ!!
b3!!
b2!!
v4!
v8!
v3!
v6!
v2!
v7! v5!
Won-Min Song, T. Di Matteo, Tomaso Aste, Nested hierarchies in planar graphs, Discrete Applied Mathematics 159 (2011) 2135-2146. W.M. Song, T. Di Matteo and T. Aste, “Hierarchical information clustering by means of topologically embedded graphs”, PLoS ONE, 7 (2012) e31929
v4!
v8!
v3!
v6!
v2!
v7! v5!
v4!
v3!v2!
v1!
b1!
b2!
k1!
v2!v3!
v4!
v6!v5!
v2! v4!
v5! v7!
v4!
v3!
v6!
v8!
b3!
b4!
k3!
k2!
v9!
α! β!
bα= b1!
k3!
k2!k1!
bβ = b4!
b2!
b3!
Cross-correlation from daily returns on the 401 US stocks
The clique structure provides automatically a classification into communities organized into a nested hierarchy
We capture both local clustering and global hierarchical organization without introducing any characteristic scale
50 100 150 200 250 300 350 400
50
100
150
200
250
300
350
4000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
10
2
4
6
8
10
12
14
16
GAPKRBACVCICHRZICSCTASNOCCHAMRIGTADICNPCPBALTRDIGNEMCCKADSKBF/AEFUCCTYQPKIMMCGPSAIEKLLYAYEJPMBDKCHRSBN
IAVPHUMTCCEMDPRYICAHENRNQKRILBAIMNXDLXEPHNZLOWHSTADPGRADWDTGTLUBRHDCBCRKNCCCSCDDDGEDRRDMHPFPCGECOMONWLMOLXLEGBACBWSMAYARBMRMJHMNLDWICSCODPSJN
JNITAPJWNFJASFMCVZLTDONELOJPMCHIRFDXMTCEGFWLTGTECNBLCVSFCGASDUKMIINTCLDGNTRSCTBLNCVV
ILUKFREGTAMDDGBLLMSFTHTMXQRF2NSMMRKBJSFRXNAVFPLANDWBLSJC
ICAEMRGFS/MELAXPGRMWVBMSUSBKSUAZACRBGGETRMUAETDDSFONMKATFNMDOVECKMBEFXKWSNIDCNAQAGNAAPLECLHNWDTEGDWABTCINFAIGAAMARMOTBWWBAGCBSETBSCDEHRSBOLACKHCNCLXMSBNLHLTKBHADBEBUDCLHOGLPXFNLAPAGP1357QOMXMURHRBBMCCATGLWNSCWCO
EQBFOHPGENZCIKOCYBNYFITBCVXGSMYLBEVKRBAHBAN68335
QGLKBF/BKMRTQINGRCMAIKNCGPWFCMERHONIRMDTAEPBDXNBRCSXGISCMCS
ADELLNKECAGJOLLTCJECIPNCBAXDTVFHNIBMAMBRCBMCIHPCAPCNIFFDALRQMASHDLMCRAYHESAPCCAHBGENHCABMYFDOEDSBHMS
QHPQMDRDJTHCFLTWQLSIBBSBCHALCMSAOCHCARMIDETNKEYAMATAVYALHOU1132QNRTL
QPHADOWITWABKMEEDHRADCTMZIAQWYEGMABSKSEBMGCBEADMLIZBMETSAGWWHMGRGDFLEHASKMGGPUIGPCHDASHFISVBRNOFTLAIPGBENAFLGCIMYGAPDNSIALXABHICTXABXEQTJPMIALLFLMIQA
AMMMCTLHLSPGNMATMCDCYLGECPQCINCMIAMGNHI
MALKUCMASOMILCNWXOMHSYFLS02468
10
12
14
16 GAP KRB ACV CIC HRZI CS CTAS NOCCH AMR IGT ADI CNP CPB ALTR DIG NEM CCK
ADSK BF/A EFUCCTYQPKI
MMC GPSAI EK LLY AYE JPM BDKCHRS BNI AVP HUMT CCE MDP RYI CAHENRNQKRIL BA IMNX DLXEP HNZ LOWHST ADP GRA DWD TGT LUBRHDC BCRK NCC CSCDDDGED RRD MHP FPCG ECOMO NWL MOLX LEG BAC BWS MAY ARB MRMJHMN LDWICSCO DPS JNJNI TAP JWNFJ AS FMCVZ LTD ONELO JPM CHIR FDXMT CEG FWLTGTEC NBL CVSFC GAS DUKMIINTC LDG NTRS CTB LNCVVI LUK FREGT AMDDG BLL MSFTHTMXQ RF2 NSM MRK BJS FRX NAV FPLANDWBLSJCI CA EMR GFS/ MEL AXPGR MWV BMS USB KSU AZACR BGG ETRMU AET DDSFO NMKAT FNM DOVEC KMB EFX KWSNIDCNAQ AGN AAPL ECL HNWDTE GDWABT CINF AIGAAMAR MOT BWWB AGC BSET BSCDE HRS BOLACKHCN CLXMS BNL HLT KBH ADBE BUDCL HOG LPXFNL APA GP1357Q OMX MUR HRB BMC CAT GLW NSCWCOEQ BFOHP
GENZCI KOCY BNY FITB CVXGS MYL BEVKRBAHBAN68335Q GLK BF/BKMRTQ INGR CMA IKN CGP WFC MER HONIR MDT AEP BDX NBR CSX GISCMCSA DELL NKE CAGJO LLTC JECIPNC BAX DTV FHN IBMAMBRCB MCI HPC APCNIFFDALRQ MAS HDLM CRAY HESAPCCAHBGEN HCA BMY FDO EDSBHMSQ HPQ MDRDJ THCFLTWQLSI BBSBC HAL CMS AOCH CARMID ETN KEY AMAT AVYAL HOU1132QNRTLQ PHA DOWITWABK MEE DHRADCTMZIAQ WYE GM ABS KSE BMG CBE ADMLIZBMETSA GWWHMGRGD FLE HAS KMG GPUI GPCHD ASH FISVBRNO FTLA IPG BEN AFL GCI MYG APD NSIALXA BHI CTX ABX EQTJPMI ALLFLMIQAA MMM CTL HLS PGN MAT MCD CYL GE CPQCIN CMIAMGNHIMALK UCM ASOMIL CNW XOM HSY FLS
ss
ss s
ss
s
ss
s
ss
s
ss
s
s
s
s
s
s
s
s
ss
s
ss
ss
s
sss s ss
s
s
ss s
s
s
s
sss
s
vvv
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v vv
vv
v
vv v
v
v
v
vv
v
vv
v
v
v
v
vvv
v
v
v
g
gg
gg
g
g
g
gg
g
ggg
g
gg
gg
g
gg
g
g
gg
ggg
gg
g
gg
g
g
g
gggg
ggg
gg
ggg
g
(a) DBHT cluster 1
cluster 2
cluster 3
s
s
s
s
s
ss
ss
s
s
ss
s sss
s
sss
s
s
sss
sss
s
s
s
s
s
ss ss
ssss
s s
s
s
ssss
v
v
v
v
v
v
v
v
v
vv
vv
v
v
v
vv
v
v
v
v
v
vvv
v
v
v
v vv v
v
v
v
v
v
v
vv
v
vv
v
v vv
v
v
g
g
g
g
g
g
g
g
g
g
g g
g
g
g
gg
gg
g
g
g
g
gg
g
g g
g
g g g
g
gg
g
gg
g
g gg
g
g
gg
gg
g
g
(b) Qcutcluster 1
cluster 2
cluster 3
ss
s
s
s
s
ss s
s
s
s
ss
s
s
s
ss ss
ss
ss s
ss s s
s
s
s s
s
ss
s
s
ss
s
s
sss
s
sss
v
v
v
v
v
v
v
v
v
vv
v
v
v
v
v
v
v
v
v
v
v
v
vv
vv
v
v
v vv
v
v
vv
v
v
vv
v
v
v
v
v vv
v
v
v
g
g
g
g
g
g
g
g
g
g
g g
g
g
g
g g
g
g
g
g
g
g
g
gg
g g
g
gg
g
g
g g
g
gg
g
g gg
g
gg
g
gg g
g
(c) kNN−Ncutcluster 1
cluster 2
cluster 3
ss
ss s
ss
s
ss
s
ss
s
ss
s
s
s
s
s
s
s
s
ss
s
ss
ss
s
sss s ss
s
s
ss s
s
s
s
sss
s
vvv
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v vv
vv
v
vv v
v
v
v
vv
v
vv
v
v
v
v
vvv
v
v
v
g
gg
gg
g
g
g
gg
g
ggg
g
gg
gg
g
gg
g
g
gg
ggg
gg
g
gg
g
g
g
gggg
ggg
gg
ggg
g
(a) DBHT cluster 1
cluster 2
cluster 3
s
s
s
s
s
ss
ss
s
s
ss
s sss
s
sss
s
s
sss
sss
s
s
s
s
s
ss ss
ssss
s s
s
s
ssss
v
v
v
v
v
v
v
v
v
vv
vv
v
v
v
vv
v
v
v
v
v
vvv
v
v
v
v vv v
v
v
v
v
v
v
vv
v
vv
v
v vv
v
v
g
g
g
g
g
g
g
g
g
g
g g
g
g
g
gg
gg
g
g
g
g
gg
g
g g
g
g g g
g
gg
g
gg
g
g gg
g
g
gg
gg
g
g
(b) Qcutcluster 1
cluster 2
cluster 3
ss
s
s
s
s
ss s
s
s
s
ss
s
s
s
ss ss
ss
ss s
ss s s
s
s
s s
s
ss
s
s
ss
s
s
sss
s
sss
v
v
v
v
v
v
v
v
v
vv
v
v
v
v
v
v
v
v
v
v
v
v
vv
vv
v
v
v vv
v
v
vv
v
v
vv
v
v
v
v
v vv
v
v
v
g
g
g
g
g
g
g
g
g
g
g g
g
g
g
g g
g
g
g
g
g
g
g
gg
g g
g
gg
g
g
g g
g
gg
g
g gg
g
gg
g
gg g
g
(c) kNN−Ncutcluster 1
cluster 2
cluster 3
Validation of Clustering on Fishers’ Iris data Dataset 50 iris plants from three different types of iris, (1) Iris Setosa; (2) Iris Versicolour; (3) Iris Virginica. Measures (i) sepal length; (ii) sepal width; (iii) petal length; (iv) petal width,
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Annals Eugen., 7:179–188.
W.M. Song, T. Di Matteo and T. Aste, “Hierarchical information clustering by means of topologically embedded graphs”, PLoS ONE, 7 (2012) e31929
6
0 0.2 0.4 0.6 0.80
0.2
0.4
0.6
0.8
1
1.2
<dR>
Ad
just
ed
Ra
nd
In
de
x
DBHT
k!means++
SOM
kNN!Spectral
Qcut
(a) Gaussian Data with Gaussian Noise
0 0.2 0.4 0.6 0.80
0.2
0.4
0.6
0.8
1
<dR>
Ad
just
ed
Ra
nd
In
de
x
DBHT
k!means++
SOM
kNN!Spectral
Qcut
(b) Gaussian Data with Lognormal Noise
0 0.2 0.4 0.6 0.80
0.2
0.4
0.6
0.8
1
1.2
<dR>
Ad
just
ed
Ra
nd
In
de
x
DBHT
k!means++
SOM
kNN!Spectral
Qcut
(c) Gaussian Data with Power-law Noise
0 0.2 0.4 0.6 0.80
0.2
0.4
0.6
0.8
1
1.2
<dR>
Ad
just
ed
Ra
nd
In
de
x
DBHT
k!means++
SOM
kNN!Spectral
Qcut
(d) Log-normal Data with Gaussian Noise
0 0.2 0.4 0.6 0.80
0.2
0.4
0.6
0.8
1
1.2
<dR>
Ad
just
ed
Ra
nd
In
de
x
DBHT
k!means++
SOM
kNN!Spectral
Qcut
(e) Log-normal Data with Lognormal Noise
0 0.2 0.4 0.6 0.80
0.2
0.4
0.6
0.8
1
1.2
<dR>
Ad
just
ed
Ra
nd
In
de
x
DBHT
k!means++
SOM
kNN!Spectral
Qcut
(f) Log-normal Data with Power-law Noise
Figure 4. Adjusted Rand index for various data sets simulated via Gaussian and Log-normaldistribution with !!in = 0.9, !!out = 0 and Nran = 25. This case refer to a cluster structure with eightclusters of size 5 elements, and one cluster of size 64 elements. For each value of C, 30 data sets weregenerated in order to get stable statistics for < dR > and adjusted Rand score. Figure (a) and (f) arethe same of Fig.1 in the paper and are here reported for completeness and for an easier comparison.
for the DBHT, and the average linkage and the complete linkage methods are respectively reported inFigs.??(b,c,d). The comparison between the adjusted Rand index is reported in Fig.??.
Validation
connections and we consider them as the centers of clustersof bubbles. Any bubble bi connected by a directed path in!"Hb to a converging bubble b! belongs to cluster !. By con-struction, the bubbles in ! form a subtree
!"h! which has only
one converging bubble b! and all edges are directed towardsb!. This is a non-discrete clustering of bubbles because therecan be multiple directed paths between bi and two or moreconverging bubbles b!, b",... . In this case we say that bi be-longs simultaneously to clusters !, ",... and the subtrees
!"h!,
!"h" ... are partially overlapping. In Fig.2(ii) the two subtreesconverging towards b! = b1 and b" = b4 are highlighted, it isclear that in this example bubbles b2 and b3 are shared by thetwo subtrees. A non-discete clustering can now be assignedto the graph vertex set V (G). Indeed, each vertex v # V (G)can be associated with the cluster memberships of the bub-bles that contain the vertex v. This produces a non-discreteclustering partition because bubbles can belong to more thanone cluster and also because vertices can belong to more thanone bubble.
Subdivision into discrete clusters.Our goal here is to obtain adiscrete clustering for V (G) and for this purpose we proceedin two steps (refer to Fig.2 for a schematic overview).
• First: for each converging bubble b! we consider its set ofvertices. Some vertices belong to only one converging bub-ble and, in this case, they are directly assigned to it (e.g.in Fig.2 vertices v1 and v2 are uniquely assigned to b! = b1
and vertices v6, v8 are uniquely assigned to b" = b4). Othervertices instead belong to more than one converging bub-ble (e.g. vertices v3 and v4 in Fig.2) and in this case welook at the ‘strength’ of attachment between each vertexand each of the bubbles by measuring the quantity
#(v, b!) =
!
u!V (b!)AG(v, u)
3(|V (b!)|! 2), [2]
and assigning each vertex to the bubble with largeststrength. (The notation |V (b!)| in Eq.2 indicates the num-ber of vertices in the vertex set of b! and 3(|V (b!)| ! 2)is the number of edges in a bubble.) After this assign-ment, each converging bubble ! has a unique set of verticesV 0(!). Let us note that there can be converging bubbleswith an empty set of vertices and, in this case, there willbe no clusters associated to them.
• Second: we consider all the other remaining vertices in thegraph that have not been assigned to any converging bub-ble (e.g. vertices v5 and v7 in Fig.2). Each of these non-associated vertices v is assigned to the converging bubble !that has the minimum mean average shortest path distance
L̄(v, !) = mean{l(v, u)|u # V 0(!) $ v #!"h!} [3]
with respect to all other converging bubbles. Here l(v, u) isthe shortest path distance on G from v to u (the smallestsum of distances dr,s over any path between v and u).
By uniquely associating each vertex to a converging bubble,we have obtained the discrete partition of the vertex set V (G)into a number of sub-sets V (!), V (") ... each respectively as-sociated to the converging bubbles b!, b" ... .
Linkage and hierarchyOnce a unique partition of the vertex set into discrete clus-ters has been obtained, we can investigate how each of theseclusters is internally structured and how di!erent clusters
(a)
!0.2
0
0.2
0.4
0.6
0.8
(b)
!0.5
0
0.5
1
0 0.2 0.4 0.6 0.8 1!0.2
0
0.2
0.4
0.6
0.8
1
1.2
dR
cR
and
(c)
DBHT
k!means on Xk!means on D
cor
k!medoids on Dcor
k!medoids on Deulid
0 0.2 0.4 0.6 0.8 1!0.2
0
0.2
0.4
0.6
0.8
1
1.2
dR
cR
and
(d)
DBHT
k!means on Xk!means on D
cor
k!medoids on Dcor
k!medoids on Deulid
Fig. 3. Top (a,b): examples of synthetic matrices of cross-correlations fromdata series generated using a multivariate gaussian generator to which power lawnoise is added. There are five artificial correlated clusters with sizes 4, 8, 16, 32and 64 respectively. (a) an example with ! = 1.5 and c = 0.267 resulting in"
"in#
! 0.20 and approximately zero inter-cluster correlations. (b) an example
with ! = 1.22 and c = 0.10 resulting in"
"in#
! 0.20 and approximately
zero inter-cluster correlations. Bottom (c,d): adjusted Rand indexes [23] for thecomparison between the partition retrieved by the DBHT clustering method and the‘true’ partition. The horizontal-axis reports the gap between average intra- and inter-
cluster correlations dR ="
"in#
" #"ou$ that becomes smaller when: (c) the
noise c increases or (d) the exponent ! decreases.
gather together into larger aggregate structures. This canbe achieved by building a specifically tailored linkage proce-dure that combines the information from the discrete clusterswith the hierarchy from the graph G and allows to analyze thegathering of nodes at di!erent levels both inside the clusterand between the clusters. This can be formalized by usingthree distinct levels of linkage.
• Intra-bubble hierarchy. The lowest level at which nodes aregathered in a topological sub-structure of the PMFG arethe bubbles. Let us therefore start by building within eachcluster a linkage procedure at bubble level. To this pur-pose we must first assign each vertex v # V (!) to a bubblebi in the subtree
!"h!. Vertices in the converging bubbles
have been already assigned to the sets V 0(!). For all re-maining vertices, the ones belonging to only one bubble areassigned to such bubble (e.g. vertex v7 in Fig.2). Whereasvertices that are belonging to more than one bubble (e.g.vertex v5 in Fig.2) are assigned to the bubble that max-imizes the strength #(v, bi) (Eq.2). In this way for everycluster ! and for each bubble bi in
!"h! we have a unique
vertex set V !(bi) on which we can now perform a completelinkage procedure [22] by using the shortest path distancesl(u, v) as distance matrix. (See supplementary informationfor the discussion of the possible use of alternative linkagesand distance measures.)
• Intra-cluster hierarchy. The next level in the hierarchy con-cerns the way in which bubbles are linked together withineach cluster !. At this level we perform a complete linkageprocedure between the bubbles in
!"h! by using the distance
matrix
dI(bi, bj) = max{l(u, v)|u # V !(bi) $ v # V !(bj)} . [4]
• Inter-cluster hierarchy. The highest hierarchical level con-cerns the way in which clusters gather together. At this
Footline Author PNAS Issue Date Volume Issue Number 3
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
!0.2
0
0.2
0.4
0.6
0.8
!0.6
!0.4
!0.2
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
!0.4
!0.2
0
0.2
0.4
0.6
0.8
1
!0.5
0
0.5
1
FIG. 1: Pearson’s cross-correlation matrices for synthetic data generated via multivariate Gaussian generator with added
noise. Partitions with clusters with sizes: 4, 8, 16, 32, 64. First row: Gaussian noise with amplitudes c = 0.77, 2.33, 3.88
respectively (from left to right) and !!intra
= 0.9. Second row: Gaussian noise with c = 1 and three values of the intra-cluster
correlation in R!, !intra = 0.9, 0.6 and 0.3 respectively (from left to right). Third row: Power law noise with exponent at
" = 1.5 and amplitudes c = 0.09, 0.27 and 0.41 respectively (from left to right). Fourth row: Power law noise with amplitude
c = 0.1 and exponents " = 2.1, 1.4 and 1.2 respectively (from left to right).
B. Comparison between the artificial ‘true’ cluster structure and the one retrieved
from the data
We used the adjusted Rand index [2] to compare the ‘true’ clusters artificially inserted in
the correlation structure and the clusters retrieved from the analysis of the data in presence
3
20 40 60 80 100 120
20
40
60
80
100
120−0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
20 40 60 80 100 120
20
40
60
80
100
120
−0.5
0
0.5
1
(a)
0
0.2
0.4
0.6
0.8
1
2
4
6
8
10
12
14
(b)
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4 (c)
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
(d)
FIG. 4: (a) The synthetic data structure R!. (b) Dendrogram associated with the DBHT hierarchical structure. (c)
Dendrogram associated with the Average linkage. (d) Dendrogram associated with the Complete linkage.
of the 16 ‘true’ clusters and achieving also an adjusted Rand index of 0.94. The complete
and average linkages, reported in Figs.4(c,d) give a less clear hierarchical structure. This is
indeed consistently quantified by the adjusted Rand indexes reported in Fig.4 in the paper
that are consistently lower and misplaced at the wrong number of elements with respect to
the ones associated with the DBHT hierarchy.
We tested other partitions and di!erent levels of noise verifying that the DBHT method is
consistently delivering good performances in comparison with the other established methods.
An example with power law noise with the same parameters as for the analysis in Fig.4 of
the main paper (and Fig.4 in the present supplementary information) but with clusters
of scaling sizes respectively of 4, 8, 16, 32 and 64 elements is reported in Figs.5(a). The
dendrograms for the DBHT, and the average linkage and the complete linkage methods
are respectively reported in Figs.5(b,c,d). The comparison between the the adjusted Rand
indexes are reported in Figs.6. Th results for another example with clusters of uniform sizes
6
W.M. Song, T. Di Matteo and T. Aste, “Hierarchical information clustering by means of topologically embedded graphs”, PLoS ONE, 7 (2012) e31929
ss
ss s
ss
s
ss
s
ss
s
ss
s
s
s
s
s
s
s
s
ss
s
ss
ss
s
sss s ss
s
s
ss s
s
s
s
sss
s
vvv
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v
v vv
vv
v
vv v
v
v
v
vv
v
vv
v
v
v
v
vvv
v
v
v
g
gg
gg
g
g
g
gg
g
ggg
g
gg
gg
g
gg
g
g
gg
ggg
gg
g
gg
g
g
g
gggg
ggg
gg
ggg
g
(a) DBHT cluster 1
cluster 2
cluster 3
s
s
s
s
s
ss
ss
s
s
ss
s sss
s
sss
s
s
sss
sss
s
s
s
s
s
ss ss
ssss
s s
s
s
ssss
v
v
v
v
v
v
v
v
v
vv
vv
v
v
v
vv
v
v
v
v
v
vvv
v
v
v
v vv v
v
v
v
v
v
v
vv
v
vv
v
v vv
v
v
g
g
g
g
g
g
g
g
g
g
g g
g
g
g
gg
gg
g
g
g
g
gg
g
g g
g
g g g
g
gg
g
gg
g
g gg
g
g
gg
gg
g
g
(b) Qcutcluster 1
cluster 2
cluster 3
ss
s
s
s
s
ss s
s
s
s
ss
s
s
s
ss ss
ss
ss s
ss s s
s
s
s s
s
ss
s
s
ss
s
s
sss
s
sss
v
v
v
v
v
v
v
v
v
vv
v
v
v
v
v
v
v
v
v
v
v
v
vv
vv
v
v
v vv
v
v
vv
v
v
vv
v
v
v
v
v vv
v
v
v
g
g
g
g
g
g
g
g
g
g
g g
g
g
g
g g
g
g
g
g
g
g
g
gg
g g
g
gg
g
g
g g
g
gg
g
g gg
g
gg
g
gg g
g
(c) kNN−Ncutcluster 1
cluster 2
cluster 3
Dataset 50 iris plants from three different types of iris, (1) Iris Setosa; (2) Iris Versicolour; (3) Iris Virginica. Fisher RA (1936). Annals Eugen., 7:179–188.
Measures (i) sepal length; (ii) sepal width; (iii) petal length; (iv) petal width,
Artificial time series generated with MVG with a reference correlation structure and added noise.
DPS
LO
SNI
RF2
USB
FHN
WB
ABT ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD AOC APA
AAPL AMAT
ADM
HOU
A
ACKH AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP BHI
BLL
CEG
ONE
BNY
BCR
BSET BMG
BOL
BAX
BSC
BDX
VZ
BLS BMS
BNL
BHMSQ BEV
BGEN
BMET
BJS BA
BBS
BK
BDK
HRB BMC
BA
OMX
BGG
BMY SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL CB
CI
CINF
CIN
MZIAQ
CTAS CCTYQ
CSCO
CIC
CLX
CMS CGP
KO
CCE
CL
HCA
68335Q CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO
CR
CRAY
CCK CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW 1132Q
DIG
DUK RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR EC ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM EDS
DTV
GDW
GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM
HD
HNW
HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG IP
JEC
FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY KMB
KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL
MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC NSM NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE
NI NBL
JWN
NSC
NTRS
NOC
NRTLQ
WFC
401 stocks on US market 1996-2009 (Reuters) PMFG 16 clusters
DPS
LO
SNI
RF2
USB
FHN
WB
ABT ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD AOC APA
AAPL AMAT
ADM
HOU
A
ACKH AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP BHI
BLL
CEG
ONE
BNY
BCR
BSET BMG
BOL
BAX
BSC
BDX
VZ
BLS BMS
BNL
BHMSQ BEV
BGEN
BMET
BJS BA
BBS
BK
BDK
HRB BMC
BA
OMX
BGG
BMY SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL CB
CI
CINF
CIN
MZIAQ
CTAS CCTYQ
CSCO
CIC
CLX
CMS CGP
KO
CCE
CL
HCA
68335Q CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO
CR
CRAY
CCK CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW 1132Q
DIG
DUK RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR EC ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM EDS
DTV
GDW
GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM
HD
HNW
HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG IP
JEC
FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY KMB
KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL
MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC NSM NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE
NI NBL
JWN
NSC
NTRS
NOC
NRTLQ
WFC
401 stocks on US market 1996-2009 (Reuters) PMFG 16 clusters
ALLEGHENY ENERGY INC AMERICAN ELECTRIC POWER CO INC CONSTELLATION ENERGY GROUP INC BROWN-FORMAN CORP PROGRESS ENERGY INC CMS ENERGY CORP MOLSON COORS BREWING CO DTE ENERGY INC DOMINION RESOURCES INC/VA DUKE ENERGY CORP ENTERGY CORP FPL GROUP INC CENTERPOINT ENERGY NICOR INC NISOURCE INC NORTHEAST UTILITIES XCEL ENERGY INC FIRSTENERGY CORP ONEOK INC PG&E CORP PPL CORP EXELON CORP PINNACLE WEST CAPITAL CORP PUBLIC SERVICE ENTERPRISE GROUP INC SCANA GROUP EDISON INTL SOUTHERN CO TECO ENERGY AMEREN CORP WISCONSIN ENERGY
ener
gy p
rodu
ctio
n &
dis
tribu
tion
DPS
LO
SNI
RF2
USB
FHN
WB
ABT ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD AOC APA
AAPL AMAT
ADM
HOU
A
ACKH AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP BHI
BLL
CEG
ONE
BNY
BCR
BSET BMG
BOL
BAX
BSC
BDX
VZ
BLS BMS
BNL
BHMSQ BEV
BGEN
BMET
BJS BA
BBS
BK
BDK
HRB BMC
BA
OMX
BGG
BMY SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL CB
CI
CINF
CIN
MZIAQ
CTAS CCTYQ
CSCO
CIC
CLX
CMS CGP
KO
CCE
CL
HCA
68335Q CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO
CR
CRAY
CCK CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW 1132Q
DIG
DUK RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR EC ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM EDS
DTV
GDW
GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM
HD
HNW
HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG IP
JEC
FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY KMB
KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL
MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC NSM NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE
NI NBL
JWN
NSC
NTRS
NOC
NRTLQ
WFC
401 stocks on US market 1996-2009 (Reuters) PMFG 16 clusters
HESS CORP ANADARKO PETROLEUM CORP APACHE CORP ASHLAND INC CHEVRON CORP EL PASO CORP EQT CORPORATION EXXON MOBIL CORP MASSEY ENERGY CORP FOSTER WHEELER AG HARTMARX CORP JACOBS ENGINEERING GROUP INC MURPHY OIL CORP NOBLE ENERGY INC OCCIDENTAL PETROLEUM CORP QUESTAR CORP SUNOCO INC MARATHON OIL CORP WILLIAMS COS INC AES EOG RESOURCES INC TESORO CORP
ALCOA INC BARRICK GOLD CORP ARCHER-DANIELS-MIDLAND CO NEWMONT MINING CORP NUCOR CORP UNITED STATES STEEL CORP WORTHINGTON INDUSTRIES INC FREEPORT-MCMORAN COPPER & GOLD INC
BAKER HUGHES BJ SERVICES HALLIBURTON CO HELMERICH & PAYNE INC MCDERMOTT INTERNATIONAL INC NABORS INDUSTRIES LTD ROWAN COS INC SCHLUMBERGER LTD SMITH INTERNATIONAL INC ENSCO INTERNATIONAL INC
min
ing
ener
gy e
xplo
ratio
n &
ser
vice
s
oil i
ndus
try
DPS
LO
SNI
RF2
USB
FHN
WB
ABT ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD AOC APA
AAPL AMAT
ADM
HOU
A
ACKH AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP BHI
BLL
CEG
ONE
BNY
BCR
BSET BMG
BOL
BAX
BSC
BDX
VZ
BLS BMS
BNL
BHMSQ BEV
BGEN
BMET
BJS BA
BBS
BK
BDK
HRB BMC
BA
OMX
BGG
BMY SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL CB
CI
CINF
CIN
MZIAQ
CTAS CCTYQ
CSCO
CIC
CLX
CMS CGP
KO
CCE
CL
HCA
68335Q CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO
CR
CRAY
CCK CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW 1132Q
DIG
DUK RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR EC ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM EDS
DTV
GDW
GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM
HD
HNW
HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG IP
JEC
FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY KMB
KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL
MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC NSM NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE
NI NBL
JWN
NSC
NTRS
NOC
NRTLQ
WFC
401 stocks on US market 1996-2009 (Reuters) PMFG 16 clusters
ADVANCED MICRO DEVICES INC ALTERA CORP ANALOG DEVICES APPLIED MATERIALS INC LINEAR TECHNOLOGY CORP LSI CORP MICRON TECHNOLOGY INC MOTOROLA INC NATIONAL SEMICONDUCTOR CORP TERADYNE TEXAS INSTRUMENTS INC XILINX INC KLA-TENCOR CORP
sem
icon
duct
ors
elec
troni
cs
DPS
LO
SNI
RF2
USB
FHN
WB
ABT ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD AOC APA
AAPL AMAT
ADM
HOU
A
ACKH AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP BHI
BLL
CEG
ONE
BNY
BCR
BSET BMG
BOL
BAX
BSC
BDX
VZ
BLS BMS
BNL
BHMSQ BEV
BGEN
BMET
BJS BA
BBS
BK
BDK
HRB BMC
BA
OMX
BGG
BMY SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL CB
CI
CINF
CIN
MZIAQ
CTAS CCTYQ
CSCO
CIC
CLX
CMS CGP
KO
CCE
CL
HCA
68335Q CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO
CR
CRAY
CCK CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW 1132Q
DIG
DUK RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR EC ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM EDS
DTV
GDW
GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM
HD
HNW
HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG IP
JEC
FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY KMB
KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL
MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC NSM NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE
NI NBL
JWN
NSC
NTRS
NOC
NRTLQ
WFC
401 stocks on US market 1996-2009 (Reuters) PMFG 16 clusters
ADC TELECOMMUNICATIONS INC ADOBE SYSTEMS INC APPLE INC AUTODESK INC BMC SOFTWARE INC CISCO SYSTEMS INC CA INC COMPUTER SCIENCES CORP CORNING INC DELL INC HARRIS CORP HEWLETT-PACKARD CO INTERNATIONAL BUSINESS MACHINES CORP INTEL CORP MICROSOFT CORP MOLEX INC
CISCO
NORTEL NETWORKS CORP NOVELL INC ORACLE CORP PARAMETRIC TECHNOLOGY CORP SUN MICROSYSTEMS INC SYMANTEC CORP EMC CORP/MASSACHUSETTS ELECTRONIC ARTS INC QUALCOMM INC AMPHENOL CORP COMPUWARE CORP TELLABS INC 3COM CORP
com
pute
rs te
leco
mm
unic
atio
ns
DPS
LO
SNI
RF2
USB
FHN
WB
ABT ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD AOC APA
AAPL AMAT
ADM
HOU
A
ACKH AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP BHI
BLL
CEG
ONE
BNY
BCR
BSET BMG
BOL
BAX
BSC
BDX
VZ
BLS BMS
BNL
BHMSQ BEV
BGEN
BMET
BJS BA
BBS
BK
BDK
HRB BMC
BA
OMX
BGG
BMY SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL CB
CI
CINF
CIN
MZIAQ
CTAS CCTYQ
CSCO
CIC
CLX
CMS CGP
KO
CCE
CL
HCA
68335Q CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO
CR
CRAY
CCK CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW 1132Q
DIG
DUK RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR EC ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM EDS
DTV
GDW
GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM
HD
HNW
HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG IP
JEC
FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY KMB
KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL
MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC NSM NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE
NI NBL
JWN
NSC
NTRS
NOC
NRTLQ
WFC
401 stocks on US market 1996-2009 (Reuters) PMFG 16 clusters
FIRST HORIZON NATIONAL CORP WACHOVIA CORP AFLAC INC THE ALLSTATE CORPORATION BANK OF NEW YORK MELLON CORP/THE CHUBB CORP CIGNA CORP CINCINNATI FINANCIAL CORP MILACRON INC COMERICA INC FIFTH THIRD BANCORP FLEETWOOD ENTERPRISES INC HUMANA INC HUNTINGTON BANCSHARES INC/OH KEYCORP LINCOLN NATIONAL CORP LOEWS CORP MARSHALL & ILSLEY CORP NATIONAL CITY CORP BANK OF AMERICA CORP TENET HEALTHCARE CORP NEWELL RUBBERMAID NORTHERN TRUST CORP WELLS FARGO & CO ALTRIA GROUP INC
CISCO
PNC FINANCIAL SERVICES GROUP INC PROGRESSIVE CORP/THE STATE STREET CORP TRAVELERS COS INC/THE SLM CORPORATION SUNTRUST BANKS, INC TORCHMARK CORP TYSON FOODS INC UNITEDHEALTH GROUP INC UNUM GROUP UST INC MBIA INC WASHINGTON MUTUAL INC ACE LTD XL CAPITAL LTD AMBAC FINANCIAL GROUP INC MGIC INVESTMENT CP BB&T CORP US BANCORP INC SYNOVUS FINANCIAL HARTFORD FINANCIAL SERVICES GROUP INC finance & banking
Clustering & Hierarchy
0
2
4
6
8
10
12
14
16
GAP
KRB
ACV
CIC
HRZICS
CTAS
NOCCHAMRIGT
ADI
CNP
CPB
ALTRDIG
NEM
CCK
ADSKBF/A
EFU
CCTYQPKI
MMC
GPSAIEKLLY
AYE
JPM
BDK
CHRSBNI
AVP
HUMT
CCE
MDPRYI
CAH
ENRNQKRILBA
IMNXDLXEPHNZ
LOW
HST
ADP
GRA
DWD
TGT
LUB
RHDCBCRK
NCC
CSCDDDGEDRRD
MHP
FPCG
ECOMONWL
MOLXLEG
BAC
BWS
MAY
ARB
MRMJHMNLDWI
CSCODPS
JNJNITAP
JWNFJASFMCVZLTD
ONELOJPM
CHIR
FDXMTCEG
FWLT
GTECNBL
CVSFCGAS
DUKMI
INTCLDG
NTRSCTB
LNCVVI
LUK
FREGTAMDDGBLL
MSFT
HTMX
QRF2
NSM
MRK
BJS
FRX
NAV
FPL
ANDWBLS
CISCOJCI
CAEMR
GFS/
MEL
AXPGRMWV
BMS
USB
KSU
AZACRBGG
ETRMUAET
DDSFONMKAT
FNM
DOVECKMB
EFXKWSNI
DCNAQ
AGN
AAPLECL
HNW
DTE
GDW
ABT
CINFAIGAAMARMOT
BWWBAGC
BSETBSCDEHRS
BOL
ACKHCNCLXMSBNL
HLT
KBH
ADBEBUDCL
HOG
LPXFNLAPAGP
1357Q
OMX
MUR
HRB
BMC
CAT
GLW
NSC
WCOEQ
BFOHP
GENZCI
KOCYBNY
FITB
CVXGSMYL
BEVKRBA
HBAN
68335QGLK
BF/B
KMRTQ
INGRCMA
IKNCGP
WFC
MER
HONIR
MDT
AEP
BDX
NBR
CSX
GIS
CMCSA
DELLNKE
CAGJO
LLTCJECIPNCBAX
DTV
FHN
IBMAMBRCBMCI
HPC
APCNIFF
DALRQ
MAS
HDLM
CRAYHES
APCCAH
BGENHCA
BMY
FDO
EDS
BHMS
QHPQ
MDRDJTHC
FLTW
QLSI
BBSBCHAL
CMS
AOCH
CARMIDETN
KEY
AMATAVYAL
HOU
1132Q
NRTLQ
PHA
DOWITWABKMEE
DHR
ADCT
MZIAQWYEGMABS
KSE
BMG
CBE
ADMLIZ
BMETSA
GWWHMGRGDFLE
HAS
KMG
GPUI
GPCHDASH
FISV
BRNOFTLAIPGBEN
AFL
GCI
MYG
APDNSI
ALXABHI
CTX
ABX
EQTJPMIALL
FLMIQAA
MMM
CTL
HLS
PGN
MAT
MCD
CYLGECPQCIN
CMI
AMGNHI
MALK
UCM
ASOMIL
CNW
XOM
HSY
FLS
finan
ce &
ban
king
bank
s
cons
truct
ions
com
pute
rs te
leco
mm
unic
atio
ns
sem
icon
duct
ors
elec
troni
cs
phar
mac
eutic
al h
ealth
bea
uty
beau
ty
reta
ilers
food
car &
tran
spor
tatio
n
reta
ilers
& c
onsu
mer
pro
duct
s
rail
road
s
ener
gy p
rodu
ctio
n &
dis
tribu
tion
min
ing
oil i
ndus
try
ener
gy e
xplo
ratio
n &
ser
vice
s
Kruskal-Wallis test p-values 0.000000 Energy 0.000000 Financial 0.000000 Technology 0.550930 Conglomerates 0.000001 Consumer Cyclical
window size
log-returns
exponential smoothing
T. Aste, W. Shaw and T. Di Matteo “Correlation structure and dynamics in volatile markets”, New Journal of Physics 12 (2010) 085009 1-21. T. Di Matteo, F. Pozzi, T. Aste, "The use of dynamical networks to detect the hierarchical organization of financial market sectors", Eur. Phys. J. B 73 (2010) 3–11.
Distribution and dynamics of risk
401 firms on the US equity market from 01/01/96 to 01/01/2009 data over a window of 250 days with 20 days steps
1998 2000 2002 2004 2006 20080.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Correlation increases during periods of market instability. This is a manifestation of the herd effect. Diversification becomes more challenging during times of crisis
1998 2000 2002 2004 2006 20080
1
2
3
4
5
6
mean distances
AllEnergyFinacialTechnologyConglomerateCons CyclDHBT clusters
DPS
LO
SNI
RF2
USB
FHN
WB
ABT
ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A AM
AMR
APCC
ASO
AS
APC ADI
ANDW
BUD
AOC
APA
AAPL
AMAT
ADM
HOU
A
ACKH
AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP
BHI
BLL
CEG
ONE
BNY
BCR
CISCO
BSET BMG
BOL
BAX
BSC
BDX
VZ
BLS
BMS
BNL
BHMSQ
BEV
BGEN
BMET
BJS
BA
BBS
BK
BDK
HRB
BMC
BA
OMX
BGG
BMY
SA
BW
BWS
KSE
BF/A
BF/B
BC BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM CVX
CHIR
CYL
CB
CI
CINF
CIN
MZIAQ CTAS
CCTYQ
CSCO
CIC
CLX
CMS
CGP
KO
CCE
CL
HCA
68335Q
CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB TAP
GLW BFO
CR
CRAY
CCK
CSX
CAR
CMI
CY
DCNAQ
DHR
TGT MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW
1132Q
DIG
DUK
RHDC
DD
FLS
EK EFU
ETN
ECO
ECL
PKI EP
EMR
EC
ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM
EDS
DTV
GDW
GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM
HD
HNW
HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG
IP
JEC FJ
JP
JCI
JNJ
JO
KSU KBH
K
KMG
KEY
KMB
KW
KMRTQ
KRI
KR LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC
LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR
MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL
MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC NL
NCC
BAC
THC
NSM
NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE NI
NBL JWN
NSC
NTRS
NOC
NRTLQ
WFC
DPS
LO
SNI
RF2
USB
FHN
WB
ABT
ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD AOC
APA
AAPL
AMAT
ADM
HOU
A ACKH
AR
ASH
1357Q
T ADP
ADSK
AVY
AVP
BHI
BLL
CEG
ONE
BNY
BCR
CISCO
BSET
BMG
BOL
BAX BSC
BDX
VZ
BLS
BMS
BNL
BHMSQ
BEV BGEN
BMET
BJS
BA
BBS
BK
BDK
HRB
BMC
BA
OMX
BGG
BMY
SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR CYL
CB
CI
CINF
CIN
MZIAQ
CTAS
CCTYQ
CSCO
CIC
CLX
CMS
CGP
KO
CCE
CL
HCA
68335Q
CMCSA
UCM
CMA
CA CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO
CR
CRAY
CCK
CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW
1132Q
DIG
DUK
RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR
EC
ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM
EDS
DTV
GDW
GR
GT
GRA
GWW GAP
GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP HPC
HSY
HPQ
HLT
HM
HD
HNW
HI
CNP
HUM
HBAN
IBM ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG
IP
JEC FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY
KMB
KW
KMRTQ KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC
LIZ
L
LDG
LPX
LOW LSI
LUB
MN HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR
MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU MIL
MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC
NSM
NSI
NAV
FC NWL NEM
NMK
GAS
NKE
NI
NBL JWN NSC
NTRS
NOC
NRTLQ WFC
DPS LO
SNI
RF2
USB
FHN
WB
ABT
ADCT
ADBE AMD
AET
AFL
AH
APD
ABS
ACV AL
IKN
ALXA
AGN
AYE
HON
ALL AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC AMGN
HES WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD
AOC
APA
AAPL
AMAT ADM
HOU A
ACKH
AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP
BHI
BLL
CEG
ONE
BNY
BCR
CISCO
BSET
BMG
BOL
BAX
BSC
BDX
VZ
BLS
BMS
BNL
BHMSQ
BEV
BGEN
BMET
BJS
BA
BBS
BK
BDK
HRB
BMC
BA
OMX
BGG
BMY
SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH CHRS
JPM
CVX
CHIR
CYL
CB
CI
CINF
CIN
MZIAQ
CTAS
CCTYQ
CSCO
CIC
CLX
CMS
CGP
KO
CCE
CL
HCA
68335Q
CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW BFO
CR
CRAY CCK
CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW
1132Q
DIG DUK
RHDC
DD
FLS
EK EFU
ETN
ECO
ECL
PKI
EP
EMR
EC
ENRNQ
ETR EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM
EDS
DTV
GDW
GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL
HDLM
HOG JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ HP
HPC
HSY
HPQ
HLT
HM
HD
HNW HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG
IP
JEC FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY
KMB KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC
LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR
MCD MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL
MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC
NSM
NSI
NAV
FC NWL
NEM
NMK
GAS
NKE
NI
NBL JWN
NSC
NTRS NOC
NRTLQ
WFC
DPS LO
SNI
RF2
USB
FHN
WB
ABT
ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD
AOC
APA
AAPL
AMAT
ADM
HOU
A
ACKH
AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP
BHI
BLL
CEG
ONE
BNY
BCR
CISCO
BSET
BMG
BOL
BAX
BSC
BDX
VZ
BLS
BMS
BNL BHMSQ
BEV BGEN
BMET
BJS
BA
BBS
BK
BDK
HRB
BMC
BA
OMX
BGG
BMY
SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL
CB
CI
CINF
CIN
MZIAQ
CTAS
CCTYQ
CSCO
CIC
CLX
CMS
CGP
KO
CCE
CL
HCA 68335Q CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW BFO
CR
CRAY
CCK
CSX
CAR CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW 1132Q
DIG
DUK
RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR
EC
ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS GPUI
GR
GS
GPC
GENZ GP
GFS/
G
GM
EDS
DTV
GDW
GR
GT
GRA
GWW GAP
GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM HD
HNW
HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC IFF
IGT
IPG
IP
JEC
FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY
KMB KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC
LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR
MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL MMM
MI
MO MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC
NSM
NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE
NI
NBL
JWN
NSC
NTRS
NOC
NRTLQ
WFC
DPS
LO
SNI
RF2
USB
FHN
WB
ABT
ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD
AOC
APA
AAPL
AMAT
ADM
HOU
A
ACKH
AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP
BHI
BLL
CEG
ONE
BNY
BCR
CISCO
BSET
BMG
BOL
BAX BSC
BDX
VZ
BLS
BMS
BNL
BHMSQ BEV
BGEN
BMET
BJS
BA BBS
BK
BDK HRB
BMC
BA
OMX
BGG BMY
SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL
CB
CI
CINF
CIN
MZIAQ
CTAS
CCTYQ
CSCO
CIC
CLX
CMS
CGP
KO
CCE
CL
HCA
68335Q
CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO
CR
CRAY
CCK
CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW
1132Q
DIG
DUK
RHDC DD
FLS
EK
EFU
ETN
ECO
ECL
PKI EP
EMR
EC
ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM
EDS
DTV
GDW
GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ HP
HPC
HSY
HPQ
HLT
HM
HD
HNW
HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX N
IR
RYI
INGR
INTC
IFF IGT
IPG
IP
JEC
FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY
KMB
KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC
LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST MMC
MI
MAS
MAT
MAY MYG
KRB
MDR
MCD
MHP MCI
MWV
MDT
MEL
CVS
MRK MT
MDP
MER
MSFT
MU
MIL
MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC
NSM
NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE
NI
NBL JWN
NSC
NTRS
NOC
NRTLQ
WFC
DPS LO
SNI
RF2
USB
FHN
WB
ABT
ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD
AOC
APA
AAPL
AMAT ADM
HOU A
ACKH
AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP
BHI
BLL
CEG
ONE
BNY
BCR
CISCO
BSET
BMG
BOL BAX
BSC
BDX
VZ
BLS
BMS
BNL
BHMSQ
BEV BGEN
BMET
BJS
BA
BBS
BK
BDK
HRB BMC
BA
OMX
BGG
BMY
SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM CVX
CHIR
CYL
CB
CI
CINF
CIN
MZIAQ
CTAS
CCTYQ
CSCO
CIC
CLX CMS
CGP
KO
CCE
CL HCA 68335Q
CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO
CR
CRAY
CCK
CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW
1132Q
DIG
DUK
RHDC DD
FLS
EK
EFU
ETN
ECO ECL
PKI
EP
EMR
EC
ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE H
GD
GIS
GPUI
GR
GS
GPC
GENZ GP
GFS/
G
GM
EDS
DTV GDW
GR
GT
GRA
GWW GAP
GLK
GTEC
HAL
HDLM
HOG JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM
HD
HNW
HI
CNP
HUM
HBAN IBM
ITW
MALK
IMNX
N
IR RYI INGR INTC
IFF
IGT
IPG
IP JEC
FJ JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY
KMB
KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK LLY LTD
LNC
LLTC
LIZ
L LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR
MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL NBR
NC
NL
NCC
BAC
THC
NSM
NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE
NI
NBL
JWN
NSC NTRS
NOC
NRTLQ
WFC
DPS LO
SNI
RF2
USB
FHN
WB
ABT
ADCT
ADBE
AMD
AET
AFL AH
APD
ABS
ACV AL
IKN
ALXA
AGN
AYE
HON ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD
AOC
APA
AAPL AMAT
ADM
HOU
A
ACKH
AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP
BHI
BLL
CEG ONE
BNY
BCR
CISCO
BSET
BMG
BOL
BAX
BSC
BDX
VZ
BLS
BMS
BNL
BHMSQ
BEV BGEN
BMET
BJS
BA
BBS
BK
BDK
HRB BMC
BA
OMX
BGG
BMY
SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL
CB
CI
CINF
CIN
MZIAQ
CTAS
CCTYQ
CSCO
CIC
CLX CMS
CGP
KO
CCE
CL
HCA
68335Q CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO
CR
CRAY CCK
CSX
CAR
CMI
CY DCNAQ
DHR
TGT
MS
DE DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW 1132Q
DIG
DUK
RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR
EC
ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM
EDS
DTV
GDW
GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS HTMXQ
HAS
HLS
HNZ
HP HPC
HSY
HPQ HLT
HM
HD
HNW
HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG
IP
JEC
FJ
JP
JCI
JNJ JO
KSU
KBH
K
KMG
KEY
KMB
KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR
MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC
NSM
NSI
NAV
FC
NWL
NEM
NMK
GAS NKE
NI
NBL JWN
NSC NTRS
NOC
NRTLQ
WFC
DPS
LO
SNI
RF2
USB
FHN
WB
ABT
ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI A
AM
AMR
APCC
ASO
AS
APC ADI
ANDW BUD
AOC
APA
AAPL
AMAT
ADM HOU
A
ACKH AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP
BHI
BLL
CEG
ONE
BNY
BCR
CISCO
BSET
BMG
BOL BAX
BSC
BDX
VZ
BLS
BMS
BNL BHMSQ
BEV
BGEN
BMET
BJS BA
BBS
BK
BDK
HRB BMC
BA
OMX
BGG
BMY
SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL
CB
CI
CINF
CIN
MZIAQ
CTAS
CCTYQ
CSCO
CIC
CLX
CMS
CGP
KO
CCE
CL
HCA
68335Q
CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB TAP
GLW BFO CR
CRAY
CCK
CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW
1132Q
DIG
DUK
RHDC DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR
EC
ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM
EDS
DTV
GDW
GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM
HD
HNW
HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG
IP
JEC FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY
KMB
KW
KMRTQ
KRI KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC
LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY MYG
KRB
MDR
MCD
MHP MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL MMM
MI
MO MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC
NSM
NSI NAV
FC
NWL
NEM
NMK
GAS
NKE
NI
NBL
JWN
NSC
NTRS
NOC NRTLQ
WFC
DPS
LO
SNI
RF2
USB
FHN
WB
ABT
ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD
AOC
APA
AAPL
AMAT
ADM
HOU
A
ACKH
AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP
BHI
BLL
CEG
ONE BNY
BCR
CISCO
BSET
BMG
BOL
BAX
BSC
BDX
VZ
BLS
BMS
BNL
BHMSQ
BEV
BGEN
BMET
BJS BA
BBS
BK
BDK
HRB BMC
BA
OMX
BGG
BMY
SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL
CB
CI
CINF
CIN
MZIAQ
CTAS
CCTYQ
CSCO
CIC
CLX
CMS
CGP
KO
CCE
CL
HCA 68335Q
CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO
CR
CRAY
CCK
CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW
1132Q
DIG DUK
RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI EP
EMR
EC
ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM
EDS
DTV
GDW
GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM
HD
HNW
HI
CNP
HUM
HBAN IBM
ITW
MALK
IMNX N
IR
RYI
INGR
INTC
IFF
IGT
IPG
IP
JEC
FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY
KMB
KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY LTD
LNC
LLTC
LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR
MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP MER
MSFT
MU MIL
MMM
MI
MO
MOLX
PHA
MRM JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC
NSM
NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE
NI NBL
JWN
NSC
NTRS
NOC
NRTLQ
WFC
DPS
LO
SNI
RF2
USB
FHN
WB
ABT
ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD
AOC APA
AAPL
AMAT
ADM
HOU
A
ACKH
AR
ASH
1357Q
T ADP
ADSK
AVY
AVP
BHI
BLL
CEG
ONE
BNY
BCR
CISCO
BSET
BMG
BOL
BAX
BSC
BDX VZ
BLS
BMS
BNL
BHMSQ
BEV
BGEN
BMET
BJS
BA
BBS
BK
BDK
HRB
BMC
BA
OMX BGG BMY
SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL CTX
ARB
CH
CHRS
JPM
CVX CHIR
CYL
CB
CI
CINF
CIN
MZIAQ
CTAS
CCTYQ
CSCO
CIC
CLX CMS
CGP KO
CCE
CL
HCA
68335Q
CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE CTB
TAP
GLW BFO CR
CRAY
CCK
CSX CAR CMI
CY
DCNAQ
DHR
TGT
MS
DE DELL
DALRQ
DLX
DTE DG
VVI
DDS
DJ
DG
D
RRD
DOV DOW
1132Q
DIG
DUK RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR
EC
ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ GP
GFS/
G
GM EDS
DTV
GDW GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM HD
HNW
HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG
IP
JEC FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY
KMB KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC
LIZ
L
LDG
LPX
LOW
LSI LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR
MCD MHP
MCI
MWV
MDT MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL MMM
MI
MO
MOLX
PHA
MRM JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC
NSM
NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE
NI NBL
JWN
NSC
NTRS
NOC
NRTLQ
WFC
DPS
LO
SNI
RF2
USB
FHN
WB
ABT
ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD
AOC
APA
AAPL
AMAT
ADM
HOU
A
ACKH
AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP
BHI
BLL
CEG
ONE
BNY
BCR
CISCO
BSET
BMG BOL
BAX
BSC
BDX
VZ
BLS
BMS
BNL
BHMSQ
BEV BGEN
BMET
BJS
BA
BBS
BK
BDK
HRB
BMC
BA
OMX
BGG
BMY
SA BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL
CB
CI
CINF
CIN
MZIAQ
CTAS
CCTYQ
CSCO
CIC
CLX
CMS
CGP
KO
CCE
CL
HCA 68335Q
CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW BFO
CR
CRAY CCK
CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV DOW
1132Q
DIG DUK
RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR
EC
ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM
EDS DTV
GDW
GR GT
GRA
GWW
GAP
GLK
GTEC
HAL
HDLM HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP HPC
HSY
HPQ
HLT
HM HD
HNW
HI
CNP
HUM HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG
IP
JEC
FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY
KMB
KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG LUK
LLY
LTD
LNC
LLTC LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR
MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL
MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT MUR MYL
NBR
NC
NL
NCC
BAC
THC
NSM
NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE
NI
NBL
JWN
NSC
NTRS
NOC
NRTLQ
WFC
DPS
LO
SNI RF2
USB
FHN
WB
ABT
ADCT
ADBE
AMD
AET
AFL
AH
APD ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD
AOC
APA
AAPL
AMAT
ADM
HOU
A
ACKH
AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP
BHI
BLL
CEG
ONE
BNY BCR
CISCO
BSET BMG
BOL BAX
BSC
BDX
VZ
BLS
BMS
BNL
BHMSQ
BEV
BGEN
BMET
BJS
BA
BBS
BK
BDK
HRB
BMC
BA
OMX BGG
BMY
SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL
CB
CI
CINF
CIN
MZIAQ
CTAS
CCTYQ
CSCO
CIC
CLX
CMS
CGP
KO
CCE
CL
HCA
68335Q
CMCSA UCM
CMA
CA
CPQ CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO CR
CRAY
CCK
CSX
CAR
CMI CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D RRD
DOV
DOW 1132Q
DIG
DUK
RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR
EC
ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI GPS
GE
H
GD
GIS
GPUI
GR
GS GPC
GENZ GP
GFS/
G
GM
EDS
DTV
GDW
GR GT
GRA
GWW
GAP
GLK
GTEC
HAL HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM
HD
HNW HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG
IP
JEC
FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY
KMB
KW KMRTQ
KRI
KR
LDWI
WCOEQ
LEG LUK
LLY LTD
LNC
LLTC
LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR
MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL
MMM
MI
MO
MOLX
PHA
MRM
JPM DWD
MI MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC
NSM
NSI
NAV
FC NWL
NEM
NMK
GAS
NKE
NI NBL
JWN
NSC
NTRS
NOC
NRTLQ
WFC
1998 2000 2002 2004 2006 20080.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Average distance between vertices
1996 1998 2000 2002 2004 2006 2008 2010
1
1.5
2
2.5
3
3.5
4
4.5
5
(a)
t
Rela
tive b
etw
eenness
centr
alit
y
1996 1998 2000 2002 2004 2006 2008 20100.8
1
1.2
1.4
1.6
1.8
2
2.2 (b)
t
Rela
tive d
egre
e
1996 1998 2000 2002 2004 2006 2008 20100.7
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
(c)
t
Rela
tive e
ccentr
icity
1996 1998 2000 2002 2004 2006 2008 20100.7
0.75
0.8
0.85
0.9
0.95
1
1.05
1.1
(d)
t
Rela
tive c
lose
ness
Loss of centrality of financial sector
Number of clusters
1998 2000 2002 2004 2006 20080
2
4
6
8
10
12
14
16
18
number of clusters
All WPAllEnergyFinacialTechnologyConglomerateCons Cycl
1998 2000 2002 2004 2006 20080.1
0.2
0.3
0.4
0.5
0.6
mean correlations
selected stocks inside industrial sectors
A good diversification must search for poorly correlated assets
Inside each of the DHBT clusters we observe larger correlations than average On the contrary, by choosing stocks across clusters we measure smaller correlations than average
inside DBHT clusters (dyn, fix)
in the whole system (401 stocks)
across DBHT clusters (16 stocks)
aver
age
corr
elat
ion
Distribution of risk: correlations
selected stocks inside industrial sectors
1998 2000 2002 2004 2006 20080.005
0.01
0.015
0.02
0.025
mean standard deviationDistribution of risk: returns variability
As a proxy for risk we can first compute the average standard deviation of the log-returns
Inside each of the DHBT clusters we observe larger than average variability On the contrary, risk can be edged by building a portfolio choosing stocks across clusters
inside DBHT clusters (dyn, fix)
in the whole system (401 stocks)
across DBHT clusters (16 stocks)
aver
age
stan
dard
dev
iatio
n of
log-
retu
rn
1998 2000 2002 2004 2006 20080
0.5
1
1.5
2
gains
inside DBHT clusters (dyn, fix)
in the whole system (401 stocks)
selected stocks inside industrial sectors
across DBHT clusters (16 stocks)
Different stock gatherings give comparable average returns
aver
age
cum
ulat
e re
turn
Distribution of risk: gains and losses
DPS
LO
SNI
RF2
USB
FHN
WB
ABT ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD AOC APA
AAPL AMAT
ADM
HOU
A
ACKH AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP BHI
BLL
CEG
ONE
BNY
BCR
BSET BMG
BOL
BAX
BSC
BDX
VZ
BLS BMS
BNL
BHMSQ BEV
BGEN
BMET
BJS BA
BBS
BK
BDK
HRB BMC
BA
OMX
BGG
BMY SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL CB
CI
CINF
CIN
MZIAQ
CTAS CCTYQ
CSCO
CIC
CLX
CMS CGP
KO
CCE
CL
HCA
68335Q CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO
CR
CRAY
CCK CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW 1132Q
DIG
DUK RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR EC ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM EDS
DTV
GDW
GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM
HD
HNW
HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG IP
JEC
FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY KMB
KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL
MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC NSM NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE
NI NBL
JWN
NSC
NTRS
NOC
NRTLQ
WFC
Distribution of risk Risk does not distribute uniformly it is larger inside clusters and smaller across clusters. The variability inside clusters is similar to the one inside industrial sectors
In terms of investment strategies an efficient diversification demands stocks characterized by both low correlations and high expected returns. These stocks are located at the periphery of the PMFG.
Efficient diversification
5.5. PERFORMANCE OF OPTIMAL PORTFOLIOS CONSTRUCTED OVER FILTERED GRAPHS121
Figure 5.5: Performance of Optimal Portfolios obtained with uniform weights (u) assigned tosubsets of stocks: MKT , RAND, BEST and PMFG peripheral nodes ((D + X)�). The numberof stocks used for RAND, BEST and PMFG is 5 for the upper left panel, 10 for the upper rightpanel, 20 for the lower left panel, 30 for the lower right panel. PMFG is the second curve in theupper left panel and it is the first in the other three panels; RAND is the last in the upper panelsand the second last in the lower panels (swapping positions with BEST which is second last in theupper panels and last in the lower panels); the position of MKT doesn’t change as it is obtainedby using all of the 300 daily stocks. By increasing the number of companies, the performance ofPMFGs slightly improves.
aver
age
cum
ulat
e re
turn
/std
cum
ulat
e re
turn
(fo
r diff
eren
t por
tfolio
s ge
nera
ted
over
707
1 tim
es)
(from F. Pozzi draft PhD thesis, ANU 2012, unpublished)
F Pozzi
We investigate efficient portfolio differentiations by selecting stocks from the periphery of the PMFG build form past data and verifying the average performances in the future
122 CHAPTER 5. OPTIMAL PORTFOLIO STRATEGIES
Figure 5.6: Performance of Optimal Portfolios obtained by solving the Markowitz Problemwithout allowing for short selling (ns) for subsets of stocks: MKT , RAND, BEST and PMFGperipheral nodes ((D + X)�). The number of stocks used for RAND, BEST and PMFG is 5 forthe upper left panel, 10 for the upper right panel, 20 for the lower left panel, 30 for the lower rightpanel. RAND is always below the other three curves. The position of MKT doesn’t change asit is obtained by using all of the 300 daily stocks. PMFG is the second curve in the upper paneland the first in the lower. BEST is always slightly under PMFG. By increasing the number ofcompanies, the performance of PMFGs greatly improves.
5.5. PERFORMANCE OF OPTIMAL PORTFOLIOS CONSTRUCTED OVER FILTERED GRAPHS123
Figure 5.7: Performance of Optimal Portfolios obtained by solving the Markowitz Problemwith short selling (s) for subsets of stocks: MKT , RAND, BEST and PMFG peripheral nodes((D + X)�). The number of stocks used for RAND, BEST and PMFG is 5 for the upper leftpanel, 10 for the upper right panel, 20 for the lower left panel, 30 for the lower right panel. PMFGis always above the other three curves and RAND always below. The position of MKT doesn’tchange as it is obtained by using all of the 300 daily stocks. BEST is always slightly under PMFG.By increasing the number of companies, the performance of PMFGs greatly improves.
5.5. PERFORMANCE OF OPTIMAL PORTFOLIOS CONSTRUCTED OVER FILTERED GRAPHS125
Figure 5.9: Performance of Optimal Portfolios obtained by solving the Markowitz Problemwithout allowing for short selling (ns) for subsets of stocks: MKT , RAND, BEST and PMFG
central nodes ((D + X)+), analogously with Figure 5.6. PMFG is always the lowest curve.
Markowitz portfolios from most peripheral vertices
Markowitz portfolios from most central vertices
126 CHAPTER 5. OPTIMAL PORTFOLIO STRATEGIES
Figure 5.10: Performance of Optimal Portfolios obtained by solving the Markowitz Problemwith short selling (s) for subsets of stocks: MKT , RAND, BEST and PMFG central nodes((D + X)+), analogously with Figure 5.7. PMFG is always the lowest curve.
(from F. Pozzi draft PhD thesis, ANU 2012, unpublished) (from F. Pozzi draft PhD thesis, ANU 2012, unpublished)
(from F. Pozzi draft PhD thesis, ANU 2012, unpublished) (from F. Pozzi draft PhD thesis, ANU 2012, unpublished)
DPS
LO
SNI
RF2
USB
FHN
WB
ABT ADCT
ADBE
AMD
AET
AFL
AH
APD
ABS
ACV
AL
IKN
ALXA
AGN
AYE
HON
ALL
AT
ALTR
AA
AZA
FO
ABX
AEP
AXP
AM
AGC
AMGN
HES
WYE
AIG
AI
A
AM
AMR
APCC
ASO
AS
APC
ADI
ANDW
BUD AOC APA
AAPL AMAT
ADM
HOU
A
ACKH AR
ASH
1357Q
T
ADP
ADSK
AVY
AVP BHI
BLL
CEG
ONE
BNY
BCR
BSET BMG
BOL
BAX
BSC
BDX
VZ
BLS BMS
BNL
BHMSQ BEV
BGEN
BMET
BJS BA
BBS
BK
BDK
HRB BMC
BA
OMX
BGG
BMY SA
BW
BWS
KSE
BF/A
BF/B
BC
BRNO
BNI
BR
CPB
CAH
PGN
CAT
CS
CTL
CTX
ARB
CH
CHRS
JPM
CVX
CHIR
CYL CB
CI
CINF
CIN
MZIAQ
CTAS CCTYQ
CSCO
CIC
CLX
CMS CGP
KO
CCE
CL
HCA
68335Q CMCSA
UCM
CMA
CA
CPQ
CSC
CAG
CNW
CN
ED
CBE
CTB
TAP
GLW
BFO
CR
CRAY
CCK CSX
CAR
CMI
CY
DCNAQ
DHR
TGT
MS
DE
DELL
DALRQ
DLX
DTE
DG
VVI
DDS
DJ
DG
D
RRD
DOV
DOW 1132Q
DIG
DUK RHDC
DD
FLS
EK
EFU
ETN
ECO
ECL
PKI
EP
EMR EC ENRNQ
ETR
EFX
EQT
XOM
FDO
FDX
FRE
FNM
FITB
FISV
FLTWQ
FLE
FLMIQ
FPC
MEE
FMC
F
FRX
FWLT
FPL
BEN
FTLA
GCI
GPS
GE
H
GD
GIS
GPUI
GR
GS
GPC
GENZ
GP
GFS/
G
GM EDS
DTV
GDW
GR
GT
GRA
GWW
GAP
GLK
GTEC
HAL
HDLM
HOG
JH
HRZI
HRS
HTMXQ
HAS
HLS
HNZ
HP
HPC
HSY
HPQ
HLT
HM
HD
HNW
HI
CNP
HUM
HBAN
IBM
ITW
MALK
IMNX
N
IR
RYI
INGR
INTC
IFF
IGT
IPG IP
JEC
FJ
JP
JCI
JNJ
JO
KSU
KBH
K
KMG
KEY KMB
KW
KMRTQ
KRI
KR
LDWI
WCOEQ
LEG
LUK
LLY
LTD
LNC
LLTC LIZ
L
LDG
LPX
LOW
LSI
LUB
MN
HST
MMC
MI
MAS
MAT
MAY
MYG
KRB
MDR MCD
MHP
MCI
MWV
MDT
MEL
CVS
MRK
MT
MDP
MER
MSFT
MU
MIL
MMM
MI
MO
MOLX
PHA
MRM
JPM
DWD
MI
MOT
MUR
MYL
NBR
NC
NL
NCC
BAC
THC NSM NSI
NAV
FC
NWL
NEM
NMK
GAS
NKE
NI NBL
JWN
NSC
NTRS
NOC
NRTLQ
WFC
Risk lies at the core We have observed that selections of stocks from the center of the PMFG give consistently �bad performances. ��Conversely stocks �gathered from the �periphery consistently �deliver better than �average performances. �
Which are the genes that are co-expressed?
Gene-expression A DNA microarray is a collection of short sections of DNA regularly arranged on a support that are used to hybridize a target sample, the profiling method fluorescently labels an RNA sample during the reverse transcription when RNA is converted into complementary DNA and can detect in this way the expression of a given gene.
500 1000 1500 2000 2500 3000 3500 4000−4
−2
0
2
4
500 1000 1500 2000 2500 3000 3500 4000−4
−2
0
2
4
500 1000 1500 2000 2500 3000 3500 4000−4
−2
0
2
4
500 1000 1500 2000 2500 3000 3500 4000−4
−2
0
2
4
Alizade et al. Nature 403 (2000) 503–511.
de Souto et al. Bioinformatics 9 (2008) 497.
Which gene is expressed in which sample?
Which is the relation with physiological properties?
Won Min Song
DCDC
D
D
D
D
D
D
D
GG
DC
DD
D
D
D
D
D
DTo
D
LN
D
DD
DD
D DD
D
D
D
D
D D
D
D DD
D
D
D
D
D D
A
A
A
A
AA
A
A
A
AT
T
TT
T
T
DCTC
TC
TCTC
TC
TC
D41
F
F
F
F F
F
F
F
F
RR
R
RC
C
C
C
C
C
C
C
C
CC
D9
D: Diffuse Large B−cell Lymphoma
D9: DLCL−0009
D41: DLCL−0041
DC: DLBCL cell line
F: Follicular Lymphoma
C: Chronic Lymphocytic leukemia
A: Activated Blood B
G: Germinal Centre B
R: Resting Blood B
TC: Transformed Cell line
T: Activated/Resting Blood T
LN: Lymph Node
To: Tonsil
cluster 1
cluster 2
cluster 3
cluster 4
cluster 5
cluster 6
cluster 7
cluster 8
cluster 9
cluster 10
cluster 11
W.M. Song, T. Di Matteo and T. Aste, “Hierarchical information clustering by means of topologically embedded graphs”, PLoS ONE, 7 (2012) e31929
Gene cluster ‘44’ (significant for sample-cluster ‘1’) Key gene: CDK1 (over expressed) commonly over-expression in DLBCL cancer types Gene cluster ‘4’ (significant for sample-cluster ‘4’) Key gene: SYK (over expressed) promising target gene for antitumor therapy, inhibition of SYK increases the chance of survival Gene cluster ‘1’ (significant for sample-cluster ‘5’) Key gene: TGF-B1 (under expressed) regulates proliferation Gene cluster ‘4’ (significant for sample-cluster ‘7’) Key genes: CDKN1B/p27Kip1 and CDKN2D/p19, tumor suppressor genes (under expressed) Gene cluster ‘125’ (significant for sample-cluster ‘9’) Key genes: ‘IL-6’. IL-6 (over expressed) specific of some DLBCL related to STAT3 activation Gene cluster ‘102’ (significant for sample-cluster ‘11’) Contains IRF1 (under expressed) mediator for cell fate and tumor suppressor
Gene-Ontology analysis
DCDC
D
D
D
D
D
D
D
GG
DC
DD
D
D
D
D
D
DTo
D
LN
D
DD
DD
D DD
D
D
D
D
D D
D
D DD
D
D
D
D
D D
A
A
A
A
AA
A
A
A
AT
T
TT
T
T
DCTC
TC
TCTC
TC
TC
D41
F
F
F
F F
F
F
F
F
RR
R
RC
C
C
C
C
C
C
C
C
CC
D9
D: Diffuse Large B−cell Lymphoma
D9: DLCL−0009
D41: DLCL−0041
DC: DLBCL cell line
F: Follicular Lymphoma
C: Chronic Lymphocytic leukemia
A: Activated Blood B
G: Germinal Centre B
R: Resting Blood B
TC: Transformed Cell line
T: Activated/Resting Blood T
LN: Lymph Node
To: Tonsil
cluster 1
cluster 2
cluster 3
cluster 4
cluster 5
cluster 6
cluster 7
cluster 8
cluster 9
cluster 10
cluster 11
W.M. Song, T. Di Matteo and T. Aste, “Hierarchical information clustering by means of topologically embedded graphs”, PLoS ONE, 7 (2012) e31929
Coexistence of local specialized activity and
cross-scale global organization in human brain
Henrik J. Jensen
Conclusions Information and complex datasets: stock prices Redundancy Dependency Information Filtering: clustering Information Filtering: hierarchy A network approach Embedded Graphs EMFG & PMFG Retrieving Clustering and Hierarchies: DBHT method A key-study: Financial Equity Market Knowledge network, gene expression, brain activity