financial networks vi - correlation networks
DESCRIPTION
Sixth lecture of a PhD level course on "Financial Networks" at Center for Financial Research at Goethe University, Frankfurt.TRANSCRIPT
Dr. Kimmo SoramäkiFounder and CEOFNA, www.fna.fi
Center for Financial Studies at the Goethe UniversityPhD Mini-course Frankfurt, 25 January 2013
Financial Networks
VI. Correlation Networks
2
Agenda
V. Inferring Links• Prices and returns• Controlling for common factors• Correlation and dependence• Significant correlations• Multiple Comparisons
VI. Correlation Networks• Distance and Hierarchical Clustering• Minimum Spanning Tree & PMFG• Other filtering• Layout algorithms
3
Hierarchical structure in financial markets
• Mantegna (1999): "Obtain the taxonomy of a portfolio of stocks traded in a financial market by using the information of time series of stock prices only"
• Correlations cannot be used as the metric as they don't fulfil the metric axioms– non-negativity: d(x, y) ≥ 0 – coincidence: d(x, y) = 0 – symmetry: d(x, y) = d(y, x) (symmetry)– subadditivity: d(x, z) ≤ d(x, y) + d(y, z)
• By transforming correlations into a Gower's (1966) distance
where e.g correlation of -1 > 2 ; 0 > 1.41 ; 1 >0
• The resulting distance matrix can be used to look for a hierarchical structure of the assets
Minimum Spanning TreeA spanning tree of a graph is a subgraph that: 1. is a tree and 2. connects all the nodes together
Length of a tree is the sum of its links. Minimum spanning tree (MST) is a spanning tree with shortest length.
MST reflects the hierarchical structure of the correlation matrix
5
MST and Hierarchical Structure
Source: R.N. Mantegna (1999). Hierarchical structure in nancial markets, Eur. Phys. J. B 11, 193-197
6
Single Linkage Clustering
• A method for hierarchical clustering• Clusters based on similarity or distance• SLINK algorithm
R. Sibson (1973). SLINK: an optimally efficient algorithm for the single-link cluster method. The Computer Journal (British Computer Society) 16 (1): 30–34.
36
7
Example
# build network from correlationsbuildbycorrelationd -file daxreturns-2011-recon.csv -missing Alert -preserve false
# calculate distancecorrdistance -p correlation -method gower
# calculate single linkage clisteringslink -p corrdistance
# create heatmapsheatmap -sortv vertex_id -p correlation -symmetric true -cellsizedefault 13 -transition 0 -cellhover correlation -palette darkblue-lightgray-darkred -colordomain (-1)-1 -saveas daxheat-slink-Y
8
Unordered, Principal Component Removed
Ordered by Cluster, Principal Component Removed
9
Radial tree -layout
• Calculates coordinates for radial layout as presented in Bachmaier, Brandes and Schlieper (2005)
• The layout allows definition of each arc length
• Specific parameters of command radialtreeviz:– Arc length property (-p) : Arc property defining arc length. Optional.– Root vertex (-rootvertex) : Id of root vertex. The root vertex is placed in the
middle of the screen. Due to the repositioning of the tree, nodes may be placed outside the canvas in other than the first network. Optional.
– Optimal rotation (-rotation) : Rotates layout to minimize sum of vertex distances between subsequent networks. Optional. By default 'false'.
– Scaling (-scale) : Scale of visualization: value/pixel.
Christian Bachmaier, Ulrik Brandes, and Barbara Schlieper (2005). Drawing Phylogenetic Trees. Department of Computer & Information Science, University of Konstanz, Germany
10
Putting it all together
# build network from correlationsbuildbycorrelationd -file daxreturns-2011.csv -missing Alert -savestdev -savereturns -preserve false
# calculate distancecorrdistance -p correlation -method gower
# calculate single linkage clisteringminst -p corrdistance
# drop arcs not in MSTdropa -e minst=false
# calculate absolute correlationcalcap -e 1-abs(correlation) -saveas vizdistance
# create heatmapsradialtreeviz -p vizdistance -vlabel vertex_id -vsize stdev -transition 3000 -ahover correlation -saveas daxviz-MST
11
Asset Trees
Links between nodes reflect 'backbone' correlations
- short link = high correlation- long link = low correlation
Size of node reflects volatility (variance) of returns
12
Circle Tree -visualization
• Calculates coordinates for circle tree layout as presented in Bachmaier, Brandes and Schlieper (2005)
• As before but instead of radialtreeviz:
circletreeviz -vlabel vertex_id -vsize stdev -transition 3000 -ahover correlation -saveas daxviz-MST-circle
13
Planar Maximally Filtered Graph
• A complex graph with loops and cliques of up to 4 elements. It can be drawn on a planar surface without link crossings.
• MST is contained in PMFG
M. Tumminello, T. Ast, T. Di Matteo and R. N. Mantegna (2005). A Tool for Filtering Information in Complex Systems. PNAS vol. 102 no. 30 pp. 10421–10426
Node size scales with degree
14
PMFG -command
# build network from correlationsbuildbycorrelationd -file daxreturns-2011.csv -missing Alert -savestdev -savereturns -preserve false
# calculate distancecorrdistance -p correlation -method gower
# calculate single linkage clisteringpmfg -p corrdistance
# drop arcs not in MSTdropa -e pmfg=false
# calculate 1-absolute correlationcalcap -e abs(correlation) -saveas vizdistance
# calculate degreedegree
# create heatmapsfrviz -vlabel vertex_id -vsize stdev -atransparency vizdistance -ahover correlation -transition 3000 -ahover correlation -arrows false -saveas daxviz-PMFG
15
Partial Correlation
• Measures the degree of association between two random variables
• What is the direct relationship between Adidas and Allianz, controlling for BASF, BAYER, ... ?
• We build regression models for Adidas and Allianz and look at the correlation of their model residuals (i.e. wgat left unexplained by the other factors) -> Partial correltation
16
Example
# build network from correlationsbuildbypartialcorrelationd -file daxreturns-2011.csv -missing Alert -savestdev -preserve false
# show as heatmapheatmap -sortv vertex_id -p partial_correlation -symmetric true -cellsizedefault 13 -transition 0 -cellhover partial_correlation -palette darkblue-lightgray-darkred -colordomain (-1)-1 -saveas daxheat-partial-Y
17
Correlations
Partial Correlations
18
NETS
• Network Estimation for Time-Series
• Forthcoming paper by Barigozzi and Brownlees
• Estimates an unknown network structure from multivariate data
• Captures both comtemporenous and serial dependence (partial correlations and lead/lag effects)
19
Correlation filtering
Balance between too much and too little information
One of many methods to create networks from correlation/distance matrices
– PMFGs, Partial Correlation Networks, Influence Networks, Granger Causality, NETS, etc.
New graph, information-theory, economics & statistics -based models are being actively developed
PMFG
Influence Network
20
Sammon’s Projection
Iris Setosa
Iris Versicolor
Iris Virginica
Proposed by John W. Sammon in IEEE Transactions on Computers 18: 401–409 (1969)
A nonlinear projection method to map a high dimensional space onto a space oflower dimensionality. Example:
21
Example
# build network from correlationsbuildbycorrelationd -file daxreturns-2011.csv -missing Alert -savestdev -savereturns -preserve false
# calculate distancecorrdistance -p correlation -method gower
# Calculate sammonlayoutsammonlayouta -p corrdistance -saveerror true
# Sum up errorsumaforv -p error -saveas error
# create heatmapssammonaviz -p corrdistance -vlabel vertex_id -vsize error -transition 3000 -ahover error -saveas daxviz-Sammon-Y
Node size reflects error in layout
23
Tutorials
• Tutorial 1 – Loading Networks into FNA• Tutorial 2 – Managing Data in FNA• Tutorial 3 – Network Summary Measures• Tutorial 4 – Centrality Measures• Tutorial 5 – Connectedness and Components• Tutorial 6 – Network Visualization• Tutorial 7 – Correlation Networks• Tutorial 8 – Payment System Simulations• Tutorial 9 – Analyzing Cross-Border Banking
Exposures
Blog, Library and Demos at www.fna.fi
Dr. Kimmo Soramäki [email protected]: soramaki