An Algorithm for Measuring Optimal Connections in Large
Valued Networks
Song Yang Henry Hexmoor
Sociology Computer Science
University of Arkansas
Preparation of this presentation benefits from cogent comments from Jim Hollander
Binary Distance
• In binary graphs, path distance is normally used to indicate the optimal connections between a pair of nodes. This solution assumes that intermediaries are costly.
Binary Distance
• If more intermediaries are necessary to connect a pair of actors, they may extract higher commissions for their services, distort the information content exchanged, and increase the time required to complete a transaction.
EXAMPLE FOR THE DYAD AB PATH LENGTH PATH LENGTH/OPTIMAL CONNECTION
A-B 1 1 A-E-B 2 N/A A-E-D-C-B 4 N/A
D
B
C
A
E
1
1
1
1 1
1
VALUED GRAPHS
• Valued graph is defined as a graph whose lines carry numerical values indicating the intensities of the relationships between all dyads.
• For example, volumes of communications, levels of friendship and trust, or dollar amounts of economic transactions.
Optimal Connections in Valued Graphs
• Previous researchers propose a solution to measure optimal connections in valued graphs. Peay (1980) states that path value, defined as the smallest value attached to any line in a path, indicates the optimal path between a pair of nodes.
Problems
• The problems of Peay’s path value solutions
• How to determine the path value/optimal connection when multiple paths/path values present between two nodes?
• How to account for the transaction costs of exchanges involving many go-betweens?
EXAMPLE FOR THE DYAD AB PATH PATH VALUE/OPTIMAL CONNECTION
A-B 1 A-E-B 3 A-E-D-C-B 2
D
B
C
E
1
3
3
2 4
6
A
Our Solution
• We argue that including binary distance is especially crucial for measuring path strength in a valued graph
• Because it takes into account the costs (in time, energy, or decay of information) required for indirectly connected dyads to reach one another through varying numbers of intermediaries.
APV
• A measure of Average Path Value (APV) between nodes ni and nj is the ratio of path value to distance, indicated by
ij
ijij D
MAPV
APV
• Note that a pair of nodes may have multiple paths, thus containing multiple APVs. We suggest that the highest APV indicates the optimal connection between the pair of nodes.
APV
• So optimal connection permits the highest volume of things such as transactions, messages, contracts, treaties or friendships after controlling for the binary distance between the two nodes.
Path of AB Binary Distance Path Value APV AB 1* 1 1.00 AEB 2 3* 1.50* AEDCB 4 2 0.50
D
B
C
A
E
1
3
3
2 4
6
Applications of APV
• Full Network Data
• Strategic Alliance Network among a set of firms under focus
The Algorithm
• Step 1 involves identifying different connected components in a graph with Union Find Algorithm. A connected component consists of a set of nodes, in which each node can reach every other node in the set.
• Step 2 involves calling of a subroutine called MAPVC to process optimal connections in each connected components
• Step 3 ensures all the connected components are processed and results organized into a matrix for further analyses
MAPVC
• MAPVC considers each node v one at a time and incrementally constructs a path from that node to all other nodes. MAPVC calls a subroutine Maximum APV (MAPV) to process each node
MAPV
• Let us start with v (i)
• First a node v (j) is picked so it has a maximum APV (path values/number of lines) with v (i).
• The path linking v (i) and v (j) becomes the path for subsequent extension.
• Suppose a node v (k) is picked extending the v (i) – v (j) path.
MAPV
• If the path value of v (j) – v (k) path is smaller than the v (i) – v (j) path, the v (j) – v (k) path value will replace the original v (i) – v (j) path value to compute the APV for v (i) – v (k) path
• For every extension, the algorithm picks up path with the largest APV and NEVER extended before.
MAPV• The process continues until every path in the
connected component matrix was either extended or was a terminal path, which was because either no other nodes is reachable or circular path occurs (the path connects back to the beginning node)
• In the end, the algorithm compares different APVs during each stage of path extension and picks up the largest APV to indicate the optimal connection between the node v (i) and node v (k)
MAPV-MAPVC-Union Find
• MAPV for single node in a connected component
• MAPVC calls on MAPV to process all the nodes in a connected components
• Union Find calls on MAPVC to process all the connected components in a graph
• Example
A
D C
E B
3 1
4
3
2
6
1.5
3 1.5
1
Application and Limitation
• Data have to be full network, instead of ego-centered network data
• Does not account for signs of links, always assume positive relations
• Does not account for directions, only for non-directional graphs. In other words, input and output matrices are symmetrical
Data
• Data matrix are strategic alliances among 38 companies in the Informational Technology in 1998
• This dataset comes from a large database focusing on 145 IT companies from 1989 to 2002, collected by David Knoke and his associates.
3COM
ALCATEL
AOL
ATT
BA
BAAN
BCE
BS
BULL
CAI
CI
SCO
COMPAQ
D E L L
DI
SNEY
EMC
ERI
CSSON
FT
FUJI
TSU
HI
TACHI
HP
IBM
INTEL
LUCENT
MI
CROSOFT
MOTOROLA
NEC
NETSCAPE
ORACLE
PHILI
PS
SBC
SI
EMENS
SONY
SUN
TCI
TOSHI
BA
TW
UNI
SYS
USW
3COM 0 1 0 1 1 0 2 1 1 1 2 5 4 0 0 2 0 2 2 5 5 2 4 2 1 0 0 1 0 1 2 0 0 0 4 0 0 1
ALCATEL 1 0 0 0 1 0 1 1 0 0 2 1 0 0 0 1 1 2 0 0 0 1 2 2 0 2 0 0 0 1 1 0 0 0 1 0 0 1
AOL 0 0 0 3 1 0 0 0 0 0 1 3 1 1 0 0 0 0 0 1 3 0 0 2 0 1 4 2 0 0 0 1 2 0 0 1 0 0
ATT 1 0 3 0 1 0 2 1 0 0 3 3 4 2 0 0 0 0 0 5 4 0 2 5 0 0 2 2 1 1 0 1 1 1 0 1 0 1
BA 1 1 1 1 0 0 2 2 0 0 2 3 1 1 0 1 0 1 0 1 1 2 2 4 1 0 1 1 1 5 1 0 1 0 0 1 0 4
BAAN 0 0 0 0 0 0 0 0 1 0 0 2 1 0 1 0 0 0 0 1 3 1 0 3 0 0 0 1 0 0 0 0 1 0 0 0 1 0
BCE 2 1 0 2 2 0 0 1 0 0 3 3 0 0 0 1 1 1 0 1 0 1 3 6 1 0 0 0 0 2 1 1 0 0 0 0 0 1
BS 1 1 0 1 2 0 1 0 0 0 1 2 0 0 0 1 0 1 0 0 0 2 1 4 0 0 0 0 0 2 1 0 0 0 0 0 0 2
BULL 1 0 0 0 0 1 0 0 0 0 0 2 0 0 2 1 1 0 0 0 1 0 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0
CAI 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 0 0 1 0 1 0 1 0 0 0
CISCO 2 2 1 3 2 0 3 1 0 0 0 2 3 1 0 1 1 1 1 5 1 2 2 4 0 0 1 2 1 1 1 0 1 0 0 1 0 4
COMPAQ 5 1 3 3 3 2 3 2 2 1 2 0 5 2 0 2 0 3 2 4 8 7 3 15
1 0 2 3 1 3 1 0 1 0 3 2 2 2
DELL 4 0 1 4 1 1 0 0 0 1 3 5 0 1 0 1 0 2 2 3 5 1 0 3 1 0 1 2 0 2 0 0 1 0 3 1 3 2
DISNEY 0 0 1 2 1 0 0 0 0 0 1 2 1 0 0 0 0 0 0 1 1 1 0 2 0 0 1 1 0 0 0 1 1 0 0 1 0 0
EMC 0 0 0 0 0 1 0 0 2 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 1 0
ERICSSON 2 1 0 0 1 0 1 1 1 0 1 2 1 0 0 0 1 1 1 0 1 2 2 1 3 1 0 1 0 1 2 0 1 0 1 0 0 1
FT 0 1 0 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
FUJITSU 2 2 0 0 1 0 1 1 0 0 1 3 2 0 0 1 0 0 2 1 4 2 1 4 0 1 1 0 1 1 1 1 0 0 2 0 1 1
HITACHI 2 0 0 0 0 0 0 0 0 0 1 2 2 0 1 1 0 2 0 1 2 1 0 2 0 1 0 1 2 0 1 2 0 0 4 1 0 0
HP 5 0 1 5 1 1 1 0 0 0 5 4 3 1 1 0 0 1 1 0 5 0 3 10
1 0 1 4 0 0 1 1 3 0 1 1 2 1
IBM 5 0 3 4 1 3 0 0 1 0 1 8 5 1 0 1 1 4 2 5 0 4 2 7 1 0 2 3 0 0 0 2 4 0 3 1 2 0
INTEL 2 1 0 0 2 1 1 2 0 0 2 7 1 1 0 2 0 2 1 0 4 0 1 9 1 0 1 0 2 2 1 2 0 0 2 0 1 2
LUCENT 4 2 0 2 2 0 3 1 0 0 2 3 0 0 1 2 0 1 0 3 2 1 0 3 1 2 0 0 1 3 1 0 2 0 0 0 1 2
MICROSOFT 2 2 2 5 4 3 6 4 1 0 4 15
3 2 0 1 0 4 2 10
7 9 3 0 0 1 2 3 1 2 2 4 2 2 0 3 1 4
MOTOROLA 1 0 0 0 1 0 1 0 1 0 0 1 1 0 0 3 0 0 0 1 1 1 1 0 0 0 0 1 0 0 2 0 1 0 1 0 0 0
NEC 0 2 1 0 0 0 0 0 1 0 0 0 0 0 1 1 0 1 1 0 0 0 2 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0
NETSCAPE 0 0 4 2 1 0 0 0 0 2 1 2 1 1 0 0 0 1 0 1 2 1 0 2 0 0 0 4 0 0 0 0 2 0 0 1 0 0
ORACLE 1 0 2 2 1 1 0 0 1 2 2 3 2 1 1 1 0 0 1 4 3 0 0 3 1 0 4 0 0 0 1 0 2 0 0 1 1 1
PHILIPS 0 0 0 1 1 0 0 0 0 0 1 1 0 0 0 0 0 1 2 0 0 2 1 1 0 0 0 0 0 1 0 2 0 0 1 1 1 1
SBC 1 1 0 1 5 0 2 2 0 0 1 3 2 0 0 1 0 1 0 0 0 2 3 2 0 0 0 0 1 0 1 0 0 0 0 0 0 4
SIEMENS 2 1 0 0 1 0 1 1 0 1 1 1 0 0 1 2 0 1 1 1 0 1 1 2 2 1 0 1 0 1 0 0 0 0 1 0 0 1
SONY 0 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 1 2 1 2 2 0 4 0 0 0 0 2 0 0 0 1 0 2 1 0 0
SUN 0 0 2 1 1 1 0 0 0 1 1 1 1 1 0 1 0 0 0 3 4 0 2 2 1 1 2 2 0 0 0 1 0 2 0 1 0 1
TCI 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 2 0 0 2 0 0
TOSHIBA 4 1 0 0 0 0 0 0 0 1 0 3 3 0 0 1 0 2 4 1 3 2 0 0 1 0 0 0 1 0 1 2 0 0 0 1 0 0
TW 0 0 1 1 1 0 0 0 0 0 1 2 1 1 0 0 0 0 1 1 1 0 0 3 0 0 1 1 1 0 0 1 1 2 1 0 0 0
UNISYS 0 0 0 0 0 1 0 0 0 0 0 2 3 0 1 0 0 1 0 2 2 1 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0
USW 1 1 0 1 4 0 1 2 0 0 4 2 2 0 0 1 0 1 0 1 0 2 2 4 0 0 0 1 1 4 1 0 1 0 0 0 0 0