measuring segregation in social networks · gh1=m ++1. s newman = p k g=1 p gg p k g=1 p g+p +g 1 p...
Post on 13-Feb-2020
6 Views
Preview:
TRANSCRIPT
Introduction Problem Approach Properties Measures Summary
Measuring Segregation in Social Networks
Micha l Bojanowski Rense Corten
ICS/Sociology, Utrecht University
July 2, 2010Sunbelt XXX, Riva del Garda
Introduction Problem Approach Properties Measures Summary
Outline
1 IntroductionHomophily and segregation
2 Problem3 Approach
ApproachNotation
4 PropertiesTiesNodesNetwork
5 Measures
6 Summary
Introduction Problem Approach Properties Measures Summary
Homophily and segregation
Homophily and segregation
Homophily Contact between similar people occurs at a higherrate than among dissimilar people (McPherson,Smith-Lovin, & Cook, 2001).
Segregation Nonrandom allocation of people who belong todifferent groups into social positions and theassociated social and physical distances betweengroups (Bruch & Mare, 2009).
Introduction Problem Approach Properties Measures Summary
Homophily and segregation
Homophily and segregation
Homophily Contact between similar people occurs at a higherrate than among dissimilar people (McPherson,Smith-Lovin, & Cook, 2001).
Segregation Nonrandom allocation of people who belong todifferent groups into social positions and theassociated social and physical distances betweengroups (Bruch & Mare, 2009).
Introduction Problem Approach Properties Measures Summary
Homophily and segregation
Homophily: Friendship selection in school classes
Moody (2001)
Introduction Problem Approach Properties Measures Summary
Homophily and segregation
Residential segregation in Seattle
Blacks Asians Whites
Source: Seattle Civil Rights and Labor History Project
Introduction Problem Approach Properties Measures Summary
Homophily and segregation
Segregation in network terms
Neighborhood structure can beconceptualized as a network inwhich links correspond to neigh-borhood proximities.
0 1 2 3 4 5
6 7 8 9 10 11
12 13 14 15 16 17
18 19 20 21 22 23
24 25 26 27 28 29
Introduction Problem Approach Properties Measures Summary
Homophily and segregation
Assumption
In static terms homophily and segregation correspond to thesame network phenomenon.
We will stick with the segregation label.
Introduction Problem Approach Properties Measures Summary
Measurement problem
To be able to compare the levels of segregation of differentnetworks (different school classes, different cities etc.) we need ameasure.
Introduction Problem Approach Properties Measures Summary
Problems with measures
There exist an abundance of measures in the literature, but:
Stem from different research streams
Follow different logics
Hardly ever refer to each other
Lead to different conclusions given the same problems (data)
So, the problems are:
Which one to select in a given setting?
On what grounds such selection should be performed?
Introduction Problem Approach Properties Measures Summary
Approach
Possible approaches
Empirical Assemble a large set of empirical datasets. Calculatethe measures for all of them. Look how theycorrelate. Perhaps through PCA or alike.
Theo-pirical Take a set of probabilistic models of networks(Erdos-Renyi random graph, preferential attachment,small-world etc.). Generate a collection of networks.Proceed as in the item above.
Theoretical Come-up with a set of properties that the measuresmight (or might not) posses. Evaluate the differencesbetween the measures in terms of satisfying (or not)certain properties.
Introduction Problem Approach Properties Measures Summary
Approach
Possible approaches
Empirical Assemble a large set of empirical datasets. Calculatethe measures for all of them. Look how theycorrelate. Perhaps through PCA or alike.
Theo-pirical Take a set of probabilistic models of networks(Erdos-Renyi random graph, preferential attachment,small-world etc.). Generate a collection of networks.Proceed as in the item above.
Theoretical Come-up with a set of properties that the measuresmight (or might not) posses. Evaluate the differencesbetween the measures in terms of satisfying (or not)certain properties.
Introduction Problem Approach Properties Measures Summary
Approach
Possible approaches
Empirical Assemble a large set of empirical datasets. Calculatethe measures for all of them. Look how theycorrelate. Perhaps through PCA or alike.
Theo-pirical Take a set of probabilistic models of networks(Erdos-Renyi random graph, preferential attachment,small-world etc.). Generate a collection of networks.Proceed as in the item above.
Theoretical Come-up with a set of properties that the measuresmight (or might not) posses. Evaluate the differencesbetween the measures in terms of satisfying (or not)certain properties.
Introduction Problem Approach Properties Measures Summary
Approach
Possible approaches
Empirical Assemble a large set of empirical datasets. Calculatethe measures for all of them. Look how theycorrelate. Perhaps through PCA or alike.
Theo-pirical Take a set of probabilistic models of networks(Erdos-Renyi random graph, preferential attachment,small-world etc.). Generate a collection of networks.Proceed as in the item above.
Theoretical Come-up with a set of properties that the measuresmight (or might not) posses. Evaluate the differencesbetween the measures in terms of satisfying (or not)certain properties.
Introduction Problem Approach Properties Measures Summary
Approach
Possible approaches
Empirical Assemble a large set of empirical datasets. Calculatethe measures for all of them. Look how theycorrelate. Perhaps through PCA or alike.
Theo-pirical Take a set of probabilistic models of networks(Erdos-Renyi random graph, preferential attachment,small-world etc.). Generate a collection of networks.Proceed as in the item above.
Theoretical Come-up with a set of properties that the measuresmight (or might not) posses. Evaluate the differencesbetween the measures in terms of satisfying (or not)certain properties.
Introduction Problem Approach Properties Measures Summary
Notation
Actors
Actors N = {1, 2, . . . , i , . . . ,N}Groups of actors Actors are assigned into K exhaustive and
mutually exclusive groups.G = {G1, . . . ,Gk , . . . ,GK}.Group membership is denoted with “type vector”:
t = [t1, . . . , ti , . . . , tN ] where ti ∈ {1, . . . ,K}
ti = group of actor iLet T be a set of all possible type vectors for N .
Introduction Problem Approach Properties Measures Summary
Notation
Network
Network Actors form an undirected network which is a squarebinary matrix X = [xij ]N×N . Let X be a set of allpossible networks over actors in N .
Mixing matrix A three-dimensional array M = [mghy ]K×K×2
defined as
mgh1 =∑i∈Gg
∑j∈Gh
xij mgh0 =∑i∈Gg
∑j∈Gh
(1− xij)
Introduction Problem Approach Properties Measures Summary
Notation
Segregation index
Segregation measure A generic segregation index S(·):
S : X × T 7→ <
For a given network and type vector assign a realnumber.
Introduction Problem Approach Properties Measures Summary
Ties
Adding between-group ties
Property (Monotonicity in between-group ties: MBG)
Let there be two networks X and Y defined on the same set ofnodes, a type vector t, and two nodes i and j such that ti 6= tj ,xij = 0, and yij = 1. For all the other nodes p, q 6= i , j xpq = ypq,i.e. the networks X and Y are identical.Network segregation index S is monotonic in between-groupties iff
S(X , t) ≥ S(Y , t)
In words: adding a between-group tie cannot increase segregation.
Introduction Problem Approach Properties Measures Summary
Ties
Adding within-group ties
Property (Monotonicity in within-group ties: MWG)
Let there be two networks X and Y defined on the same set ofnodes, a type vector t, and two nodes i and j such that ti = tj ,xij = 0 and yij = 1. For all the other nodes p, q 6= i , j xpg = ypg ,i.e. the networks X and Y are identical.Network segregation index S is monotonic in within-group tiesiff
S(X , t) ≤ S(Y , t)
In words: adding a within-group tie to the network cannot decreasesegregation.
Introduction Problem Approach Properties Measures Summary
Ties
Rewiring between-group tie to within-group
Property (Monotonicity in rewiring: MR)
Let there be two networks X and Y , a type vector t and threenodes i , j and k such that
1 xij = 1 and ti 6= tj2 yij = 0, yik = 1, and ti = tk
That is, an between-group tie ij in X is rewired to a within-grouptie ik in Y .Network segregation index S is monotonic in rewiring iff
S(X , t) ≤ S(Y , t)
Introduction Problem Approach Properties Measures Summary
Nodes
Adding isolates
Property (Effect of adding isolates: ISO)
Define two networks X = [xij ]N×N and Y = [ypq]N+1×N+1 andassociated type vectors u and w which are identical for the Nactors and differ by an (N + 1)-th node which is an isolate:
1 ∀p, q ∈ 1..N ypq = xpq
2∑N+1
p=1 yp N+1 =∑N+1
q=1 yN+1 q = 0.
3 ∀k ∈ 1..N wk = uk .
S(X ,u) ? S(X ,w)
In words: how does the segregation level change if isolates areadded to the network?
Introduction Problem Approach Properties Measures Summary
Network
Duplicating the network
Property (Symmetry: S)
Define two identical networks X and Y and some type vector t.Network segregation index S satisfies symmetry iff
S(X , t) = S(Y , t) = S(Z , z)
where the network Z is constructed by considering X and Ytogether as a single network, namely: Z = [zpq]2N×2N such that
∀p, q ∈ {1, . . . ,N} zpq = xpq
∀p, q ∈ {N + 1, . . . , 2N} zpq = ypq
otherwise zpq = 0
Introduction Problem Approach Properties Measures Summary
Measures
Freeman’s segregation index (Freeman, 1978)
Spectral Segregation Index (Echenique & Fryer, 2007)
Assortativity coefficient (Newman, 2003)
Gupta-Anderson-May’s Q (Gupta et al, 1989)
Coleman’s Homophily Index (Coleman, 1958)
Segregation Matrix index (Freshtman, 1997)
Exponential Random Graph Models (Snijders et al, 2006)
Conditional Log-linear models for mixing matrix (Koehly,Goodreau & Morris, 2004)
Introduction Problem Approach Properties Measures Summary
Measure LevelNetwork
typeScale
Freeman network U [0; 1]SSI node U [0;∞]
Assortativity network D/U [−∑
g pg+p+g
1−∑
g pg+p+g; 1]
Gupta-Anderson-May network D/U [− 1G−1
; 1]
Coleman group D [−1; 1]Segregation Matrix Index group D/U [−1; 1]Uniform homophily (CLL) network D/U [−∞;∞]Differential homophily (CLL) group D/U [−∞;∞]Uniform homophily (ERGM) network D/U [−∞;∞]Differential homophily (ERGM) group D/U [−∞;∞]
Introduction Problem Approach Properties Measures Summary
Freeman (1978)
Given two groups
SFreeman = 1− p
π
where p is the observed proportion of between-group ties and π isthe expected proportion given that ties are created randomly. Itvaries between 0 (random network) and 1 (full segregation ofgroups).
Introduction Problem Approach Properties Measures Summary
Assortativity Coefficient, Newman (2003)
Based on a contact layer of the mixing matrix pgh = mgh1/m++1.
SNewman =
∑Kg=1 pgg −
∑Kg=1 pg+p+g
1−∑K
g=1 pg+p+g
Maximum of 1 for perfect segregation; 0 for random network.Negative values for “dissasortative” networks. Minimum dependson the density.
Introduction Problem Approach Properties Measures Summary
Gupta, Anderson & May 1989
Also based on contact layer of the mixing matrix
SGAM =
∑Kg=1 λg − 1
K − 1
Where λg are eigenvalues of pgh. It varies between −1/(K − 1)and 1
Introduction Problem Approach Properties Measures Summary
Coleman, 1958
Expected number of ties within group g
m∗gg =∑i∈Gg
ηing − 1
N − 1
SgColeman =
mgg −m∗gg∑i∈Gg
ηi −m∗ggwhere mgg >= m∗gg (1)
SgColeman =
mgg −m∗ggm∗gg
where mgg < m∗gg (2)
Introduction Problem Approach Properties Measures Summary
Segregation matrix index, Freshtman 1997
SSMI =d11 − d12
d11 + d12(3)
where d11 is the density of within-group ties and d12 is the densityof between-group ties.
Introduction Problem Approach Properties Measures Summary
Conditional Log-Linear Models (Koehly et al, 2004)
log mgh1 = µ+ λAg + λBh + λUHOMgh
{λUHOMgh = λUHOM g = h
λUHOMgh = 0 g 6= h
log mgh1 = µ+ λAg + λBh + λDHOMgh
{λDHOMgh = λDHOM
g g = h
λDHOMgh = 0 g 6= h
Parameters λUHOM and λDHOMg as measures of
homophily/segregation.
Introduction Problem Approach Properties Measures Summary
ERGM
Exponential Random Graph models
log
(mgh1
mgh0
)= α + βAg + βBh + βUHOM
gh
{βUHOMgh = βUHOM g = h
βUHOMgh = 0 g 6= h
log
(mgh1
mgh0
)= µ+ βAg + βBh + βDHOM
gh
{βDHOMgh = βDHOM
g g = h
βDHOMgh = 0 g 6= h
Parameters βUHOM and βDHOMg as measures of
homophily/segregation.
Introduction Problem Approach Properties Measures Summary
Spectral Segregation Index, Echenique & Fryer (2007)
Segregation level of individual i in group g in component B:
sgi (B) =1
SgCi
∑j
rijsgj (B) (4)
where rij are entries in a row-normalized adjacency matrix.Segregation of individual i
S iSSI =
li
lλ (5)
where λ is the largest eigenvalue of B, and l is the correspondingeigenvector
Introduction Problem Approach Properties Measures Summary
SSI (2)
01
2
3
4
5
67
8
9
10
11
12
13
14
15
16
17
18
19
20
21
2223
24
25
26
2728
29
Node segregation in White's kinship data
Mother
Sister
Brother's Wife
Sister's Daughter
Brother's Daughter
Father Brother
Sister's Husband
Brother's SonSister's Son
●
MenWomen
Introduction Problem Approach Properties Measures Summary
Summary
Measure MBG (↘) MWG (↗) MR (↗) ISO S (→)
Freeman l l ↗ l ↘SSI ↘ ↗ ↗ ↘ →Assortativity ↘ ↗ ↗ → →Gupta-Anderson-May ↘ ↗ ↗ → →Coleman ↘ ↗ ↗ l ↘Segregation Matrix Index ↘ ↗ ↗ l →Uniform homophily (CLL) ↘ ↗ ↗ → →Differential homophily (CLL) ↘ ↗ ↗ → →Uniform homophily (ERGM) ↘ ↗ ↗ l →Differential homophily (ERGM) ↘ ↗ ↗ l →
Introduction Problem Approach Properties Measures Summary
Summary
Measures on different levels: individuals, groups, globalnetwork
Different zero points: random graph, proportionate mixing,full integration
MBW, MWG not very informative, all measures satisfy them.
Symmetry: All but two measures satisfy it, Coleman andFreeman decrease.
Introduction Problem Approach Properties Measures Summary
Summary: adding isolates
Measures based on contact layer of mixing matrix areinsensitive to isolates.
SSI is the only one that always decreases
The effect on others depend on relative group sizes.
Introduction Problem Approach Properties Measures Summary
Summary
Measures based on contact layer of the mixing matrixsummarize probability of node attribute combination giventhat the tie exists (CLL, assortativity, GAM): explainingattributes given the network.
Measures that take also disconnected dyads into account.(ERGM, Freeman, SSI): explaining tie formation given theattributes.
Introduction Problem Approach Properties Measures Summary
Further questions
Stricter formal analysis (axiomatizations). SSI is the onlymeasure derived axiomatically.
Link to behavioral models: how the segregation comes about.For example
Network formation game further justifying Bonacich centrality(Ballester et al., 2006)Coleman’s index in Currarini et al. (2010).
Introduction Problem Approach Properties Measures Summary
Thanks
Thanks!
http://www.bojanorama.pl
top related