social bridges in urban purchase behavior - mit media labxdong/talk/lincolnlab_social... · 2017....
TRANSCRIPT
Social bridges in urban purchase behavior
Xiaowen DongMIT Media Lab
Cambridge, MA, June 2017
with Yoshihiko Suhara, Burçin Bozkaya, Vivek K. Singh, Bruno Lepri and Alex ‘Sandy’ Pentland
/22
Introduction
2
New data sources about human behavior are emerging
/22
Introduction
2
New data sources about human behavior are emerging
Computational social science (CSS): A paradigm shift in social science
t
/22
Introduction
2
Current population management:
- demographics- individual records- static information
The new way:- behavioral traits- collective behavior- dynamics
New data sources about human behavior are emerging
Computational social science (CSS): A paradigm shift in social science
t
Practical impact
/22
Introduction
2
Current population management:
- demographics- individual records- static information
The new way:- behavioral traits- collective behavior- dynamics
New data sources about human behavior are emerging
Computational social science (CSS): A paradigm shift in social science
t
Practical impact
How communication affects human decision-making?
/22
Introduction
3
/22
Introduction
3
pi /popi
di
• Classical purchase behavior models treat individual purchases separately (Huff, 1964)
/22
Introduction
3
• Study of purchase behavior influence is largely based on socio-demographics (Zeithaml, 1985)
pi /popi
di
• Classical purchase behavior models treat individual purchases separately (Huff, 1964)
/22
Introduction
3
• Study of purchase behavior influence is largely based on socio-demographics (Zeithaml, 1985)
pi /popi
di
• Word-of-mouth and physical exposure are powerful sources of behavioral propagation (Arndt, 1967; Bikhchandani, 1998; Algesheimer, 2005), but their effectiveness in modern city environment remains unknown
• Classical purchase behavior models treat individual purchases separately (Huff, 1964)
/22
Introduction• Hypothesis
4
- Physical exposure at work environment promotes idea exchange
/22
Introduction• Hypothesis
4
- Physical exposure at work environment promotes idea exchange- Individuals living in different communities but sharing similar work locations act as
social bridges between communities
/22
Introduction• Hypothesis
4
- Physical exposure at work environment promotes idea exchange- Individuals living in different communities but sharing similar work locations act as
social bridges between communities
/22
Introduction• Hypothesis
4
- Physical exposure at work environment promotes idea exchange- Individuals living in different communities but sharing similar work locations act as
social bridges between communities
• Test at city scale
/22
Data set• A large-scale credit card transaction data set in two cities in an OECD
country during 3 months
5
/22
Methods• Urban communities
6
/22
Methods• Urban communities
6
• Number of social bridges between communities
bdg(I, J) = |{i, j}|
s.t. i 2 I, j 2 J, D(Li, Lj) d
/22
Methods• Urban communities
6
• Number of social bridges between communities
# bridges = 4 # bridges = 4
bdg(I, J) = |{i, j}|
s.t. i 2 I, j 2 J, D(Li, Lj) d
/22
Methods• Three behavioral indexes
- choice: number of co-visited stores- temporal: similarity between temporal distributions of purchases- spending: sum of differences in median spending amount of different categories
7
/22
Methods• Three behavioral indexes
- choice: number of co-visited stores- temporal: similarity between temporal distributions of purchases- spending: sum of differences in median spending amount of different categories
7
• Remark- exclude transactions during working hours- exclude transactions at stores in home/work neighborhoods
/22
Social bridge and behavioral indexes
8
0 1 2 3 4 5 6 7 8 9−50
0
50
100
150
200
250
log2( bdg(I,J) +1)+1
mdi
ff(I,J
)
0 1 2 3 4 5 6 7 8 90.4
0.5
0.6
0.7
0.8
0.9
1
log2( bdg(I,J) +1)+1
tsim
(I,J)
0 1 2 3 4 5 6 7 8 90
100
200
300
400
500
log2( bdg(I,J) +1)+1
covi
sit(I
,J)
0 1 2 3 4 5 6 7 8 9−50
0
50
100
150
200
250
log2( bdg(I,J) +1)+1
mdi
ff(I,J
)
0 1 2 3 4 5 6 7 8 90.4
0.5
0.6
0.7
0.8
0.9
1
log2( bdg(I,J) +1)+1
tsim
(I,J)
0 1 2 3 4 5 6 7 8 90
500
1000
1500
2000
2500
log2( bdg(I,J) +1)+1
covi
sit(I
,J)
City
BCi
ty A
Choice (co-visits) Temporal Spending
/22
Social bridge and purchase similarity (co-visits)• Multiple OLS regression analysis
- dependent variable (DV): # co-visits (between community pair)- independent variables (IV): # social bridges- confounding variables: population, distance, demographics, income
9
/22
Social bridge and purchase similarity (co-visits)• Multiple OLS regression analysis
- dependent variable (DV): # co-visits (between community pair)- independent variables (IV): # social bridges- confounding variables: population, distance, demographics, income
9
• Remark- entries are not independent in DV and IV- Quadratic Assignment Procedure (QAP) to test statistical significance
‣ random shuffling of communities in DV
‣ re-application of OLS
/22
Social bridge and purchase similarity (co-visits)• Regression coefficients
10
/22
Social bridge and purchase similarity (co-visits)• Regression coefficients
10
Social bridge is a stronger indicator of similar purchase behavior
/22
Social bridge and purchase similarity (co-visits)• Histogram of distance between co-visited store and co-working location
11
0 20 40 60 80 1000
1
2
3
4
5 x 104
distance (km)
freq
uenc
y co
unt
0 20 40 60 80 1000
1
2
3
4
5
6
7
8 x 105
distance (km)
freq
uenc
y co
unt
City A (62% > 2km) City B (74% > 2km)
Co-visitation is not simply due to proximity between co-visited store and co-working location
/22
Co-visits by two types of customers• Bridge customers vs. Non-bridge customers
12
/22
Co-visits by two types of customers• Bridge customers vs. Non-bridge customers
12
Bridge customers
/22
Co-visits by two types of customers• Bridge customers vs. Non-bridge customers
12
Bridge customers
Non-bridge customers
/22
Co-visits by two types of customers• Histogram of ratio of bridge customers
13
−0.1 0 0.1 0.2 0.3 0.4 0.50
1
2
3
4
5
6
7
8 x 104
ratio of bridge customers
freq
uenc
y co
unt
−0.1 0 0.1 0.2 0.3 0.4 0.50
1
2
3
4
5
6
7
8 x 104
ratio of bridge customers
freq
uenc
y co
unt
City A City B
Ratio of bridge customers are relatively small
/22
Co-visits by two types of customers• Percentage of co-visits by bridge customers
14
City A City B
A large portion of co-visits are by non-bridge customers
/22
Co-visits by two types of customers• Regression coefficients
15
/22
Co-visits by two types of customers• Regression coefficients
15
Social bridge is a indicator of similar purchase behavior even for non-bridge customers
/22
Co-visits in three merchant categories• Regression coefficients
16
/22
Co-visits in three merchant categories• Regression coefficients
16
Effect of social bridge is stronger for restaurants but weaker for supermarkets
/22
Gender difference in social bridge• Regression coefficients
17
/22
Gender difference in social bridge• Regression coefficients
17
Female-female bridges show a stronger effect
/22
Comparison with a null model
18
pi /popi
di
• Purchase choices are influenced by merchant popularity and location (Huff, 1964)
/22
Comparison with a null model
18
pis =uisPs2S uis
=A↵1
s /D↵2isP
s2S(A↵1s /D↵2
is )
pi /popi
di
• Purchase choices are influenced by merchant popularity and location (Huff, 1964)
probability customer i visits store s
/22
Comparison with a null model
18
pis =uisPs2S uis
=A↵1
s /D↵2isP
s2S(A↵1s /D↵2
is )
pi /popi
di
• Purchase choices are influenced by merchant popularity and location (Huff, 1964)
probability customer i visits store s
popularity of store s
distance between customer i and store s
/22
Comparison with a null model• Simulate individual purchases and co-visitation between communities• Compare the regression coefficient with the empirical one
19
/22
Comparison with a null model• Simulate individual purchases and co-visitation between communities• Compare the regression coefficient with the empirical one
19
City A City B
Effect of social bridge is not simply due to merchant popularity and location
/22
Influence of distance threshold• Regression coefficient as a function of distance d
20
/22
Influence of distance threshold• Regression coefficient as a function of distance d
20
City A City B
0 0.10 0.14 0.21 0.30 0.43 0.62 0.89 1.27 1.83
−0.2
0
0.2
0.4
0.6
0.8
1
distance threshold (km)
regr
essi
on c
oeffi
cien
t
co−visits by allco−visits by bridge cus.co−visits by nonbridge cus.co−visits by all (shuffled network)co−visits by bridge cus. (shuffled network)co−visits by nonbridge cus. (shuffled network)
0 0.10 0.14 0.21 0.30 0.43 0.62 0.89 1.27 1.83
−0.2
0
0.2
0.4
0.6
0.8
1
distance threshold (km)
regr
essi
on c
oeffi
cien
t
co−visits by allco−visits by bridge cus.co−visits by nonbridge cus.co−visits by all (shuffled network)co−visits by bridge cus. (shuffled network)co−visits by nonbridge cus. (shuffled network)
/22
Influence of distance threshold• Regression coefficient as a function of distance d
20
City A City B
0 0.10 0.14 0.21 0.30 0.43 0.62 0.89 1.27 1.83
−0.2
0
0.2
0.4
0.6
0.8
1
distance threshold (km)
regr
essi
on c
oeffi
cien
t
co−visits by allco−visits by bridge cus.co−visits by nonbridge cus.co−visits by all (shuffled network)co−visits by bridge cus. (shuffled network)co−visits by nonbridge cus. (shuffled network)
0 0.10 0.14 0.21 0.30 0.43 0.62 0.89 1.27 1.83
−0.2
0
0.2
0.4
0.6
0.8
1
distance threshold (km)
regr
essi
on c
oeffi
cien
t
co−visits by allco−visits by bridge cus.co−visits by nonbridge cus.co−visits by all (shuffled network)co−visits by bridge cus. (shuffled network)co−visits by nonbridge cus. (shuffled network)
/22
Influence of distance threshold• Regression coefficient as a function of distance d
20
City A City B
0 0.10 0.14 0.21 0.30 0.43 0.62 0.89 1.27 1.83
−0.2
0
0.2
0.4
0.6
0.8
1
distance threshold (km)
regr
essi
on c
oeffi
cien
t
co−visits by allco−visits by bridge cus.co−visits by nonbridge cus.co−visits by all (shuffled network)co−visits by bridge cus. (shuffled network)co−visits by nonbridge cus. (shuffled network)
0 0.10 0.14 0.21 0.30 0.43 0.62 0.89 1.27 1.83
−0.2
0
0.2
0.4
0.6
0.8
1
distance threshold (km)
regr
essi
on c
oeffi
cien
t
co−visits by allco−visits by bridge cus.co−visits by nonbridge cus.co−visits by all (shuffled network)co−visits by bridge cus. (shuffled network)co−visits by nonbridge cus. (shuffled network)
/22
Influence of distance threshold• Regression coefficient as a function of distance d
20
City A City B
0 0.10 0.14 0.21 0.30 0.43 0.62 0.89 1.27 1.83
−0.2
0
0.2
0.4
0.6
0.8
1
distance threshold (km)
regr
essi
on c
oeffi
cien
t
co−visits by allco−visits by bridge cus.co−visits by nonbridge cus.co−visits by all (shuffled network)co−visits by bridge cus. (shuffled network)co−visits by nonbridge cus. (shuffled network)
0 0.10 0.14 0.21 0.30 0.43 0.62 0.89 1.27 1.83
−0.2
0
0.2
0.4
0.6
0.8
1
distance threshold (km)
regr
essi
on c
oeffi
cien
t
co−visits by allco−visits by bridge cus.co−visits by nonbridge cus.co−visits by all (shuffled network)co−visits by bridge cus. (shuffled network)co−visits by nonbridge cus. (shuffled network)
Peak region of blue curve (co-visits by non-bridge customers) suggests geographical constraint for social bridge effect
/22
Application: Prediction of co-visits• Three-class classification: small, medium, large amount of co-visitation• For each IV (feature), train on 20% of communities and test on the rest
80%, using LIBSVM (Chang, 2011)
21
/22
Application: Prediction of co-visits• Three-class classification: small, medium, large amount of co-visitation• For each IV (feature), train on 20% of communities and test on the rest
80%, using LIBSVM (Chang, 2011)
21
Social bridge is more efficient in predicting co-visitation than traditional factors
/22
Discussion• Social bridge captures a form of social learning due to physical exposure:
similar to “the familiar stranger” (Milgram, 1977)
22
/22
Discussion• Social bridge captures a form of social learning due to physical exposure:
similar to “the familiar stranger” (Milgram, 1977)
22
• Bridge customers are conceptually similar to “structural hole spanners” (Lou, 2013)
/22
Discussion• Social bridge captures a form of social learning due to physical exposure:
similar to “the familiar stranger” (Milgram, 1977)
22
• Bridge customers are conceptually similar to “structural hole spanners” (Lou, 2013)
• Easy to compute: as long as location information (social media, etc.) is available
/22
Discussion• Social bridge captures a form of social learning due to physical exposure:
similar to “the familiar stranger” (Milgram, 1977)
22
• Bridge customers are conceptually similar to “structural hole spanners” (Lou, 2013)
• Easy to compute: as long as location information (social media, etc.) is available
• No causal relation, but tested against demographics and null model based on popularity and distance (Huff, 1964)
/22
Discussion• Social bridge captures a form of social learning due to physical exposure:
similar to “the familiar stranger” (Milgram, 1977)
22
• Bridge customers are conceptually similar to “structural hole spanners” (Lou, 2013)
• Easy to compute: as long as location information (social media, etc.) is available
• No causal relation, but tested against demographics and null model based on popularity and distance (Huff, 1964)
• Strong correlation can lead to applications such as behavior prediction and stratification, campaign targeting, and resource allocation