measurement of social capital: recall errors and bias estimations

1

Kuo-hsien Su, National Taiwan University

Nan Lin, Academia Sinica and Duke University

Measurement of Social Capital: Recall Errors and Bias Estimations

20

51

01

5P

erc

ent

-20 -10 0 10 20Differences in number of positions accessed (wave II - wave I)

Change in number of positions accessed from wave I to wave II (N=2,707 respondents)

No change : 12%

Decrease : 52.9%

Increase : 35%

3

Differences between the sets of accessed positions during two interviews may reflect…

4

Motivations

Measurement instability poses a serious challenge to the study of network changes.

Need a clear measurement or better understanding of the possible sources of error.

The two periods panel survey provided an opportunity (1) to model factors associated with changes in accessed position (2) to detect whether the respondent forgot a subsequently/previously named contact .

5

Prior research

Forgetting is a pervasive phenomenon in the elicitation of network contacts.

Research on forgetfulness has been disproportionately based on name generator instrument.

Little research on the reliability of position generator.

6

Tasks

7

Data

Social Capital Project: the Taiwan Survey, conducted in late 2004 and 2006

Consists of 1,695 men and 1,585 women aged 20-65.

8

Problem of Non-response

Wave I2004N =

3,280

Wave I2004N =

3,280

Wave II2006N =

2,710

Wave II2006N =

2,710Re-interview = 82.6%Non-response = 17.4%

9

Table 1. Characteristics of the follow-up and non-response sub-sample

　　 Full sample Follow-up Non-response

sample　 (N=3280) (N=2710) (N=570)　　 Mean % Mean % Mean %

Gender 　　　　 Male 51.7% 　 51.5% 　 52.5%

　 Female 48.3% 　 48.5% 　 47.5%Age 41.3 41.7 39.5 　Years of schooling 11.7 11.7 11.8 　Marital Status 　　　　 Single 23.9% 　 22.8% 　 29.1%

　 Married/cohab 70.2% 　 71.4% 　 64.4%　 Widow/divorced 6.0% 　 5.8% 　 6.5%Network resource indices 　　　　 Extensity 8.5 8.5 8.2 　　 Upper reachability 62.4 62.8 60.4 　　 Range of prestige 36.7 　 37.0 　 35.1 　

10

Three types of research designs (Brewer, 2000).

11

Limitations of our data

Our survey was not designed to examine forgetting specifically.

No recognition data or objective records to compare with.

Two years interval is too long: Test-retest design is usually within a very short time interval.

12

Revised method C: Comparison of accessed positions elicited in two separate interviews

Wave II2006

Wave I2004

How many years have you known this person ？

2005

Forgetting = (Contact mentioned in wave II but not mentioned in wave I) AND (duration >= 3 years)Assumption: durations reported in wave II are more or less accurate.

Whether the respondent forgot a subsequently named contact？

13

Coding scheme for tie changes

Wave II (2006)

NO YES

Wave I (2004)

NO

(1) Consistent “NO”

(2) New contacts (less than 3 years)

(3) Forgetting at wave I (more than 3 years)

YES

(4) Lost contact /Forgetting at wave II

(5) Consistent “YES”

The distribution of length of relationship of forgotten ties (N=4,332 dyads, 7.3%)0

.02

.04

.06

.08

.1D

ensi

ty

0 20 40 60Length of relationship (in years)

The average duration of ties forgotten is 13 years

15

How much does the respondent forget？

Wave I Wave IIknow more than 3 years?

Categories N Percent

YES YES 　 Consistent "YES"

14,330 49.9%

NO YES NO New contact 1,240 4.3%

　　 YESForgotten at wave I

4,332 15.1%

YES NO 　Contact lost/Forgotten at wave II

8,794 30.7%

　　　 Total 28,696 100%

approximately 15% of forgetting

Unique= 51.1%

16

Distribution of respondents by number of ties forgotten (N=2707 respondents)0

10

20

30

40

Pe

rcen

t

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Number of forgotten ties in wave 1

35.6% of the respondents did not forget any ties

64.4 %of the respondents failed to mention at least one contact, with an average of 1.6 forgotten ties per respondent.

These numbers suggest that forgetting a contact was not a rare occurrence.

17

Analytical Strategies

What factors are associated with forgetting？

Unit of analysis: person-contacts dyads

Model : Multilevel logit

Whether “forgetting” affects estimates of network resources ？

Unit of analysis: person

Model predicting “forgetting”

Analysis for the effect of forgetting on estimates of accessibility

18

Sample

A multi-level logit approach The models estimate the odds of “forgetting”

versus “not forgetting”; the reference population consisted of all contacts mentioned in the first interview (2004).

Data structure

Positions nested within individuals

LEVEL 2 LEVEL 1

The final sample consists of 2,682 respondents and 28,343 person-contact dyads.

The multi-level approach requires us to transform the individual-based data to person-contacts observations.

20

Variables

Level 2 (respondent level): Age Years of schooling Marital status (married) Employment status (employee) Occupational prestige score Size of daily contact

21

Variables

Level 1 (ties level): Type of relationships

Group into six categories: kin, neighbor, school tie, work-related ties, friends, indirect tie

Length of relations (in years) Closeness Gender homophily Status difference

Status distance = absolute difference between respondent’s prestige score and contact’s prestige scores

Status disparity = respondent’s prestige score – contact’s prestige score

Descriptive statistics (individual level)Level-2 Total Male Female　 (N=2676) (N= 1383) (N=1293)Age(in years) 41.62 41.46 41.79

(11.66) (11.62) (11.70)Years of education 11.77 12.30 11.19

(4.23) (3.75) (4.63)Marital statussingle 0.23 0.25 0.21 divorced/widowed 0.06 0.03 0.09 married 0.71 0.72 0.71 Employment statusemployee 0.72 0.68 0.76 self-employed/employer 0.18 0.23 0.12 part-timer 0.03 0.03 0.03 family worker 0.08 0.05 0.10 Occupation prestige score

39.88 41.26 38.39

(12.91) (13.13) (12.50)Size of daily contacts 3.42 3.52 3.31

(1.36) (1.31) (1.41)

Descriptive statistics (dyad level)Level-1 Total Forgetting

Not forgetting

　 (N=27,103) (N=4,315 ) (N=22,788)Type of relationshipkin 0.21 0.24 0.21 neighbor 0.07 0.09 0.07 school tie 0.07 0.07 0.08 work-related ties 0.35 0.42 0.33 friends 0.24 0.12 0.26 indirect tie 0.05 0.07 0.05 Same sex 0.61 0.60 0.61 Length of relationship 12.89 12.95 12.88

(11.92) (11.98) (11.91)Closeness 3.46 3.34 3.49

(0.99) (0.99) (0.99)Status Distance 15.79 16.74 15.61

(11.66) (12.21) (11.54)Status Disparity -3.56 -5.89 -3.12 　 (19.30) (19.87) (19.16)

24

　　　 MODEL (1)Level-2 Model

Intercept -1.197***Female (male) -.146***Age (in years) .000Years of schooling -.054***

Marital status (married)

Single .105+Divorced/widowed -.168+

Employment status (employee)

Self-employed/employer -.080

Part-timer -.076Family worker -.048

Occupation prestige scores -.007***

Size of daily contacts -.125***

Multi-level model predicting “forgetting”(level-2 model)

25

Multi-level model predicting “forgetting” (Level-1 model)

　　　 MODEL (1) MODEL (2)

Level-1 Model

Type of relationship (work-related ties)

Kin .100* .100*

Neighbor .015 .011

School ties -.196*** -.197***

Friends -.807*** -.804***

Indrect ties .096 .104+

Same sex -.039+ -.106***

(same-sex)×female .122**

Length of relationship -.008*** -.007***

Closeness -.173*** -.174***

Status Distance .007*** .011 ***

(status distance)×female -.007***

26

Multi-level model predicting “forgetting” (Level-1 model)

　　　 MODEL (3) MODEL (4)Level-1 Model

Type of relationship (work-related ties)

Kin .108* .107*

Neighbor .022 .020

School ties -.197*** -.200***

Friends -.814*** -.812***

Indrect ties .098+ .104+

Same sex -.040+ -.106***

(same-sex)×female .119*

Length of relationship -.008*** -.008***

Closeness -.179*** -.181***

Status disparity -.002** -.004***

(status disparity)×female .004**

27

Findings

Recall error may not be random. Forgetting is more likely among weak

ties. How does recall error affect the

estimation of network-driven indices ？

28

Table 4. Discrepancy between “true” (corrected) and “observed” (raw) network resources indices

　　 Corrected score

Raw score

Differences t-test

Extensity Mean 9.9 8.5 1.38 39.2

SD 5.5 5.5 　　Range Mean 40.6 36.7 3.92 25.8

　 SD 16.8 18.6 　　Upper reachability Mean 65.2 62.4 2.83 19.3

　 SD 15.2 17.6 　　Because forgetting is more likely among weak ties, position-generator underestimate embedded network resources.

29

Table 5. Correlations between “true” (corrected) and observed (raw) network resources indices at wave I (N=3,272)

　　 Corrected indicesRaw indices at

wave I

　　 Extensity Range Reachability Extensity Range

Corrected indices

　

Extensity -- 　　Range .817 　　Reachability .692 .886 　　

Raw indices at

wave I

Extensity .934 .745 .632 　Range .792 .884 .776 .832

Reachability .674 .804 .880 .694 .865

30

Conclusions

Forgetting a contact was not a rare occurrence;

Recall error is largely nonrandom. Status difference appears to govern the

recall process. Position generator systematically

underestimates network-driven resource indices.

measurement of social capital: recall errors and bias estimations

Documents