evaulation of the nscrg school sample donsig jang and xiaojing lin third international conference on...

29
EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

Upload: kevin-parker

Post on 27-Mar-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

EVAULATION OF THE NSCRG SCHOOL SAMPLE

EVAULATION OF THE NSCRG SCHOOL SAMPLE

Donsig Jang and Xiaojing Lin

Third International Conference on Establishment Surveys

Montreal, Canada, June 21, 2007

Donsig Jang and Xiaojing Lin

Third International Conference on Establishment Surveys

Montreal, Canada, June 21, 2007

Page 2: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

OutlineOutline

Sampling options on repeated establishment surveys

Reasons to keep the same sample in establishment surveys

Issues in keeping the same sample Example: NSRCG school sample Summary Recommendation for 2008 NSRCG School

Sample

Sampling options on repeated establishment surveys

Reasons to keep the same sample in establishment surveys

Issues in keeping the same sample Example: NSRCG school sample Summary Recommendation for 2008 NSRCG School

Sample

Page 3: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

Sampling options on repeated establishment surveys

Sampling options on repeated establishment surveys

Keep the same sample over time with supplemental samples for births– Efficient change estimates BUT– Response burden– Inefficient “cross-sectional” estimates

An independent sample in each survey round Sample coordination to maximize overlaps

between samples– Rotation samples (Sigman and Monsour 1995)– Permanent random number technique (Ohlsson 1995, 2001)– Keyfitz procedure (Keyfitz 1951)

Keep the same sample over time with supplemental samples for births– Efficient change estimates BUT– Response burden– Inefficient “cross-sectional” estimates

An independent sample in each survey round Sample coordination to maximize overlaps

between samples– Rotation samples (Sigman and Monsour 1995)– Permanent random number technique (Ohlsson 1995, 2001)– Keyfitz procedure (Keyfitz 1951)

Page 4: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

Reasons to keep the same sample in establishment surveys

Reasons to keep the same sample in establishment surveys

Difficulty in identifying point of contact Costly efforts in gaining participation Often requires nontrivial process to gather

information – previous survey participation would help

Difficulty in identifying point of contact Costly efforts in gaining participation Often requires nontrivial process to gather

information – previous survey participation would help

Page 5: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

Issues in keeping the same sampleIssues in keeping the same sample

Can they be a representative sample of the current cross-sectional population?– Depending on how dynamic the population is over

timecoverage issues: births vs. deathssample efficiency: distributional changes

Alternatives– Independent sample from the most up-to-date sample

frame– Coordination of samples

E.g., Keyfitz procedure to maximize the sample overlap between the current and the previous ones

Can they be a representative sample of the current cross-sectional population?– Depending on how dynamic the population is over

timecoverage issues: births vs. deathssample efficiency: distributional changes

Alternatives– Independent sample from the most up-to-date sample

frame– Coordination of samples

E.g., Keyfitz procedure to maximize the sample overlap between the current and the previous ones

Page 6: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

National Survey of Recent College Graduates (NSRCG)

National Survey of Recent College Graduates (NSRCG)

Repeated every two or three years Collects education, demographic, and

employment information from recent college graduates (bachelor’s and master’s) majoring in science,engineering, and health fields

Two stage sample design– 1st stage: select schools and obtain the list of

graduates from selected schools– 2nd stage: select graduates from the list provided by

schools NSF-sponsored survey

Repeated every two or three years Collects education, demographic, and

employment information from recent college graduates (bachelor’s and master’s) majoring in science,engineering, and health fields

Two stage sample design– 1st stage: select schools and obtain the list of

graduates from selected schools– 2nd stage: select graduates from the list provided by

schools NSF-sponsored survey

Page 7: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

NSRCGList collection from schools

NSRCGList collection from schools

Identify point of contact (usually institutional coordinator) Gather the list of graduates with key sampling and

locating information including:– degree award dates – degree level – field of major – race/ethnicity – gender – date of birth – SSN – student ID – mailing addresses including parent’s addresses– phone numbers (land line, cell) – emails, etc.

Identify point of contact (usually institutional coordinator) Gather the list of graduates with key sampling and

locating information including:– degree award dates – degree level – field of major – race/ethnicity – gender – date of birth – SSN – student ID – mailing addresses including parent’s addresses– phone numbers (land line, cell) – emails, etc.

Page 8: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

NSRCGList collection from schools (continued)

NSRCGList collection from schools (continued)

Need a good understanding on the information requested and file format

Time consuming and costly efforts– different schools have different issues

A crucial part for the quality of the survey– strive to get almost perfect cooperation rate

(99%)– Out of 300 schools,

only four final refusals in 2003 only five refusals in 2006

Need a good understanding on the information requested and file format

Time consuming and costly efforts– different schools have different issues

A crucial part for the quality of the survey– strive to get almost perfect cooperation rate

(99%)– Out of 300 schools,

only four final refusals in 2003 only five refusals in 2006

Page 9: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

NSRCGSchool sample selection

NSRCGSchool sample selection

For 1995, 1997, 1999, 2001 surveys– 275 schools initially selected in 1995 and kept with 5

supplemental samples added over three survey rounds (to account for frame coverage)

A new sample of 300 schools selected in 2003:– To reflect rapid changes of S&E populations in 1990’s– Health field added to the survey as eligible field of

study

For 1995, 1997, 1999, 2001 surveys– 275 schools initially selected in 1995 and kept with 5

supplemental samples added over three survey rounds (to account for frame coverage)

A new sample of 300 schools selected in 2003:– To reflect rapid changes of S&E populations in 1990’s– Health field added to the survey as eligible field of

study

Page 10: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

NSRCGSchool sample selection (continued)

NSRCGSchool sample selection (continued)

Probability proportional size (PPS) with composite size measure

Composite size measures calculated to achieve equal weights within each of NSRCG analytic domains constructed by a combination of:– degree year, degree level, field of majors, race/ethnicity, and

gender Population dynamics

– new schools (birth), closed (death), no S&E graduates (temporarily ineligible), etc

Coverage issue– distributions of schools changed (in terms of composite size

measures) potential factor affecting the sample efficiency

Probability proportional size (PPS) with composite size measure

Composite size measures calculated to achieve equal weights within each of NSRCG analytic domains constructed by a combination of:– degree year, degree level, field of majors, race/ethnicity, and

gender Population dynamics

– new schools (birth), closed (death), no S&E graduates (temporarily ineligible), etc

Coverage issue– distributions of schools changed (in terms of composite size

measures) potential factor affecting the sample efficiency

Page 11: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

2003 NSRCG school sample2003 NSRCG school sample

In both 2001 and 2003 NSRCG

170 (57%)

Only in 2003 NSRCG

130 (43%)

Total 300

Excessive efforts (time and resources) to Excessive efforts (time and resources) to achieve 99% of RR (4 schools refused)achieve 99% of RR (4 schools refused)

Page 12: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

Distribution of list submission dates in 2003 NSRCG

Distribution of list submission dates in 2003 NSRCG

0 30 60 90 120 150 180 210 240

0.000

0.005

0.010

0.015

Both in 01 and 03Only in 2003

Days

Page 13: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

School sample after 2003 NSRCG – 2006 NSRCG

School sample after 2003 NSRCG – 2006 NSRCG

Frame evaluationFrame evaluation

AY2001 AY2002 AY2003 AY2004 AY2005

In 2003 frame but not in 2006 frame 48 0 3,077 1,092 0 0 23

In both 2003 and 2006 frames 1,762 300 624,297 639,411 671,868 702,021 722,727

Not in 2003 frame but in 2006 frame 190 0 0 570 4,369 7,396 6,819

Frame School Count

Sample School Count

Graduate count

2003 Frame based on AY2001 IPEDS counts

2006 Frame based on AY2003 and AY2004 IPEDS counts

Page 14: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

Graduate counts dropped from and added to the population

Graduate counts dropped from and added to the population

Count Percentage Count PercentageBachelor 3476 0.35 5252 0.49Master 5380 1.97 1775 0.60Non-Hispanic White 5461 0.66 4109 0.48Asian,Pacific Islander,Nonresident 2643 1.04 1610 0.54Hispanic,Black,American Indian 752 0.40 1308 0.63Male 3949 0.71 3967 0.65Female 4907 0.69 3060 0.41

Eligible in 2003 but Ineligible in 2006

Newly Eligible for 2006 NSRCGDomain

Degree Level

Race/Ethnicity

Gender

Page 15: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

Graduate counts dropped from and added to the population

Graduate counts dropped from and added to the population

Count Percentage Count PercentageChemistry 15 0.06 109 0.46Physics/Astronomy 0 0.00 0 0.00Other Physical Sciences 30 0.23 44 0.34Mathematics/Statistics 36 0.10 120 0.30Computer Sciences 687 0.56 2,329 1.53Environmental, Geologicaland Agricultural Sciences 585 1.81 69 0.27Aerospace Engineering 0 0.00 33 0.55Chemical Engineering 0 0.00 0 0.00Civil Engineering 1 0.00 86 0.34Electrical Engineering 242 0.43 723 1.06Industrial Engineering 0 0.00 0 0.00Mechanical Engineering 0 0.00 112 0.31Other Engineering 266 0.87 171 0.51Biological Sciences 1,165 0.80 273 0.18Psychology 182 0.09 1,079 0.52Economics 39 0.08 129 0.21Sociology/Anthropology 57 0.07 76 0.08Other Social Sciences 321 0.53 398 0.59Political Science 145 0.17 251 0.24Health-Related - Nursing 163 0.16 741 0.71Health-Related – all else 4,922 3.62 284 0.23

Eligible in 2003 but Ineligible in 2006

Newly Eligible for 2006 NSRCGField of Major

Page 16: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

2006 NSRCG School Sample2006 NSRCG School Sample

No significant change of the population

– Kept the same school sample without any supplemental sample

No significant change of the population

– Kept the same school sample without any supplemental sample

Page 17: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

Distribution of list submission dates in 2006 NSRCG

Distribution of list submission dates in 2006 NSRCG

20 50 80 110 140 170 200 230

0.000

0.005

0.010

0.015

0.020

Both in 2001 and 2006Only in 2006

Days

Page 18: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

2008 NSRCG ?2008 NSRCG ?

Evaluate the current sampling strategy (keeping the same sample) by doing– frame evaluation– comparisons with other sampling schemes

Independent PPSKeyfitz procedure

Evaluate the current sampling strategy (keeping the same sample) by doing– frame evaluation– comparisons with other sampling schemes

Independent PPSKeyfitz procedure

Page 19: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

2008 NSRCG2008 NSRCG

Frame evaluationFrame evaluation

AY2001 AY2002 AY2006

In 2003 frame but not in 2008 frame 78 2 4,643 2,584 0

In both 2003 and 2008 frames 1,732 298 622,731 637,919 744,070

Not in 2003 frame but in 2008 frame 294 0 0 494 11,755

Frame School Count

Sample School Count

Graduate count

2003 Frame based on AY2001 IPEDS counts

2008 Frame based on AY2006 IPEDS counts

Page 20: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

Graduate counts dropped from and added to the population

Graduate counts dropped from and added to the population

Page 21: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

Sample EvaluationSample Evaluation

Three sample selection methods considered– Keep the 2003 school sample with a

supplemental sample of size 4– Independent PPS with composite size

measures based on updated frame information

– Keyfitz procedure

Three sample selection methods considered– Keep the 2003 school sample with a

supplemental sample of size 4– Independent PPS with composite size

measures based on updated frame information

– Keyfitz procedure

Page 22: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

PPS sample selection procedure PPS sample selection procedure

di idd

d

mS M

MDefine Size Measure:

where md is a sample size of domain d,

Md is the population size of domain d

Mid is the population size of domain d in school i

domain d is constructed from a combination of: graduate year, degree level, field of major, race/ethnicity, and gender

Page 23: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

PPS sample selection procedurePPS sample selection procedure

School i selected with probability (pi) proportional to size Si Achieve equal weight within each domain d Distributional changes of the NSRCG graduate populations

would cause unequal weight variations within domains Independent PPS with up-to-date frame data is desirable if

weight variation is severe

School i selected with probability (pi) proportional to size Si Achieve equal weight within each domain d Distributional changes of the NSRCG graduate populations

would cause unequal weight variations within domains Independent PPS with up-to-date frame data is desirable if

weight variation is severe

Page 24: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

Keyfitz procedureKeyfitz procedure

Maximize the overlap between two samples The first sample (2003 NSRCG) was selected

with PPS The second sample inclusion probability is

dependent upon: – updated size measures – the first sample inclusion probability– the actual sample realization in the first

sample

Maximize the overlap between two samples The first sample (2003 NSRCG) was selected

with PPS The second sample inclusion probability is

dependent upon: – updated size measures – the first sample inclusion probability– the actual sample realization in the first

sample

Page 25: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

Simulation of sampling procedures Simulation of sampling procedures

Generate 1000 school “independent” samples for each of the following options– Keep the same school sample with a

supplemental sample of size 4 from the newly eligible schools (“births”)

– Independent PPS sampling using MOS calculated from 2008 NSRCG frame

– Keyfitz procedure

Generate 1000 school “independent” samples for each of the following options– Keep the same school sample with a

supplemental sample of size 4 from the newly eligible schools (“births”)

– Independent PPS sampling using MOS calculated from 2008 NSRCG frame

– Keyfitz procedure

Page 26: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007
Page 27: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007
Page 28: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

SummarySummary

Keeping the same sample is a cost effective option Concern about statistical inefficiency due to the nature

of dynamic population Frame coverage corrected by supplemental sample Evaluate the NSRCG school sample

– Empirical frame evaluation– Samples simulated based on two methods

Distribution changes (in terms of composite size measure) would make the final sample inefficient:– Weight variation within planned domains– Over or under estimation of graduates in some domains

Keeping the same sample is a cost effective option Concern about statistical inefficiency due to the nature

of dynamic population Frame coverage corrected by supplemental sample Evaluate the NSRCG school sample

– Empirical frame evaluation– Samples simulated based on two methods

Distribution changes (in terms of composite size measure) would make the final sample inefficient:– Weight variation within planned domains– Over or under estimation of graduates in some domains

Page 29: EVAULATION OF THE NSCRG SCHOOL SAMPLE Donsig Jang and Xiaojing Lin Third International Conference on Establishment Surveys Montreal, Canada, June 21, 2007

RecommendationRecommendation

Keep the same school sample with supplemental sample of size 4 for 2008 NSRCG

Keep the same school sample with supplemental sample of size 4 for 2008 NSRCG