2008 nchs data users’ conference omni shoreham hotel washington, dc wednesday, august 13, 2008

22
2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

Upload: betty-johnston

Post on 04-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

2008 NCHS Data Users’

ConferenceOmni Shoreham Hotel

Washington, DC

Wednesday, August 13, 2008

Page 2: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

Session 43:

The Research Data Center:

Data Enclave of the NCHS

Session Coordinator and PresenterDeborah Rose, Ph.D.

Page 3: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

Speakers and topics (updated list)

Introduction to the RDC:

• Overview of the Research Data Center – What it is and what it does - Deborah Rose, Ph.D., M.P.H.

• Types of data available - Stephanie Robinson, B.A.

Page 4: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

Speakers and topics (updated list, continued)

Examples of RDC research collaborations:• Emergency medicine research and the

RDC - Julius Cuong Pham, M.D., Ph.D

• Assessing health and health care in the District of Columbia - Carole Roan Gresenz, PhD

• Combining contextual variables with data from NHANES III - Chloe Bird, Ph.D.

Page 5: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

What is the RDC?Location A suite of offices in Hyattsville MarylandStaff Project managers who are experienced analystsSecurity Keypad access office suite, stand- alone computers Data Public and confidential health and other information, combined and customized for your projectAccess Onsite, remote, Census RDC

Page 6: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

Start of the RDC

• Modeled after the Census Bureau Data Research Centers

• Opened in 1998• Policies were developed to assure

access and confidentiality

Page 7: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

Two contradictory mandates

• Wide dissemination of the data - The Public Health Service Act of 1956 requires the collection and wide dissemination of data.

• Maintenance of confidentiality - the NCHS 308(d) Confidentiality Statute requires that the information collected may not be released if the establishment or person supplying the information is identifiable

Page 8: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

Resolving the contradiction

• Summary tables of aggregate data are published (on paper or on the web)

• Public use datasets are released with person or institution level records

• Records do not include individual identifiers

• Variables that might allow record identification are suppressed

• Values based on small samples are suppressed

Page 9: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

When do you need the RDC?

Data needs and availability• You have a research project or policy

objective best served by analyzing representative, federally collected health data

• Public use data does not meet all your needs

• NCHS has, or can get, the data of interest

Page 10: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

When do you need the RDC?

(continued)Analytic skills and computer access• You or your staff have the skills to

analyze individual level data using a standard statistical package

• You can come to the NCHS RDC, a Census RDC, or have a secure email account to access our remote system.

Page 11: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

When do you need confidential variables?

• Confidential information is directly related to your main research question.

• You need to link link two or more datasets, using small area geographic identifiers (such as state, county or census tract) that are not publicly available.

• You need to make a subset of the population using selection criteria from a confidential variable such as exact age, date of interview, small race/ethnic group.

Page 12: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

Types of Data

• NCHS Supplied: Confidential variables from the vital statistics system, any of the NCHS data collection systems, or files linked between systems

• User supplied: Public use or other data collected by other agencies, or compiled by the user

• See next presentation for more detail

Page 13: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

Major Steps

• See the website for the latest requirements• Develop and submit a proposal• We review it and accept, reject or ask for

revisions• You sign the confidentiality agreements• You send us the public use files• We merge the public and confidential data• We send you an invoice for the setup and

usage costs

Page 14: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

Major Steps(continued)

• You run your analyses• We review the output for disclosure• You publish• Please send us a citation and copy of

your published or reported work!

Page 15: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

Components of the proposal

• Contact information• Key study questions/Public health

benefits• Year, data system and dataset(s) • Lists of public use and confidential

variables• Why publicly available data are

insufficient• Analysis/statistical methods/software• Sample output table shells

Page 16: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

NCHS User Fees

File construction and setup• Mortality files = $250 per day• All other files = $500 per day

Page 17: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

NCHS User Fees(continued)

Access and Analysis On site• $200 per day (2-10 days)

Remote• NSFG-CDF = $500/year• All other files = $500/month• Each added survey cycle = $250/month

Page 18: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

ANDRE: ANlytical Data Research by

Email• Completely automated system• Operates round the clock

without any human intervention• Registered subscribers only

– Proposals already reviewed and approved– Confidentiality agreements have been

signed

• Unlimited Access during the subscription period

Page 19: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

How ANDRE Works

• A registered subscriber sends an email to ANDRE with a SAS or SUDAAN program in an attachment

• ANDRE’s lead server authenticates the user through password challenge and email

• Researchers never see data but run their programs against a data set prepared to their specifications by RDC staff

Page 20: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

NCHS RDC Usage Statistics

Average no. of projects 1998-2003 = 1.5/month

Average no. of projects 2004-2006 = 2.5/month

Average no. of proposals 2007 = 10/monthCurrent no. of active projects, 2007 = 146Average no. of daily remote users 2007 = 18

Average no. of proposals 2008 = 7/monthCurrent no. of active projects, 2008 > 200Average no. of daily remote users 2008 = 30

Page 21: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

Visit the NCHS RDC website at:

http://www.cdc.gov/nchs/r&d/rdc.htm

Page 22: 2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008

For more information the NCHS RDC website at:

www.cdc.gov/nchs/r&d/rdc.htm