public use microdata samples using pdq explore software grace york university of michigan library...

49
Public Use Microdata Public Use Microdata Samples Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Upload: percival-lambert

Post on 28-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Public Use Microdata Public Use Microdata SamplesSamples

Using PDQ Explore Software

Grace YorkUniversity of Michigan Library

May 2004

Page 2: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

2000 Census Data 2000 Census Data TabulationsTabulations

• Summary Files 1-4, Equal Employment Opportunity, School District Data, and Work Flow data are TABULATED data

• American Factfinder EXTRACTS the tabulated data

Page 3: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Public Use Microdata Public Use Microdata SamplesSamples

• Copies of the original questionnaires with identifying information edited out

• Create your own cross tabulations of census data

Page 4: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Typical PUMS Typical PUMS QuestionsQuestions

• Single years of age by sex for teachers in Michigan (e.g. when will they retire?)

• Race of those with Arab ancestry (no, they are not all white)

• Demographic characteristics of immigrants from Senegal (age, sex, education, occupation, income, citizenship for a social survey)

• Age, race and sex of automotive industry employees (campaign for organ donations)

Page 5: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

PUMS Software PUMS Software ProgramsPrograms

• FTP data from Census Bureau (and manipulate with SAS or SPSS)

http://www.census.gov/Press-Release/www/2003/PUMS5.html

• Census Bureau CD-ROMS (Beyond 20/20 software)

http://www.census.gov/mp/www/Tempcat/PUMS.html

• SDA Software for Michigan (UMich Only)http://nds.umdl.umich.edu/n/nds/

• PDQ Explorehttp://www.pdq.com

Page 6: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

PDQ Explore SoftwarePDQ Explore Software

• Easy interface to– Public Use Microdata Samples, 1 and

5%, 1980-2000– IPUMS, edited PUMS, 1850-1880, 1900-

1920, 1940-1990– Current Population Survey, 1991+– Mortality Schedules

• Permits users to tabulate their own variables

Page 7: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Access to PDQAccess to PDQ

• Librarians may request free Ids, passwords, and software from PDQ

• Send e-mail to [email protected]– You are a librarian who talked to Grace York– Requesting ID and password for using PDQ

Explore – Want to download software for the PDQ

Toolbox, Expert Edition

http://www.pdq.com

Page 8: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

SoftwareSoftware

• Download the software per instructions to your hard drive

• To begin searching, open the icon on your desktop

Page 9: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Before Beginning …Before Beginning …

Choose FileChoose File

Two PUMS files – 1% and 5% sample

• 1% has data for the nation, states, MSAs and super-Pumas (areas of 400,000)

• 5% has data for the nation, states, MSAs and Pumas (areas of 100,000)

Page 10: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Before Beginning…Before Beginning…

Define the data you want in terms of a spreadsheet. The longer part should be defined as rows rather than columns.

I want single years of age by sex for all Vietnam-era veterans in the United States

Universe = Vietnam-era veterans in the U.S.Column=sex (not very wide)Row=single years of age (could be long)

Page 11: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Before Beginning…Before Beginning…

Consult Chapter 7 of the PUMS codebook if you want to check the possible variables and the appendices for place/language/ancestry and occupation codes

http://www.census.gov/prod/cen2000/doc/pums.pdf

Chapter 7 is also available on the University of Michigan web site at:

http://www.lib.umich.edu/govdocs/census2/pums2000/pums7.pdf

Page 12: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Before Beginning…Before Beginning…

Housing RecordAll geographic codes (state, MSA, PUMA)All housing recordsSome population records

Population RecordAll population variablesOk to combine with geographic codes in housingAsk for help for other population/housing combinations at: [email protected]

Page 13: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Before Beginning…Before Beginning…

Variable Codes for the Questionin the Technical Documentation Data Dictionary

AGE Single Years of Age

SEX Male or Female

VPS5 Veteran’s Period of Service 5: On active duty during

the Vietnam Era (Aug. 1964 to Apr. 1975)

http://www.lib.umich.edu/govdocs/census2/pums2000/pums7.pdf

Page 14: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Logging OnLogging On

Enter the subscriber name and password that you were given by the PDQ staff

Page 15: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Logging OnLogging On

Press OK to close the message of the day

Page 16: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Defining WorkspaceDefining Workspace

• To conduct a new search, create a new workspace

• Press Finish or return twice

Page 17: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Defining WorkspaceDefining Workspace

Name your file on your hard drive and save.

Page 18: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Defining WorkspaceDefining Workspace

At the next screen, use the top menu to choose Workspace; then Add a Data Set

Page 19: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Defining WorkspaceDefining Workspace

Browse data sets; highlight ipums, pums, cps, or mortality file; Open

Page 20: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Defining VariablesDefining Variables

• Once you choose a data set, its codebook will open up• Click on the plus button to get a list of variables, their

alphabetic symbols, and any numeric values

Page 21: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Defining VariablesDefining Variables

• Determine the alphanumeric variables you want (e.g. Vietnam-era veteran: yes is VPS5=1)• Use Top Menu to Choose Query/Setup New Expert Query(Access the codebook later through a tab on the desktop

toolbar)

Page 22: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Expert Query FormExpert Query Form

1. Make sure you have the correct data set2. Determine if you want a tabulation (counts or

numbers)3. Name your file

Page 23: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Expert Query FormExpert Query Form

Enter the code for UNIVERSE (what you’re counting) in the Universe box (e.g. vps5=1 are Vietnam-era veterans for the entire U.S.)

Page 24: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Expert Query FormExpert Query Form• Enter the code for the variables in the ROW box (age = single years of age; age/5 would be five year age

groups) • Enter the code for the variables in the COLUMN box (e.g.

sex)• Press RESULTS to run the query

Page 25: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Search ResultsSearch Results

Search results appear in spreadsheet format

Page 26: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Saving ResultsSaving Results

• Click on File/Export Query Results• You can save as CSV , tab delimited and several other

formats. CSV (WYSIWIG) recommended for use with Excel• Use SETUP button to return to query or icon at bottom to

review the codebook

Page 27: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Geographic CodesGeographic Codes

• Geographic codes are found in the Housing documentation

• Limit files to Michigan with the code state=26• Click on Query/New Expert Query to continue

Page 28: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Narrowing the UniverseNarrowing the Universe

Narrow Narrow the universe by using the universe by using & newcode& newcode (e.g. vps5=1 & state=26)(e.g. vps5=1 & state=26)

Page 29: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Logical Operators in Logical Operators in PDQPDQ

http://www.lib.umich.edu/govdocs/census2/pdqop.phttp://www.lib.umich.edu/govdocs/census2/pdqop.pdfdf

& & is one of numerous operators used in PDQis one of numerous operators used in PDQ

Operator Name Example/Comment X:a..b range age:15..44 unary + plus sex=+1 (never needed) unary - minus income4<=-1000 * multiply 73*income1/100 / divide rhhinc/persons % modulo subsample%10 + add income1+income2 - subtract rhhinc-rearning < less than age<65 > greater than age>64 <= less than or equal age<=65 >= greater than or equal age>=65 = or == equal age=23 != or <> not equal income!=0 & or && and race=2 & looking=1 ^ exclusive or bit-wise--use with caution | or || or age<18 | age>=65

Page 30: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Altering the Spreadsheet Altering the Spreadsheet TabulationsTabulations

Once you have a spreadsheet, click on Options to create totals or percentages for tables or columns

Page 31: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Adding More Adding More ParametersParameters

Expand the table detail by repeating the row and column data for another parameter (e.g. race) as shown in Dimension 3

Page 32: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Altering Spreadsheet Altering Spreadsheet AppearanceAppearance

• The default shows separate tables for each of the values in the third dimension (e.g. separate spreadsheets for white and black)

• Change Axis3 tab to FOREACH everything on same spreadsheet

Page 33: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Calculating Means or Calculating Means or AveragesAverages

• Calculate averages by changing the query type to summary statistics (e.g. mean or average) at the top

• Fill in the new Describe Expression box at the bottom with a variable code (e.g. age, income)

Page 34: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Complex TableComplex TableMean income of white male Vietnam-era veterans in

Michigan by age, whether or not they have earningsYou can respecify only veterans with earnings

Page 35: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Altering Mean IncomeAltering Mean Income

Add & incws > 0 to universe to count only Vietnam-eraveterans who are earning more than $0

Page 36: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Complex TableComplex TableMean income is higher when data limited to wage-

earning veterans

Page 37: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Small Area GeographySmall Area Geography

• Data from the PUMS 5% file is available for states, metropolitan areas, and Public Use Microdata Areas (PUMAS) of 100,000

• You can identify a PUMA or group of PUMAs using– Maps in American Factfinder (

http://factfinder.census.gov/)– PDF maps on the Census Bureau web site

(http://www.census.gov/geo/www/maps/puma5pct.htm)– Mable/Geocorr Search Engine

(http://mcdc2.missouri.edu/websas/geocorr2k.html)

Page 38: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Small Area GeographySmall Area Geography

This map shows Detroit as PUMAs 3701-3708

Page 39: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

PUMA Codes for PUMA Codes for MichiganMichigan

Ann Arbor 3200Detroit 3701-3708Flint 2200Grand Rapids 1300Lansing 1800

PUMA to Placehttp://www.lib.umich.edu/govdocs/census2/pumapl00.txt

Place to PUMAhttp://www.lib.umich.edu/govdocs/census2/plpuma00.txt

Page 40: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Codebook and PUMASCodebook and PUMAS

The Explore Codebook shows PUMA5 as The Explore Codebook shows PUMA5 as term for 5% PUMA boundariesterm for 5% PUMA boundaries

Page 41: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Small Area Geography Small Area Geography and Rangesand Ranges

When creating data sets for PUMAS, be sure to include the correct state as the universe (e.g. state=26)

Page 42: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Small Area Geography Small Area Geography and Rangesand Ranges

Puma5: 3701..3708 will list the data for each individual area

Page 43: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Small Area Geography Small Area Geography and Rangesand Ranges

Search result for each individual PUMA

Page 44: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Small Area Geography for Small Area Geography for RangesRanges

To get the total for the area, list it in the universe as puma5 >3700 & puma5 <3709 & state=26

Page 45: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Small Area Geography for Small Area Geography for RangesRanges

To get a listing of single years of age between 65 and 85, list column as age: 65..85

Page 46: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Calculating TotalsCalculating Totals

• To calculate the most spoken languages by 65-85 year olds as a group

• Click on Options/Total Options/Row

Page 47: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Complex ResultComplex Result

Spanish and Polish are two most popular Spanish and Polish are two most popular languages spoken by seniors 65-85 in languages spoken by seniors 65-85 in DetroitDetroit

Page 48: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Access to PDQAccess to PDQ

• Librarians may request free Ids, passwords, and software from PDQ

• Send e-mail to [email protected]– You are a librarian who talked to Grace York– Requesting ID and password for using PDQ

Explore – Want to download software for the PDQ

Toolbox, Expert Edition

http://www.pdq.com

Page 49: Public Use Microdata Samples Using PDQ Explore Software Grace York University of Michigan Library May 2004

Contacts for Research Contacts for Research AssistanceAssistance

Initial QueriesInitial Queries

Grace York, Documents Center, 203 HatcherGrace York, Documents Center, 203 Hatcher

[email protected] or [email protected] or 936-2378

JoAnn Dionne, Numeric and Spatial Data Services, JoAnn Dionne, Numeric and Spatial Data Services, 825 Hatcher, [email protected], 825 Hatcher, [email protected],

763-9408763-9408

Complex Data SetsComplex Data Sets

Lisa Neidert, Population Studies Center, 426 Lisa Neidert, Population Studies Center, 426 Thompson, [email protected], 763-2163Thompson, [email protected], 763-2163

PDQ Staff, 310 Depot Street, Suite C, Ann Arbor PDQ Staff, 310 Depot Street, Suite C, Ann Arbor 48104, [email protected], [email protected]