dli boot camp 2011 finding statistics: tools and techniques

43
DLI Boot Camp 2011 Finding Statistics: Tools and Techniques Jean Blackburn Vancouver Island University Library [email protected] SDA

Upload: denise

Post on 05-Jan-2016

19 views

Category:

Documents


0 download

DESCRIPTION

DLI Boot Camp 2011 Finding Statistics: Tools and Techniques. SDA. Jean Blackburn Vancouver Island University Library [email protected]. SDA@CHASS: Microdata Analysis & Subsetting. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

DLI Boot Camp 2011Finding Statistics:

Tools and Techniques

Jean BlackburnVancouver Island University Library

[email protected]

SDA

Page 2: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

SDA@CHASS: Microdata Analysis & Subsetting

• University of Toronto CHASS Data Centre runs a web-based statistical package called SDA (Survey Documentation & Analysis) developed at UC Berkeley

• The SDA@CHASS service links SDA to microdata files from Statistics Canada surveys and more.

• IP authenticated access to SDA@CHASS is available to DLI institutions for an annual fee

Page 3: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Key Features

• Variable-level searching• Extraction / subsetting capabilities• Access to codebooks• Web-based statistical analysis

capabilities• Ability to recode and compute new

variables

Page 4: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

http://sda.chass.utoronto.ca

Page 5: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

There are both French and English language versions of SDA@CHASS, but the French language data catalogue is limited as of the time of this presentation…

Page 6: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

The SDA@CHASS English language data catalogue contains not only DLI microdata sets, but also open-access data and those restricted to University of Toronto users.

Page 7: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Variable-level searching across SDA data sets

Page 8: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

You can specify a specific data set or cluster of data sets to search, and whether to search survey OR variable-

level metadata (but not both)

Click the + to expand the data file clusters and select specific data sets to search

Page 9: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

You can specify specific fields to search as well…

Page 10: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

The View button shows you the codebook definition for the variable; “Study” titles link to the SDA interface...

Page 11: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

…the SDA interface (we’ll come back to it in more detail later).

Page 12: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

As well as searching, you can browse the SDA@CHASS data catalogues…

Page 13: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

The French language data catalogue has 3 survey titles available; there are several data sets available for the Recensement de la population de Canada.

Page 14: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Click on a data set link to access the SDA interface.

Page 15: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

You can access the codebook for this data set from the menu bar…

Page 16: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

The codebook opens in a separate browser window or tab…

Page 17: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Use the Variable Selection tool to browse variables in the data set by category.

Page 18: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Expand variable categories by clicking the +

Click on variables to select them

Page 19: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

You can view the codebook definitions for selected variables by clicking the View button beside the “Selected” field.

Page 20: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Codebook definition for “immstat” variable

Codebooks include numeric values and labels for responses, record layouts, and unweighted frequencies.

Page 21: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

You can also search variables by keyword within a data set

Page 22: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Follow the “Search Techniques Help” link for tips on wildcards and field searching. N.B. The SDA help files are English language only.

Click the variable button to select the variable for manipulation in SDA. For example, we can use SDA to get a weighted frequency distribution for the “immstat” variable…

Page 23: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

1. Copy the selected variable “immstat” to the Row field in the Frequencies/Crosstab analysis tool, using the “Row” button…

To run a weighted frequency for the “Statut d’immigrant” variable (“immstat”)…

2. Adjust table and display options as desired. SDA defaults to weighted cases. I have selected a pie chart type.

3. Click the “Run the Table” button.

Page 24: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

The weighted frequency distribution table and chart will open in a new browser window or tab…

Page 25: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Charts can be saved as image files to insert into Word documents, etc.

Page 26: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

What if we want to see immigration status…

• for a particular province?• for people of a particular age?• for men and women separately?

SDA allows you to apply filter and control variables to your analysis.

Page 27: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

To run a weighted frequency for the “Statut d’immigrant” variable (“immstat”), for British Columbia only, we’ll use the Province variable (“pr”) as a filter…

Select the “pr” variable and click the View button to see the codebook definition…

(Note that the “immstat” variable remains in the Frequencies/Crosstab program “Row” field. )

Page 28: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

The codebook definition for the Province variable (“pr”) tells us that the value for British Columbia is 59…

Page 29: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Click the Filter button to move the “pr” variable to the Selection Filter(s) field in the SDA Frequencies/Crosstab Program

Enter the value for British Columbia, 59, within the Selection Filter parentheses, and run the table…

Page 30: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Filtering for British Columbia results in a different distribution of immigrant status.

Let’s try filtering further – for children under 15 years…

Page 31: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Select the Age Group variable (“agegrp”) and click the View button to see the codebook definition…

In the codebook, values 1 to 5 represent children under 15 years…

Page 32: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Because we want to filter for both Province and Age Group, it’s important to select “Append” rather than “Replace” before clicking the Filter button to move the variable to the Selection Filter(s) field.

For the agegrp filter, we need to enter a range of values representing children under 15 years (1-5).

Page 33: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Here’s the distribution for immigrant status, filtered for province and age.

Now let’s remove the age group filter, and instead compare the immigrant status distribution in British Columbia for men and women…

Page 34: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

To see distributions for all values of a particular variable (e.g. sex), select the variable and use the Ctrl button to move it to the Control field in the SDA Frequencies/Crosstab program, and run the table…

Page 35: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Controlling for sex, we get three different frequency distributions: one for women, one for men and one for all valid cases.

There is not much variation between these distributions!

Page 36: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

To look for a relationship between 2 variable frequency distributions, you can run a crosstabulation in SDA by copying the dependent variable to the Row field and the independent variable to the Column field and running a table.

Let’s use the 2006 Aboriginal Peoples Survey: Adults to see whether living in urban or rural environments affects ability to speak an Aboriginal language…

Crosstabs in SDA

Page 37: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Ability to speak an Aboriginal language (“bg01”) is the dependent variable and gets copied to the Row field. Geography (“geo”) is the independent variable and gets

copied to the Column field. I’ve chosen the Stacked Bar Chart chart type.

Page 38: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

The crosstabulation table and chart show that Aboriginal peoples living in Census Metropolitan Areas are less likely to speak an Aboriginal language than those living in rural areas.

The stacked bar chart effectively portrays the dramatic difference between the Arctic and the other Geography values.

Page 39: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Other analysis methods are available in SDA, under the Analysis menu

Page 40: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

In SDA, you can download custom data sets with the

Download Customized Subset command

Page 41: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Specify the data and syntax (e.g. SPSS) file formats you want…

Add filtering criteria for cases to include, if desired…

Select All, Some or None of the variable categories to be included in your custom data set. If you select Some for any category, you’ll be able to select specific variables on the next screen, after clicking the Continue button…

Page 42: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Hold down the Shift or Ctrl key to select contiguous or non-contiguous variables from the lists provided, and click Continue…

Check over your selections and click the “Create the Files” button…

Page 43: DLI Boot Camp 2011 Finding Statistics:  Tools and Techniques

Right-click the links to the files to download them to your local computer or network!