17a.accessing data: manipulating variables in spss ®
TRANSCRIPT
17a. Accessing Data: ManipulatingVariables in SPSS®
2
17a. Accessing Data: Manipulating Variables in SSPS®
Prerequisites• Recommended modules to complete before viewing
this module 1. Introduction to the NLTS2 Training Modules 2. NLTS2 Study Overview 3. NLTS2 Study Design and Sampling NLTS2 Data Sources, either
• 4. Parent and Youth Surveys or• 5. School Surveys, Student Assessments, and Transcripts
NLTS2 Documentation• 10. Overview• 11. Data Dictionaries• 12. Quick References
3
17a. Accessing Data: Manipulating Variables in SSPS®
Prerequisites• Recommended modules to complete before viewing
this module (cont’d) 13. Analysis Example: Descriptive/Comparative Using
Longitudinal Data Accessing Data
• 14a. Files in SPSS• 15a. Frequencies in SPSS
4
17a. Accessing Data: Manipulating Variables in SSPS®
Overview Purpose Modifying existing variables Creating new variables Summary Closing Important information
5
17a. Accessing Data: Manipulating Variables in SSPS®
NLTS2 restricted-use data
• NLTS2 data are restricted.• Data used in these presentations are from a
randomly selected subset of the restricted-use NLTS2 data.
• Results in these presentations cannot be replicated with the NLTS2 data licensed by NCES.
6
17a. Accessing Data: Manipulating Variables in SSPS®
Purpose
• Learn to Modify an existing variable Create a new variable Join/combine data from different sources
17a. Accessing Data: Manipulating Variables in SSPS®
7
Modifying existing variables• How to modify a variable.• It is necessary to create a new variable in SPSS to
Collapse categories Break a continuous variable into categories Recode a variable.
• Note about created variables in the NLTS2 database Our analyses were done in SAS, and this recoding step is usually not
necessary in SAS because of the external formats feature. Collapsed or recategorized variables do not necessarily exist in SAS or
SPSS files even if these items appear in published tables. There are many created variables in the NLTS2 database, but most of
them are not simply collapsed versions of an existing variable.
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
8
Modifying existing variables• Syntax to recode into collapsed categories
RECODE np1B2a (MISSING=SYSMIS) (Lowest thru 1=1) (2 thru 5=2) (6 thru 10=3) (11 thru Highest=4) INTO np1B2a_Cat .
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
9
Modifying existing variables• Syntax to assign a variable label to the new variable
*assign variable label to new categorical variable.VARIABLE LABELS np1B2a_Cat
'(np1B2a_cat) Age of youth when diagnosed categorized'.EXECUTE.
• Syntax to assign value labels* assign value labels to new categorical variable.VALUE LABELS np1B2a_Cat
1 "(1) 1 or younger" 2 "(2) 2 to 5 years of age" 3 "(3) 6 to 10 years of age" 4 "(4) 11 or older".
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
10
Modifying existing variables• Menu
Transform: Recode into Different Variables Select the variable to be recoded from the list and click the
right-facing arrow. Give the new variable a name in the box under “Output
Variable.” Assign a label to the new variable in the “Label” box under
“Output Variable.” Click “Change.” Click on the box marked “Old and New Values,” and a new box
pops up. In the new box, under “Old Values” click the radio button
“System or User-missing,” click “System Missing” under “New Values,” and click “Add” next to “Old -- >New.”
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
11
17a. Accessing Data: Manipulating Variables in SSPS®
Modifying existing variables
• Menu (cont’d) For each old to new value(s)
• Under “Old Values,” click a radio button by an actual value or range of values box.
• Designate what the old values are, either actual or range of values, in the appropriate box.
• Assign a new code under “New Values” and click “Add.” When finished with values, click “Continue” to return to
the first box. In the original box, click “OK” or “Paste” to generate code.
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
12
Modifying existing variables• Look at results.• New variable should appear at bottom of “Variable View.”• Specify formats so values are meaningful.
In variable view, click on the cell in the “Values” column to bring up a new box.
Enter a value in the “Value” box, a label for that value in the “Label” box, and click “Add.”
Do this for every value.• Look at frequency distribution.
Useful to look at a crosstab of the original by the new variable.
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
13
Modifying existing variables
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
14
Modifying existing variables
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
15
17a. Accessing Data: Manipulating Variables in SSPS®
Modifying existing variables: Example• Modifying a variable
Open Wave 3 parent/youth interview file. Collapse np3NbrProbs into new variable.
• 0-1• 2• 3• 4-6
Remember to• Label variable• Add value formats• Account for missing values• Paste your code.
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
16
Modifying existing variables: Example
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
17
Modifying existing variables: Example
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
18
Creating new variables• How to create a new variable.• The values in the new variable can be the results of
calculations, assignments, or logic.• A new variable can be created from an existing
variable or from multiple variables, including variables from other sources and/or waves. Variables from other sources/waves must be added to the
active data file before the new variable is created.
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
19
Creating new variables• Be aware of any coding differences between the
variables when combining values.• Decide what to do with missing values.• Example: Create a variable using parent interview
data from Waves 1, 2, and 3. Has a student been suspended and/or expelled in any
wave?
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
20
Creating new variables• Syntax
IF (np1D7h=0 and np2D5d=0 and np3D5d=0 and np4D5d=0) np4D5d_ever=0.
IF (np1D7h=1 or np2D5d=1 or np3D5d=1 or np4D5d=1) np4D5d_ever = 1.
IF (np1D7h=1 and np2D5d=1 and np3D5d=1 and np4D5d=1) np4D5d_ever = 2.
IF (MISSING(np1D7h) or MISSING(np2D5d) or MISSING(np3D5d) or MISSING(np4D5d)) np4D5d_ever = -999 .EXECUTE .
• This code will result in a variable that Requires a value for every wave Is 0 if never suspended/expelled Is 1 if suspended/expelled in any wave Is 2 if suspend/expelled in all three waves.
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
21
Creating new variables
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
22
Creating new variables• Menu
Transform: Compute Enter a variable name under “Target Variable.” Click “Type & Label” and assign a label. If applicable, find and select the source variable(s) and click the right-facing
arrow to move the variable name into the “Numeric Expression” box. Enter functions/operations from the keypad boxes or select from the list of
functions. For logical conditions, click “If…” and build the condition in the pop-up box. Click “OK” or “Paste.” For multiple conditions (i.e., if-then-else), repeat all steps.
• Specify conditions in order of overriding conditions.• If true, each subsequent condition will override the previous condition.
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
23
Creating new variables: Example• Creating a new variable
Open the Wave 4 parent/youth interview file. Bring in np1F7 from Wave 1, np2P8_J4 from Wave 2, and
np3P8_J4 from Wave 3 interview files. Create a new variable np4P8_J4_ever (ever done volunteer
or community service). Initialize value to “0” if any value in np1F7, np2P8_J4,
np3P8_J4, or np4P8_J4 is “0.” Reassign to “1” if any value in np1F7, np2P8_J4, np3P8_J4,
or np4P8_J4 is “1.”
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
24
Creating new variables: Example
• Creating a new variable (cont’d) Assign variable label and value labels. Run a frequency of np4P8_J4_ever. Run a crosstabulation of np4P8_J4_ever by
np4P8_J4.
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
25
Creating new variables: Example• Code for example
IF (np1F7=0 or np2P8_J4 = 0 or np3P8_J4=0 or np4P8_J4=0) np4P8_J4_ever = 0 .
IF (np1F7=1 or np2P8_J4=1 or np3P8_J4=1 or np4P8_J4=1) np4P8_J4_ever = 1 .
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
26
Creating new variables: Example
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
17a. Accessing Data: Manipulating Variables in SSPS®
27
Creating new variables: Example
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
28
17a. Accessing Data: Manipulating Variables in SSPS®
Summary• Be aware of differences in coding between similar variables when
building composite variables.• Missing values must be considered.
Know how missing values are being coded, particularly when using more than one variable to create another.
Joined data are more likely to have missing values.• Weights
Generally, the analysis weight should be the weight from the smallest sample when combining data.
When filling in values for a variable in an active file with values from another, it is OK to use the weight in the active file.
• Strongly recommended: Paste your code when creating variables.
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
29
17a. Accessing Data: Manipulating Variables in SSPS®
Summary
Know the values, mind the missing, and watch your weights!
These results cannot be replicated with full dataset; all outputin modules generated with a random subset of the full data.
30
17a. Accessing Data: Manipulating Variables in SSPS®
Closing
• Topics discussed in this module Modifying existing variables Creating new variables Summary
• Next module 18a. Complex Samples Procedures in SPSS
31
17a. Accessing Data: Manipulating Variables in SSPS®
Important information NLTS2 website contains reports, data tables, and other
project-related information http://nlts2.org/
Information about obtaining the NLTS2 database and documentation can be found on the NCES website http://nces.ed.gov/statprog/rudman/
General information about restricted data licenses can be found on the NCES website http://nces.ed.gov/statprog/instruct.asp
E-mail address: [email protected]