spss 202: data management by spss (workshop) dr. daisy dai department of medical research 1
TRANSCRIPT
SPSS 202: Data Management by SPSS (Workshop)
Dr. Daisy DaiDepartment of Medical Research
1
Log in SPSS
• CMH offers server version SPSS 18. Any employee can log in SPSS from your employee account.
• Go to https://remoteaccess.cmh.edu
• Enter cmh user name and password.
• Click SPSS 18 icon.
2
SPSS Data Entry
• SPSS data can be entered manually. – The format is ready for analysis.
• SAS, Excel, txt, etc. data can be easily imported to SPSS.
• SPSS data files are saved as “SPSS data document (.sav)”.
• SPSS output files are saved as “SPSS viewer document (.spv)”.
3
SPSS Data Entry
• SPSS has a few unique features in data entry. – Categorical variables need to be coded. For instance, code
male as 1 and female as 0 or vice versa.– When you have two treatments, test and control, please
use 1 for test and 0 for control. – Categorical variables that are not coded in other sourced
data files will not be imported or analyzed properly in SPSS.
– Continuous variables don’t need coding. – Missing values needs to be defined in “variable view”
page.
4
Example: CDC Survey Data
• An allergy survey was conducted in 2005 and 2006 to children more than 1 year old.
• Two data sets, allergy questionnaire and demographic information, are saved in sas export format.
5
CDC Survey Data
6
Tasks
• Import these two SAS data files (demo_d.xpt, agq_d.xpt) to SPSS and save them as SPSS data file.
• Sort each data set by study ID.• Merge allergy variables and demographic
variables.• Save new data set as SPSS data file.
7
Task 1: Import Data
• We need to import two data sets to SPSS.– Allergy qustionaire: aqq_d.xpt (xpt is sas Xport Tranport
File)– Demographic information: demo_d.xpt
• Please note that SPSS is on server and data must be saved in shared drive such as u drive or w drive. You will not be able to find the file in SPSS if you save them on your local disk.
8
Task 1: Import demo_d.xpt
• Click “File”, “Open”, “Data”.
• Select the folder where demo_d.xpt is saved.
• Choose “SAS (…)” for Files of Type.
• Select demo_d.xpt.• Click “Open”.
9
Task 1: Import Data
• Select the folder. • Choose agg_d file.• Select xpt format.• Click Open.
• Note: SPSS is compatible with other commonly used statistical and data management software packages. Excel, SAS, Access files are all convertible to SPSS.
10
Since this is not a SPSS data file, there is no file name (untitled) in the upper left corner.
11
Save demo_d f as SPSS data.
• Click “File”, “Save As”.
• File name: demo_d• Save as type:
SPSS7.0 (*sav). • Click “Save”.
12
Missing Data
• In the data set, missing is in “.”, which is automatically treated missing.
• If missing data is in blank, then click “Missing”, “Discrete missing values” and enter a space.
13
Task 1: Import agq_d.xpt and save it as agq_d.sav(Exercise)
14
Task 2: Sort Data
• Variable to be sort: SEQN, that is, Respondent sequence number.
15
Task 2: Sort agq_d.sav Data
• Select aqd_d.sav data.• Go to Data and select Sort
Cases.• On Sort Cases page, select
the variable, Respondent sequence number.
• Click on right arrow.• Choose Ascending or
Descending.• Click OK.
16
Practice
• Now let’s repeat this process by doing the following:– Open the demographic data, demo_d.xpt.– Sort the data by variable, Respondent Sequence
Number.
17
Sort Variables
• “Sort variable” is different from “Sort Cases”.
• This function rearranges the columns of data.
• “Sort Case” rearranges the rows of data.
18
Task 3: Merge Two Data Sets
• Two data sets need to be linked by key variables.
• In our case, the key variable is SEQN-Respondent Sequence Number.
• Make sure the key variable has the same name and variable type in two data sets.
• Both data sets needs to be sorted by the key variable.
19
Task 3: Merge Two Data Sets
• Under demo_d.sav data set, go to Data -> Merge File -> Add Variables
20
Task 3: Merge Two Data Sets
• Choose “An open data set”.
• Click “agq_d.save” and “Continue”.
21
Merge two files
22
Split Files
• Very often, we like to perform separate analysis by groups (also called strata.)
• We can do so in SPSS by splitting files for groups.
• Task: In demo_d.save. Split the file by gender then get the mean of age for each gender group.
23
Split File
• Under demo_d.sav, click “Data”, “Split File”.
• Choose “Organize output by groups”
• Select “Gender”• Choose “Sort the file by
grouping variables”.• Click “Ok”.
24
25
Split Files
• Cancel the split file function when the task is done.
• Go to “Data”, “Split Files”.
• Check “Analyze all cases, do not create groups”. This is default.
26
Compute Variables
• Can be used to create new variables.• Task 1: create a new variable age (month) for
the existing variable age (year)• Task 2: Log-transform age variable.
27
Compute Variables
Under demo_d.xpt, go to “Transform”, “Compute Variable”.
28
Log transform variables
29