data analysis using spss

178

Upload: muhammad-ibrahim

Post on 15-Jul-2015

236 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: Data analysis using spss
Page 2: Data analysis using spss

DATA ANALYSIS USING SPSS

Muhammad Ibrahim

Associate Professor of Statistics

Govt. MAO College Lahore

0300-4668681

[email protected]

Page 3: Data analysis using spss

LEARNING OBJECTIVES

1.  Understand basic concepts of biostatistics and computer software SPSS.

2.  Select appropriate statistical tests for particular types of data.

3.  Recognize and interpret the output from statistical analyses.

4.  Report statistical output in a concise and appropriate manner.

Page 4: Data analysis using spss

BASIC TERMINOLOGY

Statistics, Biostatistics, Variable, Measurement

Scale, Data, Medical Data, type of data, Data

Analysis

Page 5: Data analysis using spss

VARIABLE, SCALE, DATA

Variable is a characteristics which varies and

scale is a device on which observations are

taken. Data is set of observations/measurements

taken from experiment/survey or external source

of a specific variable using some appropriate

measurement scale

Page 6: Data analysis using spss

Statistics and Bio-statistics

Statistics is generally understood as the subject dealing with number and data, more broadly it involves activities such as collection of data from survey or experiment, summarization or management of data, presentation of results in a convincing format, analysis of data or drawing valid inferences from findings. Whereas Bio-Statistics is science which helps us in managing medical data with application of statistical methods/techniques/tools or a collection of statistical procedures particularly well-suited to the analysis of healthcare-related data

Page 7: Data analysis using spss

What is medical data?

The data which is related to patient care or numerical

information regarding patient’s clinical characteristics,

mortality rate survival rate, disease distribution,

prevalence of disease, efficacy of treatment, and

other such information is called medical data.

Page 8: Data analysis using spss

NATURE OF DATA

Data is the value you get from observing

(measuring, counting, assessing etc.) from

experiment or survey. Data is either categorical or

metric. Categorical data is further divided into

Nominal and ordinal, whereas metric into discrete

and continuous (quantitative) data.

Page 9: Data analysis using spss
Page 10: Data analysis using spss

Nominal data

The data is divided into classes or categories. Blood type, sex, causes of

disease, urban/rural, alive/ dead, infected/not infected, hair color, smoking

status. No meaningful order of classes.

Ordinal data

The data is also divided into classes or categories but be put in meaningful

order.

For example satisfaction level:-Very satisfied, satisfied, neutral, unsatisfied,

very unsatisfied. Pain as mild, moderate, sever. Socioeconomic status: poor,

middle, rich, grade of breast cancer, better, same, worst.

Discrete data

When data is taken from some counting process, for example number of

patients in different wards, number of nurses, number of hospitals in different

cities.

Continuous or quantitative data

When data is taken from some measuring process, for example, height, weight,

Temperature, uric acid, blood glucose and serum level.

Page 11: Data analysis using spss

Primary Scales of Measurement

Scale Basic

Characteristics

Common

Examples

Marketing

Examples

Nominal Numbers identify

& classify objects

Social Security

nos., numbering

of football players

Brand nos., store

types

Percentages,

mode

Chi-square,

binomial test

Ordinal Nos. indicate the

relative positions

of objects but not

the magnitude of

differences

between them

Quality rankings,

rankings of teams

in a tournament

Preference

rankings, market

position, social

class

Percentile,

median

Rank-order

correlation,

Friedman

ANOVA

Ratio Zero point is fixed,

ratios of scale

values can be

compared

Length, weight Age, sales,

income, costs

Geometric

mean, harmonic

mean

Coefficient of

variation

Permissible Statistics

Descriptive Inferential

Interval Differences

between objects

Temperature

(Fahrenheit)

Attitudes,

opinions, index

Range, mean,

standard

Product-

moment

Page 12: Data analysis using spss

Nominal Scale

The numbers serve only as labels or tags for identifying and classifying

objects.

When used for identification, there is a strict one-to-one correspondence

between the numbers and the objects.

The numbers do not reflect the amount of the characteristic possessed by the

objects.

The only permissible operation on the numbers in a nominal scale is counting.

Social security number, hockey players number. Imn marketing research

respondents, brands, attributes, stores and other objects

Page 13: Data analysis using spss

ORDINAL SCALE

A ranking scale in which numbers are assigned to objects to indicate the relative extent to which the objects possess some characteristic. Can determine whether an object has more or less of a characteristic than some other object, but not how much more or less. any series of numbers can be assigned that preserves the ordered relationships between the objects. So relative position of objects not the magnitude of difference between the objects. In addition to the counting operation allowable for nominal scale data, ordinal scales permit the use of statistics based on percentile, quartile, median. Possess description and order, not distance or origin

Page 14: Data analysis using spss

INTERVAL SCALE

Numerically equal distances on the scale represent equal values in the characteristic being measured.

It permits comparison of the differences between objects. The difference between 1 & 2 is same as between 2 & 3 The location of the zero point is not fixed. Both the zero point and the units of measurement are arbitrary. Everyday temperature scale. Attitudinal data obtained on rating scales. Do not possess origin characteristics (zero and exact measurement)

Page 15: Data analysis using spss

RATIO SCALE

The highest scale that allows to identify objects, rank

order of objects, and compare intervals or differences.

It is also meaningful to compute ratios of scale values

Possesses all the properties of the nominal, ordinal, and

interval scales. It has an absolute zero point.

Height, weight, age, money. Sales, costs, market share

and number of customers are variables measured on a

ratio scale

All statistical techniques can be applied to ratio data.

Page 16: Data analysis using spss

After collecting the accurate and reliable data

successfully by using the appropriate method

from the source, the next step is how to extract

the pertinent and useful information buried in the

data for further manipulation and interpretation.

The process of performing certain calculations

and evaluation in order to extract relevant

information from data is called data analysis.

Data Analysis

Page 17: Data analysis using spss

The data analysis may take several steps to

reach certain conclusions. Simple data can be

organized very easily, while the complex data

requires proper processing. The word

“processing” means the recasting and dealing

with data making ready for analysis.

Cont……

Page 18: Data analysis using spss

•Questionnaire checking/Data preparation

•Coding

•Cleaning data

•Applying most appropriate tools for

analysis

Steps in data analysis

Page 19: Data analysis using spss

QUESTIONNAIRE CHECKING

A questionnaire returned from the field may be

unacceptable for several reasons.

Parts of the questionnaire may be incomplete.

The pattern of responses may indicate that the respondent did not

understand or follow the instructions.

The responses show little variance.

One or more pages are missing.

The questionnaire is received after the pre-established cutoff date.

The questionnaire is answered by someone who does not qualify for

participation.

Page 20: Data analysis using spss

DATA PREPARATIONPreparation of data file

It is important to convert raw data into a usable data for analysis (coding where it needed), simply transform information from questionnaire to computer database

The analysis and results will surely depend on the quality of data

There are possibilities of errors in handling instruments, raw data, transcribing, data entry, assigning codes, values, value labels

Data need to be cleaned to fulfill the analysis conditions

Page 21: Data analysis using spss

CODING

Coding means assigning a code, usually a

number, to each possible response to each

question.

Page 22: Data analysis using spss

•One of the first steps in analyzing data is to

“clean” it of any obvious data entry errors:

Outliers? (really high or low numbers)

Example: Age = 110 (really 10 or 11?)

•Value entered that doesn’t exist for variable?

Example: 2 entered where 1=male, 0=female

•Missing values?

Did the person not give an answer? Was answer

accidentally not entered into the database?

Data cleaning

Page 23: Data analysis using spss

•May be able to set defined limits when entering data Prevents entering a 2 when only 1, 0, or missing are acceptable values

•Univariate data analysis is a useful way to check the quality of the data

Cont……

Page 24: Data analysis using spss
Page 25: Data analysis using spss

SPSSSPSS is a statistical Packages for data analysis, it is a

very popular software because of its friendly usage

in Social & Medical sciences

Page 26: Data analysis using spss

Launching SPSS

Before starting this session, you should know how to run a program in windows operating system. Click and hold on

button at lower left of your screen, and among the program listed select SPSS 16.0, click and release the mouse button

to lauanch the program

Page 27: Data analysis using spss

On clicking of SPSS this window will open then click on cancel button if you like to enter data in a new file or

click on OK for opening an existing file. A window will open known as data editor with variable view.

Page 28: Data analysis using spss

SPSS WINDOWS

There are a number of different types of windows in SPSS. The window in which you are currently working is called

the active window. Some of the frequently used windows are:

Data Editor Window: It displays the contents of the data file. This is the window that opens

automatically when you start an SPSS session. In this window, you can create new data files or modify existing ones.

When you open more than one data file, each data file has a separate Data Editor Window. The Data Editor Window

provides two view of the data:

Data View: It displays the data values. Each variable is a column. Each row is a case.

Variable View: It displays a table consisting of variable names and their attributes. You can modify the properties of

each variable or add new variables or delete existing variables in the Variable View Window.

Data view window variable view window

Page 29: Data analysis using spss

Viewer Window: It displays statistical results, tables, and charts. This window opens automatically the first time you

run a procedure that generates output

Page 30: Data analysis using spss

MORE ABOUT

WINDOWS

Page 31: Data analysis using spss
Page 32: Data analysis using spss

PULL-DOWN MENUS

Many tasks in SPSS are performed by selecting appropriate "pull-down" menus. Each window in SPSS has its own

menu bar with appropriate menu selections and toolbars. The Analyze and Graphs menus are available in all

windows. Here are some Data Editor Window menus and their uses:

File Menu: From the file menu you can open several different existing files or a database file such as

an excel file or read in a text file. You can also save any changes to the current file.

Edit Menu: from the Edit menu, you can cut, copy, paste, insert variables, insert cases, or use find in

the Data Editor window.

Data Menu: The data menu allows you to define variable properties, sort cases, merge files, split files,

select cases and use a variable to weight cases.

Transform Menu: The transform menu is where you will find the options to do some computations on

variables, to create new variables from existing ones or recode old variables.

Analyze Menu: The analyze menu is where all statistical analysis takes place. From descriptive statistics to

regression analysis to nonparametric tests

Page 33: Data analysis using spss

Graphs Menu: The graph menu is where you can create high resolution plots and graphs to be edited in

the chart editor window or you can create interactive graphs.

Utilities Menu: The utilities menu is used to display information on the contents of SPSS data files or to

run scripts.

Add-Ons Menu: From the add-ons menu you can run other packages like conjoint, classification trees, or

Neural Networks. Also there are programmability extensions that allow you to integrate programs like R

and Python into SPSS. But you should keep in mind that if you want to run any of the add-ons listed here

you will have to purchase them separately.

Window: From the window menu you can change the active window. The window with a check mark is the

active one. In this case it is the data editor window.

Help: The help menu allows you to get help on topics in SPSS or to ask the statistics coach some basic

questions.

TOOLBARS

Each window in SPSS has its own toolbars that provides access to common tasks. Some windows have

more than one. When you put the mouse pointer on a tool, there is a brief description of what the tool

does. You can show, move or hide a toolbar.

Page 34: Data analysis using spss

STATUS BARS

The status bar is at the bottom of each SPSS window and provides the following information:

Command Status: gives information about a procedure that is running.

Filter Status: Filter On shows when a subset of cases in the data is used for analysis.

Weight Status: Weight On indicates that a weight variable is being used in the analysis.

Split File Status: Split File On indicates that the file has been split into separate groups for analysis.

DIALOG BOXES

Many menu selections will open dialog boxes. In these dialog boxes, you select variables and options for analysis. The main

dialog box in any statistical procedure has the following parts:

Source variable list: A list of variable types (allowed by the procedure) from the working data file.

Target variable lists: One or more lists of variables needed for the analysis.

Command push buttons: Buttons that can be used to run the procedure by opening a subdialog box to make

additional specifications. Some of the push buttons are:

OK : Click this button to run the procedure.

Paste: Click this button to generate command syntax from your selections. The command syntax is pasted into a syntax window,

where it can be modified for future analysis. This creates the code regularly known as SPSS programs.

Reset: Deselects any selections, and resets all specifications in the dialog box and any subdialog boxes to the default status.

Cancel: Cancels any change in the dialog box settings since the last time it was opened. This will close the dialog box.

Help: Provides help about the current dialog box.

Page 35: Data analysis using spss
Page 36: Data analysis using spss

NameThe name of each SPSS variable in a given file must be unique; it must start with

a letter; it may have up to 8 characters (including letters, numbers, and the

underscore _ (note that certain key words are reversed and may not be used as

variable names, e.g., "compute", "sum", and so forth). To change an existing

name, click in the cell containing the name, highlight the part you want to

change, and type in the replacement. To create a new variable name, click in the

first empty row under the name column and type a new (unique) variable name.

Notice that we can use "cat_dog" but not "cat-dog" and not "cat dog". The hyphen

gets interpreted as subtraction (cat minus dog) by S PSS, and the space confuses

SPSS as to how many variables are being named.

Page 37: Data analysis using spss

TYPETHE TWO BASIC TYPES OF VARIABLES THAT YOU WILL USE

ARE NUMERIC AND STRING. NUMERIC VARIABLES MAY ONLY

HAVE NUMBERS ASSIGNED. STRING VARIABLES MAY

CONTAIN LETTERS OR NUMBERS, BUT EVEN IF A STRING

VARIABLE HAPPENS TO CONTAIN ONLY NUMBERS, NUMERIC

OPERATIONS ON THAT VARIABLE WILL NOT BE ALLOWED

(E.G., FINDING THE MEAN, VARIANCE, STANDARD

DEVIATION, ETC...). TO CHANGE A VARIABLE TYPE, CLICK IN

THAT CELL ON THE GREY BOX WITH ...

Page 38: Data analysis using spss

DecimalsThe decimal of a variable is the number of decimal places that SPSS will display. If more decimals have

been entered (or computed by SPSS), the additional information will be retained internally but not

displayed on screen. For whole numbers, you would reduce the number of decimals to zero. You can

change the number of decimal places by clicking in the decimals cell for the desired variable and

typing a new number or you can use the arrow keys at the edge of the cell

LabelThe label of a variable is a string of text to indentify in more detail what a variable represents.

Unlike the name, the label is limited to 255 characters and may contain spaces and

punctuation. For instance, if there is a variable for each question on a questionnaire, you would

type the question as the variable label. To change or edit a variable label, simply click anywhere

within the cell

Page 39: Data analysis using spss

ValuesAlthough the variable label goes a long way to explaining what the variable represents, for categorical

data (discrete data of both nominal and ordinal levels of measurement), we often need to know which

numbers represent which categories. To indicate how these numbers are assigned, one can add labels to

specific values by clicking on the ... box in the values cell

Clicking here opens up the Value Labels dialogue box.

Page 40: Data analysis using spss

To value 1.0 to cats and 2.0 to dogs, write 1.0 in value box and write cats in value label then click Add button,

the following box will appear.

Page 41: Data analysis using spss

Clicking on this box will bring up the variable type menu:

If you select a numeric variable, you can then click in the width box or

the decimal box to change the default values of 8 characters reserved

to displaying numbers with 2 decimal places. For whole numbers, you

can drop the decimals down to 0.

If you select a string variable, you can tell SPSS how much "room" to

leave in memory for each value, indicating the number of characters

to be allowed for data entry in this string variable.

Page 42: Data analysis using spss

When you are satisfied with the definitions of each value, click on the OK button

Page 43: Data analysis using spss

The real beauty of value labels can be seen in the Data View by clicking on the "toe

tag" icon in the tool bar , which switches between the numeric values

and their labels

Page 44: Data analysis using spss

A view of different variables with their descriptions

Page 45: Data analysis using spss

MissingWhen you click missing button the SPSS will display this

We sometimes want to signal to SPSS that data should be treated as missing, even though there is some

other numerical code recorded instead of the data actually being missing (in which case SPSS displays a

single period -- this is also called SYSTEM MISSING data). In this example, after clicking on the ... button in

the Missing cell, I declared "9", "99", and "999" all to be treated by SPSS as missing (i.e., these values will be

ignored)

Page 46: Data analysis using spss

ColumnsThe columns property tells SPSS how wide the column should be for each variable. Don't confuse this one

with width, which indicates how many digits of the number will be displayed. The column size indicates how

much space is allocated rather than the degree to which it is filled.

AlignThe alignment property indicates whether the information in the Data View should be left-justified, right-

justified, or centered

Page 47: Data analysis using spss

MeasureThe Measure property indicates the level of measurement. Since SPSS does not differentiate between

interval and ratio levels of measurement, both of these quantitative variable types are lumped together

as "scale". Nominal and ordinal levels of measurement, however, are differentiated

Page 48: Data analysis using spss

ENTERING

DATA SET

Into SPSS

Page 49: Data analysis using spss

Let we have data set with different variables

and we need to enter in SPSS, below is set of

variables and data set, this file is named as

“bp” in dataset

Example

Page 50: Data analysis using spss

Data Set:Professor Christopher conducted a study on subjects; the variable description is as with dataVariable Description

Sjcode ubject Code

Sex Subject sex (0 = female, 1= male)

Age Subject age

Height Height in inches

Weight weight, in pound

Race Subject Race (1=Amer, 2= Asian, 3= black, 4=

Hispanic, 5= white, 9= none of above)

Med Taking prescription medication (0= No, 1= Yes)

Smoke Does subject smoke? (0 =Nonsmoker, 1= smoker)

SBPCP Systolic blood pressure with cold presser

DBPCP Diastolic blood pressure with cold presser

HRCP Heart rate with cold presser

SBPMA Systolic blood pressure while doing mental

arithmetic

DBPMA Diastolic blood pressure while doing mental

arithmetic

HRMA Heart rate with while doing mental arithmetic

SBPREST Systolic blood pressure at rest

DBPREST Diastolic blood pressure at rest

PH Parental hypertension (0= No, 1= yes)

MEDPH Parent(s) on EH meds (0= No, 1=yes)

Page 51: Data analysis using spss

SJcode sex age height weight race meds smoking sbpcp dbpcp hrcp sbpma dbpma hrma sbrest dbrest Ph Medph

3 Female 19 65 155 White No Med Non smoker 126 65 88 135.667 81.333 76.667 116.25 60.75 PH+ Parent EH Yes

4 Female 18 63 132 White No Med Non smoker 125 80 96 130.667 82.667 92.667 115.75 76.375 PH+ Parent EH Yes

5 Female 19 66 138 White No Med Non smoker 149 90 91 135.333 90.333 64.333 120.5 65.375 PH+ Parent EH Yes

9 Female 18 66 130 White No Med Non smoker 113 89 88 128.333 82.333 85.667 113.625 72.125 PH- Parents EH No

10 Female 18 66 175 White No Med Non smoker 112 70 82 121.667 75.333 85 110 68.75 PH- Parents EH No

11 Female 18 62 113 White No Med Non smoker 125 70 73 133.333 82.333 74.333 119.75 73.5 PH- Parents EH No

13 Male 20 73 159 White No Med Smoker 162 62 58 145.667 68 74 130.75 57.125 PH+ Parent EH Yes

15 Male 18 70 155 White No Med Non smoker 123 73 53 137.333 78.667 53.667 126.375 65.625 PH+ Parent EH Yes

16 Male 19 69.5 185 White No Med Non smoker 139 66 48 148.667 81.667 78.667 127.625 67.375 PH+ Parent EH Yes

19 Male 18 70 164 White No Med Non smoker 133 65 85 134.333 58.667 66.667 121.75 56.5 PH- Parents EH No

20 Male 19 71 170 White No Med Non smoker 152 75 71 150.333 73 82.333 129.875 60 PH- Parents EH No

21 Male 18 76 179 Hispanic No Med Non smoker 128 70 63 121 71.333 71 121 68.5 PH- Parents EH No

23 Female 19 68.5 160 White No Med Non smoker 119 51 68 117 62.333 73.333 107.875 51.375 PH+ Parent EH Yes

24 Female 20 66 132 White No Med Non smoker 120 67 80 128.333 72.667 81 108 63.75 PH+ Parent EH Yes

25 Female 19 67.5 150 Black No Med Non smoker 129 95 70 121.333 71 77 110.25 62.875 PH- Parents EH No

26 Female 20 62 105 White Yes Med Non smoker 124 90 93 124 92.333 87 104.375 76.375 PH+ Parent EH Yes

29 Female 19 62 120 White No Med Non smoker 130 75 103 132.667 76 88.667 117.625 67.875 PH- Parents EH No

30 Female 18 67.5 143 White No Med Non smoker 130 95 93 120.667 83.667 98.333 111 77.375 PH- Parents EH No

32 Female 18 63.5 130 White No Med Non smoker 109 73 71 104 61 65.667 105.125 53.875 PH- Parents EH No

35 Male 20 66 127 White No Med Non smoker 129 68 107 124.333 63.667 93.333 117.75 62.75 PH- Parents EH No

Page 52: Data analysis using spss

Entering data into data editor

In this lesson our goal is only, how to enter, save, and edit data (the data sheet given above). The first step in

entering the data into data editor is to define all the variables. Creating a variable requires us to name it,

specify the type of data (nominal, ordinal, Scale) and assign label to the variables and data values if needed.

•Move the cursor to the bottom of the data editor, named as variable view and click it, a different grid appears

as

•Move the cursor into first empty cell in row 1 (under name) here type sjcode, then press enter

•When the cursor moves to the Type column , a small grey button marked with three dots

will appear, click on it you see this dialog box, numeric is default variable type, click ok.

Page 53: Data analysis using spss

Note that the Measure column (far right column) be put on scale, because you took numeric as variable

type, In SPSS, each variable carry a descriptive label to help identify its meaning. To add label, here is

procedure:

•Move the cursor into the label column and type Subject Code.

This complete the definition of first column.

•Now lets creats a varable to represent sex, move the fisrt colume of row 2, and name the variable

sex.

•Because sex is categorical (qualitative ) variable and we are going to represent it numerically ( for

data analysis purpose, because SPSS only entertains quantitative variable). Sinse numeric is the

default in type column, we shall skip it and go to width taking width as per our requirement, in

decimal column reduce from 2 to 0

•Label this variable as subject sex

•Now we can assign text label to our coded values ( as discussed previously). In the values column

click the grey box with three dots. A box will open as below

Page 54: Data analysis using spss

Type “0” in value box and type Female in the value label box.

Page 55: Data analysis using spss

Then click add

Now type 1 in Value and Male in Label, click add

and the click OK. In similar way we will add all the variables, the variable view window will be seen as

Page 56: Data analysis using spss
Page 57: Data analysis using spss

Now Switch to data view by clicking the appropriate tab in the lower left of screen.

Move the cursor to the first cell below the sjcode, and type 3, and then press Enter.

In the next cell type 4, when you completed the subject code, move to the tope cell

under sex, type “0” for female and “1” for male and go on. When you are done all,

the data editor should look as

On clicking the third button (named Value label) at left most you will see the screen as below

Page 58: Data analysis using spss

Saving the data fileIt is wise to save all your work in a disk file. To save a file, click on file menu, choose save as …, then next to file name, where

type BP, then click save.

Page 59: Data analysis using spss
Page 60: Data analysis using spss

Editing the data file/value

To edit any value, just to open the data file and click edit menu, and

select the case or variable which is required for editing.

Quitting SPSS

When you have completed your work, it is important to exit the program propoerly. Go

to file menu, then click on Exit , generally you will see a message asking if you wish to

save changes. Since we saved every thing earlier, click No.

Page 61: Data analysis using spss

Here we discuss the issues like, transform,

select, split, compute new variables,

re-coding of data, merging files, sorting,

transpose, weighted cases

File management

Page 62: Data analysis using spss

This tool allows you to rearrange the data

Open file data sort cases

select variable then ok

Sorting data

Page 63: Data analysis using spss
Page 64: Data analysis using spss
Page 65: Data analysis using spss

If some values are missing in data/variables that

can be replaced by different methods, if

variable is categorical then the value is replaced

by the researcher on his/her personal

experience, but the variable is continuous, SPSS

will help using the Replace missing value

command. Open file, and investigate any missing

value using sort command,

Replacing missing values

Page 66: Data analysis using spss

Then go to transform tool replace

missing value using option

Cont………

Page 67: Data analysis using spss
Page 68: Data analysis using spss

Sometimes a new variable is needed on the

basis of current/existing variable or set of

variables. The producer is as

Menu transform compute

variable ….. Insert target value and write

desired operation in target expression like

square, log ect.

Creating Variables

Page 69: Data analysis using spss
Page 70: Data analysis using spss

Open file “student” , convert weight into Kg then

fiend BMI of students. 1 Kg = 2.20462 Lb and

1M = 39.3701 and find BMI= weight/(height)2

Compare this BMI with this

BMI =weight in Lb/height in inch x703

Activity

Page 71: Data analysis using spss

If the researcher is interested to re-code the

data as you want to recode 1 5 or wants to

make numerical data into groups , then we use

re-code tool. Open the data file. From the menus

choose: Transform | Recode | Into

Different Variables...

Following Recode into Different Variables

Dialog box appears.

Re-coding

Page 72: Data analysis using spss
Page 73: Data analysis using spss

Select the variable you want to recode. For this example select AAA, and click the

right arrow button (►) to move the variable into the Input Variable > Output

Variable box, following sign appears in this box:

AAA >?

In the Output Variable group, enter an output variable name (e.g. AA1) in the Name

box, and you may label it as Stillbirth Rate Category [optional] for new variable and

click change.

Up to now, the dialog box looks as under:

Page 74: Data analysis using spss

Click Old and New Values... tab following dialog box appears, and specify how to recode

values

In the old value group, select the 5th choice then put 24 in the lowest through box.. In the

value box under new value group input 1.

Page 75: Data analysis using spss

Click Add tab. Similarly, for the closed class interval like 25-29, select the 4th choice in the old

value group then put 25 (selection of 4th choice in each case) till the time when you input 5 in the

New Value through 29 and in the value under new value input 2, then click Add tab. Repeat this

process . Now for the highest open class, select the 6th choice in the Old Value group then put 45

in the through highest box. In the Value box under New Value group input 6, then click Add tab.

The final shape looks as under.

Click Continue and then OK. The XYZ-SPSS Data Editor containing two variables viz. AAA and AA1t looks as under,

one in Variable View and other in Data View.

Page 76: Data analysis using spss

Specify Value Labels

Make the Data Editor the active window.

If the data view is displayed, double-click the variable name at the top of the column in

the data view or click the Variable View tab. Click the button in the values cell for the

variable that you want to define. For each value, enter the value and a label (the one

as seen below). Click Add to enter the value label, at last click OK.

Page 77: Data analysis using spss

For above activity make grouping of BMI as

Underweight < 18.5

Normal 18.5 – 22.9

Overweight > 22.9

Also make output of groups

Activity

Page 78: Data analysis using spss

This tool is used to analysis data for sub-group

or a specific group like mean of respondent

whose weight is above 85 Kg

Open file, select data at MENU bar, select cases

, click on if and write your option for selection ,

for example select male in BP file as gender=1

Select cases

Page 79: Data analysis using spss
Page 80: Data analysis using spss
Page 81: Data analysis using spss
Page 82: Data analysis using spss
Page 83: Data analysis using spss

Select male cases in “bp” file also female whose

age is more than 50 years

Activity

Page 84: Data analysis using spss

Two file may be merged either by variables or

by case. Let we have 1000 respondents whose

has six variables. If two data entry operators

are completing this task. They can do this task in

two ways (1) divide the cases to complete (2)

divide the number of variables

Merging file

Page 85: Data analysis using spss

File can be split into two or three categories, go

to menu then data then select split file and then

perform operation

Split file

Page 86: Data analysis using spss

Data analysis

Page 87: Data analysis using spss

BASIC STRATEGY

The following strategy is adopted to analyze the data

• Description , counting, Proportion

•Prediction, relationship, Association

•Comparing , estimation (95% confidence interval)

Page 88: Data analysis using spss

DATA ANALYSIS MAY BE

DESCRIPTIVE OR INFERENTIAL

DESCRIPTIVE CONTAINS MEAN,

MEDIAN , MODE, SD,

REGRESSION, CORRELATION ,

ON THE OTHER HAND

CONFIDENCE INTERVAL, TESTING

OF HYPOTHESIS, P-VALUE,

ANOVA RELATE TO INFERENTIAL

Page 89: Data analysis using spss

UNI-VARIATE DESCRIPTIVE ANALYSIS

Graphical Method

For nominal & ordinal data we use Bar or pie chart

For continuous data we use histogram

Numerical method

For nominal & ordinal data we use Frequency/proportions

For continuous data we use Mean , Standard deviation

Page 90: Data analysis using spss

Summary Guide

Scale Nominal Ordinal

Displaying data

Histogram

Box-plot

Bar chart, Pie chart Bar chart, Pie chart

Summarizing data

Mean, Median, SD Frequency table,

Percentages,

Proportion

Frequency table,

Percentages,

Proportion

Page 91: Data analysis using spss

GRAPHS FOR

CATEGORICAL DATA

Page 92: Data analysis using spss

MAKING BAR/PIE CHART

Open the file, then from pull-down menu click

on legacy dialogue, then click Bar/pie chart ,

select variable then click ok

Page 93: Data analysis using spss
Page 94: Data analysis using spss
Page 95: Data analysis using spss
Page 96: Data analysis using spss

DATA SUMMERY

Open the file, then from pull-down menu click on

analyze Descriptive statistics

frequency select variable

Click ok, output window will appear

Page 97: Data analysis using spss
Page 98: Data analysis using spss
Page 99: Data analysis using spss

GRAPH FOR CONTINUOUS

DATA

Page 100: Data analysis using spss

MAKING HISTOGRAM

Open the file, then from pull-down menu click

on legacy dialogue, then click histogram, select

variable, click ok

Page 101: Data analysis using spss
Page 102: Data analysis using spss
Page 103: Data analysis using spss
Page 104: Data analysis using spss

DATA SUMMARY

Open the file, then from pull-down menu click on analyze

Descriptive statistics Descriptive Statistics

select variable

Click ok, output window will appear

Page 105: Data analysis using spss
Page 106: Data analysis using spss
Page 107: Data analysis using spss

FOR ALL DESCRIPTIVE STATISTICS

AND 95% CONFIDENCE INTERVAL

Open the file, then from pull-down menu click on analyze

Descriptive statistics explore select

variable Click ok, output window will appear

Page 108: Data analysis using spss
Page 109: Data analysis using spss

Summary Guide for appropriate analysis for

two variableType of variables Graphical display Relationship

Categorical-

categorical

Multiple bar Contingency table

Categorical-Scale Box-plot Descriptive statistics

for each group

Scale-scale Scatter plot Correlation

Page 110: Data analysis using spss

GRAPH FOR CATEGORICAL DATA

Page 111: Data analysis using spss

MULTIPLE BAR CHART

Open the file, then from pull-down menu click on legacy

dialogue, then click Bar chart , select variable to

category axis and one to cluster then click ok

Page 112: Data analysis using spss
Page 113: Data analysis using spss
Page 114: Data analysis using spss
Page 115: Data analysis using spss

CONTINGENCY TABLE

Open the file, then from pull-down menu click on analyze

Descriptive statistics cross-tab select

variables, one to row and one to column, for cell proportion

Click cell and click on total, for chi-square click on statistics

ok, output window will appear

Page 116: Data analysis using spss
Page 117: Data analysis using spss
Page 118: Data analysis using spss
Page 119: Data analysis using spss
Page 120: Data analysis using spss

GRAPH FOR CONTINUOUS

DATA

Page 121: Data analysis using spss

SCATTER PLOT

Open file, on pull-down menu, click on graph

legacy dialogs scatter plot

enter variables to x-axis and y-axis then click ok

Page 122: Data analysis using spss
Page 123: Data analysis using spss
Page 124: Data analysis using spss
Page 125: Data analysis using spss

CORRELATION COEFFICIENT

Open the file, then from pull-down menu click on

analyze correlate select variables

ok, output window

will appear

Page 126: Data analysis using spss
Page 127: Data analysis using spss
Page 128: Data analysis using spss

SUMMARY ONE CATEGORICAL

ONE CONTINUOUS VARIABLE

When we have one categorical and one

continuous variable , then for descriptive

analysis we will use Explore command and for

graph we use Box-plot , suppose we have

gender and weight of respondents

Page 129: Data analysis using spss

DESCRIPTIVE STATISTICS

Open file, go to analyze, then select descriptive

statistics explore , a window will open then

select continuous variable and past to dependent list

and categorical to factor list , then click ok

Page 130: Data analysis using spss
Page 131: Data analysis using spss
Page 132: Data analysis using spss
Page 133: Data analysis using spss

BOX PLOT

Open file, click on Graph then click to legacy dialog,

the box plot then click simple then define now put

continuous variable to variable and categorical (sex,

SES) to category axis and click ok

Page 134: Data analysis using spss
Page 135: Data analysis using spss
Page 136: Data analysis using spss
Page 137: Data analysis using spss

REGRESSION ANALYSIS

Prediction of one variable on the basis of other variable or set of variables (be sure all variables are continuous) for example prediction of BP when age of a person is 55 years. The mathematical equation is as

Where a and b are coefficients of equation

XAgebaYBP )()(

Page 138: Data analysis using spss

CONT…..

Open file analyze Regression Linear

the put dependent variable and independent variable in

respected box ok

Page 139: Data analysis using spss
Page 140: Data analysis using spss
Page 141: Data analysis using spss
Page 142: Data analysis using spss

REGRESSION LINE

This is regression line using results of previous

slide.

)(075.061.129)( AgeBPY

Page 143: Data analysis using spss

MEASURE OF RISK

When we have exposure and outcome (2x2) , the

Odds Ratio (OR) is measure in cross-tab

command, when we open cross –tab, click on

statistic, then click on Risk and continue

Page 144: Data analysis using spss
Page 145: Data analysis using spss
Page 146: Data analysis using spss

Open file “states”, for variable “bac”, what percentage of states

use the 0.8 standard.

Open file “Aids”, determine the shape of distribution of Aids cases

reported in 1994

Open file “students”, make side-by-side histogram of height in

comparison for male and female. Make a cross-tab (contingency

table) of gender, and eye-color, also compare blue color in male

and female. Make a scatter plot between height and weight and

interpret the graph. Compute descriptive statistics of variable

amount paid for hair cut.

Activity

Page 147: Data analysis using spss

Open file “college” , focus on two variables in-

state tuition and out-state tuition , show which

varies more (calculate coefficient of variation).

Construct Box-plot for math score in public and

private school and comments on plot. On the

average, in which subjects (mathsat, verbsat)

score is larger.

Cont……

Page 148: Data analysis using spss

Open file “GSS94” , answer the questions

Did female tends to watch more or less TV per day than male

(calculate descriptive statistics)

If the respondents are afraid to walk alone in neighborhood,

compare mean age of those who said “yes” or “no”.

Make contingency table for sex and Race.

Make a cross –tab of variables marital status and marnomar and

find the probability of a person who is married

Cont….

Page 149: Data analysis using spss

Open file “bodyfat”, calculate correlation

between neck and chest circumference, also fit a

regression line chest circumference on neck

circumference.

Investigate the variables “Fatperc”, “age” ,

“weight”, “neck” about their normality using

appropriate test and graph.

Cont…..

Page 150: Data analysis using spss

Open file “sleep”, using appropriate descriptive and graphical

technique, how would you establish relationship between the amount

of sleep a species require and mean weight of species. Also

interpret the results. Make a frequency distribution of variable

amount of sleep taking appropriate interval. Construct 95%

confidence interval for total sleep and life span

Open file “colleges”, construct 95% confidence interval for mean

room and board charges and what does it mean?

Cont….

Page 151: Data analysis using spss

TESTING OF HYPOTHESIS

Here we will discuss

• one sample t-test

•Two sample t-test (independent groups, dependent

groups)

•One way AVOVA (F-test)

Page 152: Data analysis using spss

ONE SAMPLE T-TEST

Open data file “bodyfat”, test the hypothesis the

population mean body fat is 23 against it is not

equal to 23.

Analyze compare means one sample t-

test, select variable body fat and enter 23 as test

value, results are as

Page 153: Data analysis using spss
Page 154: Data analysis using spss
Page 155: Data analysis using spss
Page 156: Data analysis using spss

INTERPRETATION OF RESULTS

Here the sample mean is 19.15 and t-statistic is -7.30 and

p-value is 0.000, which suggested to reject null hypothesis

and it is concluded that population mean body fat is not

23

Page 157: Data analysis using spss

TWO (INDEPENDENT) SAMPLE T-

TEST

Sometimes we focus on comparing means of variable of

interest of two different samples. For example whether

height of bys is different from girl’s height. Open file

“students” and compare height of boys and girls

Page 158: Data analysis using spss

Open file analyze compare means

independent samples , click then a window will

open select height as test variable and gender

as grouping variable. Define grouping

variable putting the value of male and female

then click ok

Page 159: Data analysis using spss
Page 160: Data analysis using spss
Page 161: Data analysis using spss

T value P-value

Page 162: Data analysis using spss

PAIRED T-TEST (DEPENDENT SAMPLES)

Sometimes observations are taken before and after some

treatment on same respondents. For example BP is

measure before and after medicine. This type of sample is

called paired sample. Open file “swimmer2” and we wish

to see any difference is freestyle at two points of students

Page 163: Data analysis using spss

Open file analyze compare means

paired sample t-test , click then a window will open

select two variables 100 meter freestyle click ok

Page 164: Data analysis using spss
Page 165: Data analysis using spss
Page 166: Data analysis using spss
Page 167: Data analysis using spss

ONE WAY- NOVA

For more than two independent groups we use one-way

ANOVA. Suppose we are interested to know whether out

campus job effect the students GPA. Open file student

and test GPA with grouping variable work category. The

null hypothesis is that GPA is same for all working

category. If null hypothesis is rejected then we post hoc

test (LSD)

Page 168: Data analysis using spss

PROCEDURE

Open file analyze compare means,

One-way ANOVA, the dependent list variable is

GPA, Factor variable is workcat ,click option under

statistics , select descriptive then click on post hoc, a

window will open select LSD cick ok

Page 169: Data analysis using spss
Page 170: Data analysis using spss

Posthoc

test

Page 171: Data analysis using spss
Page 172: Data analysis using spss
Page 173: Data analysis using spss

Open file “GSS94” and test the null hypothesis that the

adults in United States watch an average of three hours

of TV daily. Test the hypothesis males spent 3 hours

while watching TV (Use select command)

Is there a statistically significant difference in amount of

time men and women spend watching TV. Is there a

statistically significant difference in amount of time

married and divorced spend watching TV?

Activity

Page 174: Data analysis using spss

Open file “students”, test the hypothesis, commuters and residents

earn significantly different mean grades? Do car owners have

significantly fewer accidents on average than non-owners? Interpret

your results using 95% confidence interval and p-value.

Open file “BP”, test the hypothesis: do subjects with parental history

of hypertension have significantly higher resting Systolic and

Diastolic BP than subjects with no parental history?

Open file “GSS94”, does the amount of television viewing varying

by respondent’s race? (ANOVA)

Cont….

Page 175: Data analysis using spss

Open file “BP”, is systolic BP (sbpma) related to a person’s sex,

parental hypertension (ph) or some combination of these factors.

Open file “group” , is subject’s perception of co-worker related to

gender , group size or combination of these two factors?

Open file “bodyfat”, consider a man whose chest measurement is

95 cm, abdomen is 85 cm, and whose weight is 158 pounds; use

regression equation to estimate this man’s body fat percentage. (use

multiple regression) Also write the regression equation and interpret

the results.

Page 176: Data analysis using spss

Develop the multiple regression line to estimate body fat

percentage on the basis of following variables, Age, weight,

abdomen circumference, chest circumference, thigh circumference,

wrist circumference using matrix plot/correlation matrix/ p-value.

Open file “salem”, test whether variables proparri and accuser are

independent (use chi-square test)

Open file “students”, test smokers tend to drink more beer than

nonsmokers? (select parametric or non-parametric test , t test or

Mann-U test)

Page 177: Data analysis using spss

ADVANCED DATA ANALYSIS

Followings are advanced tools

•Logistic regression, survival analysis (KM curve)

•Factor analysis, Reliability

•ANOVA repeated measures

•Time series analysis (forecasting)

Page 178: Data analysis using spss

Medical Data Analysis

Univariate

Categorical Data

Descriptive Analysis

Graphs, Bar, Pie Charts

Frequency (f), Percentage (%), Proportion

Inferential Analysis

Chi-square (χ2) test

Z-test

Continuous Data

Descriptive Analysis

Histogram

Mean ± S.D

Inferential Analysis

Z-test (n>30)

t-test (n<30)

Multivariate

Categorical Data

Descriptive Analysis

Multiple Bar Charts

Contigency Table

Inferential Analysis

Association χ2, OR, RR

Prediction, Logistic Regression

Continuous Data

Descriptive Analysis

Scatter Plot, Box Plot

Relationship, Regression, Correlation

Inferential Analysis

t-test

ANOVA, Multiple Regression