data display and summary

33
Data Display and Summary Biostatistics By Dr Zahid Khan

Upload: drzahid-khan

Post on 07-May-2015

307 views

Category:

Health & Medicine


1 download

DESCRIPTION

Data Display and Summary

TRANSCRIPT

Page 1: Data Display and Summary

Data Display and Summary

Biostatistics

By Dr Zahid Khan

Page 2: Data Display and Summary

2

Data

• Data is a collection of facts, such as values or measurements.

OR

• Data is information that has been translated into a form that is more convenient to move or process.

OR

• Data are any facts, numbers, or text that can be processed by a computer.

Page 3: Data Display and Summary

3

Statistics

Statistics is the study of the collection, summarizing, organization, analysis, and interpretation of data.

Page 4: Data Display and Summary

4

Vital statistics Vital statistics is collecting, summarizing,

organizing, analysis, presentation, and interpretation of data related to vital events of life as births, deaths,

marriages, divorces,

health & diseases.

Page 5: Data Display and Summary

5

Biostatistics Biostatistics is the application of statistical

techniques to scientific research in health-related fields, including medicine, biology, and public health.

Page 6: Data Display and Summary

6

Descriptive Statistics

The term descriptive statistics refers to statistics that are used to describe. When using descriptive statistics, every member of a group or population is measured. A good example of descriptive statistics is the Census, in which all members of a population are counted.

Page 7: Data Display and Summary

7

Inferential or Analytical Statistics

Inferential statistics are used to draw conclusions and make predictions based on the analysis of numeric data.

Page 8: Data Display and Summary

8

Primary & Secondary Data

• Raw or Primary data: when data collected having lot of unnecessary, irrelevant & un wanted information

• Treated or Secondary data: when we treat & remove this unnecessary, irrelevant & un wanted information

• Cooked data: when data collected not genuinely and is false and fictitious

Page 9: Data Display and Summary

9

Ungrouped & Grouped Data • Ungrouped data: when data presented or observed individually.

For example if we observed no. of children in 6 families

2, 4, 6, 4, 6, 4

• Grouped data: when we grouped the identical data by frequency. For example above data of children in 6 families can be grouped as:

No. of children Families

2 1

4 3

6 2

or alternatively we can make classes:

No. of children Frequency

2 - 4 4

5 - 7 2

Page 10: Data Display and Summary

10

Variable 

A variable is something that can be changed, such as a characteristic or value. For example age, height, weight, blood pressure etc

Page 11: Data Display and Summary

11

Types of Variable 

Independent variable: is typically the variable representing the value being manipulated or changed. For example smoking

Dependent variable: is the observed result of the independent variable being manipulated. For example ca of lung

Confounding variable: is associated with both exposure and disease. For example age is factor for many events

Page 12: Data Display and Summary

12

Categories of DATA

Page 13: Data Display and Summary

13

Quantitative or Numerical data

This data is used to describe a type of information that can be counted or expressed numerically (numbers)

2, 4 , 6, 8.5, 10.5

Page 14: Data Display and Summary

14

Quantitative or Numerical data (cont.)

This data is of two types

1. Discrete Data: it is in whole numbers or values and has no fraction. For example

Number of children in a family = 4

Number of patients in hospital = 320

2. Continuous Data (Infinite Number): measured on a continuous scale. It can be in fraction. For example

Height of a person = 5 feet 6 inches 5”.6’

Temperature = 92.3 °F

Page 15: Data Display and Summary

15

Qualitative or Categorical dataThis is non numerical data as

Male/Female, Short/Tall

This is of two types

1. Nominal Data: it has series of unordered categories

( one can not √ more than one at a time) For example

Sex = Male/Female Blood group = O/A/B/AB

2. Ordinal or Ranked Data: that has distinct ordered/ranked categories. For example

Measurement of height can be = Short / Medium / Tall

Degree of pain can be = None / Mild /Moderate / Severe

Page 16: Data Display and Summary

16

Stem and Leaf Plots

• .Simple way to order and display a data set.

• Abbreviate the observed data into two significant digits.

Stem Leaf

• 0 6 1 4

• 1 1 3 5

• 2 6 2 0

• 3 2

0.6 2.6 0.1

1.1 0.4 1.3 1.5 2.2 2.0 3.2

Page 17: Data Display and Summary

17

Measures of Central Tendency & Variation (Dispersion)

Page 18: Data Display and Summary

18

Measures of Central Tendency

are quantitative indices that describe the center of a distribution of data. These are

• Mean

• Median (Three M M M)

• Mode

Page 19: Data Display and Summary

19

Mean Mean or arithmetic mean is also called AVERAGE

and only calculated for numerical data. For example

• What average age of children in years?

Children 1 2 3 4 5 6 7

Age 6 4 4 3 2 4 6

-- Formula X = ∑ X ___

n

Mean = 6 4 4 3 2 4 5 = 28 = 4 years

7 7

Page 20: Data Display and Summary

20

Median

• It is central most value. For example what is central value in 2, 3, 4, 4, 4, 5, 6 data?

• If we divide data in two equal groups 2, 3, 4, 4, 4, 5, 6 hence 4 is the central most value

• Formula to calculate central value is:

Median = n + 1 (here n is the total no. of value)

2

Median = (n + 1)/2 = 7 + 1 = 8/2 = 4

Page 21: Data Display and Summary

21

Mode

• is the most frequently (repeated) occurring value in set of observations. Example

• No mode

Raw data: 10.3 4.9 8.9 11.7 6.3 7.7

• One mode

Raw data: 2 3 4 4 4 5 6

• More than 1 mode

Raw data: 21 28 28 41 43 43

Page 22: Data Display and Summary

Comparison of the Mode, the Median, and the Mean

• In a normal distribution, the mode , the median, and the mean have the same value.

• The mean is the widely reported index of central tendency for variables measured on an interval and ratio scale.

• The mean takes each and every score into account.

• It also the most stable index of central tendency and thus yields the most reliable estimate of the central tendency of the population.

Page 23: Data Display and Summary

23

Histogram/Bar Chart

• Histogram & Box plots are used for continuous or scale variables like temperature, Bone density etc

• Bar chart & Pie Charts are used to categorical or nominal variables like gender, name etc.

Page 24: Data Display and Summary

24

Measures of Dispersion

quantitative indices that describe the spread of a data set. These are

• Range

• Mean deviation

• Variance

• Standard deviation

• Coefficient of variation

• Percentile

Page 25: Data Display and Summary

25

Range

It is difference between highest and lowest values in a data series. For example:

the ages (in Years) of 10 children are

2, 6, 8, 10, 11, 14, 1, 6, 9, 15

here the range of age will be 15 – 1 = 14 years

Page 26: Data Display and Summary

26

Mean Deviation This is average deviation of all observation from

the mean -

Mean Deviation = ∑ І X – X І _______

_ n here X = Value, X = Mean n = Total no. of value

Page 27: Data Display and Summary

Mean Deviation ExampleA student took 5 exams in a class and had scores of 92, 75, 95, 90, and 98. Find the mean deviation for

her test scores.• First step find the mean. _

x = ∑ x ___ n

= 92+75+95+90+98

5

= 450

5

= 90

27

Page 28: Data Display and Summary

Dr. Riaz A. Bhutto 289/3/2012

Values = X ˉ Mean = X

Deviation from

ˉ Mean = X - X

Absolute value ofDeviationIgnoring + signs

92 90 2 2

75 90 -15 15

95 90 5 5

90 90 0 0

98 90 8 8

Total = 450

n = 5 Mean Deviation =

_ ∑І X – X І _______ = 30/5 n

--∑ X - X = 30

= 6

Average deviation from mean is 6

• 2nd step find mean deviation

Page 29: Data Display and Summary

29

Variance

• It is measure of variability which takes into account the difference between each observation and mean.

• The variance is the sum of the squared deviations from the mean divided by the number of values in the series minus 1.

• Sample variance is s² and population

variance is σ²

Page 30: Data Display and Summary

30

Variance (cont.)

• The Variance is defined as:

• The average of the squared differences from the Mean.

• To calculate the variance follow these steps:

• Work out the Mean (the simple average of the numbers)

• Then for each number: subtract the Mean and square the result (the squared difference)

• Then work out the average of those squared differences.

Page 31: Data Display and Summary

Dr. Riaz A. Bhutto 9/3/2012

31

Step 1

Step 2 Step 3

Step 4

Values = X ˉ Mean = X

Deviation from

ˉ Mean = X - X

ˉ ( X – X)²

2 4 -2 4

5 4 1 1

4 4 0 0

6 4 2 4

3 4 -1 1

Step 6 =

s² =_ ∑ ( X – X)² _______ = 10/5

n

= 2

∑ = 10

Step 5

S²= 2 persons²

Example: House hold size of 5 families was recorded as following: 2, 5, 4, 6, 3 Calculate variance for above data.

Page 32: Data Display and Summary

32

Standard Deviation

• The Standard Deviation is a measure of how spread out numbers are.

• Its symbol is σ (the greek letter sigma)

• The formula is easy: it is the square root of the Variance.i-e  s = √ s²

• SD is most useful measure of dispersion

s = √ (x - x²) n (if n > 30) Population

s = √ (x - x²) n-1 (if n < 30) Sample

Page 33: Data Display and Summary

Standard Deviation and Standard Error

• SD is an estimate of the variability of the observations or it is sample estimate of population parameter .

• SE is a measure of precision of an estimate of a population parameter.