introduction to biostatistics dr.s.shaffi ahamed asst. professor dept. of family and comm. medicine...
Post on 20-Dec-2015
225 Views
Preview:
TRANSCRIPT
INTRODUCTION TO INTRODUCTION TO BIOSTATISTICSBIOSTATISTICS
DR.S.Shaffi AhamedDR.S.Shaffi AhamedAsst. ProfessorAsst. ProfessorDept. of Family and Comm. MedicineDept. of Family and Comm. MedicineKKUHKKUH
This session covers:This session covers:
Origin and development of BiostatisticsOrigin and development of Biostatistics Definition of Statistics and BiostatisticsDefinition of Statistics and Biostatistics Reasons to know about BiostatisticsReasons to know about Biostatistics Types of dataTypes of data Graphical representation of a dataGraphical representation of a data Frequency distribution of a dataFrequency distribution of a data
““Statistics is the science which deals Statistics is the science which deals with collection, classification and with collection, classification and tabulation of numerical facts as the tabulation of numerical facts as the basis for explanation, description basis for explanation, description and comparison of phenomenon”.and comparison of phenomenon”.
------ Lovitt------ Lovitt
Origin and development of Origin and development of statistics in Medical Researchstatistics in Medical Research In 1929 a huge paper on application of In 1929 a huge paper on application of
statistics was published in Physiology statistics was published in Physiology Journal by Dunn.Journal by Dunn.
In 1937, 15 articles on statistical methods In 1937, 15 articles on statistical methods by Austin Bradford Hill, were published in by Austin Bradford Hill, were published in book form.book form.
In 1948, a RCT of Streptomycin for In 1948, a RCT of Streptomycin for pulmonary tb., was published in which pulmonary tb., was published in which Bradford Hill has a key influence.Bradford Hill has a key influence.
Then the growth of Statistics in Medicine Then the growth of Statistics in Medicine from 1952 was a 8-fold increase by 1982. from 1952 was a 8-fold increase by 1982.
Douglas Altman Ronald Fisher Karl Pearson C.R. Rao
Gauss -
““BIOSTATISICSBIOSTATISICS””
(1) Statistics arising out of biological (1) Statistics arising out of biological sciences, particularly from the fields of sciences, particularly from the fields of Medicine and public health.Medicine and public health.
(2) The methods used in dealing with (2) The methods used in dealing with statistics in the fields of medicine, biology statistics in the fields of medicine, biology and public health for planning, and public health for planning, conducting and analyzing data which conducting and analyzing data which arise in investigations of these branches.arise in investigations of these branches.
Reasons to know about Reasons to know about biostatistics:biostatistics:
Medicine is becoming increasingly Medicine is becoming increasingly quantitative.quantitative.
The planning, conduct and interpretation The planning, conduct and interpretation of much of medical research are of much of medical research are becoming increasingly reliant on the becoming increasingly reliant on the statistical methodology.statistical methodology.
Statistics pervades the medical literature.Statistics pervades the medical literature.
Example: Evaluation of Penicillin (treatment A) Example: Evaluation of Penicillin (treatment A) vs Penicillin & Chloramphenicol (treatment B) for vs Penicillin & Chloramphenicol (treatment B) for treating bacterial pneumonia in children< 2 yrs.treating bacterial pneumonia in children< 2 yrs.
What is the sample size needed to demonstrate the What is the sample size needed to demonstrate the significance of one group against other ?significance of one group against other ?
Is treatment A is better than treatment B or vice versa ?Is treatment A is better than treatment B or vice versa ? If so, how much better ?If so, how much better ? What is the normal variation in clinical measurement ? (mild, What is the normal variation in clinical measurement ? (mild,
moderate & severe) ?moderate & severe) ? How reliable and valid is the measurement ? (clinical & How reliable and valid is the measurement ? (clinical &
radiological) ?radiological) ? What is the magnitude and effect of laboratory and technical What is the magnitude and effect of laboratory and technical error ?error ? How does one interpret abnormal values ? How does one interpret abnormal values ?
CLINICAL MEDICINECLINICAL MEDICINE
Documentation of medical history of Documentation of medical history of diseases.diseases.
Planning and conduct of clinical studies.Planning and conduct of clinical studies. Evaluating the merits of different Evaluating the merits of different
procedures.procedures. In providing methods for definition of In providing methods for definition of
“normal” and “abnormal”.“normal” and “abnormal”.
PREVENTIVE MEDICINEPREVENTIVE MEDICINE
To provide the magnitude of any health To provide the magnitude of any health problem in the community.problem in the community.
To find out the basic factors underlying To find out the basic factors underlying the ill-health.the ill-health.
To evaluate the health programs which To evaluate the health programs which was introduced in the community was introduced in the community (success/failure).(success/failure).
To introduce and promote health To introduce and promote health legislation.legislation.
WHAT DOES STAISTICS WHAT DOES STAISTICS COVER ?COVER ?
PlanningPlanning DesignDesign Execution (Data collection)Execution (Data collection) Data ProcessingData Processing Data analysisData analysis PresentationPresentation InterpretationInterpretation PublicationPublication
HOW A “BIOSTATISTICIAN” HOW A “BIOSTATISTICIAN” CAN HELP ?CAN HELP ?
Design of studyDesign of study Sample size & power calculationsSample size & power calculations Selection of sample and controlsSelection of sample and controls Designing a questionnaireDesigning a questionnaire Data ManagementData Management Choice of descriptive statistics & graphsChoice of descriptive statistics & graphs Application of univariate and multivariateApplication of univariate and multivariate statistical analysis techniquesstatistical analysis techniques
INVESTIGATIONINVESTIGATION
Data Colllection
Data Presentation
TabulationDiagramsGraphs
Descriptive Statistics
Measures of LocationMeasures of Dispersion
Measures of Skewness & Kurtosis
Inferential Statistiscs
Estimation Hypothesis TestingPonit estimate
Inteval estimate
Univariate analysis
Multivariate analysis
TYPES OF DATATYPES OF DATA
QUALITATIVE DATAQUALITATIVE DATA DISCRETE QUANTITATIVEDISCRETE QUANTITATIVE CONTINOUS QUANTITATIVECONTINOUS QUANTITATIVE
QUALITATIVEQUALITATIVE
NominalNominal Example: Sex ( M, F)Example: Sex ( M, F)
Exam result (P, F)Exam result (P, F)
Blood Group (A,B, O or AB)Blood Group (A,B, O or AB)
Color of Eyes (blue, green,Color of Eyes (blue, green,
brown, black)brown, black)
ORDINALORDINAL Example:Example: Response to treatmentResponse to treatment (poor, fair, good)(poor, fair, good) Severity of diseaseSeverity of disease (mild, moderate, severe)(mild, moderate, severe) Income status (low, middle,Income status (low, middle, high)high)
QUANTITATIVE (DISCRETE)QUANTITATIVE (DISCRETE) Example: The no. of family membersExample: The no. of family members The no. of heart beatsThe no. of heart beats The no. of admissions in a dayThe no. of admissions in a day
QUANTITATIVE (CONTINOUS)QUANTITATIVE (CONTINOUS) Example: Height, Weight, Age, BP, Example: Height, Weight, Age, BP,
SerumSerum Cholesterol and BMI Cholesterol and BMI
Discrete data -- Gaps between possible values
Continuous data -- Theoretically,no gaps between possible values
Number of Children
Hb
CONTINUOUS DATA CONTINUOUS DATA
DISCRETE DATA DISCRETE DATA
wt. (in Kg.) : under wt, normal & over wt.wt. (in Kg.) : under wt, normal & over wt.
Ht. (in cm.): short, medium & tallHt. (in cm.): short, medium & tall
hospital length of stay Number Percent
1 – 3 days 5891 43.3
4 – 7 days 3489 25.6
2 weeks 2449 18.0
3 weeks 813 6.0
1 month 417 3.1
More than 1 month 545 4.0
Total 14604 100.0
Mean = 7.85 SE = 0.10
Table 1 Distribution of blunt injured patients according to hospital length of stay
Scale of measurementScale of measurement
Qualitative variable: A categorical variable
Nominal (classificatory) scale - gender, marital status, race
Ordinal (ranking) scale - severity scale, good/better/best
Scale of measurementScale of measurementQuantitative variable: A numerical variable: discrete; continuous
Interval scale : Data is placed in meaningful intervals and order. The unit of measurement are arbitrary.
- Temperature (37º C -- 36º C; 38º C-- 37º C are equal) and No implication of ratio (30º C is not twice as hot as 15º C)
Ratio scale:
Data is presented in frequency distribution in logical order. A meaningful ratio exists.
- Age, weight, height, pulse rate
- pulse rate of 120 is twice as fast as 60
- person with weight of 80kg is twice as heavy as the one with weight of 40 kg.
Scales of MeasureScales of Measure
Nominal Nominal – qualitative classification of equal – qualitative classification of equal value: gender, race, color, city value: gender, race, color, city
Ordinal Ordinal - qualitative classification which - qualitative classification which can be rank ordered: socioeconomic status can be rank ordered: socioeconomic status of families of families
IntervalInterval - Numerical or quantitative data: - Numerical or quantitative data: can be rank ordered and sizes compared : can be rank ordered and sizes compared : temperature temperature
RatioRatio - Quantitative interval data along with - Quantitative interval data along with ratio: time, age.ratio: time, age.
INVESTIGATIONINVESTIGATION
Data Colllection
Data Presentation
TabulationDiagramsGraphs
Descriptive Statistics
Measures of LocationMeasures of Dispersion
Measures of Skewness & Kurtosis
Inferential Statistiscs
Estimation Hypothesis TestingPonit estimate
Inteval estimate
Univariate analysis
Multivariate analysis
Frequency DistributionsFrequency Distributions
data distribution – pattern of data distribution – pattern of variability.variability. the center of a distributionthe center of a distribution the rangesthe ranges the shapesthe shapes
simple frequency distributionssimple frequency distributions grouped frequency distributionsgrouped frequency distributions
midpointmidpoint
PatienPatient Not No
HbHb
(g/dl)(g/dl)
PatienPatient Not No
HbHb
(g/dl)(g/dl)
PatienPatient Not No
HbHb
(g/dl)(g/dl)
11 12.012.0 1111 11.211.2 2121 14.914.9
22 11.911.9 1212 13.613.6 2222 12.212.2
33 11.511.5 1313 10.810.8 2323 12.212.2
44 14.214.2 1414 12.312.3 2424 11.411.4
55 12.312.3 1515 12.312.3 2525 10.710.7
66 13.013.0 1616 15.715.7 2626 12.512.5
77 10.510.5 1717 12.612.6 2727 11.811.8
88 12.812.8 1818 9.19.1 2828 15.115.1
99 13.213.2 1919 12.912.9 2929 13.413.4
1010 11.211.2 2020 14.614.6 3030 13.113.1
Tabulate the hemoglobin values of 30 adult Tabulate the hemoglobin values of 30 adult male patients listed belowmale patients listed below
Steps for making a Steps for making a tabletable
Step1 Find Minimum (9.1) & Maximum (15.7)Step1 Find Minimum (9.1) & Maximum (15.7)
Step2 Calculate difference 15.7 – 9.1 = 6.6 Step2 Calculate difference 15.7 – 9.1 = 6.6
Step3 Decide the number and width of Step3 Decide the number and width of the classes (7 c.l) 9.0 -9.9, 10.0-10.9,---- the classes (7 c.l) 9.0 -9.9, 10.0-10.9,----
Step4 Prepare dummy table – Step4 Prepare dummy table – Hb (g/dl), Tally mark, No. patientsHb (g/dl), Tally mark, No. patients
Hb (g/dl) Tall marks No. patients
9.0 – 9.910.0 – 10.911.0 – 11.912.0 – 12.913.0 – 13.914.0 – 14.915.0 – 15.9
Total
Hb (g/dl) Tall marks No.
patients
9.0 – 9.910.0 – 10.911.0 – 11.912.0 – 12.913.0 – 13.914.0 – 14.915.0 – 15.9
lllllllllll llll
lllllllll
13610532
Total - 30
DUMMY TABLEDUMMY TABLE Tall Marks TABLETall Marks TABLE
Hb (g/dl) No. of patients
9.0 – 9.910.0 – 10.911.0 – 11.912.0 – 12.913.0 – 13.914.0 – 14.915.0 – 15.9
136
10532
Total 30
Table Frequency distribution of 30 adult male Table Frequency distribution of 30 adult male patients by Hb patients by Hb
Table Frequency distribution of adult patients byTable Frequency distribution of adult patients by Hb and gender:Hb and gender:
Hb(g/dl)
Gender Total
Male Female
<9.09.0 – 9.9
10.0 – 10.911.0 – 11.912.0 – 12.913.0 – 13.914.0 – 14.915.0 – 15.9
0136
10532
23586420
248
1416952
Total 30 30 60
Elements of a TableElements of a TableIdeal table should have Number
Title Column headings Foot-notes
Number – Table number for identification in a report
Title,place - Describe the body of the table, variables, Time period (What, how classified, where and when)
Column - Variable name, No. , Percentages (%), etc.,Heading
Foot-note(s) - to describe some column/row headings, special cells, source, etc.,
Death rate (/1000 per annum)No. of divisions7.0-7.9 4 (3.3)
8.0 - 8.9 13 (10.8)9.0 - 9.9 20 (16.7)
10.0 - 10.9 27 (22.5)11.0 - 11.9 18 (15.0)12.0 - 12.9 11 (0.2)13.0 - 13.9 11 (9.2)14.0 - 14.9 6 (5.0)15.0 - 15.9 2 (1.7)16.0 - 16.9 4 (3.3)17.0 - 18.9 3 (2.5)
19.0 + 1 (0.8)Total 120 (100.0)
Table II. Distribution of 120 (Madras) Corporation divisions according to annual death rate based on registered deaths in 1975 and 1976
Figures in parentheses indicate percentages
DIAGRAMS/GRAPHSDIAGRAMS/GRAPHS
Discrete dataDiscrete data --- Bar charts (one or two groups)--- Bar charts (one or two groups)
Continuous dataContinuous data --- Histogram--- Histogram --- Frequency polygon (curve)--- Frequency polygon (curve) --- Stem-and –leaf plot--- Stem-and –leaf plot --- Box-and-whisker plot--- Box-and-whisker plot
Example dataExample data
68 63 42 27 30 36 28 3279 27 22 28 24 25 44 6543 25 74 51 36 42 28 31 28 25 45 12 57 51 12 32 49 38 42 27 31 50 38 21 16 24 64 47 23 22 43 27 49 28 23 19 11 52 46 3130 43 49 12
HistogramHistogram
Figure 1 Histogram of ages of 60 subjects
11.5 21.5 31.5 41.5 51.5 61.5 71.5
0
10
20
Age
Freq
uen
cy
PolygonPolygon
71.561.551.541.531.521.511.5
20
10
0
Age
Freq
uen
cy
Example dataExample data
68 63 42 27 30 36 28 3279 27 22 28 24 25 44 6543 25 74 51 36 42 28 31 28 25 45 12 57 51 12 32 49 38 42 27 31 50 38 21 16 24 64 47 23 22 43 27 49 28 23 19 11 52 46 3130 43 49 12
Stem and leaf plotStem and leaf plot
Stem-and-leaf of Age N = 60
Leaf Unit = 1.0
6 1 122269
19 2 1223344555777788888
(11) 3 00111226688
13 4 2223334567999
5 5 01127
4 6 3458
2 7 49
Box plotBox plot
10
20
30
40
50
60
70
80A
ge
Descriptive statistics Descriptive statistics report: Boxplotreport: Boxplot
- minimum score- maximum score- lower quartile- upper quartile - median- mean
- the skew of the distribution: positive skew: mean > median & high-score whisker is longer negative skew: mean < median & low-score whisker is longer
10%
20%
70%
Mild
Moderate
Severe
The prevalence of different degree of Hypertension
in the population
Pie Chart•Circular diagram – total -100%
•Divided into segments each representing a category
•Decide adjacent category
•The amount for each category is proportional to slice of the pie
Bar GraphsBar Graphs
912
2016
128
20
0
5
10
15
20
25
Smo Alc Chol DM HTN NoExer
F-H
Risk factor
Numb
er
The distribution of risk factor among cases with Cardio vascular Diseases
Heights of the bar indicates frequency
Frequency in the Y axis and categories of variable in the X axis
The bars should be of equal width and no touching the other bars
HIV cases enrolment in USA HIV cases enrolment in USA by genderby gender
0
2
4
6
8
10
12
1986 1987 1988 1989 1990 1991 1992
Year
En
rollm
ent
(hu
nd
red
)
MenWomen
Bar chart
HIV cases Enrollment HIV cases Enrollment in USA by genderin USA by gender
0
2
4
6
8
10
12
14
16
18
1986 1987 1988 1989 1990 1991 1992
Year
Enro
llm
ent (T
hou
sands)
WomenMen
Stocked bar chart
Graphic Presentation of Graphic Presentation of DataData
the histogram (quantitative data)
the bar graph (qualitative data)
the frequency polygon (quantitative data)
General rules for designing General rules for designing graphsgraphs
A graph should have a self-explanatory A graph should have a self-explanatory legendlegend
A graph should help reader to understand A graph should help reader to understand datadata
Axis labeled, units of measurement Axis labeled, units of measurement indicatedindicated
Scales important. Start with zero (otherwise Scales important. Start with zero (otherwise // break)// break)
Avoid graphs with three-dimensional Avoid graphs with three-dimensional impression, it may be misleading (reader impression, it may be misleading (reader visualize less easilyvisualize less easily
Any QuestionsAny Questions
top related