slide 1+2 - introduction_lecture

Upload: cwt2010

Post on 04-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    1/37

    BIOSTATISTICS

    M391By

    Dr. Atallah Z. Rabi

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    2/37

    Biostatistics

    (a word made from biology and statistics)

    The application ofstatistics to a wide range of

    topics in biology.

    http://en.wikipedia.org/wiki/Statisticshttp://en.wikipedia.org/wiki/Biologyhttp://en.wikipedia.org/wiki/Biologyhttp://en.wikipedia.org/wiki/Statistics
  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    3/37

    Biostatistics

    It is the science which deals with developmentand application of the most appropriate

    methods for the:

    Collection of data.

    Presentation of the collected data.

    Analysis and interpretation of the results.

    Making decisions on the basis of such

    analysis

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    4/37

    Why Statistics?

    You should not ignore it. It is too useful.

    You cannot fight it. Everyone else uses it.

    It gives the right answers (95%) of the time.

    But you do not know which 95%.

    It is great fun. Trust me.

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    5/37

    Why Statistics? Examples

    Which drugs should be allowed on the market?

    What Public Health programs should be pursued?

    What programs would reduce infant mortality?

    Are cell phones a good idea for drivers?

    Is it a good idea for post-menopausal women to

    take estrogen?

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    6/37

    Role of statisticians

    To guide the design of an experiment or surveyprior to data collection

    To analyze data using proper statistical

    procedures and techniques

    To present and interpret the results to researchers

    and other decision makers

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    7/37

    Definitions (cont.)

    Statistics is the science and art of collecting,summarizing, and analyzing 'data that are subject torandom variation (Last, 1995).

    Biostatistics is the application of statistics tobiological problems.

    Data refers to a collection of items of information,

    A variable is any quantity that varies. It is any

    attribute, phenomenon, or event that can havedifferent values.

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    8/37

    What is data?

    Data are numbers, numbers result from:

    Measurement (body Temp., Body weight)

    Counting (Number of patients admitted)

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    9/37

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    10/37

    Most data fall into two broad classes.

    Continuous data are used to report a measurement of the

    individual that can take on any value within an acceptable range.For example, age, systolic BP, [K+], change in weight over 6

    months.

    Categorical data are used to report a characteristic of the

    individual that has a finite, usually small number of possibilities.The categories should be clear cut, not overlapping, and cover all

    the possibilities. For example, sex (male or female), vital status

    (alive or dead), disease stage (depends on disease), ever smoked

    (yes or no).Make sure you are very clear about the definitions. Does one

    cigarette and I didnt inhale count as smoking?

    When designing a study, allow for missing values and refusals.

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    11/37

    All biostatistics begins with description. Before you do anything

    else, you lookat the data andsummarize the data. Our goal in this

    hour is to show you how to get a first look at the data and get readyto do more elaborate procedures. Astatistic is just a numerical

    summary of the data, like the largest number in the data set.

    Descriptive statistics should be clear and easily interpreted. They

    should not mislead you about the data they are summarizing.

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    12/37

    Common terms used in statistics

    Population

    Sample

    Variables

    Measurements

    Statistical Inference Simple random sample

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    13/37

    Population

    Population is the largest collection of entities for

    which we have an interest at a particular time.

    (Weights of all new born babies in a hospital) Population of values is the largest collection of

    values of a random variable for which we have an

    interest at a particular time.

    Finite population (values consist from fixed numbers)

    Infinite population (values consist of endless succession of values

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    14/37

    Sample

    Sample is a part of population. (weights of some

    selected new born babies)

    There are different types of samples

    There are different types of sampling

    techniques

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    15/37

    Populations, Samples, and Individuals

    Aristotle speculated about thepopulation of all women (compared

    to the population of men). He had immediately available to him a

    sample of two women, and he could have counted the number of

    teeth for two individuals.

    The population is the collection of all people about whom you

    would like to ask a research question. This might be a fairly clear-cut easily defined set of people:

    What proportion of people 65 or older in the US today

    have Alzheimers disease?

    Or it might be a more hypothetical group:

    How much of a reduction in symptomatic days could a

    person expect if treated with a new antiviral for flu?

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    16/37

    Typically, you cant study everyone in the population.

    You cant afford to have everyone 65 or older in the US

    seen by a neurologist, even if you could find all the old people!You cant test everyone with the flu because the cases

    havent even occurred yet!

    So you study asample, and you try to generalize to the

    population. Thesample size is the number ofindividuals in the

    sample (not the number of measurements you make on each

    person!)

    A good study design will help make your sample

    representative of the population you are concerned about.

    Good statistical analysis will help tell you the best answer to

    your question about the population, and also how far off you

    might be.

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    17/37

    Types of data

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    18/37

    Variables

    A variable is a characteristic that takes

    different values in different persons, places,

    or things.Examples of variables :

    diastolic blood pressure, heart rate, height of

    adult males, weight of new borne babies, agesof patients.

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    19/37

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    20/37

    Numerical presentation

    Graphical presentation

    Mathematical presentation

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    21/37

    1- Numerical presentation

    Tabular presentation (simplecomplex)

    Name of variable

    (Units of variable) Frequency %

    -

    - Categories

    -

    Total

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    22/37

    Age

    (years)

    Frequency %

    20-

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    23/37

    Table (III): Distribution of 20 lung cancer patients at the chest

    department of KAUH and 40 controls in May 2011 according to

    smoking

    Smoking

    Lung cancerTotal

    Cases ControlNo. % No. % No. %

    Smoker 15 75% 8 20% 23 38.33

    Nonsmoker 5 25% 32 80% 37 61.67

    Total 20 100 40 100 60 100

    Complex frequency distribution Table

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    24/37

    Complex frequency distribution Table

    Table (IV): Distribution of 60 patients at the chest

    department of KAUH in May 2011 according to smoking

    & lung cancer

    Smoking

    Lung cancerTotal

    positive negativeNo. % No. % No. %

    Smoker 15 65.2 8 34.8 23 100

    Nonsmoker 5 13.5 32 86.5 37 100

    Total 20 33.3 40 66.7 60 100

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    25/37

    Line Graph

    0

    10

    20

    30

    40

    50

    60

    1960 1970 1980 1990 2000Year

    MMR/1000

    Year MMR

    1960 50

    1970 45

    1980 261990 15

    2000 12

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    26/37

    Frequency polygon

    Age

    (years)

    Sex Mid-point of

    intervalMales Females

    20 - 3 (12%) 2 (10%) (20+30) / 2 = 25

    30 - 9 (36%) 6 (30%) (30+40) / 2 = 35

    40- 7 (8%) 5 (25%) (40+50) / 2 = 45

    50 - 4 (16%) 3 (15%) (50+60) / 2 = 55

    60 - 70 2 (8%) 4 (20%) (60+70) / 2 = 65

    Total 25(100%) 20(100%)

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    27/37

    Frequency polygon

    AgeSex M

    -

    PM F

    20- (12%) (10%) 25

    30- (36%) (30%) 35

    40- (8%) (25%) 45

    50- (16%) (15%) 55

    60-

    70(8%) (20%) 65

    0

    5

    10

    15

    20

    25

    30

    35

    40

    25 35 45 55 65Age

    %Males Females

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    28/37

    Histogram

    0

    5

    10

    15

    20

    25

    30

    35

    0 25 30 40 45 60 65

    Age (years)

    %

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    29/37

    Bar chart

    0

    5

    10

    15

    20

    25

    30

    35

    40

    45%

    Single Married Divorced Widowed

    Marital status

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    30/37

    1- Measures of central tendency(averages)

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    31/37

    Measures of central tendency

    Midrange

    Smallest observation + Largest observation

    2

    Mode

    the value which occurs with the greatestfrequency i.e. the most common value

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    32/37

    Measures of central tendency

    (cont.)Median

    the observation which lies in the middle of

    the ordered observation.

    Arithmetic mean (mean)

    Sum of all observations

    Number of observations

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    33/37

    Measures of dispersion

    Range

    Variance

    Standard dviation

    Semi-interquartile range

    Coefficient of variation

    Standard error

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    34/37

    Standard deviation SD

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    35/37

    Standard error of mean SE

    A measure of variability among means of

    samples selected from certain population

    SE(Mean) =

    Sd

    n

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    36/37

    Statistical Inference

    Statistical Inference is the procedure by which

    we reach a conclusion about a population on

    the basis of the information contained in asample that has been drawn from that

    population.

  • 7/29/2019 Slide 1+2 - Introduction_Lecture

    37/37

    Measurement

    Measurement is defined as the assignment of numbers to

    objects or events according to a set of rules.

    Measurement has different scales: Nominal scale (male - female; well-sickmutually and collectively exclusive)

    Ordinal scale (observations can be ranked, low, medium , & high economic status)

    Interval scale ( distance between 2 measurements) Ratio scale (height, weight, & length, there is zero point.)