lecture 1+2, introduction lesson (slide)

Upload: justden09

Post on 06-Apr-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    1/25

    BIOSTATISTICS

    M391By

    Dr. Atallah Z. Rabi

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    2/25

    Definitions

    Statistics: is a field of study concerned

    with:

    The collection, organization, summarization,

    and analysis of data and

    The drawing inferences about a body of data

    when a part of data is observed Biostatistics: Tools of statistics used in

    biological sciences and medicine

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    3/25

    Definitions (cont.)

    Statistics is the science and art of collecting,summarizing, and analyzing 'data that are subject torandom variation (Last, 1995).

    Biostatistics is the application of statistics to biologicalproblems.

    Data refers to a collection of items of information,

    A variable is any quantity that varies. It is any

    attribute, phenomenon, or event that can havedifferent values.

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    4/25

    Sources of Data

    Data is the raw material of statistics, Data is

    used to answer a question, Sources of data

    are: Routinely kept records (hospital medical records)

    Surveys (information about mode of pt. transportation)

    Experiments (best strategy for pt. compliance)

    External sources (Published reports)

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    5/25

    Common terms used in statistics

    Population

    Sample

    Variables

    Measurements

    Statistical Inference Simple random sample

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    6/25

    Population

    Population is the largest collection of entities for

    which we have an interest at a particular time.

    (Weights of all new born babies in a hospital) Population of values is the largest collection of

    values of a random variable for which we have an

    interest at a particular time.

    Finite population (values consist from fixed numbers)

    Infinite population (values consist of endless succession of values

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    7/25

    Sample

    Sample is a part of population. (weights of some

    selected new born babies)

    There are different types of samples

    There are different types of sampling

    techniques

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    8/25

    Variables

    A variable is a characteristic that takes

    different values in different persons, places,

    or things.

    Examples of variables :

    diastolic blood pressure, heart rate, height of

    adult males, weight of new borne babies, agesof patients.

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    9/25

    Types of variables

    Quantitative variables: (weight, height, age, they conveyinformation regarding amount)

    Qualitative variables(Sick, diabetic, they convey informationregarding attribute)

    Random variableDiscrete random variable(# of daily admissions,

    represented by whole number)

    Continuous random variable (Height, weight, skullcircumferences)

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    10/25

    Measurement

    Measurement is defined as the assignment of numbers to objects

    or events according to a set of rules.

    Measurement has different scales: Nominal scale (male - female; well-sickmutually and collectively exclusive)

    Ordinal scale (observations can be ranked, low, medium , & high economic status)

    Interval scale ( distance between 2 measurements is known)

    Ratio scale (height, weight, & length, there is zero point.)

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    11/25

    Statistical Inference

    Statistical Inference is the procedure by which

    we reach a conclusion about a population

    on the basis of the information contained ina sample that has been drawn from that

    population.

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    12/25

    Simple random sample

    If a sample of size n is drawn from a

    population of size N in such a way that

    every possible sample of size n has thesame chance of being selected, the sample

    is called a simple random sample

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    13/25

    What is data?

    Data are numbers, numbers result from:

    Measurement (body Temp., Body weight)

    Counting (Number of patients admitted)

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    14/25

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    15/25

    Why Statistics? Examples

    Which drugs should be allowed on the market?

    What Public Health programs should be pursued?

    What programs would reduce infant mortality?

    Are cell phones a good idea for drivers?

    Is it a good idea for post-menopausal women to

    take estrogen?

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    16/25

    Probability and Statistics

    Probability generalizes the concept of replicability.

    Statistics are often used for decisions about specific

    (non-replicable) situations. These decisions are often made in the context of

    what is likely to happen in that specific situation.

    Probability (likelihood) reflects our belief aboutreplicability

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    17/25

    Populations, Samples, and Individuals

    Aristotle speculated about thepopulation of all women (compared

    to the population of men). He had immediately available to him asample of two women, and he could have counted the number of

    teeth for two individuals.

    The population is the collection of all people about whom you

    would like to ask a research question. This might be a fairly clear-cut easily defined set of people:

    What proportion of people 65 or older in the US today

    have Alzheimers disease?

    Or it might be a more hypothetical group:

    How much of a reduction in symptomatic days could a

    person expect if treated with a new antiviral for flu?

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    18/25

    Typically, you cant study everyone in the population.

    You cant afford to have everyone 65 or older in the

    US seen by a neurologist, even if you could find all the oldpeople!

    You cant test everyone with the flu because the cases

    havent even occurred yet!

    So you study a sample, and you try to generalize to the

    population. The sample size is the number ofindividuals in the

    sample (not the number of measurements you make on each

    person!)

    A good study design will help make your sample

    representative of the population you are concerned about.

    Good statistical analysis will help tell you the best answer to

    your question about the population, and also how far off you

    might be.

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    19/25

    Looking at data: categorical or continuous?

    Most data fall into two broad classes.

    Continuous data are used to report a measurement of theindividual that can take on any value within an acceptable range.

    For example, age, systolic BP, [K+], change in weight over 6

    months.

    Categorical data are used to report a characteristic of theindividual that has a finite, usually small number of possibilities.

    The categories should be clear cut, not overlapping, and cover all

    the possibilities. For example, sex (male or female), vital status

    (alive or dead), disease stage (depends on disease), ever smoked(yes or no).

    Make sure you are very clear about the definitions. Does one

    cigarette and I didnt inhale count as smoking?

    When designing a study, allow for missing values and refusals.

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    20/25

    All biostatistics begins with description. Before you do anything

    else, you lookat the data and summarize the data. Our goal in this

    hour is to show you how to get a first look at the data and get readyto do more elaborate procedures. A statistic is just a numerical

    summary of the data, like the largest number in the data set.

    Descriptive statistics should be clear and easily interpreted. They

    should not mislead you about the data they are summarizing.

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    21/25

    Measures of central tendencyMeasures of central tendency tell you in some sense where

    you might expect a typical person to be, in the middleof the data.

    Themeanis the arithmetic average. For example, if 3people were in hospital 8, 10 and 30 days respectively,the mean time is 48/3 = 16 days.!

    Themedian is the value at which half the numbers are

    higher and half are lower. If number of individuals isodd, it is the middle value (rank (n+1)/2) and if numberis even, it is average of two middle values.!.

    Themodeis the most common value; rarely used

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    22/25

    Mean Calculation

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    23/25

    Measures of dispersion

    Range

    Variance

    Standard

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    24/25

  • 8/3/2019 Lecture 1+2, Introduction Lesson (Slide)

    25/25

    Line histogram showing

    distribution of HR in womenWomen's HR at 1 ltr O2/mi

    02

    4

    6

    8

    10

    12

    14

    105 110 115 120 125 130 135 140 145 150 155 160 165 170 175

    heart rate