y i 2 n 2 i 1 y 1 review - order of operations samples and ...bio300/notes/03samples4x.pdfreview -...

15
Samples and populations Estimating with uncertainty Review - order of operations s 2 = n n " 1 # $ % & ( Y i 2 i=1 n ) n " Y 2 # $ % % % % & ( ( ( ( Review - order of operations 1. Parentheses 2. Exponents and roots 3. Multiply and divide 4. Add and subtract Review - order of operations s 2 = n n " 1 # $ % & ( Y i 2 i=1 n ) n " Y 2 # $ % % % % & ( ( ( (

Upload: others

Post on 08-Feb-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Samples and populations

    Estimating with uncertainty

    Review - order of operations

    !

    s2

    =n

    n "1

    #

    $ %

    &

    ' (

    Yi

    2

    i=1

    n

    )

    n"Y 2

    #

    $

    % % % %

    &

    '

    ( ( ( (

    Review - order of operations

    1. Parentheses

    2. Exponents and roots

    3. Multiply and divide

    4. Add and subtract

    Review - order of operations

    !

    s2

    =n

    n "1

    #

    $ %

    &

    ' (

    Yi

    2

    i=1

    n

    )

    n"Y 2

    #

    $

    % % % %

    &

    '

    ( ( ( (

  • Review - order of operations

    !

    s2

    =n

    n "1

    #

    $ %

    &

    ' (

    Yi

    2

    i=1

    n

    )

    n"Y 2

    #

    $

    % % % %

    &

    '

    ( ( ( (

    Review - types of variables

    • Categorical variables

    – For example, country of birth

    • Numerical variables

    – For example, student height

    Review - types of variables

    • Categorical variables

    • Numerical variables

    Discrete

    Continuous

    Review - types of variables

    • Categorical variables

    • Numerical variables

    Discrete

    Continuous

    Nominal

    Ordinal

  • Review - types of variables

    • Categorical variables

    – Nominal - no natural order

    – Ordinal - can be placed in an order

    Review - types of variables

    • Categorical variables

    – Nominal - no natural order

    • Example - country of birth

    – Ordinal - can be placed in an order

    Review - types of variables

    • Categorical variables

    – Nominal - no natural order

    • Example - country of birth

    – Ordinal - can be placed in an order

    • Example - educational experience

    – Some high school, high school diploma, some college,

    college degree, masters degree, PhD

    Sampling from a population

    • We often sample from a population

    • Consider random samples

    – Each individual has an equal and identical

    probability of being selected

  • Body mass of 400 humans

    Random sample of 10 people

  • Population mean:µ = 70.8 kg

    Population mean:µ = 70.8 kg

    Sample mean:x = 76.7 kg

    Another sample…

  • Population mean:µ = 70.8 kg

    Sample mean:x = 69.2 kg

    What if we do this many times?

    Example: gene length

    n = 20,290

  • n = 20,290µ = 2622.0! = 2037.9

    Sample histogram

    n = 100Y = 2675.4s = 1539.2

    Y = 2675.4s = 1539.2

    Y = 2588.8s = 1620.5

    Y = 2702.4s = 1727.1

    Y = 2767.2s = 2044.7

  • Y = 2675.4s = 1539.2

    Y = 2588.8s = 1620.5

    Y = 2702.4s = 1727.1

    Y = 2767.2s = 2044.7

    Sampling distribution of the mean

    1000 samples

    Sampling distribution of the mean

    Sampling distribution of the mean

    Sampling distribution of the mean

  • µ = 2622.0

    Mean of means:2626.4

    Sampling distribution of the mean

    Y = 2675.4s = 1539.2

    Y = 2588.8s = 1620.5

    Y = 2702.4s = 1727.1

    Y = 2767.2s = 2044.7

    s = 1539.2

    s = 1620.5s = 1727.1

    s = 2044.7

    Sampling distribution of the standard deviation Sampling distribution of the standard deviation

  • 100 samplesPopulation ! = 2036.9

    Mean sample s = 1962.6

    Sampling distribution of the standard deviation

    1000 samplesPopulation ! = 2036.9

    Mean sample s = 1929.7

    Sampling distribution of the standard deviation

    Sampling distribution of the mean, n=10

    Sampling distribution of the mean, n=100

    Sampling distribution of the mean, n = 1000

    Sampling distribution of the mean, n=10

    Sampling distribution of the mean, n=100

    Sampling distribution of the mean, n = 1000

  • Precise Imprecise

    Biased

    Unbiased

    Precise Imprecise

    Biased

    Unbiased

    Larger sample size

    Group activity #2

    • Form groups of size 2-5

    • Get out a blank sheet of paper

    • Write everyone’s full name on the paper

  • How many toes do aliens have?

    Instructions

    • You have measurements from a population of400 aliens

    • Use your random number table to select a sampleof ten measurements

    • Calculate your sample mean and, if you have acalculator or a large brain, your sample standarddeviation

    • On your paper, answer the following:

    1. What was your sample mean and standard deviation?

    2. How did you randomly choose your sample?

  • Distribution of the sample mean

    • No matter what the frequency distribution

    of the population:

    • The sample mean has an approximately

    bell-shaped (normal) distribution

    • Especially for large n (large samples)

    How precise is any one estimated

    sample mean?

    The standard error of anestimate is the standarddeviation of its samplingdistribution. The standard

    error predicts the samplingerror of the estimate.

    Standard error of the mean

    !

    " µ ="

    n

  • Estimate of the standard error of

    the mean

    !

    SEY

    =s

    n

    Confidence interval

    • Confidence interval

    – a range of values surrounding the sample

    estimate that is likely to contain the population

    parameter

    • 95% confidence interval

    – plausible range for a parameter based on the

    data

    The 2SE rule-of-thumb

    The interval from Y ! 2Y

    SE to Y + 2Y

    SE

    provides a rough estimate of the 95% confidence interval

    for the mean.

    Confidence interval

  • Pseudoreplication

    The error that occurs when samples are notindependent, but they are treated as though they are.

    Example: “The transylvania effect”

    A study of 130,000 calls for police assistancein 1980 found that they were more likely thanchance to occur during a full moon.

    Example: “The transylvania effect”

    A study of 130,000 calls for police assistancein 1980 found that they were more likely thanchance to occur during a full moon.

    Problem: There may have been 130,000calls in the data set, but there were only 13full moons in 1980. These data are notindependent.