qbm117 business statistics statistical inference sampling 1

28
QBM117 Business Statistics Statistical Inference Sampling 1

Post on 19-Dec-2015

227 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: QBM117 Business Statistics Statistical Inference Sampling 1

QBM117Business Statistics

Statistical Inference

Sampling

1

Page 2: QBM117 Business Statistics Statistical Inference Sampling 1

Objectives

• To give an overview of the nest topic, statistical inference.

• To understand that importance of correct sampling techniques.

• To introduce different sampling techniques.

2

Page 3: QBM117 Business Statistics Statistical Inference Sampling 1

Populations and Samples

• A population is the entire collection of items bout which information is desired.

• A sample is a subset of the population that we collect data from.

3

Page 4: QBM117 Business Statistics Statistical Inference Sampling 1

Parameters and Statistics

• A parameter is number that describes a population.- A parameter is a fixed number.

• A statistic is a number that describes a sample.- A statistic is a random variable whose value

changes from sample to sample.

4

Page 5: QBM117 Business Statistics Statistical Inference Sampling 1

Statistical Inference

• Population parameters are almost always unknown.

• We take a random sample from the population of interest and calculate the sample statistic.

• We then use the sample statistic as an estimate of the population parameter.

• Statistical Inference involves drawing conclusions about a population based on sample information.

5

Page 6: QBM117 Business Statistics Statistical Inference Sampling 1

Sampling Distributions

• Sample statistics are random variables.

• The probability distribution of a sample statistic is called its sampling distribution.

• We us the sampling distribution to make inferences about the population parameters.

6

Page 7: QBM117 Business Statistics Statistical Inference Sampling 1

Estimation and Hypothesis Testing

• There are two types of statistical inference- Estimation- Hypothesis Testing

• Estimation is appropriate when we want to estimate a population parameter.

• Hypothesis testing is appropriate when we want to assess some claim about a population based on the evidence provided by a sample.

7

Page 8: QBM117 Business Statistics Statistical Inference Sampling 1

Sampling

• Sampling is the process of selecting a sample from a population.

• Samples may be selected in a variety of ways.

• The sample should be representative of the population.

• This is best achieved by random sampling.

8

Page 9: QBM117 Business Statistics Statistical Inference Sampling 1

Random Sampling

• A sample is random if every member of the population has an equal chance of being selected in the sample.

• Most statistical techniques assume that random samples are used.

• We will look at three types types of random sampling.

9

Page 10: QBM117 Business Statistics Statistical Inference Sampling 1

Simple Random Sampling

• A simple random sample is a sample in which each member of the population is equally likely to be included.

• The easiest way to generate a simple random sample is to use a random number generator.

10

Page 11: QBM117 Business Statistics Statistical Inference Sampling 1

Example: Generating a Simple Random Sample

A government income-tax auditor is responsible for 1000 tax returns.

The auditor wants to randomly select 40 tax returns to audit.

Each tax return in the population of 1000 is given a number from 1 to 1000.

We then use Excel’s random number generator to select the random sample of 40 tax returns.

11

Page 12: QBM117 Business Statistics Statistical Inference Sampling 1

0.3820002 382.00018 3830.1006806 100.68056 1010.5964843 596.48427 5970.8991058 899.10581 9000.8846095 884.60952 8850.9584643 958.46431 9590.0144963 14.496292 150.4074221 407.4221 4080.8632466 863.24656 8640.1385846 138.58455 1390.2450331 245.03311 246

. . .

. . .

0.3820002 382.00018 3830.1006806 100.68056 1010.5964843 596.48427 5970.8991058 899.10581 9000.8846095 884.60952 8850.9584643 958.46431 9590.0144963 14.496292 150.4074221 407.4221 4080.8632466 863.24656 8640.1385846 138.58455 1390.2450331 245.03311 246

. . .

. . .

50 numbers uniformly distributed between 0 and 1

X(1000) Round-up

50 Random numbersbetween 0 and 1000,each has a probabilityof 1/1000 to be selected

50 integral random numbersbetween 1 and 1000uniformly distributed

38310159790088595915408864139246..

The auditor will select returnsnumbered 383, 101, 597, ...

12

Page 13: QBM117 Business Statistics Statistical Inference Sampling 1

Stratified Random Sampling

• A stratified random sample is obtained by dividing the population into homogeneous groups and drawing a simple random sample from each group.

• The homogenous groups are called strata.

• Not only can acquire information about the whole population, we can also make inferences within each stratum or compare strata.

13

Page 14: QBM117 Business Statistics Statistical Inference Sampling 1

Example: Generating a Stratified Random Sample

Suppose the Internal Revenue Service wants to estimate the median amounts of deductions taxpayers claim in different categories, e.g. property taxes, charitable donations, etc.

These amounts vary greatly over the taxpayer population.

Therefore a simple random sample will not be very efficient.

14

Page 15: QBM117 Business Statistics Statistical Inference Sampling 1

• The taxpayers can be divided into strata based on their adjusted gross incomes, and a separate SRS can be drawn from each individual strata.

• Because the deductions generally increase with incomes, the resulting stratified random sample would require a much smaller total sample size to provide equally precise estimates.

15

Page 16: QBM117 Business Statistics Statistical Inference Sampling 1

There are several ways to build the stratified random sample.

One of them is to maintain the proportion of each stratum in the population, in the sample.

A sample of size 1000 is to be drawn.

16

Stratum Income Population proportion

1 under $15,000 25% 2502 15,000-29,999 40% 4003 30.000-50,000 30% 3004 over $50,000 5% 50

Stratum size

Total 1000

Page 17: QBM117 Business Statistics Statistical Inference Sampling 1

Cluster Sampling

• Cluster sampling groups the population into small clusters, draws a simple random sample of clusters, and observes everything in the sampled clusters.

• It is useful when it is difficult or costly to develop a complete list of the population members.

• It is also useful whenever the population elements are widely dispersed geographically.

17

Page 18: QBM117 Business Statistics Statistical Inference Sampling 1

Errors Involved in Sampling

• Two types of errors occur when sampling from a population

- sampling error

- non-sampling error

18

Page 19: QBM117 Business Statistics Statistical Inference Sampling 1

Sampling Error

• Sampling error is the error that arises because the data are collected from part, rather than the whole of the population.

• Whenever we make inferences about a population based on information from a sample there will naturally be some degree of error.

• The larger the sample, the smaller the sampling error.

19

Page 20: QBM117 Business Statistics Statistical Inference Sampling 1

- population mean income

sample mean incomex

Sampling error

20

Page 21: QBM117 Business Statistics Statistical Inference Sampling 1

Non-Sampling Error

• Non-sampling errors are due to errors in data acquisition, non-response error and selection bias.

• These type of errors are more serious than sampling errors as increasing the sample size will not help to reduce them.

21

Page 22: QBM117 Business Statistics Statistical Inference Sampling 1

Errors in Data Acquisition

• These types of errors occur during data collection and processing.

– Faulty equipment may lead to incorrect measurements being taken.

– Data may be recorded incorrectly.

– Processing errors may occur.

22

Page 23: QBM117 Business Statistics Statistical Inference Sampling 1

If this observation iswrongly recorded here

Then the sample mean is affected

Sampling error +Data acquisition error

Population

Sample

Data Acquisition Error

23

Page 24: QBM117 Business Statistics Statistical Inference Sampling 1

Non-Response Error

• Non-response error is the error introduced when responses are not obtained from some members of the sample.

• The sample observations that are collected may not be representative of the population.

• This results in biased results.

24

Page 25: QBM117 Business Statistics Statistical Inference Sampling 1

Non-Response Error

Population

Sample

No response here... May lead to biased results here

25

Page 26: QBM117 Business Statistics Statistical Inference Sampling 1

Selection Bias

• Selection bias occurs when some members of the population cannot possibly selected for inclusion in the sample.

• For example, surveying voters by randomly selecting telephone numbers is biased as voters who do not have a telephone cannot possibly be selected in the sample.

26

Page 27: QBM117 Business Statistics Statistical Inference Sampling 1

Selection Bias

Population

Sample

When parts of the population cannot be selected...

the sample cannot representthe whole population

27

Page 28: QBM117 Business Statistics Statistical Inference Sampling 1

Reading for next lecture

• Chapter 7, Section 7.5

Exercises

• 6.11• 6.12

28