chapter 11 sampling foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/ad585/ad585-part1c.pdf•...

59
1 Copyright © by Houghton Mifflin Company, Inc. All rights reserved First Edition Chapter 11 Sampling Foundations Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-2 Chapter Objectives Define and distinguish between sampling and census studies Discuss when to use a probability versus a nonprobability sampling method and implement the different methods Explain sampling error and sampling distribution Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-3 Chapter Objectives Construct confidence intervals for population means and proportions List the factors to consider in determining sample size, and compute the required sample size to achieve a specific degree of precision at a desired confidence level Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-4 National Poll Sample Size Harris Poll A weekly study that monitors the reactions of the American public to a variety of economic, political, and social issues Sample Size Based on a nationally representative telephone survey of 1,000 adults age 18 or over Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-5 AC Nielsen SCANTRACK Index Offers valuable scanner-based sales and brand share data on a regular basis to manufacturers of a wide variety of consumer products such as food, drugs, cosmetics Sample Size Sales and brand share estimates are gathered weekly from a representative sample of more than 4,800 stores representing over 800 retailers in 50 major markets Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-6 Sampling vs. Census Studies A census study draws inferences from the entire body of units of interest A sample study, drawing inferences from a sample drawn from the population

Upload: lamdan

Post on 15-Mar-2018

239 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

1

Copyright © by Houghton Mifflin Company, Inc. All rights reserved First Edition

Chapter 11

Sampling

Foundations

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-2

Chapter Objectives • Define and distinguish between sampling

and census studies

• Discuss when to use a probability versus a

nonprobability sampling method and

implement the different methods

• Explain sampling error and sampling

distribution

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-3

Chapter Objectives • Construct confidence intervals for

population means and proportions

• List the factors to consider in determining

sample size, and compute the required

sample size to achieve a specific degree of

precision at a desired confidence level

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-4

National Poll –Sample Size

• Harris Poll

– A weekly study that monitors the reactions of

the American public to a variety of economic,

political, and social issues

• Sample Size

– Based on a nationally representative telephone

survey of 1,000 adults age 18 or over

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-5

AC Nielsen SCANTRACK Index

• Offers valuable scanner-based sales and brand

share data on a regular basis to manufacturers of a

wide variety of consumer products such as food,

drugs, cosmetics

• Sample Size

– Sales and brand share estimates are gathered weekly

from a representative sample of more than 4,800 stores

representing over 800 retailers in 50 major markets

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-6

Sampling vs. Census Studies

• A census study draws inferences from the

entire body of units of interest

• A sample study, drawing inferences from a

sample drawn from the population

Page 2: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

2

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-7

Advantages of Sampling

• Low Cost

• Reduced time

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-8

Sampling and

Nonsampling Errors

• Sampling error: The difference between a statistic

value that is generated through a sampling

procedure and the parameter value, which can be

determined only through a census study

• Nonsampling error: Any error in a research study

other than sampling error (which arises purely

because a sample, rather than the entire

population, is studied)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-9

Minimizing Sampling Errors

• Increase the sample size

• Use a statistically efficient sampling plan

• Make the sample as representative of the

population as possible

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-10

Types of Nonsampling Errors

• Nonsampling Error

– Any error other than sampling error

• Sampling Frame Error

– Sampling frame not being representative of ideal population

• Nonresponse Error

– Final sample not representative of planned sample

• Data Error

– Distortions in collected data and mistakes in data coding, analysis, or interpretation

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-11

Potential Causes of Sampling

Frame Errors

• Incomplete sampling frame over-represents

some population segments and

underrepresents others

• Sampling frame contains irrelevant units

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-12

Minimizing Sampling Frame

Errors

• Start with a complete sampling frame

• Modify the sampling frame to make it

representative of the ideal population using

plus-one dialing in telephone surveys

Page 3: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

3

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-13

Potential Causes of Nonresponse

Errors

• Mail surveys/Internet Surveys

– Certain types of sample units being more likely

to respond than others

• Telephone and personal interview surveys

– Person not-at-home problem and respondent

refusal problem

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-14

Minimizing Nonresponse Errors

• Mail surveys: increase response rates through the use of incentives, follow-up mailings, etc.

– Caution:increase in response rate per se may not reduce non-response error

• Telephone and personal interview surveys: make call-backs and spread out the time blocks during which interviews are conducted

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-15

Potential Causes of Data Errors

• Respondents’ reluctance/ inability to give

accurate answers

• Ill-trained interviewers

• Unscrupulous interviewers

• Poorly designed questionnaire

• Mistakes in coding data

• Erroneous analysis

• Incorrect/ inappropriate interpretation of results

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-16

Exhibit 11.1 Types and Potential Causes of

Nonsampling Errors Telephone survey Online survey

Total population of interest

Portion of population that has access to the medium (telephone, online)

Portion of population that has access and volunteers (does not refuse,opts in)

Portion of population that has access, volunteers and completes(responds, does not opt out)

Source:Adapted from Thomas W. Miller, “Can We Trust the Data of Online Research,” Marketing Research (Summer 2001),

Vol. 13, No.2, p. 31.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-17

When Census Studies Are

Appropriate

• The feasibility condition

– Whenever a population is relatively small or

can be accessed easily

• The necessity condition

– When the population units are extremely varied

and each population unit is likely to be very

different from all the other units

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-18

Probability and Nonprobability

Sampling

• Probability sampling is an objective

procedure in which the probability of

selection is known in advance for each

population unit

• Nonprobability sampling is a subjective

procedure in which the probability of

selection for each population unit is

unknown beforehand

Page 4: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

4

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-19

Sampling methods

Probability sampling Nonprobability sampling

Stratified sampling Simple random

sampling Cluster Sampling

Judgment

sampling

Convenience sampling Quota sampling

Proportionate stratified

random sampling

Disproportionate

Stratified random sampling

Simple cluster sampling Systematic sampling

Exhibit 11.3 Classification of Sampling Methods

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-20

Probability Sampling Methods

• Simple Random Sampling

• Stratified Random Sampling

• Cluster Sampling

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-21

Gallup Poll: USA

• Identify and describe the population that a given

poll is attempting to represent

• Choose or design a method that will enable Gallup

to sample the target population randomly

• Random Digit Dialing(RDD)--a procedure that

creates a list of all possible household phone

numbers in America and then selects a sub-set of

numbers from that list for Gallup to call

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-22

Simple Random Sampling

• Every possible sample of a certain size

within a population has a known and equal

probability of being chosen as the study

sample

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-23

Stratified Random Sampling

• Two Types of Stratified Random Sampling:

– Proportionate Stratified Random Sampling

– Disproportionate Stratified Random Sampling

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-24

Proportionate Stratified Random

Sampling

• Sample consists of units selected from each

population stratum in proportion to the total

number of units in the stratum

Page 5: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

5

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-25

Kirkwood University-

Proportionate Stratified Random

Sampling

• Administrators of Kirkwood University wanted to

determine the attitudes of their students toward

various aspects of the university

• They selected a proportionate stratified random

sample of 500 students for conducting the attitude

survey

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-26

Table 11.2 Proportionate Allocation of Total

Sample of Kirkwood University Students

500 10,000 Total

100 2,000 Seniors

100 2,000 Juniors

150 3,000 Sophomores

150 3,000 Freshman

Number of

Sample Units

Allocated

Number of

Population Units

Population Strata

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-27

Gallup Poll on Sampling: China

• 12,500 counties, cities, and urban districts were

divided into 50 strata based on their geographic

location, degree of economic development, and

proportion of non-agricultural population

• One primary sampling unit (PSU), consisting of

either a county or a city, was selected from each

stratum based on probability proportional to

population size

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-28

Gallup Poll on Sampling: China

(Cont’d)

• Within each PSU, the populations of all

neighborhoods and villages were compiled.

From this listing, four neighborhoods or

villages were selected proportional to size.

• From each of these four neighborhoods or

villages, five households were selected at

random

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-29

Gallup Poll on Sampling: China

(Cont’d)

• One respondent was selected from each of

the selected households, ensuring proper

representation in the sample of all age

groups by both genders

• The respondent to be interviewed is then

selected according to a prescribed

systematic procedure

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-30

Gallup Poll on Sampling: China (Cont’d)

• If the designated respondent was not at home, or could not be reached, a second or, if needed, a third adult family member was selected systematically from among the household members remaining on the list

• If contact with the designated respondent could not be made after a total of three separate visits to the household, an interview with a respondent in a substitute household in the same locality was permitted

• Two substitute households were kept in reserve for each five assigned households in the interviewing area

Page 6: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

6

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-31

Gallup: India

• Design of the sample:

– GALLUP INDIA PVT. LTD. interviewed a

total of 5,122 Indian adults age 18 years and

over (one per household) in late March and

early April 1996

– Nationwide survey involved in-person

interviews in 144 villages and 84 towns and

cities across India

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-32

Gallup: India (Cont’d)

• Urban Sample (design = 1,600 interviews)

– Three hundred eighty districts in India (excluding those in Jammu-Kashmir, the northeastern states, and other difficult-to-access areas such as the Andaman and Nicobar Islands) were classified into 20 strata based on their geographical (zonal) location and urban population

– Across these 20 strata, 40 districts were chosen • In each selected district, two towns were picked on the basis of

probability proportional to size

– From the selected towns, 2 colonies were selected randomly, and 10 households were selected from each colony

• From each household,one respondent was chosen i.e., either male or female above 18 years of age

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-33

Gallup: India (Cont’d)

• Rural Sample (design = 1,440 interviews)

– 40 districts were chosen for the urban sample, the

remaining 340 districts were divided into 12 strata

based on their geographical (zonal) location and rural

population

• On average, two districts were selected from each stratum.

– From each household, one respondent was chosen on

the same criterion of demographics, i.e., either male or

female above 18 years of age

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-34

Gallup: India (Cont’d)

• Urban Oversample (design = 2,000 interviews, 400 per metro)

– The urban oversample represented five of the country’s major metropolitan areas: Bombay, Delhi, Calcutta, Madras,and Bangalore

– Within each metropolitan area, an average of 13 electoral wards were chosen on a probability proportional to size basis

– Within each electoral ward, four colonies were randomly selected

• In each colony,eight households were randomly selected

– One respondent was interviewed per household

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-35

Gallup: India (Cont’d)

• Results are projectable to within ±3 percent

for India as a whole, ±2 percent for urban

India in general, and ±7 percent for each of

India’s five largest cities

• Urban and rural India were considered as

separate domains for purposes of sampling

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-36

Disproportionate Stratified

Random Sampling

• Sample consists of units selected from each

population stratum according to how varied

the units are within the stratum

Page 7: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

7

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-37

Exhibit 11.4 Disproportionate Stratified Random

Sampling Used by A.C. Nielsen Company

Chain

(Includes Convenience

Chains)

Large Independent

(Over $500,000)

Medium Independent

($100,000 - $500,000)

Small Independent

(Under $100,000)

25.2%

12.8%

32.6%

29.4%

47.9%

24.9%

17.6%

9.6%

$2,445,000

$1,700,000

$234,000

$55,000

1 out of

every 39

1 out

of

every

69

1 out

of

every

248

1 out of

every

360

In Universe

Percent of stores

In NFL Sample Average Store Size Take Ratio

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-38

Cluster Sampling

• Clusters of population units are selected at

random and then all or some units in the

chosen clusters are studied

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-39

Systematic Sampling Steps

• An organized procedure, selecting a sample from a list containing all the population units

• Steps:

1) Determine the sampling interval, k:

number of units in the population

k = ------------------------------------------

number of units desired in the sample

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-40

Systematic Sampling Steps

(Cont’d)

• Steps (cont’d):

• 2) Choose randomly one unit between the

first and kth units in the population list

• 3) The randomly chosen unit and every kth

unit thereafter are designated as part of the

sample

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-41

Practical Considerations:

Probability Sampling Methods

• Probability sampling techniques are

generally used by large commercial

marketing research firms that maintain

national samples or panels that can be

readily accessed for conducting periodic

research surveys

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-42

Nonprobability Sampling

Methods

• Convenience Sampling

• Judgment Sampling

• Quota sampling

Page 8: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

8

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-43

Convenience Sampling

• Researcher's convenience forms the basis for selecting a sample of units

– The administrators of a college have announced a sharp increase in tuition fees for the next year.

– A TV reporter covering this news item is shown standing on campus talking to several students, one at a time, about their reactions to the proposed tuition fee increase.

– TV Reporter says: “While some of the students feel that the 10 percent fee hike is justified, most of them consider it to be unfair.”

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-44

Judgment Sampling

• A procedure in which a researcher exerts some effort in selecting a sample that he or she believes is most appropriate for a study

• Example:

– The administrators of a college have announced a sharp increase in tuition fees for the next year

– A judgment sample of student officers may be more representative than a convenience sample of students

– The researcher should be knowledgeable about the ideal population for a study

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-45

Quota Sampling

• Involves sampling a quota of units to be selected from each population cell based on the judgment of the researchers and/or decision makers

• Steps:

– 1) Divide the population into segments (referred to as

cells) based on certain control characteristics

– 2) Determine the quota of units for each cell (quotas

are determined by the researchers and/or decision

makers)

– 3) Instruct the interviewers to fill the quotas assigned

to the cells

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-46

Quota Sampling Plan for the

Newspaper Subscriber Survey

Geographic

Segment Male Female

I 30 30

II 30 30

III 30 30

IV 30 30

V 30 30

Total sample size = 300

Gender

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-47

Quota Sampling Plan for a Survey of Attitudes

Toward Social Welfare Programs

Highest Education Level

Less than High School Some College

Age High School Diploma College Degree

18-30 100 100 100 100

31-45 100 100 100 100

46-60 100 100 100 100

Over 60 100 100 100 100

Total sample size = 1600

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-48

Parameter & Statistic

• Parameter

– The actual, or true, population mean value or

population proportion for any variable

• income, product ownership

• Statistic

– An estimate of a parameter from sample data

Page 9: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

9

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-49

Sampling Error

• Sampling Error = Parameter Value -

Statistic Value

• Difference between a statistic value that is

generated through a sampling procedure and

the parameter value, which can be

determined only through a census study

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-50

Sampling Distribution

• Representation of the sample statistic values

obtained from every conceivable sample of

a certain size chosen from a population by

using a specified sampling procedure along

with the relative frequency of occurrence of

those statistic values

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-51

Sampling Distribution

µX SX

C

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-52

Table 11.4 Expenditures for Eating Out for

a Hypothetical Population

500 10

450 9

400 8

350 7

300 6

250 5

200 4

150 3

100 2

50 1

Annual expenditure for

eating out($)

Family Number

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-53

Table 11.5 Partial List of Possible

Samples and Sample Means

475 9,10

375 5,10;6,9;7,8

275 1,10;2,9;3,8;4,7;5,6

175 1,6;2,5;3,4

75 1,2

Sample Mean Values

($)

Samples of Two

Families

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-54

Exhibit 11.5 Sampling Distribution (Bar

Chart) for Simple Random Samples of Two

Units

Sample Mean Values ($)

475

450

425

400

375

350

325

300

275

250

225

200

175

150

125

100

75

6/45

5/45

4/45

3/45

2/45

1/45

0

Page 10: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

10

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-55

Exhibit 11.6 Sampling Distribution

Shown as a Histogram

Sample Mean Values

500.0450.0400.0350.0300.0250.0200.0150.0100.0

Fr

q

u

e

n

c

y

o

f

O

c

cu

r

r

e

n

c

e

Population mean value

Normal

probability

distribution

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-56

Central Limit Theorem

Distribution Mean Standard

Deviation

Population

Sample x S

Sampling x Sx

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-57

Confidence Estimation for

Interval Data

n = number of units in the sample

X = sample mean value

Sx = s / n

S = standard deviation

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-58

Confidence Estimation for

Interval Data (Cont’d)

• Given n = 100, x = 1,278 units, and s = 399 units

• To Construct 95 percent confidence interval

s 399

sx = --- = ----- = 39.9 units

n 100

• The 95 percent confidence interval is

x ± 1.96 sx = 1,278 ± (1.96)(39.9) = 1,278 ±

78.204 = 1,278 ± 78,approximately

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-59

Confidence Estimation for

Interval Data (Cont’d)

• Interpretation

– From the sample data, we can be 95 percent

confident that the average annual sales of men's

suits, across all men's clothing stores in the

population, are between 1,200 and 1,356 units

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-60

Finding Confidence Intervals for

Population Proportions = true population proportion (i.e., the parameter value) Confidence Intervals for Population proportion: p - 1.96sp p + 1.96sp p = proportion obtained from a single sample (i.e., the statistic value) sp = estimate of the standard error of the sample proportion p =number of sample units having a certain feature total number of sample units (i.e., n) sp = p (1 - p) n

Page 11: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

11

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-61

Finding Confidence Intervals for

Population Proportions (Cont’d)

Given n = 100 and p = .64. To Construct a 95 percent

confidence interval for the population proportion

sp = p (1 - p)

n

(.64)(.36) = .048

100

The 95 percent confidence interval is

p ± 1.96 sp = .64 ± (1.96)(.048)

= .64 ± .09408

= .64 ± .09, approximately.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-62

Finding Confidence Intervals for

Population Proportions (Cont’d)

• Interpretation

– This confidence interval can also be expressed

in percentage terms: 64% ± 9%

– In other words, we can be 95 percent confident

that between 55 and 73 percent of all grocery

stores in the city carry potted plants

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-63

Factors Influencing Sample Size

• Desired precision level

• Desired confidence level

• Degree of variability

• Resources available

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-64

Methods for Determining Sample

Size

• The desired precision level

• The desired confidence level

• An estimate of the degree of variability in

the population, expressed in the form of a

standard deviation

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-65

Sample Size Estimation

• H-> Desired precision level

• q-> Desired confidence level

• S-> Sample Standard deviation

• N-> Population mean

zq2 s

2

N = ------

H2

zqs

H = ----

n

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-66

Sample Size Estimation (Cont’d)

• A marketing manager of a frozen-foods firm

wants to estimate within ±$10 the average annual

amount that families in a certain city spend on

frozen foods per year and have 99 percent

confidence in the estimate

• He estimates that the standard deviation of annual

family expenditures on frozen foods is about $100

• How many families must be chosen for this

study?

Page 12: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

12

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-67

Sample Size Estimation (Cont’d)

H = $10, s = $100, and zq = 2.575

(corresponding to a confidence level of 99 percent)

n = (2.575)2(100)2 = 663 families,approximately

(10)2

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-68

Determining Sample Size

• A sporting goods marketer wants to estimate the proportion of tennis players among high school students in the United States

• The marketer wants the estimate to be accurate within ±.02 and wants to have 95 percent confidence in the interval estimate

• A pilot telephone survey of 50 high school students showed that 20 of them played tennis. Estimate the required sample size for the final study from the given data

• What should the sample size be if the desired precision and confidence levels are to be guaranteed?

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 11-69

Determining Sample Size (Cont’d) H = .02 and zq = 1.96. p = 20/50 =0.4

s = (20/50)(1 - 20/50) = (.4)(.6) = .24

z2q s

2 (l.96)2(.24 )2

n = ------------ = ------------------

H2 (.02)2

= 2,305 students, approximately

The maximum sample size is

.25z2q

nmax = ------------ = 2,401 students H2

Copyright © by Houghton Mifflin Company, Inc. All rights reserved First Edition

Chapter 12

Quality

Control and

Initial Analysis

of Data

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-71

Chapter Objectives

• Define editing and distinguish between a field edit

and an office edit

• Define coding and outline the steps it involves

• Compute measures of central tendency and

dispersion of the data for each variable in a data

set

• State the potential uses of frequency distribution

or one- way tables

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-72

Data Analysis at Rockbridge

Associates: Data Integrity • Data integrity is the foundation for successful

marketing research

• Rockbridge ensures integrity in the collection and processing of the data by a number of quality control checks for

– mail surveys

– telephone surveys

– web surveys

• Rockbridge ensures data integrity in how the results are interpreted and explained to management

Page 13: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

13

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-73

Editing

• Editing is the process of examining

completed data collection forms and taking

whatever corrective action is needed to

ensure the data are of high quality

– Preliminary or field edit

– Final or office edit

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-74

Field Edit

• A field edit, or preliminary edit, is a quick

examination of completed data collection forms,

usually on the same day they are filled out

• Objectives

– Ensure that proper procedures are being followed in

selecting respondents, interviewing them, and recording

their responses

– Fix fieldwork deficiencies before they turn into major

problems

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-75

Office Edit

• A final, or office edit, verifies response

consistency and accuracy

– Makes necessary corrections

– Determines whether some or all parts of a data

collection form should be discarded

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-76

What Is Wrong With this

Response…

• A respondent said he was 18 years old but

indicated that he had a Ph.D. when asked

for his highest level of education.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-77

Editing Can Help Uncover

• Improper field procedures

• Incomplete interviews

• Improperly conducted interviews

• Technical problems with the questionnaire or interview

• Respondent rapport problems

• Consistency problems that can be isolated and reconciled

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-78

Improper Field Procedures

• Wrong questionnaire form used

• Interview inadvertently not taken

Page 14: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

14

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-79

Incomplete Interviews

• Questions not asked

• Directions not followed (proper segments of

the questionnaire were not administered)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-80

Improperly Conducted Interviews • The wrong respondent interviewed (e.g., son

instead of father)

• Questions misinterpreted by interviewer or respondent

• Evidence of bias or influencing of answers.

• Failure to probe for adequate answers or the use of poor probes

• Interviewer's illegible writing and/or style.

• Interviewer recorded information which identified a respondent whose anonymity should have been protected

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-81

Improperly Conducted Interviews

(Cont’d) • Interviewer apparently does not understand what

type of responses constitute an answer to the actual question asked

• Interviewer does not understand what the objective of the question is and thus accepts an improper frame of reference for the respondent's answer

• Other evidence of need for training or instructions to be given to interviewer

– failure to write down probes, wrong abbreviations, failure to follow directions

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-82

Technical Problems With the

Questionnaire or Interview • Space was not provided for needed information

• The presence of unanticipated or unusually frequent extreme responses to questions, indicating a possible need for rewording of certain questions

• Inappropriate or unworkable interviewer instructions not detected in the pretest

• The order in which questions were asked introduces confusion, resentment, or bias into the respondent's answers

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-83

Respondent Rapport Problems

• Frequent refusal to answer certain questions.

• Reports of abnormal termination of the interview

(or presence of hostility) due to sensitive questions

• Evidence that respondent and interviewer are

playing the "game" of "What answer do you want

me to give?"

• Evidence that the presence of other people in the

interview situation is causing problems

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-84

Consistency Problems That Can

Be Isolated and Reconciled • Contradictory answers

– reports no savings in one section of the interview but reports interest from bank accounts in another section

• Misclassification

– mortgage debt improperly reported as installment debt

• Impossible answers

– reports paying $600 for a new Edsel in 1970--the car should have been recorded as a "used" car; or weekly income reported on the income-per-month line

Page 15: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

15

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-85

Consistency Problems That Can Be

Isolated and Reconciled (Cont’d)

• Unreasonable (and probably erroneous) responses

– Respondent reports borrowing $2000 for two years to

buy a car but reported monthly payments multiplied by

24 months are less than $2000

– Respondent reports that the house value is $90,000

while income is $2000 per year and the respondent

claims less than a high school education

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-86

Preventing Errors

• Careful planning before fieldwork begins

• Automating data entry

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-87

Coding

• Coding broadly refers to the set of all tasks

associated with transforming edited responses into

a form that is ready for analysis

• Steps

– Transforming responses to each question into a set of

meaningful categories

– Assigning numerical codes to the categories

– Creating a data set suitable for computer analysis

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-88

Transforming Responses into

Meaningful Categories

• A structured question is pre-categorized

• Responses to a nonstructured or open-ended

question to be grouped into a meaningful

and manageable set of categories

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-89

The Best Way to Treat "Don't

Know" Responses

• Infer an actual response –dubious validity

• Classify the "don't know's" as a separate

response category for each question

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-90

Missing-Value Category

• A missing value can stem from

– A respondent's refusal to answer a question

– An interviewer's failure to ask a question or

record an answer or a "don't know" that does

not seem legitimate

• Best way to treat missing value responses

– Sound questionnaire design

– Tight control over fieldwork

Page 16: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

16

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-91

Assigning Numerical Codes

• Assign appropriate numerical codes to

responses that are not already in quantified

form

• To assign numerical codes, the researcher

should facilitate computer manipulation and

analysis of the responses

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-92

Coding Multiple Response

• Which of the following countries have you visited during the past 12 months?

________Canada

________England

________France

________Germany

________Japan

________Mexico

• Need six variables, each relating to a specific country and having two possible values --for example, 1= “No” and 2 = “Yes”

• Six columns must be set aside in the data spreadsheet to record responses to this question

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-93

Multiple Response Question –

Rank Order Question • Please rank the following fast-food restaurants by

placing a 1 beside the restaurant you think is best overall, a 2 beside the restaurant you think is second best, and so on. __________Burger King __________McDonald's __________Wendy's __________Whataburger

• This question requires as many variables (and columns) as there are objects to be ranked

• 4 separate variables are needed

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-94

Creating a Data Set

• Organized collection of data records

• Each sample unit within the data set is called a Case or Observation

• Structure of a Data Set

– The number of observations = n

– The total number of variables embedded in the questionnaire is m, then

• Data set = n x m matrix of numbers

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-95

Table 12.3 Structure of a Data Sheet

Variables

Observation 1 2 …… j …… m

1 x 11 x 12 x 1j x 1m

2 X 21 X 22 X 2j X 2m

i X i1 X i2 X ij X im

n X n1 X n2 X nj X nm

Respondent 1’s

response to variable 1.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-96

Preliminary Data Analysis:

Basic Descriptive Statistics

• Preliminary data analysis examines the

central tendency and the dispersion of the

data on each variable in the data set

Page 17: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

17

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-97

Measures of Central Tendency and

Dispersion for Different Types of Variables

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-98

Measurement Level of Data

Pertaining to Variable–Nominal

• Measures of Central Tendency

– Mode: Most frequently occurring response

• Measures of Dispersion

– Strictly speaking, the concept of dispersion is

not meaningful for nominal data

– An idea about the distribution of responses can

be obtained by examining their relative

frequencies of occurrence

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-99

Measurement Level of Data

Pertaining to Variable –Ordinal

• Measures of Central Tendency

– Median: 50th percentile response

• Measures of Dispersion

– Range: Defined by the highest and lowest

response values

– Interquartile range: Difference between the

75th and 25th percentile responses

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-100

Measurement Level of Data

Pertaining to Variable– Interval

• Measures of Central Tendency

– Mean: Arithmetic average of response values

• Measures of Dispersion

– Standard deviation: As defined in Chapter 9

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-101

Measurement Level of Data

Pertaining to Variable– Ratio

• Measures of Central Tendency

– Mean: Arithmetic average of response values

• Measures of Dispersion

– Standard deviation: As defined in Chapter 9

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-102

Mode

• The value that occurs most frequently

Page 18: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

18

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-103

Table 12.5 How Long Have You Been

Using the Services of National? –

Computing Mode

Assigned

Count/

Length of Service

(USE) Value Frequency

Less than 1 year 1 36

1 to less than 2 years 2 16

2 to less than 5 years 3 26

5 years or more 4 193 (Mode = 4 most occurring value)

Total 271

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-104

Table 12.5 How Long Have You Been

Using the Services of National? –

Computing Mode (Cont’d)

In SPSS: 1. Select ANALYZE;

2. Click DESCRIPTIVE STATISTICS,

3. Select FREQUENCIES,

4. Move the variable “USE” to the Variable(s) box,

5. Click STATISTICS box,

6. Select MODE,

7. Click CONTINUE, and

8. Click OK.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-105

Table 12.5 How Long Have You Been Using the

Services of National? –Computing Mode (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-106

Table 12.5 How Long Have You Been

Using the Services of National? –

Computing Mode (Cont’d)

1= Less than a year

2 = 1 to less than 2 years

3 = 2 to less than 5 years

4 = 5 years or more

most frequently occurring value

= mode = 4

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-107

Median

• The observation below which 50 percent of

the observations fall

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-108

Table 12.6 Length of Time Service Used –

Responses from 20 Customers How long have you been using the services of National?

4 3 4 1 4 4 4 4 4 4 3

4 4 3 4 4 4 3 1 1

1= Less than a year; 2 = 1 to less than 2 years; 3 = 2 to less than 5 years;

4 = 5 years or more

Arranging the 20 values in ascending order:

1 1 1 3 3 3 3 4 4 4 4

4 4 4 4 4 4 4 4 4

Because the sample size = 20, there are two middle values: 4 and 4. The

median is, therefore, the average of the two middle values = 4.

Page 19: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

19

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-109

Table 12.7 Computing Median

for Length of Time Service Used

In SPSS:

1. Select ANALYZE;

2. Click DESCRIPTIVE STATISTICS,

3. Select FREQUENCIES,

4. Move the variable “USE” to the Variable(s) box,

5. Click STATISTICS box,

6. Select MEDIAN,

7. Click CONTINUE, and

8. Click OK.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-110

Table 12.7 Computing Median for

Length of Time Service Used (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-111

Table 12.7 Computing Median for

Length of Time Service Used (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-112

Mean

n = Number of units in the sample

xi = data obtained from each sample unit I

x = sample mean value, given by

n

(xi )

---------

i=1 n

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-113

Table 12.8 Overall Quality of Services Provided

by National– Computing Mean

On a scale of 1 to 10, how would you rate the overall quality of service

provided by National?

Extremely Extremely

Poor Good

1 2 3 4 5 6 7 8 9 10

In SPSS

1. Select ANALYZE

2. Click DESCRIPTIVE STATISTICS

3. Select FREQUENCIES

4. Move the variable “OQ- Labeled as OVERALL SERVICE

QUALITY” to the Variable(s) box

5. Click STATISTICS box

6. Select MEAN, MEDIAN, AND MODE

7. Click CONTINUE

8. Click OK

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-114

Table 12.8 Overall Quality of Services Provided by

National– Computing Mean (Cont’d)

Page 20: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

20

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-115

Table 12.8 Overall Quality of Services Provided

by National– Computing Mean (Cont’d)

Since the level of measurement is interval

scale, we can compute mean, median and

mode. Since the distribution is skewed to

the left, the mean is influenced by smaller

values than the median. Therefore the mean

is smaller than median. The median is

smaller than mode.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-116

Measures of Dispersion

• Range

• Variance

• Standard Deviation

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-117

Range

• Range is the difference between the largest

and smallest value

• The simplest measure of dispersion

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-118

Variance

• Variance of a set of data is a measure of

deviation of the data around the arithmetic

mean

(xi –x )2

S2 = ----------

n-1

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-119

Standard Deviation

• Standard deviation is the square root of the

variance

n

(xi –x )2

i=1----------

n-1

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-120

Table 12.9 Overall Quality of Services Provided by

National: Computing Range, Variance, and Standard

Deviation On a scale of 1 to 10, how would you rate the overall quality of service

provided by National?

Extremely Extremely

Poor Good

1 2 3 4 5 6 7 8 9 10

In SPSS

1. Select ANALYZE

2. Click DESCRIPTIVE STATISTICS

3. Select FREQUENCIES

4. Move the variable “OQ- Labeled as OVERALL SERVICE QUALITY”

to the Variable(s) box

5. Click STATISTICS box

6. Select STANDARD DEVIATION, VARIANCE, and RANGE 7. Click CONTINUE

8. Click OK

Page 21: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

21

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-121

Table 12.9 Overall Quality of Services Provided

by National: Computing Range, Variance, and

Standard Deviation (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-122

Standard deviation is square root of variance =

2.33

Variance =5.43

Range = highest value-lowest value = 10-1 = 9

Table 12.9 Overall Quality of Services Provided

by National: Computing Range, Variance, and

Standard Deviation (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-123

Frequency Distribution: One-

Way Tabulation

• One-way tabulation is a table showing the

distribution of data pertaining to categories

of a single variable

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-124

Table 12.10 Age and Length of

Time Service Used

• In SPSS:

1. Select ANALYZE

2. Click DESCRIPTIVE STATISTICS

3. Select FREQUENCIES

4. Move the variable “AGE” to the Variable(s) box

5. Click CHARTS box

6. Select BAR CHARTS

7. Click on CONTINUE

8. Click OK

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-125

Table 12.10 Age and Length of Time

Service Used (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-126

Table 12.10 Age and Length of Time

Service Used (Cont’d)

Page 22: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

22

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-127

Table 12.10 Age and Length of Time

Service Used (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-128

Table 12.10 Age and Length of Time

Service Used (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-129

Table 12.10 Age and Length of Time

Service Used (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-130

Table 12.10 Age and Length of Time

Service Used (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-131

Why Averages May be

Misleading

• Researchers tested a new sauce product and

found

– Mean rating of the taste test was close to the

middle of the scale, which had "very mild" and

"very hot" as its bipolar adjectives

• Researcher’s conclusion

– Consumers need really neither really hot nor

really mild sauce

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 12-132

Why Averages May be

Misleading (Cont’d)

• Deeper examination revealed

– The existence of a large proportion of consumers who wanted the sauce to be mild and an equally large proportion who wanted it to be hot nor really mild sauce

• Moral of the story:

– A clear understanding of the distribution of responses can help a researcher avoid erroneous inferences

Page 23: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

23

Copyright © by Houghton Mifflin Company, Inc. All rights reserved First Edition

Chapter 13

Hypothesis

Testing

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-134

Chapter Objectives

• Distinguish between descriptive analysis and

inferential analysis.

• State the null and alternative hypotheses

pertaining to a variety of decision situations

requiring formal hypothesis testing.

• Define Type I and Type II errors and state the

relationship between them.

• Define significance level and power of a

hypothesis test.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-135

Chapter Objectives (Cont’d)

• Lay out the steps involved in conducting a

hypothesis test

• Interpret two-way tabulation and a chi-square

contingency test

• Use the appropriate test pertaining to hypotheses

involving a single mean, a single proportion, two

means (when the two samples are independent

and when they are dependent), and two

proportions

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-136

Hypothesis Testing: Key to Actionable

Strategies By Dave Moxley,

President… • We start all research projects with in-depth

interviews of the business heads generating

hypotheses or hunches about the topic being

researched.

• Involve the business leaders in the early hypothesis generation

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-137

Client-Researcher Involvement –

Dave Moxley

• Ensure to ask necessary questions to collect the

data for testing relevant assumptions

• Increase business buy-in to the process as a full

project partner,thereby dramatically increasing the

likelihood of subsequent market action

• Improve the image of the research function as an

integrated and valued contributor to the strategic

direction and tactical program implementation of

the business

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-138

Hypotheses Testing-Dave Moxley

• Oversimplified or incorrect assumptions must be

subjected to more formal hypothesis testing

Page 24: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

24

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-139

Interesting Hypotheses – Dave Moxley

• Bankers assumed high-income earners are more profitable than low-income earners

• Clients who carefully balance their checkbooks every month and minimize fees due to overdrafts are unprofitable checking account customers

• Old clients were more likely to diminish CD balances by large amounts compared to younger clients

– This was nonintutive because conventional wisdom suggested that older clients have a larger portfolio of assets and seek less risky investments

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-140

Data Analysis

• Descriptive

– Computing measures of central tendency and

dispersion,as well as constructing one-way tables

• Inferential

– Data analysis aimed at testing specific hypotheses is

usually called inferential analysis

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-141

Null and Alternative Hypotheses

H0 -> Null Hypotheses

Ha -> Alternative Hypotheses

• Hypotheses always pertain to population

parameters or characteristics rather than to sample

characteristics. It is the population, not the sample,

that we want to make an infernece about from

limited data

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-142

Steps in Conducting a Hypothesis Test

• Step 1. Set up H0 and Ha.

• Step 2. Identify the nature of the sampling

distribution curve and specify the appropriate test

statistic.

• Step 3. Determine whether the hypothesis test is

one-tailed or two-tailed.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-143

Steps in Conducting a Hypothesis Test

• Step 4. Taking into account the specified significance level, determine the critical value (two critical values for a two-tailed test) for the test statistic from the appropriate statistical table.

• Step 5. State the decision rule for rejecting H0.

• Step 6. Compute the value for the test statistic from the sample data.

• Step 7. Using the decision rule specified in step 5, either reject H0 or reject Ha.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-144

Launching a Product Line Into a New

Market Area

• Karen, product manager for a line of apparel, to

introduce the product line into a new market area

• Survey of a random sample of 400 households in

that market showed a mean income per household

of $30,000.Karen strongly believes the product

line will be adequately profitable only in markets

where the mean household income is greater than

$29,000. Should Karen introduce the product line

into the new market?

Page 25: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

25

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-145

Karen’s Criterion for Decision

Making

• To reach a final decision, Karen has to make a

general inference (about the population) from the

sample data

• Criterion-- mean income across across all

households in the market area under consideration

• If the mean population household income is

greater than $29,000, Karen should introduce the

product line into the new market

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-146

Karen’s Hypothesis

• Karen’s decision making is equivalent to either

accepting or rejecting the hypothesis:

– The population mean household income in the new

market area is greater than $29,000

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-147

One-Tailed Hypothesis Test

• The term one-tailed signifies that all - or z-values

that would cause Karen to reject H0, are in just one

tail of the sampling distribution

-> Population Mean

H0: $29,000

Ha: $29,000

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-148

Type I and Type II Errors

• Type I error occurs if the null hypothesis is

rejected when it is true

• Type II error occurs if the null hypothesis is not rejected when it is false

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-149

Significance Level

• -> Significance level --The upper-bound

probability of a Type I error

• 1 - ->confidence level -- the complement of significance level

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-150

InferenceBased on

Sample Data

Real State of Affairs

H0 is True H0 is False

H0 is True

Correct decisionConfidence level

= 1-

Type II error

P (Type II error) =

H0 is False

Type I errorSignificance level

= *

Correct decision

Power = 1-

*Term represents the maximum probability ofcommitting a Type I error

Summary of Errors Involved in

Hypothesis Testing

Page 26: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

26

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-151

Level of Risk

• Two firms considering introducing a new product

that radically differs from their current product

line

– Firm ABC

• Well-established customer base, distinct reputation for its

existing product line

– Firm XYZ

• No loyal clientele, no distinct image for its present

products

Which of these two firms should be more cautious

in making a decision to introduce the new

product? Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-152

Scenario - Firms ABC & XYZ

• Firm ABC

– ABC should be more cautious

• Firm XYZ

– XYZ should be less cautious

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-153

Identifying the Critical Sample Mean Value--

Sampling Distribution Sample mean (x) values greater than $29,000--that is x-values on the right-hand side

of the sampling distribution centered on µ = $29,000--suggest that H0 may be false.

More important the farther to the right x is , the stronger is the evidence against H0

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-154

Karen’s Decision Rule for Rejecting

the Null Hypothesis

• Reject H0 if the sample mean exceeds xc

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-155

Every mean x has a corresponding equivalent

standard Normal Deviate:

The expression for z

x-

Z = ---------

sx

x = + zsx

Substituting xc for x and zc for z

xc = + zcsx where zc is standard normal deviate

corresponding to the critical sample mean, xc.

Criterion Value

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-156

Computing the Criterion Value

Standard deviation for the sample of 400 households is

$8,000. The standard error of the mean (sx ) is given by

S

s = ---- = $400

n

Critical mean household income xc through the

following two steps: 1. Determine the critical z-value, zc. For =.05, From

Appendix 1, zc = 1.645.

2. Substitute the values of zc, s, and (under the assumption

that H0 is "just" true ), xc = + zc s = $29,658.

x

Page 27: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

27

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-157

Karen’s Decision Rule

• If the sample mean household income is greater

than $29,658, reject the null hypothesis and

introduce the product line into the new market

area.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-158

Test Statistic

The value of the test statistic is simply the z-value

corresponding to = $30,000.

x-

Z = ------ = 2.5

s

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-159

Critical Value for Rejecting the Null

Hypothesis

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-160

P - Value – Actual Significance Level

• The probability of obtaining an x-value as high

as $30,000 or more when is only $29,000 =

.0062.

• This value is sometimes called the actual

significance level, or the p-value

• The actual significance level of .0062 in this case

means the odds are less than 62 out of 10,000 that

the sample mean income of $30,000 would have

occurred entirely due to chance (when the

population mean income is $29,000 or less).

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-161

T-test

Conduct T-Test when sample is small.

Let the sample size, n = 25

X = $30,000 , s = $8,000

From the t-table in Appendix 3, tc = 1.71 for = .05 and

d.f. = 24.

Decision rule: “Reject H0 if t 1.7l.”

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-162

T-test (Cont’d)

The value of t from the sample data:

S = 8000/25 = $1,600

x-

t = ------ = 0.625

sx

The computed value of t is less than 1.71, H0 cannot

be rejected.

Karen should not introduce the product line into the

new market area.

Page 28: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

28

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-163

Two-Tailed Hypothesis Test

• Two-tailed test is one in whichvalues of the test

statistic leading to rejectioin of the null hypothesis

fall in both tails of the sampling distribution curve

H0: = $29,000

Ha: $29,000

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-164

Test of Two Means • A health service agency has designed a public

service campaign to promote physical fitness and

the importance of regular exercise. Since the

campaign is a major one, the agency wants to

make sure of its potential effectiveness before

running it on a national scale.

– To conduct a controlled test of the campaign’s

effectiveness, the agency needs two similar cities.

– The agency identified two similar cities:

• city 1 will serve as the test city

• city 2 will serve as a control city

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-165

Test of Two Means

• Random survey of 300 adults in city 1,200 adults

in city 2 was conducted to measure the average

time per day a typical adult in each city spent on

some form of exercise.

– Results of the survey : average was 30 minutes per day

(with a standard deviation of 22 minutes) in city 1 and

35 minutes per day (with a standard deviation of 25

minutes) in city 2.

• Question:

– From these results, can the agency conclude confidently

that the two cities are well matched for the controlled

test?

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-166

City 1: n1 = 300 x1 = 30 s1 = 22

City 2: n2 = 200 x2 = 35 s2 = 25

The hypotheses are

H0: 1 =2 or 1 -2 = 0

Ha: 1 2 or 1 -2 0

Basic Statistics and Hypotheses

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-167

Test statistic is the z-statistic, given by

(x1 - x 2) - (1 - 2 )

z = -------------------------------

s12/n1 + s2

2/n2

n1 and n2 are greater than 30.

The z-statistic can therefore be used as the test statistic.

Test Statistic

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-168

Decision – Two-Tailed Test

• For Two-Tailed tests

– Identify two critical values of z, one for each tail of the

sampling distribution.

– The probability corresponding to each tail is .025, since

= .05.

– From the Normal Table, the z-value, for /2 =.025 is

1.96.

• Decision rule : “Reject H0 if z -1.96 or if z

1.96.”

Page 29: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

29

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-169

Computing the value of z from the survey results

and under the customary assumption that the null

hypothesis is true (i.e., 1 - 2 = 0):

(30 - 35) - (0)

z = --------------------------------- = -2.29

(22)2/300 + (25)2/200

Since z -1.96, we should reject H0.

Computing Z-value – Two-Tailed Test

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-170

Hypothesis Test Related to Mean Exercising in

Two Cities

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-171

Test statistic

(x1 - x2) - (1 - 2 )

t = -------------------------

s* ( 1/n1 + 1/n2 )

with d.f. = n1 + n2 - 2. In this expression, s* is the pooled

standard deviation, given by

(n1 – 1)s12 + (n2 – 1)s2

2

s* = ---------------------------------

n1 + n2 - 2

T- Test for Independent Samples

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-172

n1 = 20 x1 = 30 s1 = 22

n2 = 10 x2 = 35 s2 = 25

The degrees of freedom for the t-statistic are

d.f. = 28

Critical value of t with 28 d.f for a tail probability

of .025 is 2.05.

Decision rule : “Reject H0 if t -2.05 or if t

2.05." The pooled standard deviation is

s* = 529 (approximately) = 23

T- Test for Independent Samples- Two

Cities

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-173

The test statistic is

t = -.56

Since t is neither less than -2.05 nor greater than 2.05,

we cannot reject H0

The sample evidence is not strong enough to conclude

that the two cities differ in terms of levels of

exercising activity of their residents.

T- Test for Independent Samples

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-174

National Insurance Company Study –

Perceived Service Quality Differences

Between Males and Females

• Test of Two Means Using the SPSS T-TEST

Program

– On the 10-point scale, males gave a mean rating of

approximately 7.87, while females gave a mean rating

of approximately 7.83.

Page 30: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

30

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-175

National Insurance Company Study –

Perceived Service Quality Differences

Between Males and Females • In SPSS,

1. Select ANALYZE from the menu,

2. Click COMPARE MEANS

3. Select INDEPENDENT-SAMPLES T -TEST

4. Move “OQ – Over all Service Quality” to the “TEST

VARIABLES(S)” box

5. Move “gender” to “GROUPING VARIABLE” box

6. DEFINE GROUPS (SEX = 1 for male and 2 for

female)

7. Click OK.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-176

OQ – Overall Perceived Service Quality Gender – Sex = 1 for male

Sex = 2 for female

National Insurance Company Study –

Perceived Service Quality Differences

Between Males and Females

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-177

Group Statistics

137 7.87 2.26 .19

126 7.83 2.31 .21

gender

male

f emale

OQ

N Mean Std. Dev iation

Std. Error

Mean

National Insurance Company Study –

Perceived Service Quality Differences

Between Males and Females

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-178

F-Test--to see if the variance of the 2 groups are

assumed to be equal p-value = .210 --> null

hypothesis cannot be rejected at = 0.05

P-value > = 0.05 -- Do not Reject,

Equal variance assumed is correct

Use this row

when the null

hypothesis of

equality of

variance is

rejected

National Insurance Company Study –

Perceived Service Quality Differences

Between Males and Females

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-179

P-value=.88 is greater than

the = of 0.05.

Do not reject Ho.

The p-value implies that the odds are 88 to 100 that a difference of

magnitude .04 (i.e., 7.87 - 7.83) could have occurred from chance.

The null hypothesis cannot be rejected at the customary

significance level of .05.

National Insurance Company Study –

Perceived Service Quality Differences

Between Males and Females

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-180

Test of Two Means When Samples

Are Dependent

• The need to check for significant differences

between two mean values when the samples are

not independent

Page 31: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

31

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-181

Test of Two Means When Samples

Are Dependent

• A retail chain ran a special promotion in a

representative sample of 10 of its stores to boost

sales.

• Weekly sales per store before and after the

introduction of the special promotion are shown

• Did the special promotion lead to a significant

increase in sales ?

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-182

Sales Per Store Before and After a

Promotional Campaign Sales per Store (In Thousands)

StoreNumber (i)

BeforePromotion(xbi )

AfterPromotion(xai )

Change inSales (InThousands)xdi = xai - xbi

1 250 260 10

2 235 240 5

3 150 151 1

4 145 140 -5

5 120 124 4

6 98 100 2

7 75 70 -5

8 85 95 10

9 180 200 20

10 212 220 8

Total 50

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-183

One-Tailed Hypothesis Test:

H0: d 0; Ha: d 0.

The sample estimate of d is xd, given by n

Xdi i=1

xd = -----

n

where n is the sample size.

xd = 50/10 = 5

Test of Two Means When Samples

Are Dependent

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-184

Test statistic is

xd -

t = ----------- = 2.10

s/n

Test of Two Means When Samples

Are Dependent

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-185

Standard deviation (s) = 7.53, = 0.05,

tc for 9 d.f = 1.83 from the Appendix 3

Decision rule: “Reject H0 if t 1.83.”

Test Statistic, t 1.83, we reject H0 and conclude that

the mean change in sales per store was significantly

greater than zero.

The special promotion was indeed effective.

Test of Two Means When Samples

Are Dependent

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-186

Hypothesis Test Related to Change in

Weekly Sales Per Store

Page 32: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

32

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-187

Test for a Single Proportion

• Ms.Jones wants to substantially increase the firm's

advertising budget--The firm sells a variety of

personal computer accessories

• Random sample : 20 / 100 know the brand name

• True awareness rate for the brand name across all

personal computer owners is less than .3

• Should Ms. Jones increase the advertising budget

on the basis of survey results?

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-188

Test for a Single Proportion

• Need to test the population proportion ( is the

symbol for population proportion) of personal

computer owners who are aware of the brand:

H0: .3

Ha: .3

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-189

The test statistic:

p -

Z = ---------------------

(1- )/n

where p is the sample proportion.

From the Normal Table, zc, = -1.645 for = .05.

Decision rule here is: “Reject Ho if z - 1.645.”

p = .2, = .3, and n = 100, z = -2.174

Test for a Single Proportion

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-190

Since -2.174 -1.645, we reject H0;

The sample awareness rate of .2 is too low to support

the hypothesis that the population awareness rate is .3 or

more.

The actual significance level (p-value) corresponding to

z = -2.174 is approximately .015 (from Appendix 1).

Level of significance implies that the odds are lower

than 15 in 1,000 that the sample awareness rate of .2

would have occurred entirely by chance(that is, when

the population awareness rate is .3 or higher).

Test for a Single Proportion

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-191

Hypothesis Test Related to Proportion

of Personal Computer Owners

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-192

Test of Two Proportions: Choosing

Between Commercial X & Commercial Y

For a New Product

Tom, advertising manager for a frozen-foods, company, is

in the process of deciding between two TV commercials, X

and Y for a new frozen food to be introduced

– Commercial X

• Runs for 20 seconds

• Random sample: 20 % awareness out of 200 respondents

– Commercial Y

• Runs for 30 seconds

• Random sample:25 % awareness out of 200 respondents

Page 33: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

33

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-193

Test of Two Proportions (Cont’d)

• Question:

– Can Tom conclude that commercial Y will be more

effective in the total market for the new product?

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-194

Criterion for Decision Making

• To reach a final decision, Tom has to make a

general inference (about the population) from the

sample data

• Criterion-- relative degrees of awareness likely to

be created by the 2 commercials in the population

of all adult consumers

• Tom should conclude that commercial Y is more

effective than commercial X only if the anticipated

population awareness rate for commercial Y is

greater than that for X.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-195

Hypothesis

• Tom’s Decision making is equvalent to either

accepting or rejecting the hypothesis:

– The potential awareness rate that commercial Y can

generate among the population of consumers is greater

than that which commercial X can generate

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-196

Commercial Commercial

X Y

Sample sizes: n1 = 200 n2 = 200

Sample proportions: p1 = .25 p2 = .20

The hypotheses are

H0: 1 2 or 1 - 2 0

Ha: 1 2 or 1 - 2 0

Null and Alternative Hypotheses

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-197

(p1 – p2) - (1 - 2)

z = ------------------------

p1 - p2 -- is estimated by the sample

standard error formula

Sample Standard Error

sp1 - p2 = PQ ( 1/n1 + 1/n2)

n1p1 + n2p2

P = -------------------

n1 + n2

Q = 1 - P

Test of Two Proportions-- Sample

Standard Error

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-198

For =.05, the critical value of z (from Appendix 1)

is 1.645.

Decision rule: “Reject H0 if z 1.645.”

First to compute P and Q, then sp1 - p2 and z:

200(.25) + 200(.2)

P = ----------------------- = .225

200 + 200

Q = 1 - .225 = .775

Test of Two Proportions

Page 34: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

34

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-199

sp1 - p2 = (.225)(.775) (1/200 + 1/200)

=0.042

(.25 - .20) - (0)

z = ---------------------- = 1.19

.042

Since z 1.645, we cannot reject H0.

The sample evidence is not strong enough to suggest that

commercial Y will be more effective than commercial X.

Test of Two Proportions

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-200

Hypothesis Test Related to Awareness

Generated by Two Commercials

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-201

Cross-Tabulations: Chi-square

Contingency Test

• Technique used for determining whether there is a

statistically significant relationship between two

categorical (nominal or ordinal) variables

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-202

Telecommunications Company

• Marketing manager of a telecommunications

company is reviewing the results of a study of

potential users of a new cell phone

– Random sample of 200 respondents

• A cross-tabulation of data on whether target consumers

would buy the phone (Yes or No) and whether the cell

phone had access to the Internet (Yes or No)

• Question:

– Can the marketing manager infer that an association

exists between Internet access and buying the cell

phone?

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-203

Two-Way Tabulation of Internet Access

and Whether they Would Buy the

Cellular Phone

InternetAccess

Would Buy the Cellular Phone Yes No Total

Yes 80(80%) 20(20%) 100

No 20(20%) 80(80%) 100

Total 100(100%) 100(100%) 200

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-204

H0: There is no association between Internet access and

buying the cell phone (the two variables are

independent of each other).

Ha: There is some association between Internet access

and buying the cell phone (the two variables are not

independent of each other).

Cross Tabulations - Hypotheses

Page 35: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

35

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-205

Conducting the Test

• Test involves comparing the actual, or observed,

cell frequencies in the cross-tabulation with a

corresponding set of expected cell frequencies(Eij)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-206

Expected Values

ninj

Eij = -----

n

where ni and nj are the marginal frequencies, that

is, the total number of sample units in category i

of the row variable and category j of the column

variable, respectively

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-207

Computing Expected Values

The expected frequency for the first-row, first-

column cell is given by

100 100

E11 = ------------ = 50

200

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-208

Observed and Expected Cell

Frequencies InternetAccess

Would Buy the Cellular Phone Yes No Total

Yes 80(50) 20(50) 100

No 20(50) 80(50) 100

Total 100 100 200

Note: In each cell ij the number without parentheses is the

observed cell frequency (0ij) and the number in parentheses is

the expected cell frequency (Eij).

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-209

where r and c are the number of rows and columns, respectively,

in the contingency table. The number of degrees of freedom

associated with this chi-square statistic are given by the product

(r - 1)(c - 1).

r c (Oij - Eij)2

2 = -----------------

i=1 j=1 Eij

= 72.00

Chi-square Test Statistic

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-210

For d.f. = 1, Assuming =.05, from Appendix 2, the

critical chi-square value (2c) = 3.84.

Decision rule is-- “Reject H0 if 2 3.84.”

Computed 2 = 72.00

Since the computed Chi-square value is greater than

the critical value of 3.84, reject H0.

The apparent relationship between "Internet access"and

"would buy the cellular phone" revealed by the sample

data is unlikely to have occurred because of chance

Chi-square Test Statistic in a

Contingency Test

Page 36: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

36

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-211

Interpretation

• The actual significance level associated with a chi-

square value of 72 is less than .001 (from

Appendix 2). Thus, the chances of getting a chi-

square value as high as 72 when there is no

relationship between Internet access and purchase

of cell phones are less than 1 in 1,000.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-212

Cross-Tabulation Using SPSS for

National Insurance Company

• One crucial issue in the customer survey of

National Insurance Company was how a

customer's education was associated with whether

or not she or he would recommend National to a

friend.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-213

Need to Conduct Chi-square Test to

Reach a Conclusion

• The hypotheses are:

– H0:There is no association between educational level

and willingness to recommend National to a friend (the

two variables are independent of each other).

– Ha:There is some association between educational level

and willingness to recommend National to a friend (the

two variables are not independent of each other).

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-214

Association Between Education and

Customer’s Willingness to recommend

National to a Friend For two-way tabulation:

1. Select ANALYZE on the SPSS menu,

2. Click on DESCRIPTIVE STATISTICS,

3. Select CROSS-TABS.

4. Move the “highest level of schooling” to ROW(S) box,

5. Move “rec” variable to “COLUMN(S) box.

6. Click on CELLS,

7. Select OBSERVED, and ROW PERCENTAGES.

8. Click CONTINUE and

9. Click OK.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-215

Association Between Education and Customer’s

Willingness to recommend National to a Friend

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-216

COUNT

represents the

actual number of

customers in each

cell. The

percentages are

based on the

corresponding

Association Between Education and Customer’s

Willingness to recommend National to a Friend

Page 37: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

37

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-217

Association Between Education and Customer’s

Willingness to recommend National to a Friend

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-218

For Chi-Square Assessment:

1. Select ANALYZE

2. Click on DESCRIPTIVE STATISTICS

3. Select CROSS-TABS

4. Move the variable “highest level of schooling” to

ROW(s) box

5. Move “rec” to COLUMN(s) box;

6. Click on “STATISTICS”

7. Select CHI-SQUARE, CONTINGENCY

COEFFICIENT, and CRAMER’S V

8. Click on CELLS,

9. Select OBSERVED and EXPECTED FREQUENCIES

10.Click CONTINUE

11.Click OK.

National Insurance Company Study -

Chi-Square Test

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-219

National Insurance Company Study -

Chi-Square Test

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-220

Interpret

the Table

National Insurance Company Study--

Expected Frequency Table

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-221

Computed Chi-

square value

P-value

National Insurance Company Study

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-222

National Insurance Company Study --

P-Value Significance

• The actual significance level (p-value) = 0.019

• the chances of getting a chi-square value as high

as 10.007 when there is no relationship between

education and recommendation are less than 19 in

1000.

• The apparent relationship between education and

recommendation revealed by the sample data is

unlikely to have occurred because of chance.

• Jill and Tom can safely reject null hypothesis.

Page 38: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

38

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-223

Precautions in Interpreting Cross

Tabulation Results

• Two-way tables cannot show conclusive evidence

of a causal relationship

• Watch out for small cell sizes

• Increases the risk of drawing erroneous inferences

when more than two variables are involved

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 13-224

Patients whojog

Patients whodo not jog

Patients withheart disease

20 40

Patientswithout heartdisease

80 60

100 100

Is there a causal relationship between Patients who jog and

Patients with hearth disease ?

Two-way Table Based on a Survey of

200 Hospital Patients:

Copyright © by Houghton Mifflin Company, Inc. All rights reserved First Edition

Chapter 14

Examining

Associations:

Correlation

and Regression

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-226

Chapter Objectives

• Compute the Spearman correlation coefficient

between ordinal scaled variables and determine

whether or not it is statistically significant

• Compute the Pearson correlation coefficient

between two variables and assess its statistical

significance

• Explain simple regression analysis and state the

distinction between a dependent variable and an

independent variable

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-227

Chapter Objectives (Cont’d)

• Describe common indicators for checking

the usefulness of a regression equation

• Discuss practical applications of regression

analysis

• Interpret the results of a multiple regression

analysis

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-228

Did You Know That Experienced Women

in High-tech Jobs Earn More Than Men?

• General Belief: Men on an average earn more than women in similar occupations

– IEEE-USA: Survey results showed that in the electrotechnology and information-technology fields professional women with 20+ years of experience earned significantly more than men with similar experience

– Regression analysis revealed that gender and experience, along with ethnic background , were significantly related to income levels in the high-tech sector

Page 39: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

39

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-229

Did You Know That Parents’ Education

May Have a Bearing on Children’s GPA’s?

• A study of high schools in Alberta, Canada,

showed a statistically significant, positive

association between parents’ education

levels and children’s grades

• Regression analysis revealed that 11

percent of the variation in student’s grades

could be attributed to differences in parents’

education levels

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-230

Did You Know That University Students’

Gender and Age May Be Unrelated To Their

Grades In An Introductory Marketing Course?

• The most important predictors of grades in an introductory

marketing course were

– Overall GPA

– Whether the student transferred to the university from a

community college

– Number of hours the student worked per week

• Regression analysis revealed that the predictor variables,

such as gender, age, and participation in extracurricular

activities showed no significant relationship to course

grades

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-231

Overview of Techniques for

Examining Associations

• Spearman Correlation Coefficient Technique

• The technique is appropriate when

– The degree of association between two sets of ranks (pertaining to two variables) is to be examined

• Illustrative Research Question(s) This Technique Can Answer: – Is there a significant relationship between motivation levels of

salespeople and the quality of their performance?

• Assume that the data on motivation and quality of performance are in the form of ranks, say, 1through 20, for 20 salespeople who were evaluated subjectively by their supervisor on each variable

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-232

Overview of Techniques for

Examining Associations (Cont’d)

• Pearson Correlation Coefficient Technique

• This technique is appropriate when

– The degree of association between two metric-scaled

(interval or ratio) variables is to be examined

• Illustrative Research Question(s) This Technique

Can Answer:

– Is there a significant relationship between customers'

age (measured in actual years) and their perceptions of

our company's image (measured on a scale of 1to 7)?

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-233

Overview of Techniques for

Examining Associations (Cont’d)

• Simple Regression Analysis Technique

• This technique is appropriate when

– A mathematical function or equation linking

two metric-scaled (interval or ratio) variables is

to be constructed, under the assumption that

values of one of the two variables is dependent

on the values of the other

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-234

Overview of Techniques for

Examining Associations–Simple

Regression Analysis (Cont’d)

• Illustrative Research Question(s) this Technique Can Answer:

– Are sales (measured in dollars) significantly affected by advertising expenditures (measured in dollars)?

– What proportion of the variation in sales is accounted for by variation in advertising expenditures? How sensitive are sales to changes in advertising expenditures?

Page 40: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

40

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-235

Overview of Techniques for

Examining Associations (Cont’d)

• Multiple Regression Analysis Technique

• This technique is appropriate when

– Under the same conditions as simple regression

analysis except that more than two variables are

involved wherein one variable is assumed to be

dependent on the others

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-236

Overview of Techniques for

Examining Associations (Cont’d)

• Illustrative Research Question(s) this Technique Can Answer:

– Are sales significantly affected by advertising expenditures and price (where all three variables are measured in dollars)?

– What proportion of the variation in sales is accounted for by advertising and price? How sensitive are sales to changes in advertising and price?

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-237

Spearman Correlation Coefficient

A Spearman correlation coefficient is a measure of

association between two sets of ranks

di = the difference between the ith sample unit's ranks on the

two variables

n = the total sample size

n

6 d2

i

i =1

rs = 1 - ----------------------------

n(n2 - 1)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-238

Scenario: Industrial Marketing Firm

• An industrial marketing firm has been hiring all its salespeople from among the graduates of 10 business schools in the vicinity of its headquarters

• The firm developed a subjective ranking of the perceived prestige levels of the 10 schools and the performance levels of the groups of graduates recruited from these schools

• Question:

– What is the degree of association between the prestige levels of the schools and the sales performance levels of their graduates hired by this company?

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-239

Table 14.2 Association Between School

Prestige and Performance of Graduates BusinessSchool

(i)

Ranking ofSchool'sPrestige

(SPi)

Ranking ofPerformanceof School'sGraduates

(GPi)

DifferenceBetweenRanks(di =

SPi-GPi)

SquaredDifference

(di2)

1 10 8 2 4

2 7 3 4 16

3 9 7 2 4

4 1 2 -1 1

5 6 9 -3 9

6 2 4 -2 4

7 3 5 -2 4

8 8 10 -2 4

9 5 6 -1 1

10 4 1 3 9

di2 = 56

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-240

(6)(56)

rs = 1 - ---------------- = .661

10(100 - 1)

Hypotheses

H0: s = 0

Ha: s 0

Spearman Correlation Co-efficient

Page 41: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

41

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-241

n – 2

t = rs ---------- = 2.49

1 - rs2

t - Distribution

• For = .05, t for 8 degrees of freedom (d.f. = n - 2

= 10- 2 = 8) tc = +2.31 and -2.31

• Decision Rule:

– “Reject H0 if t 2.31 or if t -2.31.”

– Since t > 2.31, we reject H0 and conclude that there is

a true association between the prestige of business

schools and the job performance of its graduates.In

other words, the sample correlation of .661 is unlikely

to have occurred because of chance.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-242

The Pearson correlation coefficient is the degree of association

between variables that are interval-or ratio-scaled.

Pearson correlation coefficient (rxy) between them is given by

n = sample size (total number of data points)

X and Y = means

Xi and Yi = values for any sample unit i

sx and sy = standard deviations

n

i = 1 (Xi – X)(Yi – Y)

rxy = ----------------------------- (n-1) sx sy

Pearson Correlation Coefficient

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-243

Market Area Dollar Sales of

Bright (in

Thousands)

Advertising

Expenditure

for Bright ($

in 100)

Number of

Competing

detergents

1 5 5 15

2 10 13 8

3 6 5 14

4 20 15 5

5 15 10 9

6 9 9 10

7 11 5 12

8 18 13 4

9 22 17 6

10 7 6 13

11 24 19 2

12 14 12 8

13 16 15 6

14 17 14 7

15 23 18 1

16 8 7 11

17 12 10 10

18 13 12 7

19 21 16 7

20 9 16 3

Bright Detergent Data

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-244

Scatter Diagram

• Plot in a two-dimensional graph

• Indicates how closely and in what fashion

the variables are associated

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-245

Exhibit 14.1 Scatter Diagram of Sales and

Advertising Data

Advertising Expenditures for Bright ($)

200018001600140012001000800600400

Do

llar

Sa

les

of

Bri

gh

t (T

ho

usa

nd

s)

30

20

10

0

What is the relationship between dollar sales and

advertising expenditure ? Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-246

Exhibit 14.2 Scatter Diagram of Sales and

Number of Competing Brands

Num ber of Competing Detergents

1614121086420

Dol

lar S

ales

of B

right

(Tho

usan

ds)

30

20

10

0

What is the relationship between dollar sales and number of

competing detergents ?

Page 42: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

42

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-247

Pearson Correlation

• Correlation between sales and advertising is

.927

• Correlation between sales and number of

competing brands is .910

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-248

Two-Tailed Hypothesis Test For

Correlations

• H0: = 0;

• Ha: 0,

• For = .05, 19 degrees of freedom(d.f.= n -

1 = 19) rc = + .433 and rc = -.433

• Decision rule is: “Reject H0 if r .433 or if

r -.433.”

• Reject H0 in both cases

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-249

Exhibit 14.3 Scatter Diagram Showing a

Nonlinear Association Between Variables

X

14121086420

Y

70

60

50

40

30

20

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-250

National Insurance Company– Computing

Pearson Correlation Among Service Quality

Constructs

• National Insurance Company was interested in the

correlations between respondents’ overall service-

quality perceptions (on the 10-point scale) and

their average ratings along each of the five

dimensions of Service Quality

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-251

National Insurance Company– Computing

Pearson Correlation Among Service Quality

Constructs (Cont’d)

1. Click ANALYZE

2. Select CORRELATE

3. Select BIVARIATE

4. Move “oq, reliable, empathy, tangible,

response, and assure” to VARIABLES box

5. Click OK

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-252

National Insurance Company– Computing

Pearson Correlation Among Service Quality

Constructs (Cont’d)

Page 43: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

43

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-253

National Insurance Company– Computing

Pearson Correlation Among Service Quality

Constructs Using SPSS

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-254

Interpreting Pearson Correlation

Coefficients • Each of the five service-quality measures

(reliability, empathy, tangibles, responsiveness, and assurance) is significantly related to the overall quality (OQ) at the .001 level of significance

• Responsiveness has the strongest correlation (.8625)

• Tangibles have the weakest correlation (.5038)

• All the correlations are strong enough to be meaningful

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-255

Simple Regression Analysis

• Generates a mathematical relationship

(called the regression equation) between

one variable designated as the dependent

variable (Y) and another designated as the

independent variable (X)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-256

Independent Variable Vs.

Dependent Variable

• Independent variable

– Explanatory or predictor variable

– Often presumed to be a cause of the other

• Dependent variable

– Criterion Variable

– Influenced by the independent variable

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-257

Scenario: Curtis Construction

Industry Lobbyist

• Curtis, a construction industry lobbyist, is in an area of the country that has a high unemployment rate and a number of economically depressed construction projects

• His current charge is to convince local government officials to vote in favor of several tax concessions for the construction industry

• He is wondering whether he can generate any concrete evidence to show that increased construction activity (presumably spurred by the proposed tax concessions) would greatly benefit the state

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-258

Scenario: Curtis Construction

Industry Lobbyist (Cont’d) • Possible Dependent Variable

– Number of people unemployed or the unemployment rate

– Data on this variable may be gathered from a sample of areas from around the country

• Possible Independent Variable

– Number of construction permits issued or number of ongoing construction projects

– Data on this variable should be gathered from the same sample

Page 44: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

44

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-259

Scenario: Carol, Chief Librarian

• Carol, chief librarian in a major university,

is eager to increase the number of students

borrowing books from the library as well as

the number of books borrowed per student

• She needs some persuasive evidence to

show how increased borrowing of books

might benefit students

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-260

Scenario: Carol, Chief Librarian

(Cont’d) • Possible Dependent Variable

– Cumulative grade point ratio

– Data on this variable should be gathered for a sample of students who have borrowed books in the past

• Possible Independent Variable

– Number of books borrowed

– Assuming that the library has records of the books borrowed by students, data on this variable can be obtained from those records for the same sample of students

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-261

Scenario: Jack, Trade Show Officer

• Jack, an officer in an association in charge

of putting together and promoting industrial

trade shows, is wondering about the impact

of the number of exhibitors in a trade show

on trade show attendance

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-262

Scenario: Jack, Trade Show

Officer (Cont’d)

• Possible Dependent Variable

– Number of people visiting a trade show

– Data on this variable can be obtained for a representative sample of trade shows from the association’s past records

• Possible Independent Variable

– Number of exhibitors in a trade show

– Necessary data can be obtained from the past records

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-263

Deriving a Regression Equation

• Y = a + bX, where a and b are constants

• Y-> Dependent Variable

• x-> Independent Variable

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-264

Market Area Dollar Salesof Bright (inThousands)

AdvertisingExpenditure

for Bright($ in 100)

Number ofCompetingdetergents

1 5 5 15

2 10 13 8

3 6 5 14

4 20 15 5

5 15 10 9

6 9 9 10

7 11 5 12

8 18 13 4

9 22 17 6

10 7 6 13

11 24 19 2

12 14 12 8

13 16 15 6

14 17 14 7

15 23 18 1

16 8 7 11

17 12 10 10

18 13 12 7

19 21 16 7

20 9 16 3

Bright Detergent Data

Page 45: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

45

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-265

Exhibit 14.4 Several Subjectivity

Constructed Regression Lines

Advertising Expenditures for Bright ($)

200018001600140012001000800600400

Do

llar

Sa

les

of

Bri

gh

t (T

ho

us

an

ds

)

30

20

10

0

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-266

Regression Using SPSS--Sales

and Advertising Data

1. Click ANALYZE

2. Select REGRESSION

3. Click LINEAR

4. Move “Dollar Sales for Bright” to DEPENDENT

Box

5. Move “advertising expenditures for Bright” to

INDEPENDENT(S) box

6. Click OK

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-267

Exhibit 14.5 SPSS Computer Output or

Simple Regression Analysis of Sales and

Advertising Data

Model Summary

.927a .860 .852 2.28

Model

1

R R Square

Adjusted

R Square

Std. Error of

the Estimate

Predictors: (Constant), Advertis ing Expenditures for

Bright ($)

a.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-268

Exhibit 14.5 SPSS Computer Output or

Simple Regression Analysis of Sales and

Advertising Data (Cont’d)

ANOVAb

571.646 1 571.646 110.221 .000a

93.354 18 5.186

665.000 19

Regression

Residual

Total

Model

1

Sum of

Squares df Mean Square F Sig.

Predictors: (Constant), Advertising Expenditures for Bright ($)a.

Dependent Variable: Dollar Sales of Bright (Thousands)b.

F is greater than the critical value

P value < = 0.05, we can infer that the R2–value of.860 is

statistically significant; it is unlikely to have occurred by chance

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-269

Exhibit 14.5 SPSS Computer Output or

Simple Regression Analysis of Sales and

Advertising Data (Cont’d)

Coefficientsa

.163 1.457 .112 .912

1.210 .115 .927 10.499 .000

(Constant)

Advertising Expenditures

for Bright ($in 100)

Model

1

B Std. Error

Unstandardized

Coeff icients

Beta

Standardi

zed

Coeff icien

ts

t Sig.

Dependent Variable: Dollar Sales of Bright ($ in Thousands)a.

t value >2.10 and p-value < =0.05 --Reject Null Hypothesis, that is the

coefficient is statistically significant

a =.163

b =1.210

The regression equation is

Yi = .163 + 1.210 Xi

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-270

Standard Error

SSE

Sy/x = -----------

n - k - 1

• The value of the standard error (sy/x) is

shown in the computer output as 2.277,

which is the square root of the error mean

square value of 5.186

Page 46: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

46

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-271

Practical Applications of

Regression Equations

• The regression coefficient, or slope, can

indicate how sensitive the dependent

variable is to changes in the independent

variable

• The regression equation is a forecasting tool

for predicting the value of the dependent

variable for a given value of the

independent variable

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-272

Precautions In Using Regression

Analysis • Only capable of capturing linear associations

between dependent and independent variables

• A significant R2-value does not necessarily imply a cause-and-effect association between the independent and dependent variables

• A regression equation may not yield a trustworthy prediction of the dependent variable when the value of the independent variable at which the prediction is desired is outside the range of values used in constructing the equation

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-273

Precautions In Using Regression

Analysis (Cont’d)

• A regression equation based on relatively

few data points cannot be trusted

• The ranges of data on the dependent and

independent variables can affect the

meaningfulness of a regression equation

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-274

Multiple Regression Analysis

• Yi = a + b1X1i + b2X2i + … + bkXki

• Yi is the predicted value of the dependent variable

for some unit i;

• X1i, X2i, …, Xki are values on the independent

variables for unit i;

• bl, b2, . . . , bk are the regression coefficients;

• a is the Y-intercept representing the prediction for

Y when all independent variables are set to zero

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-275

National Insurance Company–

Multiple Regression Using SPSS

• Jill and Tom were interested in conducting a

multiple regression analysis wherein overall

service quality perceptions is the dependent

variable and the average ratings along the

five dimensions are the indpendent variable

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-276

National Insurance Company– Multiple

Regression Using SPSS (Cont’d)

1. Click ANALYZE

2. Select REGRESSION

3. Click LINEAR

4. Move “OQ” to DEPENDENT

Box

5. Move “reliable, empathy,

tangible, response, and assure”

to INDEPENDENT(S) box

6. Click OK

Page 47: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

47

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-277

National Insurance Company– Multiple

Regression Using SPSS (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-278

The R-square

of .810

indicates a

strong

relationship

between these

variables and

overall

quality.

National Insurance Company– Multiple

Regression Using SPSS (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-279

National Insurance Company– Multiple

Regression Using SPSS (Cont’d)

All variables except empathy are significantly

related to overall service quality

(as indicated by the t-test of significance in the

far right column)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-280

Bright Detergent Case – Multiple

Regression Using SPSS

1. Click ANALYZE

2. Select REGRESSION

3. Click LINEAR

4. Move “Dollar Sales for Bright” to DEPENDENT Box

5. Move “advertising expenditures for Bright and Number of

competing Brands” to INDEPENDENT(S) box

6. Click OK.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-281

Bright Detergent Case – Multiple

Regression Using SPSS (Cont’d)

Model Summary

.934a .873 .858 2.23

Model

1

R R Square

Adjusted

R Square

Std. Error of

the Estimate

Predictors: (Constant), Number of Competing

Detergents, Advertising Expenditures for Bright ($in

100)

a.

ANOVAb

580.373 2 290.187 58.293 .000a

84.627 17 4.978

665.000 19

Regression

Residual

Total

Model

1

Sum of

Squares df Mean Square F Sig.

Predictors: (Constant), Number of Competing Detergents, Adv ertising Expenditures

f or Bright ($in 100)

a.

Dependent Variable: Dollar Sales of Bright ($ in Thousands)b.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-282

Coefficientsa

8.854 6.717 1.318 .205

.808 .324 .619 2.496 .023

-.498 .376 -.328 -1.324 .203

(Constant)

Adv ertising Expenditures

f or Bright ($in 100)

Number of Competing

Detergents

Model

1

B Std. Error

Unstandardized

Coeff icients

Beta

Standardi

zed

Coeff icien

ts

t Sig.

Dependent Variable: Dollar Sales of Bright ($ in Thousands)a.

Bright Detergent Case – Multiple

Regression Using SPSS (Cont’d)

Page 48: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

48

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-283

Multicollinearity

• Multicollinearity exists when independent

variables in a multiple regression equation

are highly correlated among themselves

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 14-284

Bright Detergent Case– Multicollinearity

Correlations

1.000 .927** -.909**

. .000 .000

20 20 20

.927** 1.000 -.937**

.000 . .000

20 20 20

-.909** -.937** 1.000

.000 .000 .

20 20 20

Pearson Correlation

Sig. (2-tailed)

N

Pearson Correlation

Sig. (2-tailed)

N

Pearson Correlation

Sig. (2-tailed)

N

Dollar Sales of Bright ($ in

Thousands)

Adv ertising Expenditures

f or Bright ($in 100)

Number of Compet ing

Detergents

Dollar Sales

of Bright ($ in

Thousands)

Adv ertising

Expenditures

f or Bright ($in

100)

Number of

Competing

Detergents

Correlation is signif icant at the 0.01 level (2-tailed).**.

Very high correlation between independent variables-presence of multicollinearity

Copyright © by Houghton Mifflin Company, Inc. All rights reserved First Edition

Chapter 15

Overview of

Other

Multivariate

Techniques

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-286

Chapter Objectives

• Distinguish between dependence and interdependence techniques

• Interpret interaction effect in a factorial ANOVA

• Identify two key purposes of discriminant analysis

• Discuss factor analysis and interpret a factor-loading matrix

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-287

Chapter Objectives (Cont’d)

• Distinguish between cluster analysis and

discriminant analysis

• Describe the potential uses of

multidimensional scaling and point out its

key limitations

• State the purpose of conjoint analysis and

use the results from such an analysis

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-288

Dependence and Interdependence

Techniques

• Dependence technique

– One variable is designated as the dependent variable and the rest are treated as independent variables

• Interdependence technique

– There are no dependent and independent variable designations, all variables are treated equally in a search for underlying patterns of relationships

Page 49: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

49

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-289

Dependence Technique–

Regression Analysis

• Input Data

– Dependent variable(s) - metric

– Independent variable(s)- metric

• Primary Purpose of the Technique

– Ascertain the relative importance of independent variable(s) in explaining variation in the dependent variable

– Predict dependentvariable values for given values of the independent variable(s)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-290

Overview of Multivariate

Techniques

• Analysis of Variance (ANOVA) Technique

• Usual Form of the Input Data

– Dependent variable, metric independent

variable(s), nonmetric

• Primary Purpose of the Technique

– See whether different levels (treatments) of

independent variable(s) have significantly

different impacts on the dependent variable

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-291

Overview of Multivariate

Techniques (Cont’d)

• Discriminant Analysis Technique

• Usual Form of the Input Data

– Dependant variable, nonmetric independent variable(s),

metric

• Primary Purpose of the Technique

– To identify independent variables that are critical in

distinguishing between subsamples defined by the

dependent-variable categories; also aid inclassifying

new units into one of the subsample categories

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-292

Overview of Multivariate

Techniques (Cont’d)

• Factor Analysis Technique

• Usual Form of the Input Data

– Metric

• Primary Purpose of the Technique

– To reduce data on a large number of variables into a

relatively small set of factors

– To identify key constructs underlying the original set of

measured variables in classifying new units into one of

the subsample categories

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-293

Overview of Multivariate

Techniques (Cont’d)

• Cluster Analysis Technique

• Usual Form of the Input Data

– Metric

• Primary Purpose of the Technique

– To identify natural clusters of objects on the

basis of similarities of the objects on a variety

of characteristics

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-294

Overview of Multivariate

Techniques (Cont’d)

• Multidimensional Scaling Technique

• Usual Form of the Input Data

– Nonmetric (similarity ranks based on comparison of

actual objects)

• Primary Purpose of the Technique

– To identify key dimensions underlying respondent

evaluations of products, brands, stores, etc.

– To determine the relative positions of the objects in

multidimensional space

Page 50: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

50

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-295

Overview of Multivariate

Techniques (Cont’d) • Conjoint Analysis Technique

• Usual Form of the Input Data

– Nonmetric

• Primary Purpose of the Technique

– To derive utility values that respondents implicitly assign to various levels of key attributes used in evaluating objects

• the utility values themselves aid in ascertaining the relative importance of the attributes as well as the potential attractiveness of descriptive profiles defined by different combinations of attributes

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-296

Analysis of Variance

• ANOVA is appropriate in situations where

the independent variable is set at certain

specific levels (called treatments in an

ANOVA context) and metric measurements

of the dependent variable are obtained at

each of those levels

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-297

Example 24 Stores Chosen randomly for the study

8 Stores randomly chosen for each treatment

Treatment 1

Store brand sold at

the regular price

Treatment 2

Store brand sold at

50¢ off the regular

price

Treatment 3

Store brand sold at

75¢ off the regular

price

monitor sales of the store brand for a week in each store

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-298

Table 15.2 Unit Sales Data Under Three

Pricing Treatments Treatment Regular Price 50 ¢ off 75 ¢ off

Unit Sale ineach store

37 46 46

38 43 49

40 43 48

40 45 48

38 45 47

38 43 48

40 44 49

39 44 49

Number ofstores

8 8 8

Mean sales 38.75 44.13 48.00

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-299

EG1(R) X1 O1

EG2(R) X2 O2

EG3(R) X3 O3

EG1 -- Experiment Group 1, X1-- Regular Price

EG2 -- Experiment Group 2, X2-- 50c off

EG3 -- Experiment Group 3, X3-- 75c off

O1 -- Observation (monitoring unit sales data in each store)

O2 -- Observation (monitoring unit sales data in each store)

O3 -- Observation (monitoring unit sales data in each store)

After Only Design

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-300

ANOVA –Grocery Store

Hypothesis

• Grocery Store Example

– Ho 1 = 2 = 3

– Ha At least one is different from one or more of

the others

• Hypotheses for K Treatment groups or samples

– Ho 1 = 2 = ………..k

– Ha At least one is different from one or more of

the others

Page 51: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

51

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-301

Exhibit 15.1 SPSS Computer

Output for ANOVA Analysis

Between-Subjects Factors

Regular

pri ce8

50 cents off 8

75 cents off 8

1

2

3

Treatment

group

Val ue Label N

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-302

Exhibit 15.1 SPSS Computer Output

for ANOVA Analysis (Cont’d)

Tests of Between-Subjects Effects

Dependent Variable: SALES

345.250a 2 172.625 137.445 .000

45675.375 1 45675.375 36367.123 .000

345.250 2 172.625 137.445 .000

26.375 21 1.256

46047.000 24

371.625 23

Source

Corrected Model

Intercept

TREAT

Error

Total

Corrected Total

Type III Sum

of Squares df Mean Square F Sig.

R Squared = .929 (Adjusted R Squared = .922)a.

There is less than a .001 probability of obtaining an F-

value as high as 137.447

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-303

Bank Customer Perceptions Study

Bank Customers

Gender

Male Female

< 35

Years

35-64

Years

> 64

Years

< 35

Years

35-64

Years

> 64

Years

Measure Overall Perceptions

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-304

Bank Customer Perceptions Study (Cont’d)

Tests Between-Subjects Effects

Dependent Variable:Overall Quality of the Company’s Services

Source Type III

Sum of

Squares

df Mean

Square

F Sig.

Corrected

Model

2156.112a 5 431.222 438.891 .000

Intercept 20665.912 1 20665.912 1033.424 .000

Gender 382.436 1 382.436 389.237 .000

Age 1311.623 2 655.811 667.474 .000

Gender * Age 260.433 2 30.216 132.532 .000

Error 459.823 468 .983

Total 24341.000 474

Corrected Total 2615.935 473

a. R Squared = .824 (Adjusted R Squared = .822)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-305

Bank Customer Perceptions Study (Cont’d)

Descriptive Statistics

Dependent Variable: Overall Quality of the company 's serv ices

2.54 1.31 79

6.72 1.17 88

8.08 .82 85

5.87 2.57 252

6.49 1.39 55

6.95 .58 79

9.36 .48 88

7.79 1.53 222

4.16 2.36 134

6.83 .94 167

8.73 .93 173

6.77 2.35 474

Age

<35

35-64

>64

Total

<35

35-64

>64

Total

<35

35-64

>64

Total

Gender

Male

Female

Total

Mean Std. Dev iation N

Male and female

customers differed in

their overall

perceptions

Customers' perceptions

differed according to

their ages

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-306

Estimated Marginal Means of Overal l Qua li ty o f the company's services

Age

>6435-64<35

Es

tim

ate

d M

arg

in

al M

ea

ns

10

8

6

4

2

Gender

Male

Female

Sex and age interacted in influencing perceptions

Bank Customer Perceptions Study (Cont’d)

Page 52: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

52

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-307

Factorial Anova

• The Factorial ANOVA is used to analyze

data from a factorial design experiment

variable

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-308

Exhibit 15.2 Illustrations of Main and

Interaction Effects

Grocery Store Experiment

Display

Present

Display

absent

(a) Main and Interaction Effects Present

Display

Present

Display

absent

(b) Only Main Effects Present

U

n

i

t

S

a

l

e

s

Price

U

n

i

t

S

a

l

e

s

Price

Regular

Price 50 ¢ off 75 ¢ off Regular

Price 50 ¢ off 75 ¢ off

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-309

Discriminant Analysis

• Identifies the distinguishing features of

prespecified subgroups of units that are formed on

the basis of some dependent variable

• Examples of Subgroups

– Heavy, moderate, and light users of a product

– Homeowners and renters

– Viewers and nonviewers of a television program

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-310

Discriminant Analysis (Cont’d)

• Dependent Variable

– Categorical: as many categories as there are subgroups

• Heavy, moderate, and light users: 3 categories

• Independent Variable

– Metric-scaled

• Purpose of discriminant analysis is to classify new

units into one of the subgroups given the new

units’ values of the independent variable

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-311

Example

Computer Manufacturer

Household

income

Number of years of

formal education

PC Ownership Not Owning A PC

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-312

Exhibit 15.3 Scatter Plot of Income and

Education Data for Personal Computer

Owners and Nonowners

Owners

Non

Owners

Income ($)

Page 53: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

53

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-313

Using the Discriminant Function

• Y = v1X1 + v2X2

– Discriminant weights v1 and v2 can be interpreted as signifying the relative importance of X1 and X2 in being able to discriminate between the two groups

• Ynew = v1X1,new + v2X2, new

– The program assigns either to the owner group or to the non-owner group based on the criterion value

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-314

Evaluating a Discriminant

Function

• Confusion Matrix

– Indicates the degree of correspondence, or lack

thereof, between the actual groupings of the

sample units and the predicted groupings

obtained by classifying the same units through

the discriminant function

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-315

Table 15.3 Confusion Matrix

Predicted groupings

Households with Households without

Actual Groupings Personal Computers Personal Computers

Households with

Personal computers 17 3

Household without

Personal computers 4 16

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-316

Usefulness of Discriminant

Analysis

• Discriminant analysis is very useful for

– Defining customer segments

– Identifying critical characteristics capable of

distinguishing among them

– Classifying prospective customers into

appropriate segments

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-317

Factor Analysis

• A data and variable reduction technique that

attempts to partition a given set of variables

into groups of maximally correlated

variables

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-318

Intuitive Explanation

• Consider two statements from the Star

Brand Inc.(SBI) survey

• S1. “I have been satisfied with the Star

products I have purchased”

• S2. “When I have to purchase a home

appliance in the future, it will likely be a

Star product”

Page 54: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

54

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-319

Exhibit 15.6 S1 and S2 Highly Correlated:

Factor Analysis Will Be Beneficial

S1 and S2 can be

combined into one

factor.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-320

Exhibit 15.7 Situation Where Factor Analysis

Will Not Be Beneficial: S1 and S2 Poorly

Correlated

S1 and S2 cannot

be combined

into one factor.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-321

Factor Analysis Output and Its

Interpretation

• Primary output of factor analysis is a factor-

loading matrix

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-322

Table 15.4 Factor-Loading Matrix Based on Data from

Study of Star Customers

Factor Loadings Factors F1 F2

AchievedCommunalities

X4: My friends are very

impressed with the Star VCR

0.96 0.06 .926

X6: No other brand of VCR

even comes close to matchingthe Star

0.92 0.17 .875

X1: I did not mind paying the

high Price for my Star VCR

0.89 0.15 .815

X3: I hardly ever worry about

anything going wrong with myStar VCR

0.18 0.94 .916

X5: The Star VCR has the

latest technology built into it

0.09 0.88 .782

X2: I am pleased with the

variety of things that a StarVCR can do

0.16 0.86 .766

VCR

Eigenvalues: Standardized

variance explained by eachfactor

2.626 2.454

Proportion of the total varianceexplained by each factor

0.438 0.409

3 Variables load

high on factor 1

3 Variables load

high on factor 2

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-323

Reducing Star Data

• X1, X4, and X6 can be combined into one

factor

• X2, X3, and X5 can be into a second factor

• 6 variables can be reduced to two factors

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-324

Potential Applications of Factor

Analysis

• Used to

– Develop concise but comprehensive, multiple-item scales for measuring various marketing constructs

– Illuminate the nature of distinct dimensions underlying an existing data set

– Convert a large volume of data into a set of factor scores on a limited number of uncorrelated factors

Page 55: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

55

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-325

Cluster Analysis

• Segment objects into groups so that

members within each group are similar to

one another in a variety of ways

• Useful for segmenting customers, market

areas, and products

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-326

Use of Cluster Analysis

• Firm offering recreational services wanted to enter a new region of the country

• They gathered data on more than 100 characteristics including

– Demographics

– Expenditures on recreation

– Leisure time activities

– Interests of household members

• The firm identified one or several household segments that are likely to be most responsive to its advertising and to its services

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-327

How Does Cluster Analysis

Work?

• Cluster analysis measures the similarity

between objects on the basis of their values

on the various characteristics

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-328

Exhibit 15.8 Clusters Formed by

Using Data on Two Characteristics

High

High

Low

Low Extent of participation in outdoor sporting events

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-329

Multidemensional Scaling

• Uncovers key dimensions underlying

customers' evaluations from a series of

similarity and/or preference judgments

provided by customers about products or

brands within a given set

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-330

Multi-Dimensional Scaling on

SUV’s

• A customer is asked to compare pairs of

SUVs and rank the pairs from most similar

to least similar

Page 56: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

56

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-331

Table 15.5 Similarity Rankings

of Six 2001 SUVs

LX 470 Lrover MBenz Acura Infiniti BMW

LX 470 15 14 12 11 13

Lrover 1 4 7 2

Mbenz 5 8 3

Acura 10 6

Infiniti 9

Note: Numbers are ranks indicating perceived similarities between pairs of SUVs; the smaller the number, the more

similar the pair of SUVs is.

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-332

Exhibit 15.9 Multidimensional Map of 2001

SUVs Based on Similarity Rankings

What do these dimensions

stand for ?

Maybe Value

Ma

yb

e Q

ua

lity

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-333

Conjoint Analysis

• Technique for deriving the utility values

that customers presumably attach to

different levels of an object's attributes

• Requires respondents to compare

hypothetical products, brands

• The hypothetical stimuli are descriptive profiles

formed by systematically combining varying levels

of certain key attributes

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-334

Personal Computer Study

• To assess the role played by attributes in

customer evaluations of personal compters

– Price: 3 levels - $839, $1039, $1259

– Processor: 2 levels – 800MHz , 1.1 GHz

– Speed: 4 levels - 10 GB, 14 GB, 18 GB, 20 GB

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-335

Personal Computer Study

(Cont’d)

• 3 Levels of Price X, 2 Levels of Processor

Speed X, 4 Levels of Hard Drive Capacity =

24 different descriptive profiles of personal

computers are possible

• Data Collection in Conjoint Analysis

– Two-Factors-at-a-Time Approach

– Full-Profile Approach

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-336

Personal Computer Study: Two-Factors-

At-a-Time Approach

$ 839 $1,039 $1259Processing

Speed

Price

800

MHz

1.1

GHz

Note: Customers are asked to rank the six possible combinations

of levels according to their preferences , Most Preferred = 1 and

Least Preferred = 6

Page 57: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

57

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-337

PERSONAL

COMPUTER –

DESKTOP

Price

$839

Speed

800 MHz

Hard Drive

10 GB

PERSONAL

COMPUTER –

DESKTOP

Price

$839

Speed

800 MHz

Hard Drive

14 GB

PERSONAL COMPUTER

- DESKTOP

Price

$839

Speed

800 MHz

Hard Drive

18 GB

PERSONAL

COMPUTER -DESKTOP

Price

$839

Speed

800 MHz

Hard Drive

20 GB

Note: Customers are asked to rank order their preferences for the

24 different profiles representing all possible combinations of the

three attributes

Personal Computer Study: Full-

Profile Approach

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-338

Exhibit 15.10 Utility Values for Three

Personal-Computer Attributes

$ 839 $1,039 $1,259

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-339

800 MHz 1.1 GHz

Exhibit 15.10 Utility Values for Three

Personal-Computer Attributes (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-340

10 GB 14 GB 18 GB 20 GB

Exhibit 15.10 Utility Values for Three

Personal-Computer Attributes (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-341

Relative Attributes of the 3

Attributes

• Range for price = 0.8 - 0.3 = 0.5

– Price is the most critical

• Range for hard drive capacity = 0.8 - 0.4 =

0.4

– Hard drive capacity is the next most critical

• Range for processor speed = 0.9 - 0.6 = 0.3

– Processor speed Ii the least critical

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-342

Potential Attractiveness of Different

Personal Computer Configurations

• PC Configuration A

– 800 MHz, 14 GB, $1,059

– Total utility for the personal computer =

0.6 + 0.7 + 0.4 = 1.7

• PC Configuration B

– 1.1 GHz, 18 GB, $1,259

– Total utility for the personal computer =

0.9 + 0.8 + 0.3 = 2.0

• Personal Computer B is more attractive

Page 58: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

58

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-343

Online (Virtual) Conjoint

Analysis Experiments at MIT

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-344

Virtual Consumer Initiative: mitsloan.mit.edu

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-345

Virtual Consumer Initiative:

Ski Resort

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-346

Virtual Consumer Initiative:

Ski Resort (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-347

Virtual Consumer Initiative:

Ski Resort (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-348

Virtual Consumer Initiative:

Ski Resort (Cont’d)

Page 59: Chapter 11 Sampling Foundations - boun.edu.trweb.boun.edu.tr/ulas.akkucuk/AD585/AD585-Part1c.pdf• Define and distinguish between sampling and census studies ... Sampling methods

59

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-349

Virtual Consumer Initiative:

Ski Resort (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-350

Virtual Consumer Initiative:

Ski Resort (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-351

Virtual Consumer Initiative:

Ski Resort (Cont’d)

Copyright © by Houghton Mifflin Company, Inc. All rights reserved 15-352

Virtual Consumer Initiative:

Ski Resort (Cont’d)