1_introduction to statistics_jan-2, 2012 [compatibility mode]

12
Page 1 1 Statistics for Management Ramesh Anbanandam [email protected] Department of Mechanical Engineering, NIT Calicut Kerala, India -673 601. Dedicated to Professor. S. G. Deshmukh Professor. S. G. Deshmukh Professor. S. G. Deshmukh Professor. S. G. Deshmukh 2 [email protected] 3 Objectives of this course… Appreciate the role of statistics in various decision making situations Summarize data with frequency distributions and graphic presentation. Interpret descriptive statistics for central tendency, dispersion and location Define and interpret probability. Utilize discrete and continuous probability distributions to determine probabilities in various managerial applications. Apply the central limit theorem to determine probabilities of sample means and compute and interpret point and interval estimates. Conduct Hypothesis tests for means Utilize linear regression to estimate and predict variables. Understand basic concepts of design-of-experiment Understand importance of non-parametric tests [email protected] Lab/tutorial The laboratory content will require pre- requisite of working with Excel. There will be quizzes/assignments every week. The lab assignments are to be submitted on that day itself. Students will be also required to visit and consult useful web resources. [email protected] 4 Mode of Evaluation and Grades Grades are based on total points earned from test 1 &2,lab/tutorial/assignments, mini-project and end semester examination. [email protected] 5 Test 1 Test 2 End Semester Lab/tutorial /assignments (every week) Mini- Project Surprise quizzes 15 % 15 % 40 % 10 % 10% 10% Reference Meyer PL, Introductory Probability and Statistical Applications, Oxford and IBH Publishers Miller IR, Freund JE, Johnson R, Probability and Statistics for Engineers, Prentice-Hall (I) Ltd Walpole RE and Myers RH, Probability & Statistics for Engineers and Scientists, Macmillan Levin, R. I. and Rubin, D.S., Statistics for Management (Pearson Education ) Levine,David., Stephan,David., Krehbiel, Timothy and Berenson, Mark., Statistics for Managers using Microsoft Excel, Prentice Hall [email protected] 6

Upload: kiran-mohan

Post on 12-Nov-2014

17 views

Category:

Documents


3 download

DESCRIPTION

good

TRANSCRIPT

Page 1: 1_introduction to Statistics_Jan-2, 2012 [Compatibility Mode]

Page 1

1

Statistics for Management

Ramesh [email protected]

Department of Mechanical Engineering, NIT CalicutKerala, India -673 601.

Dedicated to

Professor. S. G. DeshmukhProfessor. S. G. DeshmukhProfessor. S. G. DeshmukhProfessor. S. G. Deshmukh

[email protected]

3

Objectives of this course…

• Appreciate the role of statistics in various decision making situations

• Summarize data with frequency distributions and graphic presentation.

• Interpret descriptive statistics for central tendency, dispersion and location

• Define and interpret probability. Utilize discrete and continuous probability distributions to determine probabilities in various managerial applications.

• Apply the central limit theorem to determine probabilities of sample means and compute and interpret point and interval estimates.

• Conduct Hypothesis tests for means

• Utilize linear regression to estimate and predict variables.• Understand basic concepts of design-of-experiment

• Understand importance of non-parametric tests

[email protected]

Lab/tutorial

• The laboratory content will require pre-requisite of working with Excel. There will

be quizzes/assignments every week. The

lab assignments are to be submitted on

that day itself. Students will be also

required to visit and consult useful web

resources.

[email protected] 4

Mode of Evaluation and Grades

• Grades are based on total points earned from test 1 &2,lab/tutorial/assignments,

mini-project and end semester

examination.

[email protected] 5

Test 1 Test 2 End Semester

Lab/tutorial /assignments

(every week)

Mini-Project

Surprise quizzes

15 % 15 % 40 % 10 % 10% 10%

Reference

• Meyer PL, Introductory Probability and Statistical Applications, Oxford and IBH Publishers

• Miller IR, Freund JE, Johnson R, Probability and Statistics for Engineers, Prentice-Hall (I) Ltd

• Walpole RE and Myers RH, Probability & Statistics for Engineers and Scientists, Macmillan

• Levin, R. I. and Rubin, D.S., Statistics for Management

(Pearson Education )

• Levine,David., Stephan,David., Krehbiel, Timothy and

Berenson, Mark., Statistics for Managers using Microsoft Excel, Prentice Hall

[email protected] 6

Page 2: 1_introduction to Statistics_Jan-2, 2012 [Compatibility Mode]

Page 2

7

Statistics..

• Plays an important role in many facets of human endeavour

• Occurs remarkably frequently in our

everyday lives

• It is often incorrectly thought of as just a

collection of data, graphs and diagrams

[email protected] 8

Statistics in Business

• Accounting — auditing and cost estimation• Economics — regional, national, and international

economic performance • Finance — investments and portfolio management• Management — human resources, compensation,

and quality management• Management Information Systems — (ERP):

performance of systems which gather, summarize, and disseminate information to various managerial levels

• Marketing — market analysis and consumer research

• International Business — market and demographic analysis

[email protected]

9

What is Statistics?

• Science of gathering, analyzing, interpreting,

and presenting data

• Branch of mathematics

• Facts and figures

• Measurement taken on a sample

Statistics is the scientific method that

enables us to make decisions as responsibly

as possible.

[email protected] 10

Statistics…

• The science of data to answer research questions– Formulate a research question(s) (hypothesis)

– Collect data

– Analyze and summarize data

– Draw conclusions to answer research questions

• Statistical Inference

– In the presence of variation

[email protected]

11

Answers Questions from Everyday

Life• Business: Will a new marketing strategy be

profitable?

• Industry: Will a product’s life exceed the warranty period?

• Medicine: Will this year’s flu vaccine reduce the chance of flu?

• Education: Will technology improve learning?

• Government: Will a change in interest rates affect inflation?

[email protected] 12

Statistics: Science of

variability..?

• Virtually everything varies

• Variation occurs among individuals

• Variation occurs within any one individual

as time passes

[email protected]

Page 3: 1_introduction to Statistics_Jan-2, 2012 [Compatibility Mode]

Page 3

13

Can Statistics Be Trusted?“There are three kinds of lies:

Lies, damned lies, and statistics.”--Mark Twain

“It is easy to lie with statistics. But it is

easier to lie without them.” --Frederick Mosteller

“Figures won’t lie but liars will figure.”--Charles Grosvenor

[email protected] 14

Population Versus Sample• Population — the whole

– a collection of persons, objects, or items under study

– The entire group of individuals in a statistical study we want information about.

• Census — gathering data from the entire population

• Sample — a portion of the whole– a subset of the population

– a part of the population from which we actually collect information, used to draw conclusions about the whole (statistical inference

[email protected]

15

Statistics can be split into two

broad categories

1. Descriptive statistics

2. Statistical inference

[email protected]

Descriptive Statistics

� Collect data

� ex. Survey

� Present data

� ex. Tables and graphs

� Characterize data

� ex. Sample mean =i

X

n

17

Descriptive statistics..

• Encompasses the following:

– Graphical or pictorial display

– Condensation of large masses of data into a

form such as tables

– Preparation of summary measures to give a

concise description of complex information

(e.g. an average figure)

– Exhibition of patterns that may be found in

sets of information

[email protected]

Inferential Statistics

� Estimation

� ex. Estimate the

population mean weight

using the sample mean

weight

� Hypothesis testing

� ex. Test the claim that the

population mean weight

is 120 poundsDrawing conclusions and/or making decisions concerning a population based on sample results.

Page 4: 1_introduction to Statistics_Jan-2, 2012 [Compatibility Mode]

Page 4

19

Inferential Statistics..

• Especially relates to:

– Determining whether characteristics of a

situation are unusual or if they have

happened by chance

– Estimating values of numerical quantities and

determining the reliability of those estimates

– Using past occurrences to attempt to predict

the future

[email protected] 20

Process of Inferential Statistics

Population

(parameter)

µ

Sample

x

(statistic )

Calculate x

to estimate µ

Select a

random sample

[email protected]

Population vs. Sample

Population Sample

Measures used to describe the

population are called parameters

Measures computed from

sample data are called statistics

22

Parameter vs. Statistic

• Parameter — descriptive measure of the

population

– Usually represented by Greek letters

• Statistic — descriptive measure of a

sample

– Usually represented by Roman letters

[email protected]

23

Symbols for Population

Parameters

µ denotes population parameter

2

σ denotes population variance

σ denotes population standard deviation

[email protected] 24

Symbols for Sample Statistics

x denotes sample mean

2

S denotes sample variance

S denotes sample standard deviation

[email protected]

Page 5: 1_introduction to Statistics_Jan-2, 2012 [Compatibility Mode]

Page 5

Types of Variables

� Categorical (qualitative) variables have values

that can only be placed into categories, such as

“yes” and “no.”

� Numerical (quantitative) variables have values

that represent quantities.

Types of Variables

Data

Categorical Numerical

Discrete Continuous

Examples:

� Marital Status

� Political Party� Eye Color

(Defined categories)Examples:

� Number of Children

� Defects per hour

(Counted items)

Examples:

� Weight

� Voltage

(Measured characteristics)

27

Levels of Data Measurement

• Nominal — Lowest level of measurement

• Ordinal

• Interval

• Ratio — Highest level of measurement

[email protected]

Levels of Measurement

� A nominal scale classifies data into distinct

categories in which no ranking is implied.

Categorical Variables Categories

Personal Computer Ownership

Type of Stocks Owned

Internet Provider

Yes / No

Microsoft Network / AOL

Growth Value Other

Levels of Measurement

� An ordinal scale classifies data into distinct

categories in which ranking is implied

Categorical Variable Ordered Categories

Student class designation Freshman, Sophomore, Junior,

Senior

Product satisfaction Satisfied, Neutral, Unsatisfied

Faculty rank Professor, Associate Professor,

Assistant Professor, Instructor

Standard & Poor’s bond ratings AAA, AA, A, BBB, BB, B, CCC, CC,

C, DDD, DD, D

Student Grades A, B, C, D, F

Levels of Measurement

� An interval scale is an ordered scale in which the difference between measurements is a meaningful quantity but the measurements do not have a true zero point.

� A ratio scale is an ordered scale in which the difference between the measurements is a meaningful quantity and the measurements have a

true zero point.

Page 6: 1_introduction to Statistics_Jan-2, 2012 [Compatibility Mode]

Page 6

Interval and Ratio Scales

32

Usage Potential of Various

Levels of Data

Nominal

Ordinal

Interval

Ratio

[email protected]

33

Data Level, Operations,

and Statistical Methods

Data Level

Nominal

Ordinal

Interval

Ratio

Meaningful Operations

Classifying and Counting

All of the above plus Ranking

All of the above plus Addition, Subtraction

All of the above plus multiplication and division

StatisticalMethods

Nonparametric

Nonparametric

Parametric

Parametric

[email protected] 34

Data preparation rules

• Data presented must be

– factual

– relevant

Before presentation always check:

• the source of the data

• that the data has been accurately

transcribed

• the figures are relevant to the problem

[email protected]

35

Methods of visual presentation

of data• Table

1st Qtr 2nd Qtr 3rd Qtr 4th Qtr

East 20.4 27.4 90 20.4

West 30.6 38.6 34.6 31.6

North 45.9 46.9 45 43.9

[email protected] 36

Methods of visual presentation

of data• Graphs

0

10

20

30

40

50

60

70

80

90

1st Qtr 2nd Qtr 3rd Qtr 4th Qtr

East

West

North

[email protected]

Page 7: 1_introduction to Statistics_Jan-2, 2012 [Compatibility Mode]

Page 7

37

Methods of visual presentation

of data• Pie chart

1st Qtr

2nd Qtr

3rd Qtr

4th Qtr

[email protected] 38

Methods of visual presentation

of data• Multiple bar chart

0 20 40 60 80 100

1st Qtr

2nd Qtr

3rd Qtr

4th Qtr

North

West

East

[email protected]

39

Methods of visual presentation

of data• Simple pictogram

0

20

40

60

80

100

1st Qtr 2nd Qtr 3rd Qtr 4th Qtr

East

North

West

[email protected] 40

Frequency distributions

• Frequency tables

Class Interval Frequency Cumulative Frequency

< 20 13 13

<40 18 31

<60 25 56

<80 15 71

<100 9 80

Observation Table

[email protected]

41

Frequency

0

5

10

15

20

25

30

< 20 <40 <60 <80 <100

Frequency

Frequency diagramsFrequency

0

5

10

15

20

25

30

< 20 <40 <60 <80 <100

Frequency

Cumulative Frequency

0

10

20

30

40

50

60

70

80

90

< 20 <40 <60 <80 <100

Cumulative Frequency

[email protected] 42

Ungrouped Versus

Grouped Data

• Ungrouped data

• have not been summarized in any way

• are also called raw data

• Grouped data

• have been organized into a frequency distribution

[email protected]

Page 8: 1_introduction to Statistics_Jan-2, 2012 [Compatibility Mode]

Page 8

43

Example of Ungrouped

Data

42

30

53

50

52

30

55

49

61

74

26

58

40

40

28

36

30

33

31

37

32

37

30

32

23

32

58

43

30

29

34

50

47

31

35

26

64

46

40

43

57

30

49

40

25

50

52

32

60

54

Ages of a Sample of

Managers from

XYZ

[email protected] 44

Frequency Distribution of

Ages

Class Interval Frequency

20-under 30 6

30-under 40 18

40-under 50 11

50-under 60 11

60-under 70 3

70-under 80 1

[email protected]

45

Data Range

42

30

53

50

52

30

55

49

61

74

26

58

40

40

28

36

30

33

31

37

32

37

30

32

23

32

58

43

30

29

34

50

47

31

35

26

64

46

40

43

57

30

49

40

25

50

52

32

60

54

Smallest

Largest

Range = Largest - Smallest

= 74 - 23

= 51

[email protected] 46

Number of Classes and Class

Width• The number of classes should be between 5 and 15.

• Fewer than 5 classes cause excessive summarization.

• More than 15 classes leave too much detail.

• Class Width

• Divide the range by the number of classes for an approximate class width

• Round up to a convenient number

10 = Width Class

8.5 =6

51 = Width Class eApproximat

[email protected]

47

Class Midpoint

Class Midpoint = beginning class endpoint + ending class endpoint

2

= 30 + 40

2

= 35

( )

Class Midpoint = class beginning point + 1

2class width

= 30 + 1

210

= 35

[email protected] 48

Relative FrequencyRelative

Class Interval Frequency Frequency

20-under 30 6 .12

30-under 40 18 .36

40-under 50 11 .22

50-under 60 11 .22

60-under 70 3 .06

70-under 80 1 .02

Total 50 1.00

6

50=

18

50=

[email protected]

Page 9: 1_introduction to Statistics_Jan-2, 2012 [Compatibility Mode]

Page 9

49

Cumulative Frequency

CumulativeClass Interval Frequency Frequency

20-under 30 6 6

30-under 40 18 24

40-under 50 11 35

50-under 60 11 46

60-under 70 3 49

70-under 80 1 50

Total 50

18 + 6

11 + 24

[email protected] 50

Class Midpoints, Relative Frequencies,

and Cumulative Frequencies

Relative Cumulative

Class IntervalFrequency Midpoint Frequency Frequency

20-under 30 6 25 .12 6

30-under 40 18 35 .36 24

40-under 50 11 45 .22 35

50-under 60 11 55 .22 46

60-under 70 3 65 .06 49

70-under 80 1 75 .02 50

Total 50 [email protected]

51

Cumulative Relative Frequencies

Relative Cumulative Cumulative Relative

Class IntervalFrequency Frequency Frequency Frequency

20-under 30 6 .12 6 .12

30-under 40 18 .36 24 .48

40-under 50 11 .22 35 .70

50-under 60 11 .22 46 .92

60-under 70 3 .06 49 .98

70-under 80 1 .02 50 1.00

Total 50 1.00

[email protected] 52

Common Statistical Graphs

• Histogram -- vertical bar chart of frequencies

• Frequency Polygon -- line graph of frequencies

• Ogive -- line graph of cumulative frequencies

• Pie Chart -- proportional representation for

categories of a whole

• Stem and Leaf Plot

• Pareto Chart

• Scatter Plot

[email protected]

53

Histogram

Class Interval Frequency

20-under 30 6

30-under 40 18

40-under 50 11

50-under 60 11

60-under 70 3

70-under 80 1 01

02

0

0 10 20 30 40 50 60 70 80

Years

Fre

qu

en

cy

[email protected] 54

Histogram Construction

Class Interval Frequency

20-under 30 6

30-under 40 18

40-under 50 11

50-under 60 11

60-under 70 3

70-under 80 1

01

02

0

0 10 20 30 40 50 60 70 80

Years

Fre

qu

en

cy

[email protected]

Page 10: 1_introduction to Statistics_Jan-2, 2012 [Compatibility Mode]

Page 10

55

Frequency Polygon

Class Interval Frequency

20-under 30 6

30-under 40 18

40-under 50 11

50-under 60 11

60-under 70 3

70-under 80 1 01

02

0

0 10 20 30 40 50 60 70 80

Years

Fre

qu

en

cy

[email protected] 56

Ogive

Cumulative

Class Interval Frequency

20-under 30 6

30-under 40 24

40-under 50 35

50-under 60 46

60-under 70 49

70-under 80 50

020

40

60

0 10 20 30 40 50 60 70 80

Years

Fre

qu

en

cy

[email protected]

57

Relative Frequency Ogive

Cumulative

Relative

Class Interval Frequency

20-under 30 .12

30-under 40 .48

40-under 50 .70

50-under 60 .92

60-under 70 .98

70-under 80 1.00

0.000.100.200.300.400.500.600.700.800.901.00

0 10 20 30 40 50 60 70 80

Years

Cu

mu

lati

ve R

ela

tive F

req

uen

cy

[email protected] 58

Complaints by Passengers

COMPLAINT NUMBER PROPORTION DEGREES

Stations, etc. 28,000 .40 144.0

TrainPerformance

14,700 .21 75.6

Equipment 10,500 .15 50.4

Personnel 9,800 .14 50.6

Schedules,etc.

7,000 .10 36.0

Total 70,000 1.00 360.0

[email protected]

59

Complaints by Passengers

Stations, Etc.

40%Train

Performance

21%

Equipment

15%

Personnel

14%

Schedules,

Etc.

10%

[email protected] 60

Second

Quarter Truck Production

2d QuarterTruck

ProductionCompany

A

B

C

D

ETotals

357,411

354,936

160,997

34,099

12,747920,190

[email protected]

Page 11: 1_introduction to Statistics_Jan-2, 2012 [Compatibility Mode]

Page 11

61

39%

39%

17%

4%1%

A B C D E

Second Quarter

Truck Production

[email protected] 62

Pie Chart Calculations for

Company A

2d QuarterTruck

ProductionProportion DegreesCompany

A

B

C

D

ETotals

357,411

354,936

160,997

34,099

12,747920,190

.388

.386

.175

.037

.0141.000

140

139

63

13

5360

357,411

920,190 =

.388 360 =×

[email protected]

63

Pareto Chart

0

10

20

30

40

50

60

70

80

90

100

Poor

Wiring

Short in

Coil

Defective

Plug

Other

Fre

qu

ency

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

[email protected] 64

Scatter Plot

Registered Vehicles (1000's)

Gasoline Sales (1000's of

Gallons)

5 60

15 120

9 90

15 140

7 60

0

100

200

0 5 10 15 20Registered Vehicles

Ga

soli

ne S

ale

s

[email protected]

Principles of Excellent Graphs

� The graph should not distort the data.

� The graph should not contain unnecessary

adornments (sometimes referred to as chart junk).

� The scale on the vertical axis should begin at zero.

� All axes should be properly labeled.

� The graph should contain a title.

� The simplest possible graph should be used for a

given set of data.

Graphical Errors: Chart Junk

1960: $1.00

1970: $1.60

1980: $3.10

1990: $3.80

Minimum Wage

Bad Presentation

Minimum Wage

0

2

4

1960 1970 1980 1990

$

� Good Presentation

Page 12: 1_introduction to Statistics_Jan-2, 2012 [Compatibility Mode]

Page 12

Graphical Errors:

Compressing the Vertical Axis

Good Presentation

Quarterly Sales Quarterly Sales

Bad Presentation

0

25

50

Q1 Q2 Q3 Q4

$

0

100

200

Q1 Q2 Q3 Q4

$

Graphical Errors: No Zero Point

on the Vertical Axis

Monthly Sales

36

39

42

45

J F M A M J

$

Graphing the first six months of sales

Monthly Sales

0

39

42

45

J F M A M J

$

36

�Good PresentationsBad Presentation

69

Thank You

• http://www.stats.gla.ac.uk/steps/glossary/presenting_data.html

• http://www.ilir.uiuc.edu/courses/lir593/