random variable

88
Random Variable Qualitative (categorica l) Quantitativ e (numeric) Nominal Ordinal Ratio Interva l Continu ous Discret e

Upload: yetta-oneill

Post on 01-Jan-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Random Variable. Qualitative (categorical). Quantitative (numeric). Ratio. Interval. Nominal. Ordinal. Continuous. Discrete. SUMMARIZING NUMERIC DATA. Simple Frequency Table Grouped Frequency Table Histogram Frequency Polygon Cumulative Frequency Distribution. 3- 3. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Random Variable

Random Variable

Qualitative (categorical)

Quantitative (numeric)

Nominal Ordinal RatioInterval

ContinuousDiscrete

Page 2: Random Variable

SUMMARIZING NUMERIC DATA

• Simple Frequency Table

• Grouped Frequency Table

• Histogram

• Frequency Polygon

• Cumulative Frequency Distribution

Page 3: Random Variable

• Arithmetic Mean

• Median

• Mode.

3- 3

Measures of Central Location

Page 4: Random Variable

Mean for grouped data:

3- 4

N

fxmeanpopulation :

n

fxxmeansample

:

Page 5: Random Variable

3- 5

MedianMedian for grouped data:

mf

Fn

cLmedian

2

Page 6: Random Variable

3- 6

ModeMode for grouped data:

cffff

ffLodem

mm

m

21

1

Page 7: Random Variable

Measures of Dispersion (Variability)

• Range

• Variance and Standard Deviation

• Coefficient of Variation

• Non-central Locations: Inter-fractile Ranges

Page 8: Random Variable

Standard Deviation

(grouped data)

(ungrouped data)

)1(

)()( 22

nn

fxfxns

)1(

)()( 22

nn

xxns

Page 9: Random Variable

Coefficient of variation:

%)100(x

sCV

Page 10: Random Variable

68%

95%

99.7%

3- 10

Empirical Rule:

Page 11: Random Variable

The Relative Positions of the Mean, Median, and Mode:

Symmetric Distribution

Zero skewness → :Mean =Median = Mode

M o d e

M ed ia n

M ea n

3- 11

Page 12: Random Variable

Positively skewed: Mean>Median>Mode

M o d e

M ed ia n

M ea n

3- 12

Page 13: Random Variable

Negatively Skewed: Mean<Median<Mode

M o d eM ea n

M ed ia n

3- 13

Page 14: Random Variable

Non-Central Location Measures (Fractiles or Quantiles)

• Quartiles• Sextiles• Octiles• Deciles• Percentiles

Page 15: Random Variable

n = sample sizeL = lower limit of jth quartile classF = < cumulative frequency of immediately preceding class.fQj = frequency of jth quartile class.

The jth quartile for grouped data is given by:

Calculating Quartiles for Grouped Data

jQj f

cFjn

LQ

4

Page 16: Random Variable

Example

A sample of 20 randomly-selected hospitals in the US revealed the following daily charges (in $) for a semiprivate room.

153 159 142 146

141 140 130 148

142 163 134 151

122 167 137 152

143 168 159 1411.1 Using class intervals of width 10 units, construct a less-than cumulative frequency distribution of the above data. Let 120 units be the lower limit of the smallest class.

1.2 Draw a less-than ogive and use it to estimate the 80th percentile.

1.3 For the grouped data of question 1.1 above, calculate: 1.3.1 The mean, median and mode 1.3.2 The interquartile range.. 1.3.3 The coefficient of variation. Interpret the result obtained.

Page 17: Random Variable

Solution

Class Freq, f <cum freq, F

120 - 130 1 1

130 - 140 3 4

140 - 150 8 12

150 - 160 5 17

160 - 170 3 20

  ∑ = 20

1.1

Page 18: Random Variable

1.2

80th percentile = 158

Page 19: Random Variable

Class Freq, f <cum freq, F midpt, x fx

120 - 130 1 1 125 125

130 - 140 3 4 135 405

140 - 150 8 12 145 1160

150 - 160 5 17 155 775

160 - 170 3 20 165 495

  ∑ = 20 ∑ = 2960

14820

2960

f

fxx

c

f

FLx

med

n

medmed2 5.14710

8

410140

cfff

ffLx

e

eee )2( 21mod

1modmodmod 3.14610

)5316(

)38(140

1.3.1

Page 20: Random Variable

Class Freq, f <cum freq, F

120 - 130 1 1

130 - 140 3 4

140 - 150 8 12

150 - 160 5 17

160 - 170 3 20

  ∑ = 20

7.143.141156

3.141108

)45(140

156105

1215150

13

1

3

QQIQR

Q

Q

1.3.2

Page 21: Random Variable

Class Midpt, x fx fx2

120 - 130 125 125 15625

130 - 140 135 405 54675

140 - 150 145 1160 168200

150 - 160 155 775 120125

160 - 170 165 495 81675

  ∑ = 2960 ∑ = 440300

1.3.3

8.1019

20/2960440300

)1(

/)( 222

n

nfxfxs

CV = standard deviation/mean

→ CV = 10.8/148 0.073 ≡ 7.3% → data clustered around mean.

Page 22: Random Variable

BASIC PROBABILITY CONCEPTS

• Random Experiment• Sample Space• Event• Collectively Exhaustive Events • Dependent Events • Independent Events

Page 23: Random Variable

• Marginal Probability

• Joint Probability: P(A∩B) = P(B∩A) • Conditional Probability: P(A|B) = P(A∩B)/P(B) P(B|A) = P(A∩B)/P(B)

.

Page 24: Random Variable

Complement Rule:

P(A’) = 1 – P(A) or P(A) = 1 – P(A’)

Page 25: Random Variable

P(A and B) = P(AB) = P(A)P(B/A) or

P(A and B) = P(AB) = P(B)P(A/B)

General Multiplication Rule:

Special Multiplication Rule:

P(A and B) = P(A)P(B) = P(B)P(A)

Page 26: Random Variable

Special Addition Rule:

P(A or B) = P(A)+P(B)

GeneralAddition Rule:

P(A or B) = P(A)+P(B) – P(A and B)

Page 27: Random Variable

Example

A company manufactures a total of 8000 motorcycles a month in three plants A, B and C. Of these, plant A manufactures 4000, and plant B manufactures 3000. At plant A, 85 out of 100 motorcycles are of standard quality or better. At plant B, 65 out of 100 motorcycles are of standard quality or better and at plant C, 60 out of 100 motorcycles are of standard quality or better. The quality controller randomly selects a motorcycle and finds it to be of substandard quality. Calculate the probability that it has come from plant B.

Page 28: Random Variable

Solution

P(B/substd) = No. of substd items from B/Total no. of substd items

No of substd items from A = 4000x(100 – 85)/100 = 40x15 = 600 No of substd items from B = 3000x(100 – 65)/100 = 30x35 = 1050 No of substd items from C =1000x(100 – 60)/100 = 10x40 = 400   Total number of substd items = 600 +1050 + 400 = 2050 P(B/substd) = 1050/2050 = 0.512 

Page 29: Random Variable

PROBABILITY DISTRIBUTIONS

• Properties

• Discrete distributions

• Normal distributions

Page 30: Random Variable

xnx

xnx

nxP

)1(

)!(!

!)(

Binomial Probability Distribution

Page 31: Random Variable

Example

According to a leading newspaper, the largest cellular phone service in the US has about 36 million subscribers out of a total of 180 million cell phone users. If six cell phone users are randomly selected, what is the probability that at least two of them subscribes to this service?

Page 32: Random Variable

xnx

xnx

nxP

)1(

)!(!

!)(

2.0180/36

)1()0(1)2( PPxP

262.0)2.01()2.0()!06(!0

!6)0( 60

P

393.0)2.01()2.0()!16(!1

!6)1( 51

P

345.0393.0262.01)2( xP

n = 6

Page 33: Random Variable

!)(

x

exP

x

Poisson Probability Distribution

Page 34: Random Variable

Example

Customers arrive randomly and independently at a service point at an average rate of 30 per hour.

1. Calculate the probability that exactly 20 customers arrive at the service point during any given hour.

2. Calculate the probability that during any 5 minute period at least 3 customers arrive at the service point.

Page 35: Random Variable

ex

xPx

!)(

)2()1()0(1)3( PPPxP

5.20

!0

5.2)0( eP

5.21

!1

5.2)1( eP 5.2

2

!2

5.2)2( eP

5.20

!0

5.2 e 5.21

!1

5.2 e5.2

2

!2

5.2 e

; λ = 30/60 min = 2.5/5 min

→ P(x ≥ 3) = 1 -

- = 0.497

- -

2.

0134.0!20

30)10( 20

20

eP1.

Solution

λ = 30/hr

Page 36: Random Variable

x

z

Standard normal or z-distribution

Normal probability distribution

Page 37: Random Variable

- 5

0 . 4

0 . 3

0 . 2

0 . 1

. 0

x

f(

x

r a l i t r b u i o n : m = 0 , s2 = 1

Mean, median, andmode are equal

Theoretically, curve extends to infinity

a

Normal Distribution

Normal curve is symmetrical

Page 38: Random Variable

Area between 0 and z

Page 39: Random Variable

z  0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359

0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753

0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141

0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517

0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879

0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224

0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549

0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852

0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133

0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389

1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621

1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830

1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015

1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177

1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319

1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441

1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545

1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633

1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706

1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767

2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817

2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857

2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890

2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916

2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936

2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952

2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964

2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974

2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981

2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986

3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990

Page 40: Random Variable

Example

Six hundred candidates wrote an entrance test for admission to a management course. The marks obtained by the candidates were found to be normally distributed with a mean of 132 marks and a standard deviation of 18 marks.

1. How many candidates scored between 140 and 160 marks?

2. If the top 60 performers were given confirmed admission, calculate the minimum mark (to the nearest integer) above which a candidate would be guaranteed admission?

Page 41: Random Variable

x

z

Solution

Z1 =(140 -132)/18 = 0.4444 → P1 ≈ 0.172

Z2 =(160 -132)/18 = 1.5556 → P2 ≈ 0.440

→ P (160<X<140) ≈ 0.440 – 0.172 = 0.268 → 0.268 x 600 students ≈ 161 students

1.

Page 42: Random Variable

cc

xz

15528.118

132

18

132

c

cc xxx

Let xc denote the minimum mark.

60/600 = 0.1 = 10%. P(0 <z<zc) = 0.50 - 0.10 = 0.4 → zc = 1.28

2.

Page 43: Random Variable

HYPOTHESIS TESTING

• What is a Hypothesis?

• What is Hypothesis Testing?

Page 44: Random Variable

Basic Terms

• Null hypothesis• Alternative hypothesis• Level of significance• Type I error• Type II error• Critical value• Test statistic• Rejection area• Acceptance area• One-tailed test• Two-tailed Test

Page 45: Random Variable

Five-Step Procedure for Hypothesis TestingFive-Step Procedure for Hypothesis Testing

Step 1: State the null and alternative hypotheses

Step 3: Identify and calculate the test statistic

Step 4: Formulate and apply the decision rule

Step 2: Determine the critical value associated with the the level of significance

Step 5: Draw a conclusion

Page 46: Random Variable

Test statistic:

Large sample( n Large sample( n > 30) 30)

Testing a Single Population Mean

Small sample( n <Small sample( n < 30) 30)

n

xttest

Test statistic:

n

xztest

Page 47: Random Variable

t table with right tail probabilities

Page 48: Random Variable

df\p 0.4 0.25 0.1 0.05 0.025 0.01 0.005 0.0005

1 0.32492 1 3.077684 6.313752 12.7062 31.82052 63.65674 636.6192

2 0.288675 0.816497 1.885618 2.919986 4.30265 6.96456 9.92484 31.5991

3 0.276671 0.764892 1.637744 2.353363 3.18245 4.5407 5.84091 12.924

4 0.270722 0.740697 1.533206 2.131847 2.77645 3.74695 4.60409 8.6103

5 0.267181 0.726687 1.475884 2.015048 2.57058 3.36493 4.03214 6.8688

6 0.264835 0.717558 1.439756 1.94318 2.44691 3.14267 3.70743 5.9588

7 0.263167 0.711142 1.414924 1.894579 2.36462 2.99795 3.49948 5.4079

8 0.261921 0.706387 1.396815 1.859548 2.306 2.89646 3.35539 5.0413

9 0.260955 0.702722 1.383029 1.833113 2.26216 2.82144 3.24984 4.7809

10 0.260185 0.699812 1.372184 1.812461 2.22814 2.76377 3.16927 4.5869

11 0.259556 0.697445 1.36343 1.795885 2.20099 2.71808 3.10581 4.437

12 0.259033 0.695483 1.356217 1.782288 2.17881 2.681 3.05454 4.3178

13 0.258591 0.693829 1.350171 1.770933 2.16037 2.65031 3.01228 4.2208

14 0.258213 0.692417 1.34503 1.76131 2.14479 2.62449 2.97684 4.1405

15 0.257885 0.691197 1.340606 1.75305 2.13145 2.60248 2.94671 4.0728

16 0.257599 0.690132 1.336757 1.745884 2.11991 2.58349 2.92078 4.015

17 0.257347 0.689195 1.333379 1.739607 2.10982 2.56693 2.89823 3.9651

18 0.257123 0.688364 1.330391 1.734064 2.10092 2.55238 2.87844 3.9216

19 0.256923 0.687621 1.327728 1.729133 2.09302 2.53948 2.86093 3.8834

20 0.256743 0.686954 1.325341 1.724718 2.08596 2.52798 2.84534 3.8495

21 0.25658 0.686352 1.323188 1.720743 2.07961 2.51765 2.83136 3.8193

22 0.256432 0.685805 1.321237 1.717144 2.07387 2.50832 2.81876 3.7921

23 0.256297 0.685306 1.31946 1.713872 2.06866 2.49987 2.80734 3.7676

24 0.256173 0.68485 1.317836 1.710882 2.0639 2.49216 2.79694 3.7454

25 0.25606 0.68443 1.316345 1.708141 2.05954 2.48511 2.78744 3.7251

26 0.255955 0.684043 1.314972 1.705618 2.05553 2.47863 2.77871 3.7066

27 0.255858 0.683685 1.313703 1.703288 2.05183 2.47266 2.77068 3.6896

28 0.255768 0.683353 1.312527 1.701131 2.04841 2.46714 2.76326 3.6739

Page 49: Random Variable

Test statistic:

Large sample( n > 30)

Testing a Single Population Proportion:

n

pztest

)1(

Small sample( n< 30)

Test statistic:

n

pttest

)1(

Page 50: Random Variable

Tests Involving Two Sample Means

2

22

1

21

2121 )(

ns

ns

xxztest

Page 51: Random Variable

Example

A union representing workers at a large industrial concern accused management that discriminatory wages were paid to the workers in two production facilities, A and B. It claimed that workers in facility A were being paid less than those in facility B. The company investigates the claim by examining the pay of 70 workers from each production facility. The results were as follows.

Facility A Facility B

Mean salary $455.00 $463.00

Std deviation $10.00 $13.00

What conclusion did the company reach? Investigate at the 5% level of significance.

Page 52: Random Variable

BA

BA

081.470/16970/100

463455

// 22

BBAA

BAtest

nn

xxz

Solution

H1:

→ two tailed-test nA, nB > 30 → z test. α = 5% → zcrit = 1.96

Since │zcrit │ > │zcrit│ reject H0

→ Sufficient statistical evidence to suggest a significant difference in the salaries.

H0:

Page 53: Random Variable

Tests Involving Two Sample Proportions

21

2121

11

)(

nnpq

ppztest

21

2211

nn

pnpnp

pq 1

Page 54: Random Variable

Example

Surveys were conducted in two major cities “A” and “B” to ascertain viewer habits regarding a popular television channel. In city “A”, 1000 people were interviewed and 680 said they viewed the channel. In city “B”, 600 people were interviewed and 444 said they viewed the channel. Investigate, at the 5% level of significance, whether there is a significant difference between the viewing habits in the two cities.

Page 55: Random Variable

BA

BA

7025.06001000

444680

BA

BBAA

nn

npnpp

54.2

600/11000/12975.07025.0

600/4441000/680

)/1/1(

BA

BAtest

nnpq

ppz

H0:

H1:

→ two tailed-test; α = 5% → zcrit = 1.96

q = 1 – p = 0.2975

Since │ztest │> │zcrit │, reject H0 at the 5% level of significance.→ Sufficient statistical evidence to suggest a significant difference in the viewing habits.

Page 56: Random Variable

Major Characteristics:

positively skewed

non-negative

family of chi-square distributions

Chi-square Applications

Page 57: Random Variable

H0: There is no difference between the observed and expected frequencies.

H1: There is a difference between the observed and the expected frequencies.

Test statistic:

e

eostat f

ff 22

The critical value is a chi-square value with (k-1) degrees of freedom, where k is the number of categories

Page 58: Random Variable

Right tail areas for the Chi-square Distribution

Page 59: Random Variable

df\area 0.995 0.99 0.975 0.95 0.90 0.75 0.5 0.25 0.10 0.05 0.025 0.01 0.005

1 0.00004 0.00016 0.00098 0.00393 0.01579 0.10153 0.45494 1.3233 2.70554 3.84146 5.02389 6.6349 7.87944

2 0.01003 0.0201 0.05064 0.10259 0.21072 0.57536 1.38629 2.77259 4.60517 5.99146 7.37776 9.21034 10.5966

3 0.07172 0.11483 0.2158 0.35185 0.58437 1.21253 2.36597 4.10834 6.25139 7.81473 9.3484 11.3449 12.8382

4 0.20699 0.29711 0.48442 0.71072 1.06362 1.92256 3.35669 5.38527 7.77944 9.48773 11.1433 13.2767 14.8603

5 0.41174 0.5543 0.83121 1.14548 1.61031 2.6746 4.35146 6.62568 9.23636 11.0705 12.8325 15.0863 16.7496

6 0.67573 0.87209 1.23734 1.63538 2.20413 3.4546 5.34812 7.8408 10.6446 12.5916 14.4494 16.8119 18.5476

7 0.98926 1.23904 1.68987 2.16735 2.83311 4.25485 6.34581 9.03715 12.017 14.0671 16.0128 18.4753 20.2777

8 1.34441 1.6465 2.17973 2.73264 3.48954 5.07064 7.34412 10.2189 13.3616 15.5073 17.5346 20.0902 21.955

9 1.73493 2.0879 2.70039 3.32511 4.16816 5.89883 8.34283 11.3888 14.6837 16.919 19.0228 21.666 23.5894

10 2.15586 2.55821 3.24697 3.9403 4.86518 6.7372 9.34182 12.5489 15.9872 18.307 20.4832 23.2093 25.1882

11 2.60322 3.05348 3.81575 4.57481 5.57778 7.58414 10.341 13.7007 17.275 19.6751 21.9201 24.725 26.7569

12 3.07382 3.57057 4.40379 5.22603 6.3038 8.43842 11.3403 14.8454 18.5494 21.0261 23.3367 26.217 28.2995

13 3.56503 4.10692 5.00875 5.89186 7.0415 9.29907 12.3398 15.9839 19.8119 22.362 24.7356 27.6883 29.8195

14 4.07467 4.66043 5.62873 6.57063 7.78953 10.1653 13.3393 17.1169 21.0641 23.6848 26.119 29.1412 31.3194

15 4.60092 5.22935 6.26214 7.26094 8.54676 11.0365 14.3389 18.2451 22.3071 24.9958 27.4884 30.5779 32.8013

16 5.14221 5.81221 6.90766 7.96165 9.31224 11.9122 15.3385 19.3689 23.5418 26.2962 28.8454 31.9999 34.2672

17 5.69722 6.40776 7.56419 8.67176 10.0852 12.7919 16.3382 20.4887 24.769 27.5871 30.191 33.4087 35.7185

18 6.2648 7.01491 8.23075 9.39046 10.8649 13.6753 17.3379 21.6049 25.9894 28.8693 31.5264 34.8053 37.1565

19 6.84397 7.63273 8.90652 10.117 11.6509 14.562 18.3377 22.7178 27.2036 30.1435 32.8523 36.1909 38.5823

20 7.43384 8.2604 9.59078 10.8508 12.4426 15.4518 19.3374 23.8277 28.412 31.4104 34.1696 37.5662 39.9969

Page 60: Random Variable

Helped Harmed No Effect Total

Drug 150 30 70 250

Sugar Pills 130 40 80 250

Total 280 70 150 500

A certain drug is claimed to be effective in curing the common cold. In a clinical trial involving 500 patients having the common cold, 250 were given the drug and the rest were given sugar pills. The patients’ reactions to the treatment are recorded in the table below.

On the basis of the above data, can it be concluded, at the 5% significance level, that there is a significant difference in the effect of the drug and sugar pills?

Example

Page 61: Random Variable

f0 fe f0 – f0 (f0 - f0)2/fe

150 140 -10 0.714330 35 5 0.714370 75 5 0.3333

130 140 10 0.714340 35 -5 0.714380 75 -5 0.3333

= 3.524

991.52 crit

0fef ef0fef

22 524.3 critcalc

H0: No significant difference in effect of drug and sugar pills.

H1: There is a significant difference in effect of drug and sugar pills.

α = 0.05, df = (2-1)(3-1) = 2 →

Hence do not reject H0 at α = 0.05.

→ insufficient statistical evidence to suggest that there is a significant difference between drug and sugar pills.

Page 62: Random Variable

• Correlation analysis• Scatterplot• Correlation coefficient• Dependent and independent variables• The coefficient of determinationcoefficient of determination • Linear regression equation

LINEAR REGRESSION AND CORRELLATION

Page 63: Random Variable

2222 yynxxn

yxxynr

Correlation Coefficient Formula:

The coefficient of determination =coefficient of determination = r2

Page 64: Random Variable

b = slope of the line.

Y' = average predicted value of Y for any X.

a = Y-intercept = estimated Y value when X=0

The regression equation : Y' = a + bX

Page 65: Random Variable

n

xbya

22

xxn

yxxynb

Page 66: Random Variable

Example

The following data relates to the training periods and average weekly sales of seven randomly selected salesmen in a large company.

Salesman Training (hours) Ave weekly sales ($’000)

A 20 44

B 5 22

C 10 35

D 13 32

E 12 27

F 8 26

G 15 35

Page 67: Random Variable

1. Calculate the correlation coefficient. Comment on the value obtained.

2. Determined the coefficient of determination and interpret the value obtained.

3. Assuming a linear relation between the variables in the given data, obtain the regression equation connecting the variables.

4. Estimate the weekly sales of a salesman who had 22h of training. Is the result reliable? Explain.

Page 68: Random Variable

Solution

x y x2 Y2 xy

20 44 400 1936 880

5 22 25 484 110

10 35 100 1225 350

13 32 169 1024 416

12 27 144 729 324

8 26 64 676 208

15 35 225 1225 525

83 221 1127 7299 2813

1. Let x denote training period (in hours) and let y denote sales (in $’000)

Page 69: Random Variable

2222 yynxxn

yxxynr

9.0)22172997)(8311277(

221832813722

xx

x

strong positive linear relationship between x and y

2. r2 = 0.81 81% of variation in Y due to variation in X. The remaining 19% due to other factors.

Page 70: Random Variable

22 xxn

yxxynb 35.1

8311277

22183281372

x

x

xbya

=

= 221/7 – 1.35 x 83/7 =15.56 → y = 15.56 +1.35x

3.

4. When x = 22 hours, y = 15.56 + 1.35 x 22 = 45.3 x $1000 = $45300

No. Regression equation valid only in the domain 5 ≤ x ≤ 20

Page 71: Random Variable

TIME SERIES AND FORECASTING

Page 72: Random Variable

Components

• The Irregular Variation (I)

Multiplicative Model: Y = T.C.S.I

• The Secular Trend (T)

• The Cyclical Variation (C)

• The Seasonal Variation (S)

Page 73: Random Variable

The linear trend equation :

T = a + bt

Page 74: Random Variable

Moving average Centred moving average Ratio to centred moving average Adjusted seasonal average Deasonalizing a series.

Seasonal Indices

Page 75: Random Variable

Year Q1 Q2 Q3 Q4

2008 14.0 15.6 21,5 18.3

2009 13.1 14.7 24.8 19.4

2010 14.4 17.3 25.6 15.8

The Following table gives the quarterly healthcare claims (in R millions) against all healthcare claims for the period 2008 to 2010.

1. Represent the above data in as time series plot.2.Calculate the quarterly seasonal indices for healthcare claims using the ratio-to moving average method. Interpret the results.3. Derive a trend line using the method of least squares4.Estimate the seasonally-adjusted trend value of health care claims for the third quarter of 2011.

Example

Page 76: Random Variable

Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4

2008 2009 2010

Quarterly Healthcare Claims ( in Rm) for the period 2008 - 2010

1.

Page 77: Random Variable

Season Data(Rm)

4MA(Rm)

Centred4MA (Rm)

Unadj. SI(%)

2008 Q1 14.0  -  - - 

Q2 15.6  -  - -

Q3 21.5 17.350 17.238 124.7

Q4 18.3 17.125 17.013 107.6

2009 Q1 13.1 16.900 17.313 75.7

Q2 14.7 17.725 17.863 82.3

Q3 24.8 18.000 18.163 136.5

Q4 19.4 18.325 18.650 104.0

2010 Q1 14.4 18.975 19.075 75.5

Q2 17.3 19.175 18.725 92.4

Q3 25.6 18.275 -  - 

Q4 15.8  - - -

2.

Page 78: Random Variable

Q1 Q2 Q3 Q4

2 008 124.7 107.6

2 009 75.7 82.3 136.6 104.0

2 010 75.6 92.4 - -

Mean SI 75.7 87.4 130.7 105.8

Adj. SI 75.7 87.5 130.9 106.0

 

The annual seasonal influences are as follows:

Q1: substantial decrease of 24.3%Q2: decrease of 12.5%Q3: substantial increase of 30.9%Q4: increase of 6.0%

Page 79: Random Variable

t T t2 tT1 14.0 1 14.02 15.6 4 31.23 21.5 9 64.54 18.3 16 73.25 13.1 25 65.56 14.7 36 88.27 24.8 49 173.68 19.4 64 155.29 14.4 81 129.6

10 17.3 100 173.011 25.6 121 281.612 15.8 144 189.6

∑ = 78 ∑ = 214.5 ∑ = 650 ∑ = 1439.2

T(t) = 15.9 +0.31t

3.

Adj. Estimate for Q3 of 2011:

Y(2011, Q3) = T(15) x 1.309 = (15.9 + 0.31 x 15) x 1.309 = 26.9 ≡ R26.9m

4.

Page 80: Random Variable

STATISTICAL DECISION THEORY

Page 81: Random Variable

Components to Decision-Making Situation

• Decision alternatives or acts

• Payoffs

• States of nature

Page 82: Random Variable

• Minimax Regret Strategy

• Maximin Strategy

• Maximax Strategy

Decision Making Without Probabilities

Page 83: Random Variable

• Expected Payoff or Expected Monetary Value (EMV)

Decision Making with Probabilities

• Payoff table

Page 84: Random Variable

Decision Trees

• Decision nodes

• Even nodes

• Tree Structure

• EMV calculations

Page 85: Random Variable

Example

A large corporation arranged to use an ocean linear as a floating hotel for its annual convention. The shipping company had to make a decision whether or not to lease the ship. If leased, the company would get a flat fee and an additional percentage of profits from the convention, which could attract as many as 50000 people. The company’s analysts estimated that if the ship were leased there would be a 50% chance of realizing a profit of $700000, a 30% chance of making a profit of $800000, 15% chance of making a profit of $900000 and a 5% chance of making a profit of $1m.If the ship were not leased, it could be used for its usual voyage over the convention duration. In this case there would a 90% probability of making a profit of $750000 and a 10% probability that profits would be $780000.

Page 86: Random Variable

The company has one additional option. It the ship were leased, and it became clear within the first few days of the convention that the profits were going to be in the $700000 range, the company could choose to promote the convention on its own by offering participants discounts on the ocean liner’s cruises. The company’s analysts believe that if this action were chosen there would be a 60% chance that profits would increase to $740000 and a 40% chance that the promotion would fail, lowering profits to $680000.

4.1 Draw a decision tree to depict the above problem.

4.2 What decision should the shipping company take? Show all working.

Page 87: Random Variable

Lease

Do not

lease

0.1

0.9

0.3

0.15

0.05

0.5Promote

Do not

Promote

0.4

0.6

$700000

$680000

$740000

$800000

$900000

$1000000

$750000

$780000A

B

C

D

Page 88: Random Variable

4.2

EMV = max[EMV(A), EMV(B)]

EMV(A) = $780000 x 0.1 + $750000 x 0.9 = $753000

EMV(B) = $1000000x0.05 + $900000x0.15 + $800000x0.3 + 0.5xEMV(C) = $425000+0.5xEMV(C)

EMV(C) = max[$700000, EMV(D)]

= max[$700000, $680000x0.4 + $740000x0.6] = $716000 → promote

Hence EMV (B) = $425000 + $716000x0.5 = $783000 → EMV = $783000

Decision: Lease and then promote the convention if profits from lease are in the $700000 range.