chapter 10: correlation and regression chapter 13:...

27
Chapter 10: Correlation and Regression Chapter 13: Nonparametric Statistics Objectives: Learn how to draw a scatter plot for a set of ordered pairs. Learn how to compute the correlation coefficient. Learn how to compute the equation of the regression line. Learn how to compute the Spearman rank correlation coefficient.

Upload: others

Post on 12-Aug-2021

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Chapter 10: Correlation and Regression

Chapter 13: Nonparametric Statistics

Objectives:

❑ Learn how to draw a scatter plot for a set of

ordered pairs.

❑ Learn how to compute the correlation coefficient.

❑ Learn how to compute the equation of the

regression line.

❑ Learn how to compute the Spearman rank

correlation coefficient.

Page 2: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Overview of Chapters 10 and 13

Sec. # Title Page(s)

10 - 1 Scatter Plots and Correlation 369 – 385

13 - 6The Spearman Rank Correlation

Coefficient459 – 461

10 - 2 Regression 386 – 393

Page 3: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Remember?

Independent variable

influencesDependent variable

Page 4: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

At a Glance!

Are two or more variables linearly related?

(Scatter plot and/or correlation coefficient)

If so, what is the strength of the relationship?

(Scatter plot and/or correlation coefficient)

What type of relationship exists?

(Scatter plot, correlation coefficient and/or regression)

What kind of predictions can be made from the relationship?

(Regression)

Page 5: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

A scatter plot is a graph of the ordered pairs

of numbers (x, y) consisting of the independent

variable x and the dependent variable y.

10 – 1: Scatter Plots and Correlation

Page 6: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

10 – 1: Scatter Plots and Correlation

(cont.)

It is a visual way to describe the nature of the

relationship between the x and y. It may shows:

a positive linear relationship,

a negative linear relationship,

a curvilinear relationship,

or no relationship.

Example 10 – 1, page 372, Example 10 – 2,

page 372 – 373, Example 10 – 3, page 373.

Page 7: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Examples of scatter plots patterns

Page 8: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Correlation

Pearson’s linear correlation coefficient, which will

be denoted by 𝑟, measures the strength and the

direction of a linear relationship between two

quantitative variables.

Page 9: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Calculating 𝒓

The linear correlation coefficient is given by

𝒓 =𝒏∑𝒙𝒚 − (∑𝒙)(∑𝒚)

𝒏∑𝒙𝟐 − ∑𝒙 𝟐 𝒏∑𝒚𝟐 − ∑𝒚 𝟐

The above coefficient is also known as Pearson

product moment correlation coefficient (PPMC).

Page 10: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Properties of 𝒓

The range of the correlation coefficient is from +1 to -1.

If the value of 𝑟 is close to +1, then there is a strong positive linear relationship between the variables.

If the value of 𝑟 is close to -1, then there is a strong negative linear relationship between the variables.

If the value of 𝑟 is close to 0, then there is either a weak or no linear relationship between the variables.

Page 11: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Properties of 𝒓

Page 12: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Example 10 – 4: Car Rental Companies

# of Cars (x) Revenue (y)

63 7

29 3.9

20.8 2.1

19.1 2.8

13.4 1.4

8.5 1.5

From the left table, we

obtain:

∑𝒙 = 𝟏𝟓𝟑. 𝟖,∑𝒚 = 𝟏𝟖. 𝟕,

∑𝒙𝒚 = 𝟔𝟖𝟐. 𝟕𝟕,∑𝒙𝟐 = 𝟓𝟖𝟓𝟗. 𝟐𝟔,∑𝒚𝟐 = 𝟖𝟎. 𝟔𝟕.

Page 13: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Example 10 – 4 (cont.)

𝒓 =𝟔(𝟔𝟖𝟐. 𝟕𝟕) − (𝟏𝟓𝟑. 𝟖)(𝟏𝟖. 𝟕)

𝟔(𝟓𝟖𝟓𝟗. 𝟐𝟔) − 𝟏𝟓𝟑. 𝟖 𝟐 𝟔(𝟖𝟎. 𝟔𝟕) − 𝟏𝟖. 𝟕 𝟐

= 𝟎. 𝟗𝟖𝟐

Hence, there is a strong positive linear correlation relation

between the number of rented cars and revenues.

Example 10 – 5, page 377 (Negative correlation),

Example 10 – 6, page 378 (Weak positive correlation).

Page 14: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

13 – 6: The Spearman Rank Correlation

Coefficient

If 𝑛 is the sample size, and 𝑑 is difference in ranks,

then the Spearman rank correlation coefficient is

calculated as

𝒓𝒔 = 𝟏 −𝟔∑𝒅𝟐

𝒏(𝒏𝟐 − 𝟏)

Page 15: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Example 13 – 7: Bank Branches and

Deposits (page 459)

# of branches (X) Deposits (Y) Rank

(X)

Rank

(Y)

209 23 4 4353 31 2 119 7 8 6201 12 5 5344 26 3 2132 5 6 7401 24 1 3126 5 7 8

# of branches (X) Deposits (Y) Rank

(X)

209 23 4353 31 219 7 8201 12 5344 26 3132 5 6401 24 1126 5 7

# of branches (X) Deposits (Y)

209 23353 3119 7201 12344 26132 5401 24126 4

# of branches (X)

20935319201344132401126

Page 16: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Example 13 – 7 (cont.)

Rank (X) Rank (Y) 𝒅 𝒅𝟐

4 4 0 02 1 1 18 6 2 45 5 0 03 2 1 16 7 -1 11 3 -2 47 8 -1 1

∑ 𝟎 𝟏𝟐 = ∑𝒅𝟐

Page 17: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Example 13 – 7 (cont.)

𝒓𝒔 = 𝟏 −𝟔∑𝒅𝟐

𝒏 𝒏𝟐 − 𝟏= 𝟏 −

𝟔 ⋅ 𝟏𝟐

𝟖 𝟔𝟒 − 𝟏= 𝟏 −

𝟕𝟐

𝟓𝟎𝟒= 𝟎. 𝟖𝟓𝟕

The above value indicates that we have a strong positive correlation.

We can calculate Spearmen’s correlation if the data are ordinal-level qualitative.

Page 18: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

10 – 2: Regression

If the value of the correlation coefficient is

significant, the next step is to determine the

equation of the regression line, which is the data’s

line of best fit.

Best fit means that the sum of the squares of the

vertical distances from each point to the line is at

a minimum.

Page 19: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Line of best fit

Page 20: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Line of best fit (cont.)

Page 21: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Determination of the Regression Line

Equation

The equation regression line is:

𝒚′ = 𝒂 + 𝒃 ⋅ 𝒙

Here, 𝑎 is the intercept or the regression constant,

𝑏 is the slope or the regression coefficient, 𝑥 is the

observed independent variable, and they are used

to calculate 𝑦′which is the predicted dependent

variable.

Page 22: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Determination of the Regression Line

Equation (cont.)

𝒂 =∑𝒚 ∑𝒙𝟐 − (∑𝒙)(∑𝒚)

𝒏 ∑𝒙𝟐 − ∑𝒙 𝟐

𝒃 =𝒏 ∑𝒙𝒚 − (∑𝒙)(∑𝒚)

𝒏 ∑𝒙𝟐 − ∑𝒙 𝟐

Page 23: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Example 10 – 9 (page 388)

Number of rented cars is the independent variable

𝑥, while the revenue is the dependent variable 𝑦.

The regression line is found to be:

𝒚′ = 𝟎. 𝟑𝟗𝟔 + 𝟎. 𝟏𝟎𝟔 ⋅ 𝒙

This means that as the number of rented cars

increases by 1 as the revenue increases by 0.106

on average.

Page 24: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Example 10 – 10 (page 389)

Number of absences is the independent variable

𝑥, while the final grade is the dependent variable

𝑦. The regression line is found to be:

𝒚′ = 𝟏𝟎𝟐. 𝟒𝟗𝟑 − 𝟑. 𝟔𝟐𝟐 ⋅ 𝒙

This means that as the number of absences

increases by 1 as the final grade decreases by

3.622 on average.

Page 25: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Example 10 – 11 (page 391)

Predict the income of a car rental agency (y) that

has 200,000 automobiles (x).

Note that in the Example 10 – 1, the unit of number

of rented automobiles is in ten thousands.

Therefore, 200,000 automobiles is in fact 20 ten

thousand, i.e. x = 20. Hence,

𝑦′ = 0.396 + 0.106 𝟐𝟎 = 2.516

Page 26: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Important Rule!

Q. Is there any relationship between the Person’s

correlation coefficient and the regression coefficient

𝒃?

A. The sign of the correlation coefficient and the sign

of the slope of the regression line will always be the

same.

Page 27: Chapter 10: Correlation and Regression Chapter 13: …fmalam.kau.edu.sa/.../157188_STAT_110_CH10_CH13_2019.pdf · 2020. 9. 5. · Chapter 10: Correlation and Regression Chapter 13:

Application Summary

Measure Excel only Excel + MegaStat

Scatter plot ✓

Person’s linear correlation

coefficient✓

Spearman’s correlation

coefficient✓

Regressions equation ✓