Download - 9.1 Correlation Key Concepts: –Scatter Plots –Correlation –Sample Correlation Coefficient, r –Hypothesis Testing for the Population Correlation Coefficient,

9.1 Correlation

• Key Concepts:– Scatter Plots– Correlation– Sample Correlation Coefficient, r– Hypothesis Testing for the Population

Correlation Coefficient, ρ

9.1 Correlation

• What exactly do we mean by correlation?– If two variables are correlated, it means a relationship

exists between them.– Examples of correlated variables:

• Job Satisfaction and Job Attendance• Number of Cows per Square Mile and Crime Rate• Height and Weight• High School GPA and College GPA• Square Footage and Price (of a house)

9.1 Correlation

• Two questions we need to answer:1. Does a linear (or straight line) correlation exist

between the two variables?

2. If the variables appear linearly correlated, how strong is the correlation?

– We can answer (1) using a scatter plot• The independent (explanatory) variable is x• The dependent (response) variable is y

– Example: How well does High School GPA, x, “explain” College GPA, y?

– See section 2.2 for a review of scatter plots

9.1 Correlation

• Once the scatter plot is complete, we should be able to see if a linear relationship exists between the two variables.– See p. 470 for what we mean by Negative Linear

Correlation, Positive Linear Correlation, No Correlation, and Nonlinear Correlation.

• Next, we need a way to quantify or measure the strength of the linear relationship between the two variables.

9.1 Correlation

• The Correlation Coefficient measures the strength and the direction of the linear relationship between two variables. The sample correlation coefficient, r, is defined as:

where n is the number of pairs of data

2 22 2

n xy x yr

n x x n y y

9.1 Correlation

• Things we need to know about the sample correlation coefficient, r :– r will always lie between -1 and 1, inclusive: -1 ≤ r ≤ 1

– If r = -1, we say there is a perfect negative linear correlation between the two variables.

– If r = 1, there is a perfect positive linear correlation between the two variables.

– The strength of the linear relationship between the variables is determined by r ’s proximity to 1 or -1. In other words, the closer r is to 1 or -1, the stronger the linear relationship. The closer r is to 0, the weaker the linear relationship.

• Practice:#22 p. 482 (Age and Vocabulary)

9.1 Correlation

• Once we have the sample linear correlation coefficient, r, we can use it in a t-Test to make an inference about the population linear correlation coefficient, ρ (Greek letter “rho”).– Why bother?

• Remember we found r using a limited set of data. What about the rest of the population? Do we have enough evidence from the sample data to claim that a significant linear correlation exists between our two variables?

– Example: If we have analyzed the High School GPA and College GPA of 25 students, is there enough evidence to claim that a significant linear correlation exists between the High School GPA and College GPA of all students?

9.1 Correlation

• t-Test for the Population Correlation Coefficient– We will use the two-tailed version of this test:

H0: ρ = 0 (no significant correlation exists)

Ha: ρ ≠ 0 (a significant correlation exists)

– The test statistic is r and the standardized test statistic is given by:

Note: t follows a t-distribution with n – 2 degrees of freedom

212

r

r rt

rn

9.1 Correlation

• Practice using the t-Test:

#32 p. 484 (Braking Distances: Wet Surface)

Download - 9.1 Correlation Key Concepts: –Scatter Plots –Correlation –Sample Correlation Coefficient, r –Hypothesis Testing for the Population Correlation Coefficient,

Top Related