3.1-3.2 scatterplots and correlation. differentiate between an explanatory and response variable ...
TRANSCRIPT
AP Statistics3.1-3.2 Scatterplots and Correlation
Differentiate between an explanatory and response variable
Draw and interpret a scatterplot
Add categorical data into a scatterplot
Calculate and interpret the Correlation between two variables
Learning Objectives:
Response Variable-Measures an outcome of a study
Explanatory Variable-Attempts to explain the observed outcomes
We often find explanatory data called independent variable, and response variables called dependent variable.
**This simply tells us the response variable depends on the explanatory variable***
Go to page 29 in your textbook (we are just looking at the columns SAT Math and percent taking)
Where does the explanatory data go? x-axis Where does the response variable go?
y-axis Identify each variable (which one is the
independent? Dependent?) Also decide if there is a positive or negative association.
Discuss this with your groups and then share your reasoning with the class
State average SAT math score vs. Percent of graduates taking the SAT
Independent-% taking
Dependent-SAT Math
Remember: you sign up for the exam first, so we first know the percent of kids taking it. Then you take the exam so we find out the average SAT math score for the state last.
It should have a negative association. Think of this:- If a very small % takes the SAT, they are all most likely
taking it to get into college and should be smarter.- If a large % takes it, the average is lower(think of the
ACT at Athens where they pay for everyone to take-some kids who know they aren’t going to college don’t even attempt this test and lower the overall average)
Draw a scatter plot (we are just going to input the data from AL to KY to save time)
Step 1: Input % taking into L1 SAT Math into L2
Step 2: 2nd-statplot-1-enter so the cursor is highlighting ON.
Xlist: L1Ylist:L2Then hit zoom 9 (this will fit your window to the
data you inputted)Sketch a quick scatter plot on your notes (your
axis doesn’t have to start at 0).
This is a rough sketch of the scatter plot. Make sure you label it.
Now take 2 minutes and come up with an example in your groups.
-Did anyone say hair color vs. weight? (this is incorrect b/c hair color is categorical and you need
2 variables that are quantitative!!)
Give an example with no explanatory-response distinction?
We want to look for an overall pattern
(linear?) .
We can describe the overall pattern of a scatterplot by the direction, form, and strength!!!
An important kind of deviation is an outlier.
Examining a scatterplot
Use the scatter plot to answer a-c a) Are there any clusters? b) Are there any outliers? c) Is there a clear direction?
Example:
Positive Association-
As x increases, y increases
Negative Association-
As x increases, y decreases
A scatterplot displays the relationship between two quantitative variables.
How can we add categorical variables into a scatterplot?
It would have to be a third variable-then you can use a key!!
Show an example:
The correlation measures the direction and strength of the linear relationship between two quantitative variables. Correlation is usually written as r.
Correlation (r)
1-The mean and standard deviation of the two variables are denoted as:
2- The correlation between x and y is:
Your calculator does this for you! First-make sure your diagnostics are on.
2nd-catalog (0)-scroll down to diagnostic ON-enter-enter.
(you only have to set your calc to this once, unless you change the batteries)
Femur 38 56 59 64 74 Humerus 41 63 70 72 84
Input Femur into L1and humerus into L2
Graph it-does it look postive or negative? Is the correlation strong,
moderatley strong, weak?
Find the correlation between the two bones in the fossil specimens.
TI-84: Stat-calc-8 (use 8 not 4)Xlist:L1Ylist:L2FreqList: (leave this blank)Store RegEQ: Y1 (vars-Yvars-1-1)Calculate
TI-83: Stat-calc-8 (use 8 not 4) then type in L1,L2, Y1, enter
r=0.9941 There is a very strong positive linear relationship between the femur and the humerus.
(make sure you write out the sentence that describes the r value not just r=0.9941.)
To find the correlation (r):
When a question asks for the correlation-you have to give the r value and ALSO describe it in a sentence EVERY TIME!!! (that is what they grade you on for the AP exam)
The sentence should include the strength (strong,weak,..), direction (pos. or neg.) and the form (linear)
So if r=0.87 for your test grades versus the hours you studied.
Answer: r=0.87 There is a moderately strong positive linear relationship between hrs. studied and test grades
#1-Positive r indicates pos. association Negative r indicates negative
association #2- correlation always falls between -1 and
1(the closer to 1 and -1, the stronger it is)
Facts About Correlation
#3-r is standardized, so it does not change with different measurements. (Go back and look at the actual formula for r. It is really just converting x and y’s to z-scores).
#4- correlation measures the strength of only
linear relationships b/w 2 variables
#5- Correlation is strongly affected by outliers!
#6- Correlation is non-directional (flip the x and y doesn’t change it!)
Correlation is not a complete description of two variable data!!!!!