ap statistics – chapter 3 practice quiz model …...ap statistics – chapter 3 practice quiz...

4
AP Statistics – Chapter 3 Practice Quiz MODEL SOLUTIONS Multiple Choice: 1. C is correct. This description addresses the form (linear), direction (positive), and strength (moderately weak) of the relationship. There are no observations that do not fit the overall pattern. The other options are not the best choices because a description of the relationship between two quantitative variables should address the form, direction, and strength of the relationship, along with any observation that do not fit the overall pattern. 2. B is correct. The teacher wants to know if textbook reading has an impact on grades, so there is a clear explanatory-response relationship where textbook reading is the explanatory variable and AP Stats grades are the response variable. 3. D is correct. Choice (a) is incorrect because correlation does not imply causation. There may be other variables involved that are associated with textbook reading and grades, such as increased time for studying or a healthy home environment that is conducive towards studying. Both (b) and (c) are correct because, while this correlation does not establish a causal link between these two variables, it is close to +1, implying a strong association. This may suggest that students should consider increasing the amount of time they spend reading their textbooks (but it does NOT mean that it WILL for sure increase their grades – other factors are involved). 4. A is correct. The least-squares line is the line that makes the sum of the squared residuals as small as possible. Choice (b) is incorrect because the least-squares line is based on distances between observed values and the regression line, but not the perpendicular distance (45 degrees to the regression line). (c) is incorrect because it refers to the distances between the explanatory values instead of the response values. 5. B is correct. If one looks only at the o’s corresponding to females, there is a clear downward trend indicating a negative correlation between weight and time. The same is true for the +’s corresponding to males. Choice (a) is incorrect because the correlation r measures association between two quantitative variables. Since “gender” is not quantitative, r is not an appropriate measure of association between gender and weight. Choice (c) is incorrect because if this statement were true, then the cluster of +’s corresponding to males would lie distinctly below the cluster of o’s corresponding to females (that is, the +’s would tend to have smaller y-coordinates than the o’s), clearly not the case here. Choice (d) is incorrect because the r – value should be negative since there is a negative association (as weight increases, time to raise pulse to 140 bpm decreases). Choice (e) is incorrect for two reasons: 1) we don’t have a least-squares regression line graphed or given to us, and 2) predicting the time it takes for a 300 lb. male would be considered extrapolation since 300 is much higher than the interval of 90 pounds to 180 pounds. Oops, there is no #6. 7. B is correct. The number of sharks and beach deaths are both quantitative variables, and the r = 0.33 is between -1 and 1 and it is unit-less. Choice (a) is incorrect because zip codes are actually a categorical variable, therefore we cannot draw an association using r. Choice (c) is incorrect because the correlation coefficient should not have units. Choice (d) is incorrect because we cannot conclude whether a linear model is appropriate without looking at the scatterplot and residual plot, even if the r- value appears to be very close to 1 or -1. Choice (e) is incorrect because (d) is incorrect.

Upload: others

Post on 13-Mar-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AP Statistics – Chapter 3 Practice Quiz MODEL …...AP Statistics – Chapter 3 Practice Quiz MODEL SOLUTIONS Multiple Choice: 1. C is correct. This description addresses the form

AP Statistics – Chapter 3 Practice Quiz MODEL SOLUTIONS Multiple Choice: 1.Ciscorrect.Thisdescriptionaddressestheform(linear),direction(positive),andstrength(moderatelyweak)oftherelationship.Therearenoobservationsthatdonotfittheoverallpattern.Theotheroptionsarenotthebestchoicesbecauseadescriptionoftherelationshipbetweentwoquantitativevariablesshouldaddresstheform,direction,andstrengthoftherelationship,alongwithanyobservationthatdonotfittheoverallpattern.

2.Biscorrect.Theteacherwantstoknowiftextbookreadinghasanimpactongrades,sothereisaclearexplanatory-responserelationshipwheretextbookreadingistheexplanatoryvariableandAPStatsgradesaretheresponsevariable.

3.Discorrect.Choice(a)isincorrectbecausecorrelationdoesnotimplycausation.Theremaybeothervariablesinvolvedthatareassociatedwithtextbookreadingandgrades,suchasincreasedtimeforstudyingorahealthyhomeenvironmentthatisconducivetowardsstudying.Both(b)and(c)arecorrectbecause,whilethiscorrelationdoesnotestablishacausallinkbetweenthesetwovariables,itiscloseto+1,implyingastrongassociation.Thismaysuggestthatstudentsshouldconsiderincreasingtheamountoftimetheyspendreadingtheirtextbooks(butitdoesNOTmeanthatitWILLforsureincreasetheirgrades–otherfactorsareinvolved).

4.Aiscorrect.Theleast-squareslineisthelinethatmakesthesumofthesquaredresidualsassmallaspossible.Choice(b)isincorrectbecausetheleast-squareslineisbasedondistancesbetweenobservedvaluesandtheregressionline,butnottheperpendiculardistance(45degreestotheregressionline).(c)isincorrectbecauseitreferstothedistancesbetweentheexplanatoryvaluesinsteadoftheresponsevalues.

5.Biscorrect.Ifonelooksonlyattheo’scorrespondingtofemales,thereisacleardownwardtrendindicatinganegativecorrelationbetweenweightandtime.Thesameistrueforthe+’scorrespondingtomales.Choice(a)isincorrectbecausethecorrelationrmeasuresassociationbetweentwoquantitativevariables.Since“gender”isnotquantitative,risnotanappropriatemeasureofassociationbetweengenderandweight.Choice(c)isincorrectbecauseifthisstatementweretrue,thentheclusterof+’scorrespondingtomaleswouldliedistinctlybelowtheclusterofo’scorrespondingtofemales(thatis,the+’swouldtendtohavesmallery-coordinatesthantheo’s),clearlynotthecasehere.Choice(d)isincorrectbecausether–valueshouldbenegativesincethereisanegativeassociation(asweightincreases,timetoraisepulseto140bpmdecreases).Choice(e)isincorrectfortworeasons:1)wedon’thavealeast-squaresregressionlinegraphedorgiventous,and2)predictingthetimeittakesfora300lb.malewouldbeconsideredextrapolationsince300ismuchhigherthantheintervalof90poundsto180pounds.

Oops,thereisno#6.7.Biscorrect.Thenumberofsharksandbeachdeathsarebothquantitativevariables,andther=0.33isbetween-1and1anditisunit-less.Choice(a)isincorrectbecausezipcodesareactuallyacategoricalvariable,thereforewecannotdrawanassociationusingr.Choice(c)isincorrectbecausethecorrelationcoefficientshouldnothaveunits.Choice(d)isincorrectbecausewecannotconcludewhetheralinearmodelisappropriatewithoutlookingatthescatterplotandresidualplot,evenifther-valueappearstobeverycloseto1or-1.Choice(e)isincorrectbecause(d)isincorrect.

Page 2: AP Statistics – Chapter 3 Practice Quiz MODEL …...AP Statistics – Chapter 3 Practice Quiz MODEL SOLUTIONS Multiple Choice: 1. C is correct. This description addresses the form

8.Ciscorrect.Choice(a)isincorrectbecauseyouhavemisinterpretedr2,theproportionofvariationintheresponsevariablethatcanbeexplainedbyregressionontheexplanatoryvariable,asr.Choice(b)isincorrectbecauseyouperformedthewrongoperationtotransformthegiveninformation(youshouldsquareroot0.81insteadofsquareit).Choice(d)isobviouslyincorrectforbeingnegativebecausetheassociationgivenintheproblemimpliesapositivecorrelationsinceincreasedstudyingisassociatedwithbetterscores.

9.Discorrect.Choice(IV)isthecorrectchoicebecausefootlengthiscorrectlyidentifiedastheexplanatoryvariableandheightiscorrectlyidentifiedasthepredictedresponsevariable,witha“hat”ontop.Additionally,theslopeiscorrectlyidentifiedas1.878andthey-interceptis117.99.Choice(I)isincorrectbecauseyshouldbe𝑦sincetheleast-squaresregressionlineisapredictionmodelandalsobecausethevariablesxand𝑦shouldbedefinedfortheequation.Choice(II)isincorrectforsimilarreasonsastowhy(I)isincorrect,aswellasthefactthattheslopeandy-interceptareswitched.Choice(III)isincorrectbecause,again,theslopeandy-interceptareswitched.Choice(V)isincorrectbecausetheexplanatoryandresponsevariablesareswitchedaswellastheslopeandy-intercept.

Free Response: 10.ThefollowingscatterplotwascreatedusingStapplet.com(twoquantitativevariablesapplet).11.Thereisalinear,negative,fairlystrongassociationbetweentimeandnumberofpeoplepayingattention.Therearenooutliersorunusualfeatures.

12a) Slope:𝑏 = 𝑟 ∙ !!

!!= −0.99 ∙ !".!"

!.!"= −9.39125 ≈ −9.39 𝑝𝑒𝑜𝑝𝑙𝑒 𝑝𝑒𝑟 ℎ𝑜𝑢𝑟

***UseyourL1andL2,LinReg,and2ndSTAT>MATH>StdDevoptionstofind𝑟,𝑠! ,and𝑠! . Y-intercept:𝑎 = 𝑦 − 𝑏𝑥 = 51.5714− −9.39125 3 = 79.74517857 ≈ 79.75LeastsquaresRegressionLine𝑦 = 𝑎 + 𝑏𝑥canbewritteneitherway:

• 𝑦 = 79.75− 9.39𝑥wherexrepresentstime(hours)and𝑦representspredictednumberofpeoplepayingattention(carefultonotforget“predicted”!)

• 𝑝𝑒𝑜𝑝𝑙𝑒 𝑝𝑎𝑦𝚤𝑛𝑔 𝑎𝑡𝑡𝑒𝑛𝑡𝚤𝑜𝑛 = 79.76− 9.39(𝑡𝑖𝑚𝑒)

Page 3: AP Statistics – Chapter 3 Practice Quiz MODEL …...AP Statistics – Chapter 3 Practice Quiz MODEL SOLUTIONS Multiple Choice: 1. C is correct. This description addresses the form

12b)𝑦 = 79.75− 9.39𝑥isthesameaswhatwecalculatedbyhand.13)𝑦 = 79.75− 9.39 2.5 = 56.2678 ≈ 56people Thereareabout56peoplepredictedtobepayingattentionat2.5hours.14)𝑟 = −0.99 Thereisastrongandnegativeassociationbetweentimeandnumberofpeoplepayingattention.15)𝑟! = 0.98 98%ofthevariationinthenumberofpeoplepayingattentioncanbeaccountedforbytheleastsquaresregressionlinerelatingtimeandnumberofpeoplepayingattention.

16)slope=-9.39 Foreachadditionalhourthatpasses,thenumberofpeoplepayingattentionispredictedtodecreaseby9.39people.

17)y-intercept=79.75 Atthestartofthetraining(time=0hours),thepredictednumberofpeoplepayingattentionis79.75people.

18)No,youshouldnotmakeapredictionforthenumberofpeoplepayingattentionaftera13-hourconferencebecausethisisconsideredextrapolation.13hoursis7hoursabovetheintervalof0to6hoursusedtocalculatetheleastsquaresregressionline𝑦 = 79.75− 9.39𝑥,thereforeourpredictionwouldbeunreliablebecauseitcouldpossiblyoverestimateorunderestimatethepredictednumberofpeoplepayingattention.

19)TheMinitaboutputgivesusthesameinformationasabove. slope=-9.393 y-intercept=79.75 𝑦 = 79.75− 9.39𝑥wherexrepresentstime(hours)and𝑦representspredictednumberofpeople20)𝑆 = 3.143 Wewilltypically(oronaverage)beoffbyabout3.143peoplewhenweusetheleastsquaresregressionlinetopredictthenumberofpeoplepayingattentionfromthetimepassed.

21)𝑦 = 79.75− 9.39 4 = 42.178people***IusedY1(4)inmycalculator 𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙 = 𝑦 − 𝑦 = 48− 42.178 = 5.82 Theactualnumberofpeoplepayingattentionafter4hoursis5.81peoplehigherthanpredictedbytheleastsquaresregressionline.

22)Alinearmodelisappropriatetobecausethereisnoleftoverpattern(orrandomscatter)inthe

residualplot.

Page 4: AP Statistics – Chapter 3 Practice Quiz MODEL …...AP Statistics – Chapter 3 Practice Quiz MODEL SOLUTIONS Multiple Choice: 1. C is correct. This description addresses the form

23) Yes,MichaelJordanwalkingintothetrainingwouldbeinfluentialbecausethepoint(7,82)liesoutsidetheoverallpatternofobservations.

Beforeweaddedthisnewpoint(7,82),theleastsquaresregressionlinewas𝑦 = 79.75− 9.39𝑥andhadanr-valueof-0.99.Afteraddingthepoint(7,82),thelinechangedsignificantlyto𝑦 = 68.42− 3.73𝑥andther–valueweakenedto-0.418.

Sincetheleastsquaresregressionlineisnonresistanttoinfluentialobservations,theslopedrasticallyincreasedfrom-9.39to-3.73,andthey-interceptdecreasedfrom79.75to68.42.

Thecorrelationcoefficientisalsononresistanttoinfluentialpoints.Thecorrelationbetweentimeandpeoplepayingattentionweakenedbecausethecorrelationcoefficientr=-0.418iscloserto0thanr=-0.99.Thishappenedbecausethepoint(7,82)doesnotfollowthelineartrend.