section 2: data collection, sampling, and …section 2: data collection, sampling, and experimental...

17
Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge and Skills for Mathematics TAC §111.47(c). 2.01 Sample Surveys Statistics (1)(A) Statistics (2)(B) Statistics (2)(C) Statistics (2)(E) 2.02 Sources of bias in sampling and surveys Statistics (2)(A) Statistics (2)(C) 2.03 Sampling Methods – Part 1 Statistics (1)(A) Statistics (2)(A) 2.04 Sampling Methods – Part 2 Statistics (1)(A) Statistics (2)(A) Statistics (2)(G) 2.05 Experiments vs Observational Studies Statistics (2)(B) 2.06 Three Principles of Experimental Design Statistics (1)(B) Statistics (2)(C) Statistics (2)(E) Statistics (2)(F) 2.07 Lurking and Confounding Variables Statistics (2)(G) Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 1

Upload: others

Post on 07-Jul-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

Section2:DataCollection,Sampling,andExperimentalDesignThefollowingmapsthevideosinthissectiontotheTexasEssentialKnowledgeandSkillsforMathematicsTAC§111.47(c).2.01SampleSurveys

• Statistics(1)(A)• Statistics(2)(B)• Statistics(2)(C)• Statistics(2)(E)

2.02Sourcesofbiasinsamplingandsurveys

• Statistics(2)(A)• Statistics(2)(C)

2.03SamplingMethods–Part1

• Statistics(1)(A)• Statistics(2)(A)

2.04SamplingMethods–Part2

• Statistics(1)(A)• Statistics(2)(A)• Statistics(2)(G)

2.05ExperimentsvsObservationalStudies

• Statistics(2)(B)2.06ThreePrinciplesofExperimentalDesign

• Statistics(1)(B)• Statistics(2)(C)• Statistics(2)(E)• Statistics(2)(F)

2.07LurkingandConfoundingVariables

• Statistics(2)(G)

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 1

Page 2: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

2.08ConfidenceIntervalsintheRealWorld

• Statistics(1)(B)• Statistics(2)(C)• Statistics(2)(E)• Statistics(2)(F)• Statistics(2)(G)

Note:Unlessstatedotherwise,anysampledataisfictitiousandusedsolelyforthepurposeofinstruction.

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 2

Page 3: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

2.01SampleSurveys

Samplesurvey–Designedtogatherinformationaboutasmallgroupfromapopulationofinterest

• Asampleofsubjectsistakenfromthe______________andaskedquestions.• Thesample________matters,notthepopulationsize.The_______________our

sample,themorepreciseourestimateswillbe.• Toreducebiasinsurveys,itisbesttohave__________samples.• ___________________samplesaremadeupofpeopleorsubjectsthatareeasyto

obtain.

Thedesignofasurveycanhaveamajorimpactonresults.Apoorlydesignedandimplementedstudymaybemeaninglessormisleading.

1. SupposeDr.Malcolmisresearchingthesideeffectsofadrugintendedtorelievepain.She

surveys15ofherpatientsandfindsthat40%ofthemexperiencedfatigueormuscleache.Twootherdoctorsresearchingthesamedrugfoundthatfromarandomsampleof760patientsacrossthecountry,30%experiencedfatigueormuscleache.Whichstudyismorecredible?Justifyyouranswer.

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 3

Page 4: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

Whendesigningasurvey,thefollowingquestionsshouldbeconsidered:

• Whatarethegoals?• Whatisthetargetedpopulation?• Whatquestionswillbeasked?• Howwilltheparticipantsbeselected?• Howwillthedatabeanalyzedandpresented?

Otherquestionsmayneedtobeconsideredaswell,dependingonthecircumstances.

2. Belowarethreehypotheticalsurveyquestionsandstatements.Foreach,identifyanyflaw(s)andsuggestawaytoimproveit.

i. “Howsatisfiedareyouwithyourpay,healthbenefits,andworkload?”

ii. “WiththeincreaseinspeedingticketsonI-10thesepastfewmonths,doyouthinkthespeedlimitshouldbeincreasedsinceitseemseveryoneisdrivingfastanyway?”

iii. “Whatisyourmaritalstatus?”

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 4

Page 5: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

2.02SourcesofBiasinSamplingandSurveys

Supposethatyouworkforaresearchagency,andyourjobincludessurveyingvotersandpredictingwhoisgoingtowintheupcomingelection.Youcall1,000voters,askwhichcandidatetheyintendtovotefor,recordandtabulatetheanswers,andmakesomepredictions.Afteryoucollectthedata,youdiscoverthatalmosteverysurveyrespondentwasalow-incomevoterandaregisteredDemocrat.Isthereaproblemwithyourdata?

Insurveysampling,_______________isdefinedasthesystematicfavoringofcertainoutcomes.

Agoodsampleis__________________,meaningthateachpersonoriteminthesampleisequallylikelytobeselectedfromthepopulation.

Bias–Theresultsobtainedfromasampledonotaccuratelyrepresentthepopulation.

SourcesofBias

• Undercoverage–Aportionofthepopulationisexcludedorunderrepresented.

Example:Thepopulationishighschoolstudents,butoursamplecontainsonlyfreshmen.

• Nonresponsebias–Peopledonotrespondtothestudy.

Example:Onehundredpeopleareselectedfromthephonebooktobesurveyed,but40ofthemhangupbeforeresponding.

• Responsebias–Thesurveydesigninfluencestheresponses.

Examples:“Doyoureallythinkweshouldhavebandandchorusinschoolwhenwehavesomanycoresubjectstolearn?”or“Doyouonlycareaboutlooks?”

SamplingTechniquesThatCreateBias

• Voluntaryresponsesample–Therespondents______________choosetosubmittheirresponses.

• Conveniencesample–Thesampleisselectedbasedoneaseandcosttotheinterviewer.

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 5

Page 6: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

1. Supposeyoutakeasurveyaboutschoolattendance.Thewomanconductingthesurveyremindsyouofyouraunt,soyourespond“No”whensheasksifyouhaveeverbeenlatetoclass,eventhoughyouhave.Whichofthefollowingtypesofbiasis/arepresentinyourresponse:undercoverage,nonresponsebias,orresponsebias?

2. Twopollsshowsignificantlydifferentresults.Oneasks,“Doyouthinkschoolhoursshouldbedecreased?”andtheotherasks,“Doyouthinkschoolhoursshouldbedecreasedconsideringthereisnotimeforoutsideplay?”Whichofthefollowingtypesofbiasis/arepresentinthisscenario:undercoverage,nonresponsebias,orresponsebias?

3. Supposethatapoliticianwantstoknowhowtheresidentsofhisdistrictwillreacttoabillthatraisesthefullretirementageto70.HerunsthefollowingadduringSundayNightFootball:“Letusknowwhatyouthink!Wouldyoubeinfavorofraisingtheretirementageto70,orwouldyouratherkeeptheretirementageat67,whenmanyseniorsarestillproductiveandhealthyenoughtostayworking?Giveusacallat1-800-555-1111,andgiveusyouropinion!”Identifythesourcesofbiaspresentinthisad,andjustifyyouranswer.

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 6

Page 7: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

2.03SamplingMethods–Part1

Supposeyouareexaminingastudyandyoufindthattheresultsarebiasedbecauseoffactorssuchasundercoverageandconveniencesampling.Youareveryinterestedinthetopicandwouldliketoreplicatethestudytoproduceresultsthatareunbiasedandrepresentthepopulation.

Ingeneral,__________________________usethelawsofprobabilityintheselectionprocessandminimizesbias.

RandomSamplingTechniques

• Simplerandomsample(SRS)–Eachsubjectinthepopulationhasanequalchanceofbeingselected.Theuseofrandomnumbergeneratorsisusefulinthisprocess.

• Stratifiedrandomsampling–Thepopulationisdividedintohomogeneousgroups,andSRSsaretakenfromeachgroup.

Example:Dividingahighschoolintofreshmen,sophomores,juniors,andseniors,andtakingasimplerandomsamplefromeachgroup

• Systematicrandomsampling–Arandomnumbergeneratorisusedtodeterminevaluesfork,andtheneverykthobservationissampled.

• Clusterrandomsampling–Thepopulationisdividedintoheterogeneousgroups,andSRSsaretakenfromeachgroup.

Example:Dividingahighschoolbasedonclassperiod,randomlyselectingseveralclassrooms,andsamplingeveryoneinthoseclassrooms

• Multistagerandomsampling–Thisapproachinvolvesacombinationoftherandomsamplingmethodslistedabove.

1. SupposeMs.Abernathyisconductingresearchonteachers’attitudestowardhomeschooling.Sheisparticularlyinterestedindescribingtheattitudesofteachersfromrural,smallurban,andlargeurbanschooldistricts.WhichsamplingprocedureshouldMs.Abernathyusetoensurehersampleisrepresentativeofthesetypesofschooldistricts?

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 7

Page 8: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

2. SupposeyouareaqualitycontrolengineerforabottlingcompanyinLaredo,Texas,thatfocusesonsoftdrinks.Yourjobistoplanandoverseethevariousstepsthatareinvolvedinprocessingandmanufacturingeachsoftdrinkproducttomakesuretheproductsmaintainthehigheststandardsofquality.Everytimeyouwanttotestaproduct,youuserandomdigitstorandomlyselectfivecansofsodafromeachbatchof50cans.Whatsamplingmethodareyouusing?Giveanexampleoftherandomselectionofcans.

3. Youareconductingapollinthemallandwanttorandomlyselectfivepeopletointerviewfromthenext100peoplethatwalkpastyou.Ifyouusethefollowingrandomdigits,whatarethefirsttwonumbersthatwillbeaddedtoyoursample?

412411756270184057528156592499

4. Youareconductingapollinthemallandwanttorandomlyselect10peopletointerviewfromthenext200peoplethatwalkpastyou.Ifyouusethefollowingrandomdigits,whatarethefirsttwonumbersthatwillbeaddedtoyoursample?

412411756270184057528156592499

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 8

Page 9: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

2.04SamplingMethods–Part2

1. Whichtypeofrandomornonrandomsamplingtechniqueisbeingusedineachscenariobelow?

i. Youareaqualitycontrolengineerforanorangejuicecompany.Youusethefollowingrandomdigitstorandomlyselectfivecartonsoforangejuicefromabatchof50cartonsoforangejuice.

743788300036123412325423881261

ii. PercyWeasleyisconductinganinvestigation.Herandomlysamplesfromall

studentsatHogwartsuntilhehas25Slytherins,20Gryffindors,20Ravenclaws,and15Hufflepuffs.

iii. ApollingcompanyinNewYorkwantstodeterminewhetherregisteredvotersin

thestaterecentlyvoted“Yes”onanewamendment.Theyrandomlyselect200registeredvotersfromeverycountyinthestateandaskeachpersonwhetherheorshevoted“Yes”or“No.”

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 9

Page 10: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

2. Theownerofabusinessparkwantstosurveytheofficesinthecomplextoestimatehertenants’overallsatisfaction.Thebusinessparkhastwobuildings,oneolderandonenewer.Eachbuildinghasthreefloorsofstandardofficespace(20officesperfloor)andonetopflooroflargeoffices,whicharedoublethesize(10largeofficesperfloor).Halfoftheofficesineachbuildingfacethetwomainroads,whiletheotherhalffacealakeandseveraltrees.Currently,75%oftheofficesareoccupied.

i. Howmanyofficesaretheretotal?

ii. Explainhowtoselectasimplerandomsampleof20offices.

iii. Explainhowtoselectastratifiedrandomsampleof20offices.

iv. Explainwhyselectingtworandomfloorswouldnotbeagoodwaytoobtainaclustersample.

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 10

Page 11: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

2.05ExperimentsandObservationalStudies

Observationalstudy–Researchersdonottrytochangeanything;theyjustobserve.

• Retrospectivestudy–Researchersselectsubjectsanddeterminetheirpreviousconditionsorbehaviors.

• Prospectivestudy–Researchersfollowsubjectstoobservefutureoutcomes.

Example:Supposeyouarestudyingthequestion“Domusicianshavebettergrades?”Youcantakeahighschoolandlookuprecords(retrospective),oryoucantakeahighschoolandtrackthenextdecade(prospective).

Experiment–Researchersmanipulatefactorlevelstocreatetreatmentsandcompareresponses.

• Experimentalunits–Subjectsorparticipants

• Response–Whatwewanttomeasure

• Factor–Categoricalvariablewithatleasttwolevels

o Level–Thespecificvaluesofafactorthattheexperimenterchooseso Treatment–Allpossiblevaluesoftheexplanatoryvariable

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 11

Page 12: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

1. Determinewhethereachofthefollowingresearchprojectsisanexperimentoranobservationalstudy.Ifitisanobservationalstudy,identifyitasretrospectiveorprospective.

i. AresearcherwantstoknowwhichofthreesororitieshasthehighestGPA.Sherandomlysamples10womenfromeachsororityandrecordstheirGPAs.

ii. Aresearcherwantstocomparetheeffectsofthreeheadachemedications.Sherandomlyassigns20patientssufferingfromheadachestoreceiveoneoffourpills,includingaplacebo,andmeasuresthetimeuntileachpatientnolongerfeelstheheadache.

iii. Aresearcherwantstocomparethreebrandsoftires.Herandomlyassignsatirebrandto24cars,ensuresthatthecarsaredrivenundersimilarconditions,andrecordshowlongeachcar’stireslast.

iv. Aresearcherwantstocomparethesalariesoffourcollegemajors.Herandomlysamples20recentgraduatesfromeachmajorandrecordstheirsalariesoverthenext5years.

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 12

Page 13: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

2. Ishappinessaffectedbydietandexercise?Agroupofpeopleisinvolvedinastudytotestthis.Theyaretoldwhethertoexercisealow,moderate,orhighamounteachweekandwhethertomaintainahealthyorunhealthydiet.Inthisexperiment,identifyeachofthefollowing:

i. Experimentalunits

ii. Factor(s)

iii. Level(s)

iv. Treatment(s)

v. Responsevariable

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 13

Page 14: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

2.06ThreePrinciplesofExperimentalDesign

Threefundamentalprinciplesthathelptomakeagoodexperiment:

• Control–Thetreatmentthecontrolgroupreceivesisaplacebo.

o Ifthesubjectsdon’tknowwhichpilltheyget,thesubjectsareblind.o Iftheresearcherdoesn’tknoweither,thestudyisdouble-blind.

• Randomization–Subjectsarechosenatrandom.

o Acompletelyrandomizedisadesigninwhichtheexperimentalunitsarerandomlyassignedtothetreatmentgroups.

o Ablockdesignmaybeusedtoplacesubjectsintosimilargroupso Amatchedpairdesignisaspecialcaseofablockdesign,usedwhenthe

experimenthasonlytwotreatments.

• Replication–Severalexperimentalunitsareassignedtoeachtreatment.Thenumberofreplicationsistypicallytheratioofthenumberofexperimentalunitstothenumberoftreatments.

1. Severalmethodsofcontrolexist:placebogroup,blinding,anddouble-blinding.Writethecorrectmethodofcontrolforeachofthefollowingsituations.

i. Thetreatmentgroupisgivenafakepillbecausepeopleoftengetbettersimplybecausetheythinktheyaretakingsomethingthatwillmakethembetter.

ii. Neithertheresearcher(s)northesubjectsknowwhoisgettingwhattreatment.Thishelpstoensurealltreatmentgroupsaretreatedasequallyaspossible.

iii. Thesubjectsdonotknowwhichtreatmenttheyaregetting.Thishelpstoensurealltreatmentgroupsaretreatedasequallyaspossible.

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 14

Page 15: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

2. Determinewhethereachofthefollowingscenariosisacompletelyrandomizeddesign,arandomizedblockdesign,oramatchedpairsdesign.Justifyyouranswer.

i. Totestfordifferencesbetweenthreebrandsofgasoline,20compactcars,20midsizedcars,20hybridcars,and20luxurycarswillberandomlyassignedoneofthethreegasolinebrands.Allthecarswillbedrivenundersimilarconditions.Thentheirmilespergallon(mpg)willbecalculatedandcompared.

ii. Afarmerwishestostudytheeffectoftwodifferentpesticidesonhisstrawberries.Hewillrandomlyassignside-by-sideplotsoffieldtoreceiveeitherpesticideornopesticide(controlgroup).Thenhewillmeasureandcomparetheyieldofstrawberrieswithinapre-determinedperiodoftimeforeachplot.

iii. Tocomparetheabilitiesofmaleandfemalefirefighters,10maleand10femalefirefighters,matchedaccordingtotheirweights,willbeaskedtorunupfiveflightsofstairswhilecarryingthesamefirehose.Thedifferencesintimeswillbedeterminedandcompared.

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 15

Page 16: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

2.07LurkingandConfoundingVariables

Earlierwediscussedvariablesthatinfluencetheresultsofanexperiment.Twotypesofvariablescanaffecttheresultsofastudy:______________variablesand_____________variables.

Lurkingvariable–Avariablethatisnotincorporatedintothedesignofaresearchstudy

Confoundingvariable–Avariablethatisincorporatedintothedesignofaresearchstudy

Thesetwotypesofvariableshavesimilareffects.Eachdrivesthebehavioroftwoothervariablesobservedinastudy,creatinganapparentassociationbetweenthosetwovariables.However,whentwovariablesareconfounded,theyareintertwinedinsuchawaythatfiguringoutwhichofthem(orperhapswhichcombination)isaffectinganothervariableisahugechallenge.

1. Forthefollowingscenarios,statewhethertheissueiswithlurkingvariablesorconfoundingvariables.

i. Alegislatorproposesareductionoffundingforfiredepartmentsinhisstatebecauseastudyconcludedthatthemorefirefightersinvolvedinafightingafire,theworsedamagethefirecauses.

ii. Astudentconductsresearchonthedifferenceinbodymassindex(BMI)betweeneconomicsstudentsandeducationstudents.HewasinspiredbyapreviousstudythatconcludedthateconomicsstudentshavehigherBMIsthaneducationstudents.Thestudent’sadvisorsuggeststhatheincludebiologicalsexinhisstudy.Thestudentfindsthatthereisnodifferencebetweeneconomicsandeducationstudents’BMIs,challengingpreviousfindings.

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 16

Page 17: Section 2: Data Collection, Sampling, and …Section 2: Data Collection, Sampling, and Experimental Design The following maps the videos in this section to the Texas Essential Knowledge

2.08GeneralizationofResultsandConclusions

Whatresultscanbedrawnfromobservationalstudies,experiments,andsurveys?

________________referstotheextenttowhichthefindingsfromastudycanbegeneralized,orapplied,toalargerpopulation.Itrequires______________selection.

1. Whichofthefollowingdescribeswhenitisappropriatetogeneralizestudyresultstoalargerpopulation?

A. Inexperimentalstudies,whenparticipantsareconvenientlyselectedfromalargepopulation

B. Inexperimentalstudies,whenparticipantsarerandomlyselectedfromalargepopulation

C. Inobservationalstudies,whenparticipantsareselectedfromapoolofinterestedpeople

2. AstudyevaluatestheefficacyofaspecifictreatmentinelderlyCaucasianmenwhohavecoronaryheartdisease.

i. Discusswhencanwegeneralizeorapplytheresultsofthisstudytoalargerpopulation.

ii. SupposethatahealthcenterinyourcitywantstoapplytheresultsofthisstudytodevelopoutreachprogramsandservicesforHispanicwomenandAfrican-Americanmen.Discussthegeneralizedissuesthatmayarisefromthisaction.

3. Supposeapopularradiostationconductedatelephonesurveyfrom11:00a.m.to1:00p.m.toevaluatethepopularityofthestationamongthelocalpopulation.Whatistheproblemwiththegeneralizabilityoftheresultsandconclusionsofthisstudy?

Copyright 2017 Licensed and Authorized for Use Only by Texas Education Agency 17