Question 4 What are data and what do they mean to a scientist?

Download Question 4 What are data and what do they mean to a scientist?

Post on 20-Jan-2016

215 views

Category:

Documents

0 download

TRANSCRIPT

QuestionWhat are data and what do they mean to a scientist?Dinner at the Urquhart HouseBrought to you by the Briggs Multiracial AllianceSunday nightAll food provided (probably Chinese)Contact Mimi Reddy, reddydee@msu.edu for detailsData, Statistics, and SpreadsheetsWhat are data?What are statistics?What are spreadsheets?How can you analyze data with spreadsheets?DataData are pieces of informationData can be numbers, words, descriptionsData have UNITSThe word data is PLURAL, datum is singularData about Willoughby: Age: 5 (years)Height: 47 (inches)Weight: 66 (pounds)Eyes: BlueFavorite word: WrestleFavorite letter: WTypes of DataNumbers two typesReal #s rational numbers 28.75 lbsIntegers whole numbers 18 monthsLetters called characters in programmingW is a characterWords called strings in programmingNo thanks is a strings, can be individual words or phrasesStatistics and DataTest Scores: Jeff: 88Mollie: 92Marcie: 88Dave: 47Karim: 99Willoughby: 42Benjamin: 0What statistics can you calculate to describe these data?Try to think of four things to describe the datastopStatisticsStatistics are derived from the data Statistics are descriptions of dataStatistics are meant to simplify the dataStatistics can be misleadingTypical StatisticsSample Size - number of individuals measured = nSum = SAverage or Mean = S/nMedian Value of 50th percentile, half of values fall above, half belowMaximum, Minimum, Range (Max-Min)Mode - most common valueStandard deviationVariance (SD2)Analyze these data...Mean, max, min, range, median, mode1833447493829455sample size (n)Sum Smean=average=S/ndenoted xmedian = halfwaymode = most commonSpreadsheetsSpreadsheets are tables Spreadsheets allow calculations and manipulations of dataCalculations: mean, standard deviationManipulations: sort, CostaRicaNicaraguaRainforest625,0003,712,000Dry Forest50,000300,000Total675,0004,012,000Make a data table:Fly 1, length 13.4 mm, velocity 27 Kph, age 21 daysFly 2, length 9.4 mm, velocity 0 Kph, age 220 daysFly 3, length 9.3 mm, velocity 44 Kph, age 1 daysFly 4, length 13.4 mm, velocity 17 Kph, age 32 daysFly 5, length 17.4 mm, velocity 33 Kph, age 11 daysHow many columns?How many rows? #s go down or across?Data TableMicrosoft ExcelTypical spreadsheet programLotus 1-2-3 is original commercial spreadsheetHas similar controls to MS WordNow allows graphing (charts) very restricted formats, hard to get exactly what you wantExcel tables and graphs can be copied into MS WordFridays AssignmentWe will work with Microsoft Excel to analyze some dataGroups of two will submit one finished spreadsheet for the assignmentGraphsMany different types of graphsPointsLinesBarsPiesPoint GraphsCalled X-Y Scatter in MS ExcelPlot points based on X and Y valueCan fit a REGRESSION LINE to the dataLine that best fits the dataX-Y ScatterBar GraphsCategorize data into counts or percentsCategories can be descriptive categories (Windows 98, Windows 2000, )Can also be numeric categories Height: 60-63, 63-66, etc. or just 61, 62, 63Count up number of people in each groupHistograms are a particular type of bar graphBar GraphChart136000380003950041000430004500047000Starting SalarySheet1YearStarting Salary1988$36,0001989$38,0001990$39,5001991$41,0001992$43,0001993$45,0001994$47,000Sheet1Starting SalarySheet2Sheet3HistogramX axis is categoriesY axis is a number or proportion of observations in that categoryHistogram Bar GraphNumber of CrashesRegular Bar Graph vs. Histogram Bar GraphChart136000380003950041000430004500047000Starting SalarySheet1YearStarting Salary1988$36,0001989$38,0001990$39,5001991$41,0001992$43,0001993$45,0001994$47,000Sheet1Starting SalarySheet2Sheet3DistributionsSpecial type of histogram with continuous numeric scale at bottomNormal distribution is a key concept in statisticsSkewed distribution is one that is unbalancedSample distribution histogramsDanyoungyoo, Katanchalee, and Srichawla, www.s-t.au.ac.th/handout/st2204/week5-Univariate-Des.pptRobert D. Duval, PS 400 Lecture, www.polsci.wvu.edu/duval/ps400/Notes/400Notes.pptThe NORMAL DistributionA NORMAL DISTRIBUTION is the theoretical distribution of values given natural variation around a MEANIt is balanced, humped distributionDistributionsSkew is an imbalance in the distributionDanyoungyoo, Katanchalee, and Srichawla, www.s-t.au.ac.th/handout/st2204/week5-Univariate-Des.ppt Hypothesis TestingStatistical Tests are how scientists decide if data support their hypothesis (NOT PROVE their hypothesis)Four major statistical tests: T-test, X2 Test, Regression, ANOVAHypothesisProcessor speed has an effect on the performance of the computer.Null HypothesisH0: Processor speed has NO EFFECT on the performance of a computer.Statistical Tests and ProbabilityStatistical tests give a valueThat value can be related to a probabilityProbability is likelihood that NULL hypothesis is correct given the data you haveIf P < 0.05 (1/20), then you conclude NULL hypothesis is FALSET-TestCompares differences between two meansFormula: T = (x1-x2)/SEMSEM is Standard Error of Mean [SD/(N-1)]T Values: Difference between mean in comparison to the amount of spread in your dataT-ValuesIf T > 2.5 or 3.0, difference is usually significant (this depends on your sample sizes)