question 4 what are data and what do they mean to a scientist?
Post on 20-Jan-2016
Embed Size (px)
QuestionWhat are data and what do they mean to a scientist?
Dinner at the Urquhart HouseBrought to you by the Briggs Multiracial AllianceSunday nightAll food provided (probably Chinese)Contact Mimi Reddy, firstname.lastname@example.org for details
Data, Statistics, and SpreadsheetsWhat are data?What are statistics?What are spreadsheets?How can you analyze data with spreadsheets?
DataData are pieces of informationData can be numbers, words, descriptionsData have UNITSThe word data is PLURAL, datum is singularData about Willoughby: Age: 5 (years)Height: 47 (inches)Weight: 66 (pounds)Eyes: BlueFavorite word: WrestleFavorite letter: W
Types of DataNumbers two typesReal #s rational numbers 28.75 lbsIntegers whole numbers 18 monthsLetters called characters in programmingW is a characterWords called strings in programmingNo thanks is a strings, can be individual words or phrases
Statistics and DataTest Scores: Jeff: 88Mollie: 92Marcie: 88Dave: 47Karim: 99Willoughby: 42Benjamin: 0What statistics can you calculate to describe these data?
Try to think of four things to describe the datastop
StatisticsStatistics are derived from the data Statistics are descriptions of dataStatistics are meant to simplify the dataStatistics can be misleading
Typical StatisticsSample Size - number of individuals measured = nSum = SAverage or Mean = S/nMedian Value of 50th percentile, half of values fall above, half belowMaximum, Minimum, Range (Max-Min)Mode - most common valueStandard deviationVariance (SD2)
Analyze these data...Mean, max, min, range, median, mode1833447493829455sample size (n)
mean=average=S/ndenoted xmedian = halfway
mode = most common
SpreadsheetsSpreadsheets are tables
Spreadsheets allow calculations and manipulations of dataCalculations: mean, standard deviationManipulations: sort,
Make a data table:Fly 1, length 13.4 mm, velocity 27 Kph, age 21 daysFly 2, length 9.4 mm, velocity 0 Kph, age 220 daysFly 3, length 9.3 mm, velocity 44 Kph, age 1 daysFly 4, length 13.4 mm, velocity 17 Kph, age 32 daysFly 5, length 17.4 mm, velocity 33 Kph, age 11 days
How many columns?How many rows? #s go down or across?
Microsoft ExcelTypical spreadsheet programLotus 1-2-3 is original commercial spreadsheetHas similar controls to MS WordNow allows graphing (charts) very restricted formats, hard to get exactly what you wantExcel tables and graphs can be copied into MS Word
Fridays AssignmentWe will work with Microsoft Excel to analyze some dataGroups of two will submit one finished spreadsheet for the assignment
GraphsMany different types of graphsPointsLinesBarsPies
Point GraphsCalled X-Y Scatter in MS ExcelPlot points based on X and Y valueCan fit a REGRESSION LINE to the dataLine that best fits the data
Bar GraphsCategorize data into counts or percentsCategories can be descriptive categories (Windows 98, Windows 2000, )Can also be numeric categories Height: 60-63, 63-66, etc. or just 61, 62, 63Count up number of people in each groupHistograms are a particular type of bar graph
HistogramX axis is categoriesY axis is a number or proportion of observations in that category
Histogram Bar GraphNumber of Crashes
Regular Bar Graph vs. Histogram Bar Graph
DistributionsSpecial type of histogram with continuous numeric scale at bottomNormal distribution is a key concept in statisticsSkewed distribution is one that is unbalanced
Sample distribution histogramsDanyoungyoo, Katanchalee, and Srichawla, www.s-t.au.ac.th/handout/st2204/week5-Univariate-Des.pptRobert D. Duval, PS 400 Lecture, www.polsci.wvu.edu/duval/ps400/Notes/400Notes.ppt
The NORMAL DistributionA NORMAL DISTRIBUTION is the theoretical distribution of values given natural variation around a MEANIt is balanced, humped distribution
DistributionsSkew is an imbalance in the distributionDanyoungyoo, Katanchalee, and Srichawla, www.s-t.au.ac.th/handout/st2204/week5-Univariate-Des.ppt
Hypothesis TestingStatistical Tests are how scientists decide if data support their hypothesis (NOT PROVE their hypothesis)Four major statistical tests: T-test, X2 Test, Regression, ANOVA
HypothesisProcessor speed has an effect on the performance of the computer.Null HypothesisH0: Processor speed has NO EFFECT on the performance of a computer.
Statistical Tests and ProbabilityStatistical tests give a valueThat value can be related to a probabilityProbability is likelihood that NULL hypothesis is correct given the data you haveIf P < 0.05 (1/20), then you conclude NULL hypothesis is FALSE
T-TestCompares differences between two means
Formula: T = (x1-x2)/SEMSEM is Standard Error of Mean [SD/(N-1)]T Values: Difference between mean in comparison to the amount of spread in your data
T-ValuesIf T > 2.5 or 3.0, difference is usually significant (this depends on your sample sizes)