stats 250 lab 2 julie ghekas [email protected] september 15, 2014
TRANSCRIPT
Schedule
• Recap from first lab and prelab
• Warm Up
• Lab 2
• Cool Down
• iClicker Questions
Lab Workbook for Fall 2014
• open.umich.edu/
– Search Stats 250
– Select Materials Tab for the course
– Scroll down to have access to handouts, labs, lecture notes,
etc.
• open.umich.edu/education/lsa/statistics250/fall2015/
materials#Labs
• Or order from Amazon (~$10)
• Recommend taking notes in updated personal workbook
Prelab Results
Right skewed: Mean > Median
Left skewed: Median > Mean
Symmetric: Mean = Median
Remember, the mean is more sensitive to outliers than the median.
Recap for Homework
• Practice homework graded to the
same standard of regular homework
• Answer the question fully
• Put your name on generated graphs
• Show all work
• Include units
Boxplots
• Graph of 5-number summary
• Outliers denoted with ° and *
• Can be side-by-side
• Does not show shape of distribution
Bar charts
• Displays categorical variables
• Y-axis represents counts, proportions,
or percentages
• Can rearrange bars in any order
Time Series/Sequence Plots
• Examining data over time
• Checks assumption that observations
are from an identically distributed
population
• Be careful; time series can be
displayed in different formats
Time Series
Source: http://www.statcan.gc.ca/edu/power-pouvoir/ch9/bargraph-diagrammeabarres/5214818-eng.htm
Source: http://blogs.bgsu.edu/statgraphicsmepaler/2013/03/22/new-havens-temperature-in-4-different-time-series-plots/
Time Series: Trends• A trend is a consistent, long-term rise
or fall.
Time Series: Variation
• Generally, variation is used to
describe patterns in the data.
Seasonal VariationIncreasing Variation
Time Series: Stability
• If there are no patterns in the time plot, then
it is considered stable.
• Stability helps us confirm or reject the
identically distributed part of iid/random
sample. In order for data to be considered
stable, both the mean and the variance of the
observations needs to be constant over time.
Q-Q Plots
• Checks assumption that observations are
from a normally distributed parent population
• Q stand for quantiles (percentiles): graph
compares Quantiles from the standard normal
distribution with Quantiles from our sample
• Want a straight line
• Better than a histogram
Q-Q Plot of Data from an Approximately Normal Distribution
Q-Q Plots that do NOT allow us to assume a population with Normal Distribution
R scripts
• Canvas homepage -> R tutorials
• Open timeseries.rdata or qqplot.rdata
– Canvas homepage -> R tutorials ->Time
Series/QQ plots
• Start script with timeseries() or qqplot()
– Without the underscore printed in the lab
workbook
Warm Up
Lab
• With a partner or two, work on the Lab
• You will not get credit if you work alone
• Work with employee data.sav
• If you finish early, complete the Cool
Down, R practice, Example Exam
Question, or Practice HW problem 3
IQR=$13,162.50
IQR=$7,125.00
IQR=$16,200.00
Cool Down
• Everyone turns in own ticket
• Work on Cool Down in groups
iClicker
Survey: Students were asked how many hours they study in
a typical week. A five-number summary of the responses is:
2, 10, 14, 20, 60
Fill in the blank: About 75% of the students spent at least
___ hours studying in a typical week.
A. 10
B. 14
C. 20
D. 45
iClicker
Survey: Students were asked how many hours they study in
a typical week. A five-number summary of the responses is:
2, 10, 14, 20, 60
Fill in the blank: About 75% of the students spent at least
___ hours studying in a typical week.
A. 10
B. 14
C. 20
D. 45
iClicker
Survey: Students were asked how many hours they study in
a typical week. A five-number summary of the responses is:
2, 10, 14, 20, 60
What percent of students reported studying between 10
and 20 hours in a typical week?
A. 68%
B. 50%
C. 25%
D. 75%
iClicker
Survey: Students were asked how many hours they study in
a typical week. A five-number summary of the responses is:
2, 10, 14, 20, 60
What percent of students reported studying between 10
and 20 hours in a typical week?
A. 68%
B. 50%
C. 25%
D. 75%
iClicker
Which of the following provides the most
information about the shape of a data set?
A. Boxplot
B. Pie chart
C. Five number summary
D.Histogram
iClicker
Which of the following provides the most
information about the shape of a data set?
A. Boxplot
B. Pie chart
C. Five number summary
D.Histogram
iClickerHere is a graph showing revenue for a
company. What kind of graph is this?
• A. Bar chart
• B. Histogram
• C. Time plot
• D. Box plot
0
5
10
15
20
25
Re
ve
nu
es
($
billi
on
s)
Actual Revenue for Eastman Kodak
iClickerHere is a graph showing revenue for a
company. What kind of graph is this?
• A. Bar chart
• B. Histogram
• C. Time plot
• D. Box plot
0
5
10
15
20
25
Re
ve
nu
es
($
billi
on
s)
Actual Revenue for Eastman Kodak
iClickerCost of a gallon of gas
Can we describe this graph as…
A. Right Skewed B. Left Skewed C.
Neither
iClickerCost of a gallon of gas
Can we describe this graph as…
A. Right Skewed B. Left Skewed C.
Neither
iClicker
In a boxplot, what does a dot represent?
A. Quartile
B. Mean
C. Median
D. Mode
E. Outlier
iClicker
In a boxplot, what does a dot represent?
A. Quartile
B. Mean
C. Median
D. Mode
E. Outlier
iClicker
Which time plot is the most stable?
A.
B.
C.
iClicker
Which time plot is the most stable?
A.
B.
C.
iClicker
How do you feel about the material covered today?
A. Completely understood everything
B. Understood main ideas, shaky on details
C. Good for the first half, lost for the second
D. Had trouble with some main ideas
E. Difficulty following most materials
Reminders• Good job setting up LectureBook
• Practice Homework due Thursday 8
am
• Pre-lab 3 due Monday 8 am
• Office Hours
• Food Allergies?