math stat course: making incremental changes
DESCRIPTION
Math Stat Course: Making Incremental Changes. Mary Parker University of Texas at Austin. Intro to Mathematical Statistics (M378K at University of Texas). Prerequisite: Probability course (which is required for all math majors) - PowerPoint PPT PresentationTRANSCRIPT
Math Stat Course: Making Incremental Changes
Mary Parker
University of Texas at Austin
Intro to Mathematical Statistics (M378K at University of Texas)
Prerequisite: Probability course (which is required for all math majors)
Students: Math majors, actuarial students, other science majors
Previous statistics courses? Some took an applied stat course, either freshman course or after probability course. Some didn’t.
Math Stat Topics
Sampling distributions of statistics Estimation of parameters: confidence intervals, method
of moments estimation, maximum likelihood estimation, comparison of estimators using mean square error and efficiency, sufficient statistics
Hypothesis tests: p-values, power, likelihood ratio tests Distributions used include normal, binomial, Poisson,
uniform, gamma, beta, t, F, chi-squared, and other standard distributions.
Other topics as time permits.
Some students took this course:
M358K Applied Statistics descriptionNew course in the last five yearsPrerequisite: Probability course. Taken by math majors with concentration in secondary school teaching, statistics, and some others. If they take both M358K and M378K, they are encouraged to take M358K first.
Introduction of this course has not decreased enrollment in M378K and some students who take this new course and didn’t plan to take more statistics do go on to M378K.
Questions
MAIN: How can a teacher who doesn't have the time/inclination to completely revamp her course make incremental changes that will better prepare students to understand and use contemporary statistics techniques?
Preliminary: What aspects of the reform of the first course are also appropriate for the math stat
course? What should we preserve in the current math stat course so that it continues to
give mathematically sophisticated students a strong foundation in statistics? What additional tools and techniques of theoretical statistics should be introduced
at this level? Within twenty years, when all students will be using the equivalent of a
Mathematica-level program, what can/should we be teaching in theoretical statistics courses?
Incrementally changing Math Stat
Focus on assumptions throughout. Check assumptions. Mention alternative techniques if assumptions not
met. Discuss robustness of methods. Briefly introduce nonparametric statistics and
Bayesian inference to illustrate different assumptions / framework.
Have students do explorations.
What explorations?
Main idea: Simulate and explore sampling distributions of various statistics. Use to illustrate theoretical ideas and to check on robustness of procedures.
Preliminary idea 1: Create a complete sampling distribution themselves and check its properties to see that they agree with the theoretical results.
Preliminary idea 2: Think of some interesting estimators to investigate. (See that there are more possible estimators for a parameter than the sample mean.)
Why explorations?
Explorations help make the theory concrete
Robustness of statistical techniques: The concept seems strange to math students and they appreciate tools to explore it on their own.
Simulate and explore a sampling distribution1. The population is the numbers of potatoes in a 5-lb sack of potatoes
from a certain company. Assume the counts are distributed as discrete uniform, from 12 potatoes to 18 potatoes. Choose a reasonable sampling method and construct the sampling distribution of the sample mean for samples of size 2.
2. Find the mean and variance of the population and then find the mean and variance of the sampling distribution.
3. Comment on the results, based on your theoretical understanding from the formulas we proved about the mean and variance of a sample mean.
4. Discuss what would be different for samples of size 9. 5. Investigate the sampling distribution of the sample range.
Strategy
Given very early in the semester. Student groups of 2-3. Grading and instructions encourage students to
think about it over a couple of weeks without spending much time on it at first, BECAUSE
This assignment is not as well-defined as it looks for many students.
Difficulties often encountered
Should (13,14) be a different element of the sample space from (14,13)?
Should I sample with replacement or without replacement? Why?
When computing the standard deviations here, is the denominator n or n-1?
Extensions
Sampling without replacement: what changes? What does that tell us about the language/formulas of our text? (independence of samples)
Where could we find the equivalent formulas to those in our text for sampling without replacement? What’s different?
Constructing various estimators
“German Tank Problem”Assume German tanks had consecutive ID numbers from 001 to ???.
Need to estimate the number of the population of German tanks (max ID in the population,) based on the IDs from the sample of tanks we have captured.
In groups, think of at least three different reasonable estimators. Then draw a sample of size 5 from my “population of German tank IDs” in the envelope. Give your three estimates.
Use a computer to simulate the three sampling distributions
Strategy
Done in class before beginning to talk about estimation.
Usually students will use (1) two times the mean,(2) the maximum, and then, after a bit of time, will come up with something else.
Students will need help simulating the sampling distributions. Again, arrange the timing/grading to encourage them to think about it and discuss it before spending a lot of time doing it.
Difficulties in simulating sampling distributions How do you describe the original population to the
computer? (Discrete uniform on 1 to 600, maybe) Is it fairly easy to obtain a random sample from
that distribution in your software? (If not, find other software!)
Distinguish between the sample size and the number of points from the sampling distribution.
What should you do with the sampling distribution?
Looking at sampling distributions
What should you look at to summarize a sampling dist’n? (histogram, summary statistics)
Is it close to normally distributed? (Discuss normal scores plots.)
(More advanced) Is it close to a __ dist’n? (Make available information about probability plots in more generality.)
If the statistic is unbiased, what characteristic will the sampling dist’n have? (If yours doesn’t have the mean exactly what it’s supposed to, is that because you made an error? Why or why not?)
Focus on Assumptions
Checking assumptions for typical normal-theory techniques Already discussed normal probability plots Discuss what types of deviations from assumptions
cause problems for a particular technique and why In two-sample t procedures, help them see exactly
why equal variance assumption is more popular among theorists than those working in applications.
Robustness Central Limit Theorem. Explorations of various types
of distributions – how large must n be?
Focus on Assumptions II
Nonparametric techniques Sign test, signed rank test, and rank-sum test Compare results with those from t-test for some examples to
further illustrate conditions for robustness of t-tests Bayesian statistics
Very brief introduction, contrasting assumptions of frequentist and Bayesian approaches
Do examples from binomial or normal with conjugate priors and indicate that choosing the prior mean and variance gives quite a lot of flexibility
Mention that using more general, non-conjugate priors leads to the need for more computationally-intensive methods
Actual assignments
Construct a sampling distribution German tank problem Simulating sampling distributions in MINITAB
Find the actual assignments and supporting material at the website listed on the handout for this session
http://www.ma.utexas.edu/users/parker/jsm04/
Right now, click here