chapter 1 statistical thinking what is statistics? why do we study statistics

Post on 03-Jan-2016

234 Views

Category:

Documents

6 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Chapter 1 Statistical Thinking

•What is statistics?

•Why do we study statistics

Statistical Thinking

• the science of collecting, organizing, and analyzing data

• the mathematics of the collection, organization and interpretation of numerical data

• The branch of mathematics which is the study of the methods of collecting and analyzing data

• a branch of applied mathematics concerned with the collection and interpretation of quantitative data and the use of probability theory to estimate population parameters

Statistical Thinking

Statistics is a discipline which is concerned

with: – designing experiments and other data

collection,– summarizing information to aid

understanding, – drawing conclusions from data, and – estimating the present or predicting the

future.

Statistical Thinking

• "I like to think of statistics as the science of learning from data...." Jon Kettenring, ASA President, 1997

• Steps of statistical analysis involve: – collecting information (Data Collection)– evaluating the information (Data Analysis)– drawing conclusions (Statistical Inference)

Statistical Thinking • What type of information?

– A test group's favorite amount of sweetness in a blend of fruit juices

– The number of men and women hired by a city government

– The velocity of a burning gas on the sun's surface

– Clinical trials to investigate the effectiveness of new treatments

– Field experiments to evaluate irrigation methods

– Measurements of water quality

Statistical Thinking

Problems

• Is a new treatment for heart disease more effective than a standard one?

• Is using a high octane gas beneficial to car performance?

• Does reading an article in statistics improve students’ statistics grade?

Statistical Thinking

• Is a new treatment for heart disease more effective than a standard one?

– Pick, say, 100 heart patients

– Divide them into two groups, 50 in each group

– Group 1------------New treatment

– Group 2------------Standard treatment

Statistical Thinking

Results

• 40 out of 50 of Group 1 patients improved

• 30 out of 50 of Group 2 patients improved

• Conclusion: New treatment is more effective!

Statistical Thinking

• How do you divide the patients?

• Have you controlled other factors? (fitness level, life style, age, etc)

• How do you decide who gets what treatment? Ethical issues????

Statistical Thinking

Comparing Test Scores

• Select 10 students and give them a journal article in statistics.

• Test their knowledge about the article and record their scores

• Repeat the test after they take STT 231.

Statistical Thinking

Result

• 8 out of the 10 students improved their scores.

• Question: Can we conclude that reading the article has improved students’ knowledge about statistics?

Statistical Thinking

Look at worst case scenarios: “Under the assumption that the new treatment is no better than the standard one,what is the chance that 80% of the patientsbenefit from this treatment?”

“Under the assumption that STT 231 bringsno benefit, how likely is it that we see 80%of the students improve their scores? “

Statistical Thinking

Need a model to answer these questions!!

If STT 231 is not beneficial, then students’scores may go up or down with 50%chance.

This is equivalent to flipping a coin: • 50% chance you get Head• 50% chance you get Tail

Statistical Thinking

• Comparing pre and post test scores for 10 students is equivalent to – flipping a coin 10 times and calculating the chance of

observing 8H

• Relevant Questions: – Will the chance of observing 80% of the time H

depend on the number of students involved in the experiment?

– Will this chance go up, down or remain the same if you repeat the experiment with 200 students?

Statistical Thinking

• Suppose the proportion of improvement in 10 trials is 4.4%. What does this mean?

– If STT 231 is not beneficial, then there is a 4.4%chance that we will observe 8 out of 10 students’ scores improve.

– There is little hope that 8 students’ scores will improve by just by CHANCE

Statistical Thinking

• Suppose the proportion of improvement in 10 trials is 4.4%.

• We observed 8 students’ scores out of 10 improve.

• What does this mean?

Statistical Thinking

• Course is highly effective

• Course is ineffective and we observed an unlikely event.

• We do not know which one!

Statistical Thinking

• Suppose there is a “small” chance that an event happens by CHANCE,

• Then this is an indication for a strong evidence that the change that we observe did not happen by CHANCE.

• Hence there is a strong evidence for a factor to be responsible for this change.

Statistical Thinking

• The course is highly effective!!

• Reasoning: What we observed is very unlikely if the course was ineffective. Hence the course is effective.

• The 80% score increment is unlikely to be achieved if the course was ineffective.

Statistical Thinking

Some Remarks

For questions that involve uncertainty:– Carefully formulate the question you want to answer

(Modeling)– Collect Data– Summarize, analyze and present data– Draw Conclusions. Conclusions always include

uncertainty– Support your conclusions by quantifying how

confident you are about your conclusions.

Chapter 2 A Design Example

The Polio Vaccine Case

• Caused by virus

• Especially deadly in children

• Big problem during the first half of the 20th Century

• Develop vaccine to fight the disease

• Jonas Salk (~1950)

A Design Example

• Problem with vaccines: – Are they safe? – Are they effective?

• Undertake a large scale trial to answer these questions

A Design Example

• Case 1: A Simple Study– Distribute the vaccine widely (under the

assumption it is safe)

– Decrease in the number of polio cases after the vaccine provides evidence that the vaccine is effective

• Problem?????

A Design Example

Problems

• Lack of control group– Is decrease in number of polio due to the

vaccine or other factors?

• How reliable is the assumption “vaccine is safe”?

A Design Example

• Case 2: Adding a Control Group– Have two groups

• Control group-----gets salt solution

• Treatment group---gets the actual vaccine

A Design Example

• Example (Observed Control Study)

– Control Group---all 1st and 3rd grade children

– Treatment group---all 2nd graders

• Assumption: – Age difference between control and treatment

group was felt to be unimportant

A Design Example

• Potential Problems:– Parents of 2nd graders may not agree to

vaccinating their kids – Parents of sicker kids are most likely to

accept the vaccine– More educated parents tend to accept the

vaccine– Parents of sick 1st and 3rd graders may object

that their kids are not getting treatment

A Design Example

• Difficulty in diagnosing polio

– Extreme case of polio are easy to diagnose

– Less severe cases of polio have symptoms similar to other common illnesses

A Design Example

• Potential Problems– Physicians are aware of who has received the

vaccine and who has not – Less severe case of polio in a 2nd grader (who

has received the vaccine) may wrongly diagnosed as another illness

– Less severe case in a 1st or 3rd grader will most likely be diagnosed as polio

A Design Example

• Case 3: Randomization, Placebo Control, Double Blindness

– Random assignment of control and treatment groups

• Select a child

• Flip a coin-------H-------Treatment Group T---------Control Group

Design Example

• Placebo Control

– Kids in the control group receive salt solution

• Double Blind

– Neither the child – nor the parents – nor the doctors/nurses

who make the diagnosis of polio know whether akid receives the vaccine or the placebo

A Design Example

Summary• In designing experiments

– Introduce some sort of control group

– Use randomization to avoid bias in selection and assignment of subjects for the study

– Double blind experiments give protection against biases, both intentional and unintentional

A Design Example

• Perform the experiment on a large number of subjects (Polio case ~in millions of kids)

• Repeat the experiment several times before making definitive conclusions

A Design Example

Basic Principles of Experimental Designs

• Randomization

• Blocking (Treatment/Control Groups)

• Replication

top related