chapter 3 part1-design of experiments

INTRODUCTION TO STATISTICS & PROBABILITY

Chapter 3: Producing Data

(Part 1)

Dr. Nahid Sultana

1

Chapter 3: Producing Data

Introduction

3.1 Design of Experiments

3.2 Sampling Design

3.3 Toward Statistical Inference

3.4 Ethics

2

Introduction

3

Available Data,

Anecdotal Data

Sample Surveys and Experiments

Observation vs. Experiment

4

Obtaining Data

Available data are data that were produced in the past for some other purpose but that may help answer a present question inexpensively.

The library and the Internet are sources of available data.

Beware of drawing conclusions from our own experience or hearsay or from anecdotal evidence --- data from anecdotes (a short amusing or interesting story about a real incident or person, especially a biographical one)…. They may not be a representative of any larger group of cases.

Some research questions require data produced specifically to answer them. This leads to designing observational or experimental studies.

5

Observational vs. Experimental Studies

Observational study: A study where researcher observed individuals and record information about variable of interest. No treatment is imposed. The purpose is to describe some group or situation. Sample surveys are observational studies

Experimental study: A study where researcher intentionally imposes treatments on individuals and measure their responses to the treatments. Experiments allow researchers to establish “cause and effect” relationship. The purpose is to study whether the treatment causes a change in the response.

6

Observational vs. Experimental Studies (Cont…)

Lurking variable: A variable that affects the relationship between the response variable and the explanatory variable but is not included among the variables studied.

Confounding: Two variables (explanatory variables or lurking variables) are confounded when their effects on a response variable cannot be distinguished.

studying

intelligence

Good grade on test CAUSE?

Confounding?

Well designed experiments take steps to defeat confounding.

7

Population versus sample

Population: The entire group of individuals in which we are interested but can’t usually assess directly.

Example: All humans, all working-age people in California, all crickets

Sample: The part of the population we actually examine and for which we do have data. How well the sample represents the population depends on the sample design.

A parameter is a number describing a characteristic of the population.

A statistic is a number describing a characteristic of a sample.

8

3.1 Design of Experiments

Experimental terminology

Comparative Experiments

Bias

Principles of Experimental Design

Statistical Significance

Matched Pairs Design

Block Design

9

Terminology

The individuals in an experiment are the experimental units. If they are human, we call them subjects.

In an experiment, we do something to the subject and measure the response. The “something” we do is a called a treatment, or factor.

If the experiment involves giving two different doses of a drug, we say that we are testing two levels of the factor.

One group of people may be placed on a diet/exercise program for 6 months (treatment), and their blood pressure (response variable) would be compared with that of people who did not diet or exercise.

10

Terminology (Cont..)

Example:

11

Comparative Experiments

Experimental Units

In the laboratory environment, simple designs often work well.

Outside the laboratory, badly designed experiments often yield worthless results and we can’t tell whether the response was due to the treatment or to lurking variables.

Many laboratory experiments use a simple design :

12

Comparative Experiments (Cont…)

We compare the response to a treatment versus to : Another treatment, No treatment (a control), A placebo Or any combination of the above

A control is a situation where no treatment is administered.

A placebo is a fake/dummy treatment, such as a sugar pill. Many patients respond favorably to any treatment, even a placebo. The response to a dummy treatment is the placebo effect. Perhaps the most famous placebo is the kiss, blow, hug, band aid—whatever your technique—that parents use (quite effectively) for minor injuries in kids.

13

Randomized Comparative Experiments

The design of a study is biased if it systematically favors certain outcomes.

A double-blind experiment is one in which neither the subjects

nor the experimenter(s) know which individuals received which

treatment until the experiment is completed.

However, subjects must be informed that they will get one of a

number of treatments, and must consent to that condition (it

would be unethical otherwise).

14

Randomized Comparative Experiments (Cont…)

The best way to exclude biases from an experiment is to randomize the design. Both the individuals and treatments are assigned randomly.

Experimental Units

Random Assignment

Group 1

Group 2

15


Control: Compare two or more treatments. This will control the effects of lurking variables on the response. 1. Randomize: Use neutral chance to assign experimental units to

treatments. Replication: Repeat each treatment on many units to reduce chance of variation in the results.


An observed effect so large that it would rarely occur by chance is called statistically significant. It tells you that the investigators found good evidence for the effect they were seeking.

16

How to Randomize

The idea of randomization is to assign subjects to treatments by drawing names from a hat.

To randomize an experiment we can use a table of random digits (like Table B) or a statistical software.

The digits appear in groups of five and in numbered rows. The groups and rows have no meaning

17

Using Table B

Suppose we need to select five students from a class of 20 students. Step 1 (Label): Since the class is of 20 students, list and number them as

01, 02 , 03, ….., 20 Step 2 (Table): Start anywhere in the table and read two digit groups.

Each of these two-digit groups is a label. The labels 00 and 21 to 99 are not used in this example, so we ignore them. Continue to the next line. 45 46 71 17 09 77 55 80 00 95 32 86 32 94 85 82 22 69 00 56 52 71 13 88 89 93 07 46 02 ……

18

Completely randomized designs

In a completely randomized experimental design, individuals are randomly assigned to groups, then the groups are randomly assigned to treatments.

Look at the response!!!

19

Blocked Designs

In a block or stratified design, subjects are divided into groups, or blocks, prior to the experiment to test hypotheses about differences between the groups.

The blocking, or stratification, here is by gender

20

Matched Pairs Designs

A common type of randomized block design for comparing two treatments is a matched pairs design.

Matched pairs:

Choose pairs of subjects that are closely matched—e.g., same sex, height, weight, age, and income.

Within each pair, randomly assign who will receive which treatment.

The idea is that matched subjects are more similar than unmatched subjects, so that comparing responses within a number of pairs is more efficient than comparing the responses of groups of randomly assigned subjects.

Is there a significant difference in resting pulse rates for men and for women? A random sample of 28 men and 24 women had their pulse rate measured at rest in the lab.

Many dairy cows now receive injections of BST, a hormone intended to spur greater milk production. The milk production of 60 dairy cows was recorded before and after they received a first injection of BST.

What experimental design?

In a study of sickle cell anemia, 150 patients were given the drug hydroxyurea, and 150 were given a placebo (dummy pill). The researchers counted the episodes of pain in each subject at the end of the study.

21

chapter 3 part1-design of experiments

Education

sample design

data available data

laboratory experiments

designed experiments

data introduction

months treatment

fakedummy treatment

experiments observation