problems with science

68
Problems with Science

Upload: orea

Post on 04-Feb-2016

41 views

Category:

Documents


0 download

DESCRIPTION

Problems with Science. “The first principle is that you must not fool yourself.” – Richard Feynman. The final. Final Exam. 17 December 2013 (Tuesday) 18:30-20:30 In the Gymnasium 20 Questions All short answer 5 marks each Worth 20% of the course grade. Problems with science. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Problems with Science

Problems with Science

Page 2: Problems with Science

“The first principle is that you must not fool yourself.” – Richard Feynman

Page 3: Problems with Science

THE FINAL

Page 4: Problems with Science

Final Exam

• 17 December 2013 (Tuesday)• 18:30-20:30• In the Gymnasium• 20 Questions• All short answer• 5 marks each• Worth 20% of the course grade

Page 5: Problems with Science

PROBLEMS WITH SCIENCE

Page 6: Problems with Science

Replication

In science, we require that our results be reproducible.

In a scientific article about an experiment, scientists are forced to describe every detail of what they did, so that someone else can do the same experiment and get the same result.

Page 7: Problems with Science

Replication

This is called replication. It is a basic principle of the scientific method. If a finding cannot be replicated, then we must reject it.

Page 8: Problems with Science

Does Replication Happen?

• Biotech firm Amgen tried to reproduce 53 “landmark” cancer studies, but only reproduced 6.

• Drug company Bayer tried to reproduce 67 different studies, but only reproduced 17.

• In the decade 2000-2010, there were 80,000 people involved in studies that later could not be replicated.

Page 9: Problems with Science

Why? What’s wrong?

Page 10: Problems with Science

FALSE POSITIVES

Page 11: Problems with Science

Classic Article

In a classic article titled “Why Most Published Research is False,” John Ioannidis found:

“Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true.”

Page 13: Problems with Science
Page 14: Problems with Science

False Positives

Why do we get 5% false positives?

In science we require p < .05. If the null hypothesis is true, we would obtain these results only 1 in 20 times, or 5%.

So 5% of our results involve accidental correlations. Repeating the experiment is unlikely to result in the same accident.

Page 15: Problems with Science

False Negatives

And why can there be more than 5% false negatives?

There’s no cap on false negatives: you can’t punish people for not finding the truth: it’s difficult!

But what this does mean is that most published findings are false.

Page 16: Problems with Science

Solving the Problem

Remember that p = .05 is the maximum p-value that scientific journals will accept. There’s no reason you can’t have p = .01 (one percent false positives) or p = .001 (one in 1000 false positives).

How do you do that? Just have more people in your experiment.

Page 17: Problems with Science

FINDING YOUR HYPOTHESES IN THE DATA

Page 18: Problems with Science

Finding the Hypothesis in the Data

There’s a difficult-to-understand fallacy in science that goes by different names– “finding hypotheses in the data,” “the problem of multiple comparisons,” “hypothesis fishing” etc.

Page 19: Problems with Science

Random Correlations Everywhere

Suppose I decide to test a new drug I made. I don’t have any idea what it does or doesn’t do. I’m just going to give the drug to the experimental group and a placebo to the control group, then see what happens.

Page 20: Problems with Science

Questionnaire

• How much have you slept in the past two weeks?• How much sex did you have?• Have you had any headaches? How many?• Did you find yourself getting angry for no reason?• How easy or hard did you find it to concentrate?• Do you have more or fewer pimples?• What is your blood pressure?• What’s 14723 plus 9843?

Page 21: Problems with Science

Finding the Hypothesis in the Data

By pure random chance, if I ask enough questions, there will be an accidental correlation between the experimental group and some answers that is not found in the control group.

Look! My drug gives you increased mathematical ability!

Page 22: Problems with Science

Use New Data!

It’s not unusual to find accidental correlations in data.

So in science we require that hypotheses be tested by new data. After I decide that my drug gives people better math skills, I then need to do a new experiment. This is to avoid the problem of multiple comparisons.

Page 23: Problems with Science

Why New Data Is Important

It would be really unlikely if (i) I propose a correlation(ii) I test it against some new data(iii) The new data confirm the correlation(iv) All of that was just an accident

Compare this to the fact that it is really likely to find random correlations in the data.

Page 24: Problems with Science

fMRI

fMRI = functional Magnetic Resonance Imaging. It’s a way of measuring change in blood flow in the brain, that allows us to get an understanding of change in brain activity.

Page 25: Problems with Science

fMRI Neuroscience

Common neuroscience involving fMRI might go something like this: I put a bunch of people in fMRI machines, and have them look at various pictures.

Page 26: Problems with Science

Information Processing

When they look at pictures of happy things, like smiling babies, double rainbows, cute puppies, or whatever, I might notice that certain parts of their brains are active (and not active when I’m not showing them these pictures).I might then conclude that these parts of the brain (the active ones) are responsible for processing information about happy things.

Page 27: Problems with Science

Multiple Comparisons

But this methodology is ripe for the problem of multiple comparisons.

There are lots of areas of the brain and lots of different aspects of any picture. If I look at all the areas of the brain and all the aspects of the pictures, I will find many correlations totally by random chance.

Page 28: Problems with Science

The Proof

Page 29: Problems with Science

Dead Fish

Craig Bennett is a neuroscience graduate student. He wanted to test out his fMRI machine, so he bought a whole dead salmon.

He put the dead salmon in the machine and showed it “a series of photographs depicting human individuals in social situations.”

Page 30: Problems with Science

Experimental Design

The salmon “was asked to determine what emotion the individual in the photo must have been experiencing.”

Then Bennett looked to see whether there were correlations between changes in the blood flow in the salmon’s brain, and the pictures.

Page 31: Problems with Science

Correlations!

Unsurprisingly, there were.

16 out of 8,064 voxels (volumetric pixels) were correlated with picture-viewing.

The important thing is that lots of neuroscientists use these same methods for humans. The risk of error is great.

Page 32: Problems with Science

Solving the Problem

This problem is easily solved: don’t find your hypotheses in your data.

Well… it’s not that easy. You have to convince neuroscientists to behave!

Page 33: Problems with Science

MASSAGING THE DATA

Page 34: Problems with Science

Cheating in Science

There are lots of ways to cheat in science. If you want your study to show that antidepressants do better than placebos, you can not double blind your studies, or use improper randomization techniques (this is obvious to real scientists, though).

Page 35: Problems with Science

You can also:• Only correct the baseline when it suits you.• Ignore dropouts.• Remove outliers when it suits you.• Choose a statistical test that gets the best

results.• Publish only positive findings.

Page 36: Problems with Science

The Baseline

Often, studies don’t have the power we would ideally desire. Remember that for a 95% confidence interval of 6%, we estimated that we’d need 1,000 subjects in our study.

But if you’re studying a new drug, how do you find 1,000 people who need it in your area who are willing to sign up for your trial?

Page 37: Problems with Science

The Baseline

Scientists often test much smaller groups, and then aggregate (put together) all the data later. This is called metaanalysis, and we’ll be discussing it later.When you have a small group of people– for example 20 or 30, there is a high probability that by random chance either the control group or the experimental group will be doing better.

Page 38: Problems with Science

The Baseline

This is called “the baseline.”

If you’re testing a pain medication, for example, the control group might– merely as a matter of chance– have a higher degree of average pain than the experimental group.

They have a higher “baseline” degree of pain.

Page 39: Problems with Science

Controlling for the Baseline

You can “control for the baseline” by testing how much people’s pain improved over the course of the trial, instead of just testing how much pain they’re in at the end of the trial.

Page 40: Problems with Science

Not Controlling

The average pain score in the control group was 65 when the experiment started, and 52 for the experimental group. Nobody improved, so it was also 65 and 52 at the end. But if you report just the end scores, it looks like your treatment worked: the experimental group had 12 less average pain points!

Page 41: Problems with Science

Controlling for the Baseline

It’s best to control for the baseline, but it’s OK if you don’t.

What’s bad is when you control for the baseline when the control group is doing better, but don’t control for it when the experimental group is doing better. That’s cheating

Page 42: Problems with Science

Ignoring Dropouts

Sometimes a treatment won’t work, or will cause harmful side-effects. The people experiencing the worst of these side-effects might drop out of the trial.

If you collect data only on people who finished the trial, it will seem like your treatment has fewer side-effects than it actually does.

Page 43: Problems with Science

Outliers

Page 44: Problems with Science

Outliers

An outlier is a data point that is far away from all of your other data points– it doesn’t fit a pattern that is clearly there.

For example, in a trial for a pain medication, you might have some people get a little better, some people get a little worse, and one person who dies. Dying is an outlier, in this situation.

Page 45: Problems with Science

Controlling for Outliers

Outliers are often due to just random chance. Through no fault of your treatment, sometimes people die. It can’t be helped.

It’s accepted practice to control for outliers (which have specific definitions in statistics) by removing them from your data. You can also choose to leave all your data intact.

Page 46: Problems with Science

Controlling for Outliers

Nothing is wrong with removing outliers– except when you do it only when it suits you.

If you choose to keep negative outliers in the control group and keep positive outliers in the experimental group, but choose to eliminate positive outliers in the control group or negative outliers in the experimental group, you’re cheating!

Page 47: Problems with Science
Page 48: Problems with Science

Publication Bias

Suppose I conduct a rigorous, scientific test of the claim that reading causes foot cancer. I show (high statistical significance, large effect size) that it is true!

That’s big news, and not only will I get published in the best science journals, like Science and Nature, I’ll probably get in the newspapers too.

Page 49: Problems with Science

Publication Bias

Instead, suppose I go out and conduct a rigorous, double-blind placebo-controlled randomized trial for the claim that reading does NOT cause foot cancer.

I use a large sample of a representative set of the population, and discover, with a high degree of statistical significance, that I’m right.

Page 50: Problems with Science

Publication Bias

Well who cares? Not Science or Nature!

We all knew that reading didn’t cause foot cancer. That’s silly.

Negative results are inherently boring and uninteresting. Positive results are exciting and informative.

Page 51: Problems with Science

Testing ESP

http://www.colbertnation.com/the-colbert-report-videos/372474/january-27-2011/time-traveling-porn---daryl-bem

Page 52: Problems with Science

Testing ESP

Dr. Daryl Bem conducted experiments where the task was for subjects to select which of two curtains had an image behind it.

The curtain with the picture was determined randomly by a computer. So we expect that people will get the answer right about 50% of the time… random guessing.

Page 53: Problems with Science

Porn from the Future

What Bem found was that when the picture was normal and not pornographic, people did guess randomly: 49.8% of the time they guessed which curtain hid the picture.

But if the picture was pornographic, subjects guessed right 53.1% of the time. This was statistically significant.

Page 54: Problems with Science

Replicability

This is not unusual. Positive results will happen frequently through mere chance. We saw this in the salmon example.

This is why science relies on replicability. Other experimenters should be able to repeat your experiment and get the same results. If they can’t, then it looks like your result was lucky.

Page 55: Problems with Science

Publication Bias

So does ESP exist? Three separate teams of scientists at the University of Edinburgh, the University of Hertfordshire, and Goldsmiths, the University of London decided to test it. They performed the same experiment that Bem did, but got negative results: even porn pictures were guessed at random.

Page 56: Problems with Science

“No Replications”

The original journal that published Bem’s results, the Journal of Personality and Social Psychology, refused to publish these new ones.

“We don’t publish replications” they said.

Two other psychological journals said the same thing. They wouldn’t even look at the paper.

Page 57: Problems with Science

Publication Bias

Just consider this for a moment.

Suppose you want to know whether ESP works. There are 4 studies and 1 is positive and 3 are negative. But only the positive one is published, no one will publish the negative ones. So you look at all the available evidence and conclude that it works. You’re wrong!

Page 58: Problems with Science

FRAUD ON BOTH ENDS

Page 59: Problems with Science

Diederik Stapel

Diederik Stapel was a professor of social psychology at Tilburg University in the Netherlands.

Page 60: Problems with Science

Diederik Stapel

His research was frequently in the news.

He found that people are more racist when there’s trash around.

He found that eating meat made people less social and more selfish.

Page 61: Problems with Science

Diederik Stapel

But it was a lie!

He made up all the experimental data.

Page 62: Problems with Science

Diederik Stapel

“Nobody ever checked my work. They trusted me.… I did everything myself, and next to me was a big jar of cookies. No mother, no lock, not even a lid.… Every day, I would be working and there would be this big jar of cookies, filled with sweets, within reach, right next to me — with nobody even near. All I had to do was take it.” (p. 164)

Page 63: Problems with Science

Publish or Perish

It’s a “publish or perish” world in science. If you don’t have exciting scientific discoveries, you won’t get promoted, or you may lose your job.

On the other side of things, many unscrupulous “scientific journals” publish your results for money. They claim to “peer review” the results, but this is obviously not always true…

Page 64: Problems with Science

Pay to Publish Scam

Researcher John Bohannon wanted to test whether these “pay to publish” journals were really just scams to make money off of scientists.

So he made up a fake paper, full of obvious mistakes, and sent it to 304 journals. More than half accepted it!

Page 65: Problems with Science

The One Paper that was 304 Papers

“The paper took this form: Molecule X from lichen species Y inhibits the growth of cancer cell Z. To substitute for those variables, I created a database of molecules, lichens, and cancer cell lines and wrote a computer program to generate hundreds of unique papers. Other than those differences, the scientific content of each paper is identical.” -- Bohannon

Page 66: Problems with Science

Basic Problems

The paper had basic problems with it.

For example, there was no control group. It showed that certain lichens stopped cancer from growing– but it didn’t show that they didn’t also stop normal cells from growing.

Page 67: Problems with Science

Stats

• Submitted to 304 journals.• Journals were published by top publishers:

Elsevier, Kluwer, Sage, Wolters• Accepted by 157 (52%)• Rejected by 98 (32%)• No decision by 49 (16%)

Page 68: Problems with Science

Why?

This doesn’t mean all the research published by these journals is bad– it just means you can’t tell whether it’s good unless you look.

Why would anyone doing real research pay money to have it published? One example: Chinese universities often pay between 5,000 and 10,000 RMB per paper published, AND, again, publication is the only way to get ahead.