myblue5. getting started with statisticssman/courses/6739/6739-05-statsbasics-200… · coke vs....

499
5. Getting Started with Statistics Dave Goldsman H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology 7/12/20 ISYE 6739 — Goldsman 7/12/20 1 / 74

Upload: others

Post on 23-Mar-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

5. Getting Started with Statistics

Dave Goldsman

H. Milton Stewart School of Industrial and Systems EngineeringGeorgia Institute of Technology

7/12/20

ISYE 6739 — Goldsman 7/12/20 1 / 74

Page 2: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Outline

1 Introduction to Descriptive Statistics

2 Summarizing Data

3 Candidate Distributions

4 Introduction to Estimation

5 Unbiased Estimation

6 Mean Squared Error

7 Maximum Likelihood Estimation

8 Trickier MLE Examples

9 Invariance Property of MLEs

10 Method of Moments Estimation

11 Sampling Distributions

ISYE 6739 — Goldsman 7/12/20 2 / 74

Page 3: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Lesson 5.1 — Introduction to Descriptive Statistics

What’s Coming Up:Three high-level lessons on what Statistics is (not involving much math).

Several lessons on estimating parameters of probability distributions.

One lesson on certain distributions that will come up in subsequentStatistics modules — normal, time for t, χ2, and F .

Statistics forms a rational basis for decision-making using observed orexperimental data. We make these decisions in the face of uncertainty.

Statistics helps us answer questions concerning:

The analysis of one population (or system).

The comparison of many populations.

ISYE 6739 — Goldsman 7/12/20 3 / 74

Page 4: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Lesson 5.1 — Introduction to Descriptive Statistics

What’s Coming Up:

Three high-level lessons on what Statistics is (not involving much math).

Several lessons on estimating parameters of probability distributions.

One lesson on certain distributions that will come up in subsequentStatistics modules — normal, time for t, χ2, and F .

Statistics forms a rational basis for decision-making using observed orexperimental data. We make these decisions in the face of uncertainty.

Statistics helps us answer questions concerning:

The analysis of one population (or system).

The comparison of many populations.

ISYE 6739 — Goldsman 7/12/20 3 / 74

Page 5: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Lesson 5.1 — Introduction to Descriptive Statistics

What’s Coming Up:Three high-level lessons on what Statistics is (not involving much math).

Several lessons on estimating parameters of probability distributions.

One lesson on certain distributions that will come up in subsequentStatistics modules — normal, time for t, χ2, and F .

Statistics forms a rational basis for decision-making using observed orexperimental data. We make these decisions in the face of uncertainty.

Statistics helps us answer questions concerning:

The analysis of one population (or system).

The comparison of many populations.

ISYE 6739 — Goldsman 7/12/20 3 / 74

Page 6: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Lesson 5.1 — Introduction to Descriptive Statistics

What’s Coming Up:Three high-level lessons on what Statistics is (not involving much math).

Several lessons on estimating parameters of probability distributions.

One lesson on certain distributions that will come up in subsequentStatistics modules — normal, time for t, χ2, and F .

Statistics forms a rational basis for decision-making using observed orexperimental data. We make these decisions in the face of uncertainty.

Statistics helps us answer questions concerning:

The analysis of one population (or system).

The comparison of many populations.

ISYE 6739 — Goldsman 7/12/20 3 / 74

Page 7: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Lesson 5.1 — Introduction to Descriptive Statistics

What’s Coming Up:Three high-level lessons on what Statistics is (not involving much math).

Several lessons on estimating parameters of probability distributions.

One lesson on certain distributions that will come up in subsequentStatistics modules

— normal, time for t, χ2, and F .

Statistics forms a rational basis for decision-making using observed orexperimental data. We make these decisions in the face of uncertainty.

Statistics helps us answer questions concerning:

The analysis of one population (or system).

The comparison of many populations.

ISYE 6739 — Goldsman 7/12/20 3 / 74

Page 8: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Lesson 5.1 — Introduction to Descriptive Statistics

What’s Coming Up:Three high-level lessons on what Statistics is (not involving much math).

Several lessons on estimating parameters of probability distributions.

One lesson on certain distributions that will come up in subsequentStatistics modules — normal, time for t, χ2, and F .

Statistics forms a rational basis for decision-making using observed orexperimental data. We make these decisions in the face of uncertainty.

Statistics helps us answer questions concerning:

The analysis of one population (or system).

The comparison of many populations.

ISYE 6739 — Goldsman 7/12/20 3 / 74

Page 9: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Lesson 5.1 — Introduction to Descriptive Statistics

What’s Coming Up:Three high-level lessons on what Statistics is (not involving much math).

Several lessons on estimating parameters of probability distributions.

One lesson on certain distributions that will come up in subsequentStatistics modules — normal, time for t, χ2, and F .

Statistics forms a rational basis for decision-making using observed orexperimental data.

We make these decisions in the face of uncertainty.

Statistics helps us answer questions concerning:

The analysis of one population (or system).

The comparison of many populations.

ISYE 6739 — Goldsman 7/12/20 3 / 74

Page 10: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Lesson 5.1 — Introduction to Descriptive Statistics

What’s Coming Up:Three high-level lessons on what Statistics is (not involving much math).

Several lessons on estimating parameters of probability distributions.

One lesson on certain distributions that will come up in subsequentStatistics modules — normal, time for t, χ2, and F .

Statistics forms a rational basis for decision-making using observed orexperimental data. We make these decisions in the face of uncertainty.

Statistics helps us answer questions concerning:

The analysis of one population (or system).

The comparison of many populations.

ISYE 6739 — Goldsman 7/12/20 3 / 74

Page 11: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Lesson 5.1 — Introduction to Descriptive Statistics

What’s Coming Up:Three high-level lessons on what Statistics is (not involving much math).

Several lessons on estimating parameters of probability distributions.

One lesson on certain distributions that will come up in subsequentStatistics modules — normal, time for t, χ2, and F .

Statistics forms a rational basis for decision-making using observed orexperimental data. We make these decisions in the face of uncertainty.

Statistics helps us answer questions concerning:

The analysis of one population (or system).

The comparison of many populations.

ISYE 6739 — Goldsman 7/12/20 3 / 74

Page 12: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Lesson 5.1 — Introduction to Descriptive Statistics

What’s Coming Up:Three high-level lessons on what Statistics is (not involving much math).

Several lessons on estimating parameters of probability distributions.

One lesson on certain distributions that will come up in subsequentStatistics modules — normal, time for t, χ2, and F .

Statistics forms a rational basis for decision-making using observed orexperimental data. We make these decisions in the face of uncertainty.

Statistics helps us answer questions concerning:

The analysis of one population (or system).

The comparison of many populations.

ISYE 6739 — Goldsman 7/12/20 3 / 74

Page 13: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Lesson 5.1 — Introduction to Descriptive Statistics

What’s Coming Up:Three high-level lessons on what Statistics is (not involving much math).

Several lessons on estimating parameters of probability distributions.

One lesson on certain distributions that will come up in subsequentStatistics modules — normal, time for t, χ2, and F .

Statistics forms a rational basis for decision-making using observed orexperimental data. We make these decisions in the face of uncertainty.

Statistics helps us answer questions concerning:

The analysis of one population (or system).

The comparison of many populations.

ISYE 6739 — Goldsman 7/12/20 3 / 74

Page 14: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Examples:

Election polling.

Coke vs. Pepsi.

The effect of cigarette smoking on the probability of getting cancer.

The effect of a new drug on the probability of contracting hepatitis.

What’s the most popular TV show during a certain time period?

The effect of various heat-treating methods on steel tensile strength.

Which fertilizers improve crop yield?

King of Siam — etc., etc., etc.

Idea (Election polling example): We can’t poll every single voter. Thus, wetake a sample of data from the population of voters, and try to make areasonable conclusion based on that sample.

ISYE 6739 — Goldsman 7/12/20 4 / 74

Page 15: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Examples:

Election polling.

Coke vs. Pepsi.

The effect of cigarette smoking on the probability of getting cancer.

The effect of a new drug on the probability of contracting hepatitis.

What’s the most popular TV show during a certain time period?

The effect of various heat-treating methods on steel tensile strength.

Which fertilizers improve crop yield?

King of Siam — etc., etc., etc.

Idea (Election polling example): We can’t poll every single voter. Thus, wetake a sample of data from the population of voters, and try to make areasonable conclusion based on that sample.

ISYE 6739 — Goldsman 7/12/20 4 / 74

Page 16: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Examples:

Election polling.

Coke vs. Pepsi.

The effect of cigarette smoking on the probability of getting cancer.

The effect of a new drug on the probability of contracting hepatitis.

What’s the most popular TV show during a certain time period?

The effect of various heat-treating methods on steel tensile strength.

Which fertilizers improve crop yield?

King of Siam — etc., etc., etc.

Idea (Election polling example): We can’t poll every single voter. Thus, wetake a sample of data from the population of voters, and try to make areasonable conclusion based on that sample.

ISYE 6739 — Goldsman 7/12/20 4 / 74

Page 17: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Examples:

Election polling.

Coke vs. Pepsi.

The effect of cigarette smoking on the probability of getting cancer.

The effect of a new drug on the probability of contracting hepatitis.

What’s the most popular TV show during a certain time period?

The effect of various heat-treating methods on steel tensile strength.

Which fertilizers improve crop yield?

King of Siam — etc., etc., etc.

Idea (Election polling example): We can’t poll every single voter. Thus, wetake a sample of data from the population of voters, and try to make areasonable conclusion based on that sample.

ISYE 6739 — Goldsman 7/12/20 4 / 74

Page 18: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Examples:

Election polling.

Coke vs. Pepsi.

The effect of cigarette smoking on the probability of getting cancer.

The effect of a new drug on the probability of contracting hepatitis.

What’s the most popular TV show during a certain time period?

The effect of various heat-treating methods on steel tensile strength.

Which fertilizers improve crop yield?

King of Siam — etc., etc., etc.

Idea (Election polling example): We can’t poll every single voter. Thus, wetake a sample of data from the population of voters, and try to make areasonable conclusion based on that sample.

ISYE 6739 — Goldsman 7/12/20 4 / 74

Page 19: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Examples:

Election polling.

Coke vs. Pepsi.

The effect of cigarette smoking on the probability of getting cancer.

The effect of a new drug on the probability of contracting hepatitis.

What’s the most popular TV show during a certain time period?

The effect of various heat-treating methods on steel tensile strength.

Which fertilizers improve crop yield?

King of Siam — etc., etc., etc.

Idea (Election polling example): We can’t poll every single voter. Thus, wetake a sample of data from the population of voters, and try to make areasonable conclusion based on that sample.

ISYE 6739 — Goldsman 7/12/20 4 / 74

Page 20: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Examples:

Election polling.

Coke vs. Pepsi.

The effect of cigarette smoking on the probability of getting cancer.

The effect of a new drug on the probability of contracting hepatitis.

What’s the most popular TV show during a certain time period?

The effect of various heat-treating methods on steel tensile strength.

Which fertilizers improve crop yield?

King of Siam — etc., etc., etc.

Idea (Election polling example): We can’t poll every single voter. Thus, wetake a sample of data from the population of voters, and try to make areasonable conclusion based on that sample.

ISYE 6739 — Goldsman 7/12/20 4 / 74

Page 21: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Examples:

Election polling.

Coke vs. Pepsi.

The effect of cigarette smoking on the probability of getting cancer.

The effect of a new drug on the probability of contracting hepatitis.

What’s the most popular TV show during a certain time period?

The effect of various heat-treating methods on steel tensile strength.

Which fertilizers improve crop yield?

King of Siam — etc., etc., etc.

Idea (Election polling example): We can’t poll every single voter. Thus, wetake a sample of data from the population of voters, and try to make areasonable conclusion based on that sample.

ISYE 6739 — Goldsman 7/12/20 4 / 74

Page 22: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Examples:

Election polling.

Coke vs. Pepsi.

The effect of cigarette smoking on the probability of getting cancer.

The effect of a new drug on the probability of contracting hepatitis.

What’s the most popular TV show during a certain time period?

The effect of various heat-treating methods on steel tensile strength.

Which fertilizers improve crop yield?

King of Siam — etc., etc., etc.

Idea (Election polling example): We can’t poll every single voter. Thus, wetake a sample of data from the population of voters, and try to make areasonable conclusion based on that sample.

ISYE 6739 — Goldsman 7/12/20 4 / 74

Page 23: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Examples:

Election polling.

Coke vs. Pepsi.

The effect of cigarette smoking on the probability of getting cancer.

The effect of a new drug on the probability of contracting hepatitis.

What’s the most popular TV show during a certain time period?

The effect of various heat-treating methods on steel tensile strength.

Which fertilizers improve crop yield?

King of Siam — etc., etc., etc.

Idea (Election polling example):

We can’t poll every single voter. Thus, wetake a sample of data from the population of voters, and try to make areasonable conclusion based on that sample.

ISYE 6739 — Goldsman 7/12/20 4 / 74

Page 24: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Examples:

Election polling.

Coke vs. Pepsi.

The effect of cigarette smoking on the probability of getting cancer.

The effect of a new drug on the probability of contracting hepatitis.

What’s the most popular TV show during a certain time period?

The effect of various heat-treating methods on steel tensile strength.

Which fertilizers improve crop yield?

King of Siam — etc., etc., etc.

Idea (Election polling example): We can’t poll every single voter.

Thus, wetake a sample of data from the population of voters, and try to make areasonable conclusion based on that sample.

ISYE 6739 — Goldsman 7/12/20 4 / 74

Page 25: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Examples:

Election polling.

Coke vs. Pepsi.

The effect of cigarette smoking on the probability of getting cancer.

The effect of a new drug on the probability of contracting hepatitis.

What’s the most popular TV show during a certain time period?

The effect of various heat-treating methods on steel tensile strength.

Which fertilizers improve crop yield?

King of Siam — etc., etc., etc.

Idea (Election polling example): We can’t poll every single voter. Thus, wetake a sample of data from the population of voters, and try to make areasonable conclusion based on that sample.

ISYE 6739 — Goldsman 7/12/20 4 / 74

Page 26: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Statistics tells us how to conduct the sampling (i.e., how many observations totake, how to take them, etc.),

and then how to draw conclusions from thesampled data.

Types of DataContinuous variables: Can take on any real value in a certaininterval. For example, the lifetime of a lightbulb or the weight of anewborn child.

Discrete variables: Can only take on specific values. E.g., the numberof accidents this week at a factory or the possible rolls of a pair of dice.

Categorical variables: These data are not typically numerical.What’s your favorite TV show during a certain time slot?

ISYE 6739 — Goldsman 7/12/20 5 / 74

Page 27: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Statistics tells us how to conduct the sampling (i.e., how many observations totake, how to take them, etc.), and then how to draw conclusions from thesampled data.

Types of DataContinuous variables: Can take on any real value in a certaininterval. For example, the lifetime of a lightbulb or the weight of anewborn child.

Discrete variables: Can only take on specific values. E.g., the numberof accidents this week at a factory or the possible rolls of a pair of dice.

Categorical variables: These data are not typically numerical.What’s your favorite TV show during a certain time slot?

ISYE 6739 — Goldsman 7/12/20 5 / 74

Page 28: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Statistics tells us how to conduct the sampling (i.e., how many observations totake, how to take them, etc.), and then how to draw conclusions from thesampled data.

Types of Data

Continuous variables: Can take on any real value in a certaininterval. For example, the lifetime of a lightbulb or the weight of anewborn child.

Discrete variables: Can only take on specific values. E.g., the numberof accidents this week at a factory or the possible rolls of a pair of dice.

Categorical variables: These data are not typically numerical.What’s your favorite TV show during a certain time slot?

ISYE 6739 — Goldsman 7/12/20 5 / 74

Page 29: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Statistics tells us how to conduct the sampling (i.e., how many observations totake, how to take them, etc.), and then how to draw conclusions from thesampled data.

Types of DataContinuous variables:

Can take on any real value in a certaininterval. For example, the lifetime of a lightbulb or the weight of anewborn child.

Discrete variables: Can only take on specific values. E.g., the numberof accidents this week at a factory or the possible rolls of a pair of dice.

Categorical variables: These data are not typically numerical.What’s your favorite TV show during a certain time slot?

ISYE 6739 — Goldsman 7/12/20 5 / 74

Page 30: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Statistics tells us how to conduct the sampling (i.e., how many observations totake, how to take them, etc.), and then how to draw conclusions from thesampled data.

Types of DataContinuous variables: Can take on any real value in a certaininterval. For example, the lifetime of a lightbulb or the weight of anewborn child.

Discrete variables: Can only take on specific values. E.g., the numberof accidents this week at a factory or the possible rolls of a pair of dice.

Categorical variables: These data are not typically numerical.What’s your favorite TV show during a certain time slot?

ISYE 6739 — Goldsman 7/12/20 5 / 74

Page 31: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Statistics tells us how to conduct the sampling (i.e., how many observations totake, how to take them, etc.), and then how to draw conclusions from thesampled data.

Types of DataContinuous variables: Can take on any real value in a certaininterval. For example, the lifetime of a lightbulb or the weight of anewborn child.

Discrete variables:

Can only take on specific values. E.g., the numberof accidents this week at a factory or the possible rolls of a pair of dice.

Categorical variables: These data are not typically numerical.What’s your favorite TV show during a certain time slot?

ISYE 6739 — Goldsman 7/12/20 5 / 74

Page 32: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Statistics tells us how to conduct the sampling (i.e., how many observations totake, how to take them, etc.), and then how to draw conclusions from thesampled data.

Types of DataContinuous variables: Can take on any real value in a certaininterval. For example, the lifetime of a lightbulb or the weight of anewborn child.

Discrete variables: Can only take on specific values. E.g., the numberof accidents this week at a factory or the possible rolls of a pair of dice.

Categorical variables: These data are not typically numerical.What’s your favorite TV show during a certain time slot?

ISYE 6739 — Goldsman 7/12/20 5 / 74

Page 33: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Statistics tells us how to conduct the sampling (i.e., how many observations totake, how to take them, etc.), and then how to draw conclusions from thesampled data.

Types of DataContinuous variables: Can take on any real value in a certaininterval. For example, the lifetime of a lightbulb or the weight of anewborn child.

Discrete variables: Can only take on specific values. E.g., the numberof accidents this week at a factory or the possible rolls of a pair of dice.

Categorical variables:

These data are not typically numerical.What’s your favorite TV show during a certain time slot?

ISYE 6739 — Goldsman 7/12/20 5 / 74

Page 34: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Statistics tells us how to conduct the sampling (i.e., how many observations totake, how to take them, etc.), and then how to draw conclusions from thesampled data.

Types of DataContinuous variables: Can take on any real value in a certaininterval. For example, the lifetime of a lightbulb or the weight of anewborn child.

Discrete variables: Can only take on specific values. E.g., the numberof accidents this week at a factory or the possible rolls of a pair of dice.

Categorical variables: These data are not typically numerical.What’s your favorite TV show during a certain time slot?

ISYE 6739 — Goldsman 7/12/20 5 / 74

Page 35: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Plotting Data

A picture is worth 1000 words. Always plot data before doing anything else,if only to identify any obvious issues such as nonstandard distributions,missing data points, outliers, etc.

Histograms provide a quick, succinct look at what you are dealing with. Ifyou take enough observations, the histogram will eventually converge to thetrue distribution. But sometimes choosing the optimal number of cells is alittle tricky — like Goldilocks!

ISYE 6739 — Goldsman 7/12/20 6 / 74

Page 36: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Plotting Data

A picture is worth 1000 words. Always plot data before doing anything else,if only to identify any obvious issues such as nonstandard distributions,missing data points, outliers, etc.

Histograms provide a quick, succinct look at what you are dealing with. Ifyou take enough observations, the histogram will eventually converge to thetrue distribution. But sometimes choosing the optimal number of cells is alittle tricky — like Goldilocks!

ISYE 6739 — Goldsman 7/12/20 6 / 74

Page 37: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Plotting Data

A picture is worth 1000 words. Always plot data before doing anything else,if only to identify any obvious issues such as nonstandard distributions,missing data points, outliers, etc.

Histograms provide a quick, succinct look at what you are dealing with.

Ifyou take enough observations, the histogram will eventually converge to thetrue distribution. But sometimes choosing the optimal number of cells is alittle tricky — like Goldilocks!

ISYE 6739 — Goldsman 7/12/20 6 / 74

Page 38: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Plotting Data

A picture is worth 1000 words. Always plot data before doing anything else,if only to identify any obvious issues such as nonstandard distributions,missing data points, outliers, etc.

Histograms provide a quick, succinct look at what you are dealing with. Ifyou take enough observations, the histogram will eventually converge to thetrue distribution. But sometimes choosing the optimal number of cells is alittle tricky — like Goldilocks!

ISYE 6739 — Goldsman 7/12/20 6 / 74

Page 39: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Descriptive Statistics

Plotting Data

A picture is worth 1000 words. Always plot data before doing anything else,if only to identify any obvious issues such as nonstandard distributions,missing data points, outliers, etc.

Histograms provide a quick, succinct look at what you are dealing with. Ifyou take enough observations, the histogram will eventually converge to thetrue distribution. But sometimes choosing the optimal number of cells is alittle tricky — like Goldilocks!

ISYE 6739 — Goldsman 7/12/20 6 / 74

Page 40: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Outline

1 Introduction to Descriptive Statistics

2 Summarizing Data

3 Candidate Distributions

4 Introduction to Estimation

5 Unbiased Estimation

6 Mean Squared Error

7 Maximum Likelihood Estimation

8 Trickier MLE Examples

9 Invariance Property of MLEs

10 Method of Moments Estimation

11 Sampling Distributions

ISYE 6739 — Goldsman 7/12/20 7 / 74

Page 41: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Lesson 5.2 — Summarizing Data

In addition to plotting data, how do we summarize data?

It’s nice to have lots of data. But sometimes it’s too much of a good thing!Need to summarize.

Example: Grades on a test (i.e., raw data):

23 62 91 83 82 64 73 94 94 52

67 11 87 99 37 62 40 33 80 83

99 90 18 73 68 75 75 90 36 55

ISYE 6739 — Goldsman 7/12/20 8 / 74

Page 42: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Lesson 5.2 — Summarizing Data

In addition to plotting data, how do we summarize data?

It’s nice to have lots of data. But sometimes it’s too much of a good thing!Need to summarize.

Example: Grades on a test (i.e., raw data):

23 62 91 83 82 64 73 94 94 52

67 11 87 99 37 62 40 33 80 83

99 90 18 73 68 75 75 90 36 55

ISYE 6739 — Goldsman 7/12/20 8 / 74

Page 43: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Lesson 5.2 — Summarizing Data

In addition to plotting data, how do we summarize data?

It’s nice to have lots of data. But sometimes it’s too much of a good thing!Need to summarize.

Example: Grades on a test (i.e., raw data):

23 62 91 83 82 64 73 94 94 52

67 11 87 99 37 62 40 33 80 83

99 90 18 73 68 75 75 90 36 55

ISYE 6739 — Goldsman 7/12/20 8 / 74

Page 44: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Lesson 5.2 — Summarizing Data

In addition to plotting data, how do we summarize data?

It’s nice to have lots of data. But sometimes it’s too much of a good thing!Need to summarize.

Example: Grades on a test (i.e., raw data):

23 62 91 83 82 64 73 94 94 52

67 11 87 99 37 62 40 33 80 83

99 90 18 73 68 75 75 90 36 55

ISYE 6739 — Goldsman 7/12/20 8 / 74

Page 45: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Lesson 5.2 — Summarizing Data

In addition to plotting data, how do we summarize data?

It’s nice to have lots of data. But sometimes it’s too much of a good thing!Need to summarize.

Example: Grades on a test (i.e., raw data):

23 62 91 83 82 64 73 94 94 52

67 11 87 99 37 62 40 33 80 83

99 90 18 73 68 75 75 90 36 55

ISYE 6739 — Goldsman 7/12/20 8 / 74

Page 46: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Stem-and-Leaf Diagram of grades.

Easy way to write down all of the data.Saves some space, and looks like a sideways histogram.

9 9944100

8 73320

7 5533

6 87422

5 52

4 0

3 763

2 3

1 81

ISYE 6739 — Goldsman 7/12/20 9 / 74

Page 47: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Stem-and-Leaf Diagram of grades. Easy way to write down all of the data.Saves some space, and looks like a sideways histogram.

9 9944100

8 73320

7 5533

6 87422

5 52

4 0

3 763

2 3

1 81

ISYE 6739 — Goldsman 7/12/20 9 / 74

Page 48: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Stem-and-Leaf Diagram of grades. Easy way to write down all of the data.Saves some space, and looks like a sideways histogram.

9 9944100

8 73320

7 5533

6 87422

5 52

4 0

3 763

2 3

1 81

ISYE 6739 — Goldsman 7/12/20 9 / 74

Page 49: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Grouped Data

Cumul. Proportion of

Range Freq. Freq. observations so far

0–20 2 2 2/30

21–40 5 7 7/30

41–60 2 9 9/30

61–80 10 19 19/30

81–100 11 30 1

ISYE 6739 — Goldsman 7/12/20 10 / 74

Page 50: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Grouped Data

Cumul. Proportion of

Range Freq. Freq. observations so far

0–20 2 2 2/30

21–40 5 7 7/30

41–60 2 9 9/30

61–80 10 19 19/30

81–100 11 30 1

ISYE 6739 — Goldsman 7/12/20 10 / 74

Page 51: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Summary Statistics:

n = 30 observations.

If Xi is the ith score, then the sample mean is

X ≡n∑i=1

Xi/n = 66.5.

The sample variance is

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 = 630.6.

Remark: Before you take any observations, X and S2 must be regarded asrandom variables.

ISYE 6739 — Goldsman 7/12/20 11 / 74

Page 52: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Summary Statistics:

n = 30 observations.

If Xi is the ith score, then the sample mean is

X ≡n∑i=1

Xi/n = 66.5.

The sample variance is

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 = 630.6.

Remark: Before you take any observations, X and S2 must be regarded asrandom variables.

ISYE 6739 — Goldsman 7/12/20 11 / 74

Page 53: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Summary Statistics:

n = 30 observations.

If Xi is the ith score, then the sample mean is

X ≡n∑i=1

Xi/n = 66.5.

The sample variance is

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 = 630.6.

Remark: Before you take any observations, X and S2 must be regarded asrandom variables.

ISYE 6739 — Goldsman 7/12/20 11 / 74

Page 54: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Summary Statistics:

n = 30 observations.

If Xi is the ith score, then the sample mean is

X ≡n∑i=1

Xi/n = 66.5.

The sample variance is

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 = 630.6.

Remark: Before you take any observations, X and S2 must be regarded asrandom variables.

ISYE 6739 — Goldsman 7/12/20 11 / 74

Page 55: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Summary Statistics:

n = 30 observations.

If Xi is the ith score, then the sample mean is

X ≡n∑i=1

Xi/n = 66.5.

The sample variance is

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 = 630.6.

Remark: Before you take any observations, X and S2 must be regarded asrandom variables.

ISYE 6739 — Goldsman 7/12/20 11 / 74

Page 56: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Summary Statistics:

n = 30 observations.

If Xi is the ith score, then the sample mean is

X ≡n∑i=1

Xi/n = 66.5.

The sample variance is

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 = 630.6.

Remark: Before you take any observations, X and S2 must be regarded asrandom variables.

ISYE 6739 — Goldsman 7/12/20 11 / 74

Page 57: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Summary Statistics:

n = 30 observations.

If Xi is the ith score, then the sample mean is

X ≡n∑i=1

Xi/n = 66.5.

The sample variance is

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 = 630.6.

Remark: Before you take any observations, X and S2 must be regarded asrandom variables.

ISYE 6739 — Goldsman 7/12/20 11 / 74

Page 58: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

In general, suppose that we sample iid data X1, . . . , Xn from the populationof interest.

Example: Xi is the lifespan of the ith lightbulb we observe.

We’re most interested in measuring the “center” and “spread” of theunderlying distribution of the data.

Measures of Central Tendency:

Sample Mean: X =∑n

i=1Xi/n.

Sample Median: The “middle” observation when the Xi’s are arrangednumerically.

ISYE 6739 — Goldsman 7/12/20 12 / 74

Page 59: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

In general, suppose that we sample iid data X1, . . . , Xn from the populationof interest.

Example: Xi is the lifespan of the ith lightbulb we observe.

We’re most interested in measuring the “center” and “spread” of theunderlying distribution of the data.

Measures of Central Tendency:

Sample Mean: X =∑n

i=1Xi/n.

Sample Median: The “middle” observation when the Xi’s are arrangednumerically.

ISYE 6739 — Goldsman 7/12/20 12 / 74

Page 60: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

In general, suppose that we sample iid data X1, . . . , Xn from the populationof interest.

Example: Xi is the lifespan of the ith lightbulb we observe.

We’re most interested in measuring the “center” and “spread” of theunderlying distribution of the data.

Measures of Central Tendency:

Sample Mean: X =∑n

i=1Xi/n.

Sample Median: The “middle” observation when the Xi’s are arrangednumerically.

ISYE 6739 — Goldsman 7/12/20 12 / 74

Page 61: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

In general, suppose that we sample iid data X1, . . . , Xn from the populationof interest.

Example: Xi is the lifespan of the ith lightbulb we observe.

We’re most interested in measuring the “center” and “spread” of theunderlying distribution of the data.

Measures of Central Tendency:

Sample Mean: X =∑n

i=1Xi/n.

Sample Median: The “middle” observation when the Xi’s are arrangednumerically.

ISYE 6739 — Goldsman 7/12/20 12 / 74

Page 62: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

In general, suppose that we sample iid data X1, . . . , Xn from the populationof interest.

Example: Xi is the lifespan of the ith lightbulb we observe.

We’re most interested in measuring the “center” and “spread” of theunderlying distribution of the data.

Measures of Central Tendency:

Sample Mean:

X =∑n

i=1Xi/n.

Sample Median: The “middle” observation when the Xi’s are arrangednumerically.

ISYE 6739 — Goldsman 7/12/20 12 / 74

Page 63: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

In general, suppose that we sample iid data X1, . . . , Xn from the populationof interest.

Example: Xi is the lifespan of the ith lightbulb we observe.

We’re most interested in measuring the “center” and “spread” of theunderlying distribution of the data.

Measures of Central Tendency:

Sample Mean: X =∑n

i=1Xi/n.

Sample Median: The “middle” observation when the Xi’s are arrangednumerically.

ISYE 6739 — Goldsman 7/12/20 12 / 74

Page 64: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

In general, suppose that we sample iid data X1, . . . , Xn from the populationof interest.

Example: Xi is the lifespan of the ith lightbulb we observe.

We’re most interested in measuring the “center” and “spread” of theunderlying distribution of the data.

Measures of Central Tendency:

Sample Mean: X =∑n

i=1Xi/n.

Sample Median:

The “middle” observation when the Xi’s are arrangednumerically.

ISYE 6739 — Goldsman 7/12/20 12 / 74

Page 65: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

In general, suppose that we sample iid data X1, . . . , Xn from the populationof interest.

Example: Xi is the lifespan of the ith lightbulb we observe.

We’re most interested in measuring the “center” and “spread” of theunderlying distribution of the data.

Measures of Central Tendency:

Sample Mean: X =∑n

i=1Xi/n.

Sample Median: The “middle” observation when the Xi’s are arrangednumerically.

ISYE 6739 — Goldsman 7/12/20 12 / 74

Page 66: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Example: 16, 7, 83 gives a median of 16.

Example: 16, 7, 83, 20 gives a “reasonable” median of 16+202 = 18.

Remark: The sample median is less susceptible to “outlier” data than thesample mean. One bad number can spoil the sample mean’s entire day.

Example: 7, 7, 7, 672, 7 results in a sample mean of 140 and a samplemedian of 7.

Sample Mode: “Most common” value. Not the most useful measuresometimes.

Example: 16, 7, 20, 83, 7 gives a mode of 7.

ISYE 6739 — Goldsman 7/12/20 13 / 74

Page 67: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Example: 16, 7, 83 gives a median of 16.

Example: 16, 7, 83, 20 gives a “reasonable” median of 16+202 = 18.

Remark: The sample median is less susceptible to “outlier” data than thesample mean. One bad number can spoil the sample mean’s entire day.

Example: 7, 7, 7, 672, 7 results in a sample mean of 140 and a samplemedian of 7.

Sample Mode: “Most common” value. Not the most useful measuresometimes.

Example: 16, 7, 20, 83, 7 gives a mode of 7.

ISYE 6739 — Goldsman 7/12/20 13 / 74

Page 68: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Example: 16, 7, 83 gives a median of 16.

Example: 16, 7, 83, 20 gives a “reasonable” median of 16+202 = 18.

Remark: The sample median is less susceptible to “outlier” data than thesample mean. One bad number can spoil the sample mean’s entire day.

Example: 7, 7, 7, 672, 7 results in a sample mean of 140 and a samplemedian of 7.

Sample Mode: “Most common” value. Not the most useful measuresometimes.

Example: 16, 7, 20, 83, 7 gives a mode of 7.

ISYE 6739 — Goldsman 7/12/20 13 / 74

Page 69: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Example: 16, 7, 83 gives a median of 16.

Example: 16, 7, 83, 20 gives a “reasonable” median of 16+202 = 18.

Remark: The sample median is less susceptible to “outlier” data than thesample mean. One bad number can spoil the sample mean’s entire day.

Example: 7, 7, 7, 672, 7 results in a sample mean of 140 and a samplemedian of 7.

Sample Mode: “Most common” value. Not the most useful measuresometimes.

Example: 16, 7, 20, 83, 7 gives a mode of 7.

ISYE 6739 — Goldsman 7/12/20 13 / 74

Page 70: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Example: 16, 7, 83 gives a median of 16.

Example: 16, 7, 83, 20 gives a “reasonable” median of 16+202 = 18.

Remark: The sample median is less susceptible to “outlier” data than thesample mean. One bad number can spoil the sample mean’s entire day.

Example: 7, 7, 7, 672, 7 results in a sample mean of 140 and a samplemedian of 7.

Sample Mode:

“Most common” value. Not the most useful measuresometimes.

Example: 16, 7, 20, 83, 7 gives a mode of 7.

ISYE 6739 — Goldsman 7/12/20 13 / 74

Page 71: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Example: 16, 7, 83 gives a median of 16.

Example: 16, 7, 83, 20 gives a “reasonable” median of 16+202 = 18.

Remark: The sample median is less susceptible to “outlier” data than thesample mean. One bad number can spoil the sample mean’s entire day.

Example: 7, 7, 7, 672, 7 results in a sample mean of 140 and a samplemedian of 7.

Sample Mode: “Most common” value. Not the most useful measuresometimes.

Example: 16, 7, 20, 83, 7 gives a mode of 7.

ISYE 6739 — Goldsman 7/12/20 13 / 74

Page 72: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Example: 16, 7, 83 gives a median of 16.

Example: 16, 7, 83, 20 gives a “reasonable” median of 16+202 = 18.

Remark: The sample median is less susceptible to “outlier” data than thesample mean. One bad number can spoil the sample mean’s entire day.

Example: 7, 7, 7, 672, 7 results in a sample mean of 140 and a samplemedian of 7.

Sample Mode: “Most common” value. Not the most useful measuresometimes.

Example: 16, 7, 20, 83, 7 gives a mode of 7.

ISYE 6739 — Goldsman 7/12/20 13 / 74

Page 73: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Measures of Variation (dispersion, spread)

Sample Variance:

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 =1

n− 1

( n∑i=1

X2i − nX2

),

the latter expression being easier to compute.

Sample Standard Deviation: S = +√S2.

Sample Range: maxiXi −miniXi.

ISYE 6739 — Goldsman 7/12/20 14 / 74

Page 74: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Measures of Variation (dispersion, spread)

Sample Variance:

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 =1

n− 1

( n∑i=1

X2i − nX2

),

the latter expression being easier to compute.

Sample Standard Deviation: S = +√S2.

Sample Range: maxiXi −miniXi.

ISYE 6739 — Goldsman 7/12/20 14 / 74

Page 75: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Measures of Variation (dispersion, spread)

Sample Variance:

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 =

1

n− 1

( n∑i=1

X2i − nX2

),

the latter expression being easier to compute.

Sample Standard Deviation: S = +√S2.

Sample Range: maxiXi −miniXi.

ISYE 6739 — Goldsman 7/12/20 14 / 74

Page 76: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Measures of Variation (dispersion, spread)

Sample Variance:

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 =1

n− 1

( n∑i=1

X2i − nX2

),

the latter expression being easier to compute.

Sample Standard Deviation: S = +√S2.

Sample Range: maxiXi −miniXi.

ISYE 6739 — Goldsman 7/12/20 14 / 74

Page 77: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Measures of Variation (dispersion, spread)

Sample Variance:

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 =1

n− 1

( n∑i=1

X2i − nX2

),

the latter expression being easier to compute.

Sample Standard Deviation: S = +√S2.

Sample Range: maxiXi −miniXi.

ISYE 6739 — Goldsman 7/12/20 14 / 74

Page 78: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Measures of Variation (dispersion, spread)

Sample Variance:

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 =1

n− 1

( n∑i=1

X2i − nX2

),

the latter expression being easier to compute.

Sample Standard Deviation:

S = +√S2.

Sample Range: maxiXi −miniXi.

ISYE 6739 — Goldsman 7/12/20 14 / 74

Page 79: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Measures of Variation (dispersion, spread)

Sample Variance:

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 =1

n− 1

( n∑i=1

X2i − nX2

),

the latter expression being easier to compute.

Sample Standard Deviation: S = +√S2.

Sample Range: maxiXi −miniXi.

ISYE 6739 — Goldsman 7/12/20 14 / 74

Page 80: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Measures of Variation (dispersion, spread)

Sample Variance:

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 =1

n− 1

( n∑i=1

X2i − nX2

),

the latter expression being easier to compute.

Sample Standard Deviation: S = +√S2.

Sample Range:

maxiXi −miniXi.

ISYE 6739 — Goldsman 7/12/20 14 / 74

Page 81: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Measures of Variation (dispersion, spread)

Sample Variance:

S2 ≡ 1

n− 1

n∑i=1

(Xi − X)2 =1

n− 1

( n∑i=1

X2i − nX2

),

the latter expression being easier to compute.

Sample Standard Deviation: S = +√S2.

Sample Range: maxiXi −miniXi.

ISYE 6739 — Goldsman 7/12/20 14 / 74

Page 82: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Remark: Suppose the data takes p different values X1, . . . , Xp, withfrequencies f1, . . . , fp, respectively.

How to calculate X and S2 quickly?

X =

p∑j=1

fjXj/n and S2 =

∑pj=1 fjX

2j − nX2

n− 1.

Example: Suppose we roll a die 10 times.

Xj 1 2 3 4 5 6

fj 2 1 1 3 0 3

Then X = (2 · 1 + 1 · 2 + · · ·+ 3 · 6)/10 = 3.7, and S2 = 3.789. 2

ISYE 6739 — Goldsman 7/12/20 15 / 74

Page 83: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Remark: Suppose the data takes p different values X1, . . . , Xp, withfrequencies f1, . . . , fp, respectively.

How to calculate X and S2 quickly?

X =

p∑j=1

fjXj/n and S2 =

∑pj=1 fjX

2j − nX2

n− 1.

Example: Suppose we roll a die 10 times.

Xj 1 2 3 4 5 6

fj 2 1 1 3 0 3

Then X = (2 · 1 + 1 · 2 + · · ·+ 3 · 6)/10 = 3.7, and S2 = 3.789. 2

ISYE 6739 — Goldsman 7/12/20 15 / 74

Page 84: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Remark: Suppose the data takes p different values X1, . . . , Xp, withfrequencies f1, . . . , fp, respectively.

How to calculate X and S2 quickly?

X =

p∑j=1

fjXj/n

and S2 =

∑pj=1 fjX

2j − nX2

n− 1.

Example: Suppose we roll a die 10 times.

Xj 1 2 3 4 5 6

fj 2 1 1 3 0 3

Then X = (2 · 1 + 1 · 2 + · · ·+ 3 · 6)/10 = 3.7, and S2 = 3.789. 2

ISYE 6739 — Goldsman 7/12/20 15 / 74

Page 85: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Remark: Suppose the data takes p different values X1, . . . , Xp, withfrequencies f1, . . . , fp, respectively.

How to calculate X and S2 quickly?

X =

p∑j=1

fjXj/n and S2 =

∑pj=1 fjX

2j − nX2

n− 1.

Example: Suppose we roll a die 10 times.

Xj 1 2 3 4 5 6

fj 2 1 1 3 0 3

Then X = (2 · 1 + 1 · 2 + · · ·+ 3 · 6)/10 = 3.7, and S2 = 3.789. 2

ISYE 6739 — Goldsman 7/12/20 15 / 74

Page 86: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Remark: Suppose the data takes p different values X1, . . . , Xp, withfrequencies f1, . . . , fp, respectively.

How to calculate X and S2 quickly?

X =

p∑j=1

fjXj/n and S2 =

∑pj=1 fjX

2j − nX2

n− 1.

Example: Suppose we roll a die 10 times.

Xj 1 2 3 4 5 6

fj 2 1 1 3 0 3

Then X = (2 · 1 + 1 · 2 + · · ·+ 3 · 6)/10 = 3.7, and S2 = 3.789. 2

ISYE 6739 — Goldsman 7/12/20 15 / 74

Page 87: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Remark: Suppose the data takes p different values X1, . . . , Xp, withfrequencies f1, . . . , fp, respectively.

How to calculate X and S2 quickly?

X =

p∑j=1

fjXj/n and S2 =

∑pj=1 fjX

2j − nX2

n− 1.

Example: Suppose we roll a die 10 times.

Xj 1 2 3 4 5 6

fj 2 1 1 3 0 3

Then X = (2 · 1 + 1 · 2 + · · ·+ 3 · 6)/10 = 3.7, and S2 = 3.789. 2

ISYE 6739 — Goldsman 7/12/20 15 / 74

Page 88: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Remark: Suppose the data takes p different values X1, . . . , Xp, withfrequencies f1, . . . , fp, respectively.

How to calculate X and S2 quickly?

X =

p∑j=1

fjXj/n and S2 =

∑pj=1 fjX

2j − nX2

n− 1.

Example: Suppose we roll a die 10 times.

Xj 1 2 3 4 5 6

fj 2 1 1 3 0 3

Then X = (2 · 1 + 1 · 2 + · · ·+ 3 · 6)/10 = 3.7, and S2 = 3.789. 2

ISYE 6739 — Goldsman 7/12/20 15 / 74

Page 89: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Remark: If the individual observations can’t be determined in frequencydistributions, you might just break the observations up into c intervals.

Example: Suppose c = 3, where we denote the midpoint of the jth intervalby mj , j = 1, . . . , c, and the total sample size n =

∑cj=1 fj = 30.

Xj interval mj fj

100–150 125 10

150–200 175 15

200–300 250 5

X ≈∑c

j=1 fjmj

n= 170.833 and

S2 ≈∑c

j=1 fjm2j − nX2

n− 1= 1814. 2

ISYE 6739 — Goldsman 7/12/20 16 / 74

Page 90: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Remark: If the individual observations can’t be determined in frequencydistributions, you might just break the observations up into c intervals.

Example: Suppose c = 3, where we denote the midpoint of the jth intervalby mj , j = 1, . . . , c, and the total sample size n =

∑cj=1 fj = 30.

Xj interval mj fj

100–150 125 10

150–200 175 15

200–300 250 5

X ≈∑c

j=1 fjmj

n= 170.833 and

S2 ≈∑c

j=1 fjm2j − nX2

n− 1= 1814. 2

ISYE 6739 — Goldsman 7/12/20 16 / 74

Page 91: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Remark: If the individual observations can’t be determined in frequencydistributions, you might just break the observations up into c intervals.

Example: Suppose c = 3, where we denote the midpoint of the jth intervalby mj , j = 1, . . . , c, and the total sample size n =

∑cj=1 fj = 30.

Xj interval mj fj

100–150 125 10

150–200 175 15

200–300 250 5

X ≈∑c

j=1 fjmj

n= 170.833 and

S2 ≈∑c

j=1 fjm2j − nX2

n− 1= 1814. 2

ISYE 6739 — Goldsman 7/12/20 16 / 74

Page 92: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Remark: If the individual observations can’t be determined in frequencydistributions, you might just break the observations up into c intervals.

Example: Suppose c = 3, where we denote the midpoint of the jth intervalby mj , j = 1, . . . , c, and the total sample size n =

∑cj=1 fj = 30.

Xj interval mj fj

100–150 125 10

150–200 175 15

200–300 250 5

X ≈∑c

j=1 fjmj

n= 170.833 and

S2 ≈∑c

j=1 fjm2j − nX2

n− 1= 1814. 2

ISYE 6739 — Goldsman 7/12/20 16 / 74

Page 93: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Summarizing Data

Remark: If the individual observations can’t be determined in frequencydistributions, you might just break the observations up into c intervals.

Example: Suppose c = 3, where we denote the midpoint of the jth intervalby mj , j = 1, . . . , c, and the total sample size n =

∑cj=1 fj = 30.

Xj interval mj fj

100–150 125 10

150–200 175 15

200–300 250 5

X ≈∑c

j=1 fjmj

n= 170.833 and

S2 ≈∑c

j=1 fjm2j − nX2

n− 1= 1814. 2

ISYE 6739 — Goldsman 7/12/20 16 / 74

Page 94: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

Outline

1 Introduction to Descriptive Statistics

2 Summarizing Data

3 Candidate Distributions

4 Introduction to Estimation

5 Unbiased Estimation

6 Mean Squared Error

7 Maximum Likelihood Estimation

8 Trickier MLE Examples

9 Invariance Property of MLEs

10 Method of Moments Estimation

11 Sampling Distributions

ISYE 6739 — Goldsman 7/12/20 17 / 74

Page 95: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

Lesson 5.3 — Candidate Distributions

Time to make an informed guess about the type of probability distributionwe’re dealing with. We’ll look at more-formal methodology for fittingdistributions later in the course when we do goodness-of-fit tests. But for now,some preliminary things we should think about:

Is the data from a discrete, continuous, or mixed distribution?

Univariate/multivariate?

How much data is available?

Are experts around to ask about nature of the data?

What if we do not have much/any data — can we at least guess at a gooddistribution?

ISYE 6739 — Goldsman 7/12/20 18 / 74

Page 96: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

Lesson 5.3 — Candidate Distributions

Time to make an informed guess about the type of probability distributionwe’re dealing with.

We’ll look at more-formal methodology for fittingdistributions later in the course when we do goodness-of-fit tests. But for now,some preliminary things we should think about:

Is the data from a discrete, continuous, or mixed distribution?

Univariate/multivariate?

How much data is available?

Are experts around to ask about nature of the data?

What if we do not have much/any data — can we at least guess at a gooddistribution?

ISYE 6739 — Goldsman 7/12/20 18 / 74

Page 97: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

Lesson 5.3 — Candidate Distributions

Time to make an informed guess about the type of probability distributionwe’re dealing with. We’ll look at more-formal methodology for fittingdistributions later in the course when we do goodness-of-fit tests.

But for now,some preliminary things we should think about:

Is the data from a discrete, continuous, or mixed distribution?

Univariate/multivariate?

How much data is available?

Are experts around to ask about nature of the data?

What if we do not have much/any data — can we at least guess at a gooddistribution?

ISYE 6739 — Goldsman 7/12/20 18 / 74

Page 98: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

Lesson 5.3 — Candidate Distributions

Time to make an informed guess about the type of probability distributionwe’re dealing with. We’ll look at more-formal methodology for fittingdistributions later in the course when we do goodness-of-fit tests. But for now,some preliminary things we should think about:

Is the data from a discrete, continuous, or mixed distribution?

Univariate/multivariate?

How much data is available?

Are experts around to ask about nature of the data?

What if we do not have much/any data — can we at least guess at a gooddistribution?

ISYE 6739 — Goldsman 7/12/20 18 / 74

Page 99: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

Lesson 5.3 — Candidate Distributions

Time to make an informed guess about the type of probability distributionwe’re dealing with. We’ll look at more-formal methodology for fittingdistributions later in the course when we do goodness-of-fit tests. But for now,some preliminary things we should think about:

Is the data from a discrete, continuous, or mixed distribution?

Univariate/multivariate?

How much data is available?

Are experts around to ask about nature of the data?

What if we do not have much/any data — can we at least guess at a gooddistribution?

ISYE 6739 — Goldsman 7/12/20 18 / 74

Page 100: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

Lesson 5.3 — Candidate Distributions

Time to make an informed guess about the type of probability distributionwe’re dealing with. We’ll look at more-formal methodology for fittingdistributions later in the course when we do goodness-of-fit tests. But for now,some preliminary things we should think about:

Is the data from a discrete, continuous, or mixed distribution?

Univariate/multivariate?

How much data is available?

Are experts around to ask about nature of the data?

What if we do not have much/any data — can we at least guess at a gooddistribution?

ISYE 6739 — Goldsman 7/12/20 18 / 74

Page 101: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

Lesson 5.3 — Candidate Distributions

Time to make an informed guess about the type of probability distributionwe’re dealing with. We’ll look at more-formal methodology for fittingdistributions later in the course when we do goodness-of-fit tests. But for now,some preliminary things we should think about:

Is the data from a discrete, continuous, or mixed distribution?

Univariate/multivariate?

How much data is available?

Are experts around to ask about nature of the data?

What if we do not have much/any data — can we at least guess at a gooddistribution?

ISYE 6739 — Goldsman 7/12/20 18 / 74

Page 102: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

Lesson 5.3 — Candidate Distributions

Time to make an informed guess about the type of probability distributionwe’re dealing with. We’ll look at more-formal methodology for fittingdistributions later in the course when we do goodness-of-fit tests. But for now,some preliminary things we should think about:

Is the data from a discrete, continuous, or mixed distribution?

Univariate/multivariate?

How much data is available?

Are experts around to ask about nature of the data?

What if we do not have much/any data — can we at least guess at a gooddistribution?

ISYE 6739 — Goldsman 7/12/20 18 / 74

Page 103: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

Lesson 5.3 — Candidate Distributions

Time to make an informed guess about the type of probability distributionwe’re dealing with. We’ll look at more-formal methodology for fittingdistributions later in the course when we do goodness-of-fit tests. But for now,some preliminary things we should think about:

Is the data from a discrete, continuous, or mixed distribution?

Univariate/multivariate?

How much data is available?

Are experts around to ask about nature of the data?

What if we do not have much/any data — can we at least guess at a gooddistribution?

ISYE 6739 — Goldsman 7/12/20 18 / 74

Page 104: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

If the distribution is a discrete random variable, then we have a number offamiliar choices to select from.

Bernoulli(p) (success with probability p)

Binomial(n, p) (number of successes in n Bern(p) trials)

Geometric(p) (number of Bern(p) trials until first success)

Negative Binomial (number of Bern(p) trials until multiple successes)

Poisson(λ) (counts the number of arrivals over time)

Empirical (the all-purpose “sample” distribution based on the histogram)

ISYE 6739 — Goldsman 7/12/20 19 / 74

Page 105: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

If the distribution is a discrete random variable, then we have a number offamiliar choices to select from.

Bernoulli(p) (success with probability p)

Binomial(n, p) (number of successes in n Bern(p) trials)

Geometric(p) (number of Bern(p) trials until first success)

Negative Binomial (number of Bern(p) trials until multiple successes)

Poisson(λ) (counts the number of arrivals over time)

Empirical (the all-purpose “sample” distribution based on the histogram)

ISYE 6739 — Goldsman 7/12/20 19 / 74

Page 106: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

If the distribution is a discrete random variable, then we have a number offamiliar choices to select from.

Bernoulli(p) (success with probability p)

Binomial(n, p) (number of successes in n Bern(p) trials)

Geometric(p) (number of Bern(p) trials until first success)

Negative Binomial (number of Bern(p) trials until multiple successes)

Poisson(λ) (counts the number of arrivals over time)

Empirical (the all-purpose “sample” distribution based on the histogram)

ISYE 6739 — Goldsman 7/12/20 19 / 74

Page 107: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

If the distribution is a discrete random variable, then we have a number offamiliar choices to select from.

Bernoulli(p) (success with probability p)

Binomial(n, p) (number of successes in n Bern(p) trials)

Geometric(p) (number of Bern(p) trials until first success)

Negative Binomial (number of Bern(p) trials until multiple successes)

Poisson(λ) (counts the number of arrivals over time)

Empirical (the all-purpose “sample” distribution based on the histogram)

ISYE 6739 — Goldsman 7/12/20 19 / 74

Page 108: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

If the distribution is a discrete random variable, then we have a number offamiliar choices to select from.

Bernoulli(p) (success with probability p)

Binomial(n, p) (number of successes in n Bern(p) trials)

Geometric(p) (number of Bern(p) trials until first success)

Negative Binomial (number of Bern(p) trials until multiple successes)

Poisson(λ) (counts the number of arrivals over time)

Empirical (the all-purpose “sample” distribution based on the histogram)

ISYE 6739 — Goldsman 7/12/20 19 / 74

Page 109: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

If the distribution is a discrete random variable, then we have a number offamiliar choices to select from.

Bernoulli(p) (success with probability p)

Binomial(n, p) (number of successes in n Bern(p) trials)

Geometric(p) (number of Bern(p) trials until first success)

Negative Binomial (number of Bern(p) trials until multiple successes)

Poisson(λ) (counts the number of arrivals over time)

Empirical (the all-purpose “sample” distribution based on the histogram)

ISYE 6739 — Goldsman 7/12/20 19 / 74

Page 110: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

If the distribution is a discrete random variable, then we have a number offamiliar choices to select from.

Bernoulli(p) (success with probability p)

Binomial(n, p) (number of successes in n Bern(p) trials)

Geometric(p) (number of Bern(p) trials until first success)

Negative Binomial (number of Bern(p) trials until multiple successes)

Poisson(λ) (counts the number of arrivals over time)

Empirical (the all-purpose “sample” distribution based on the histogram)

ISYE 6739 — Goldsman 7/12/20 19 / 74

Page 111: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

If the data suggest a continuous distribution. . . .

Uniform (not much is known from the data, except perhaps the minimumand maximum possible values)

Triangular (at least we have an idea regarding the minimum, maximum,and “most likely” values)

Exponential(λ) (e.g., interarrival times from a Poisson process)

Normal (a good model for heights, weights, IQs, sample means, etc.)

Beta (good for specifying bounded data)

Gamma, Weibull, Gumbel, lognormal (reliability data)

Empirical (our all-purpose friend)

ISYE 6739 — Goldsman 7/12/20 20 / 74

Page 112: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

If the data suggest a continuous distribution. . . .

Uniform (not much is known from the data, except perhaps the minimumand maximum possible values)

Triangular (at least we have an idea regarding the minimum, maximum,and “most likely” values)

Exponential(λ) (e.g., interarrival times from a Poisson process)

Normal (a good model for heights, weights, IQs, sample means, etc.)

Beta (good for specifying bounded data)

Gamma, Weibull, Gumbel, lognormal (reliability data)

Empirical (our all-purpose friend)

ISYE 6739 — Goldsman 7/12/20 20 / 74

Page 113: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

If the data suggest a continuous distribution. . . .

Uniform (not much is known from the data, except perhaps the minimumand maximum possible values)

Triangular (at least we have an idea regarding the minimum, maximum,and “most likely” values)

Exponential(λ) (e.g., interarrival times from a Poisson process)

Normal (a good model for heights, weights, IQs, sample means, etc.)

Beta (good for specifying bounded data)

Gamma, Weibull, Gumbel, lognormal (reliability data)

Empirical (our all-purpose friend)

ISYE 6739 — Goldsman 7/12/20 20 / 74

Page 114: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

If the data suggest a continuous distribution. . . .

Uniform (not much is known from the data, except perhaps the minimumand maximum possible values)

Triangular (at least we have an idea regarding the minimum, maximum,and “most likely” values)

Exponential(λ) (e.g., interarrival times from a Poisson process)

Normal (a good model for heights, weights, IQs, sample means, etc.)

Beta (good for specifying bounded data)

Gamma, Weibull, Gumbel, lognormal (reliability data)

Empirical (our all-purpose friend)

ISYE 6739 — Goldsman 7/12/20 20 / 74

Page 115: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

If the data suggest a continuous distribution. . . .

Uniform (not much is known from the data, except perhaps the minimumand maximum possible values)

Triangular (at least we have an idea regarding the minimum, maximum,and “most likely” values)

Exponential(λ) (e.g., interarrival times from a Poisson process)

Normal (a good model for heights, weights, IQs, sample means, etc.)

Beta (good for specifying bounded data)

Gamma, Weibull, Gumbel, lognormal (reliability data)

Empirical (our all-purpose friend)

ISYE 6739 — Goldsman 7/12/20 20 / 74

Page 116: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

If the data suggest a continuous distribution. . . .

Uniform (not much is known from the data, except perhaps the minimumand maximum possible values)

Triangular (at least we have an idea regarding the minimum, maximum,and “most likely” values)

Exponential(λ) (e.g., interarrival times from a Poisson process)

Normal (a good model for heights, weights, IQs, sample means, etc.)

Beta (good for specifying bounded data)

Gamma, Weibull, Gumbel, lognormal (reliability data)

Empirical (our all-purpose friend)

ISYE 6739 — Goldsman 7/12/20 20 / 74

Page 117: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

If the data suggest a continuous distribution. . . .

Uniform (not much is known from the data, except perhaps the minimumand maximum possible values)

Triangular (at least we have an idea regarding the minimum, maximum,and “most likely” values)

Exponential(λ) (e.g., interarrival times from a Poisson process)

Normal (a good model for heights, weights, IQs, sample means, etc.)

Beta (good for specifying bounded data)

Gamma, Weibull, Gumbel, lognormal (reliability data)

Empirical (our all-purpose friend)

ISYE 6739 — Goldsman 7/12/20 20 / 74

Page 118: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Candidate Distributions

If the data suggest a continuous distribution. . . .

Uniform (not much is known from the data, except perhaps the minimumand maximum possible values)

Triangular (at least we have an idea regarding the minimum, maximum,and “most likely” values)

Exponential(λ) (e.g., interarrival times from a Poisson process)

Normal (a good model for heights, weights, IQs, sample means, etc.)

Beta (good for specifying bounded data)

Gamma, Weibull, Gumbel, lognormal (reliability data)

Empirical (our all-purpose friend)

ISYE 6739 — Goldsman 7/12/20 20 / 74

Page 119: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Outline

1 Introduction to Descriptive Statistics

2 Summarizing Data

3 Candidate Distributions

4 Introduction to Estimation

5 Unbiased Estimation

6 Mean Squared Error

7 Maximum Likelihood Estimation

8 Trickier MLE Examples

9 Invariance Property of MLEs

10 Method of Moments Estimation

11 Sampling Distributions

ISYE 6739 — Goldsman 7/12/20 21 / 74

Page 120: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Lesson 5.4 — Introduction to Estimation

Definition: A statistic is a function of the observations X1, . . . , Xn, andnot explicitly dependent on any unknown parameters.

Examples of statistics: X and S2, but not (X − µ)/σ.

Statistics are random variables. If we take two different samples, we’d expectto get two different values of a statistic.

A statistic is usually used to estimate some unknown parameter from theunderlying probability distribution of the Xi’s.

Examples of parameters: µ, σ2.

ISYE 6739 — Goldsman 7/12/20 22 / 74

Page 121: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Lesson 5.4 — Introduction to Estimation

Definition: A statistic is a function of the observations X1, . . . , Xn, andnot explicitly dependent on any unknown parameters.

Examples of statistics: X and S2, but not (X − µ)/σ.

Statistics are random variables. If we take two different samples, we’d expectto get two different values of a statistic.

A statistic is usually used to estimate some unknown parameter from theunderlying probability distribution of the Xi’s.

Examples of parameters: µ, σ2.

ISYE 6739 — Goldsman 7/12/20 22 / 74

Page 122: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Lesson 5.4 — Introduction to Estimation

Definition: A statistic is a function of the observations X1, . . . , Xn, andnot explicitly dependent on any unknown parameters.

Examples of statistics: X and S2, but not (X − µ)/σ.

Statistics are random variables. If we take two different samples, we’d expectto get two different values of a statistic.

A statistic is usually used to estimate some unknown parameter from theunderlying probability distribution of the Xi’s.

Examples of parameters: µ, σ2.

ISYE 6739 — Goldsman 7/12/20 22 / 74

Page 123: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Lesson 5.4 — Introduction to Estimation

Definition: A statistic is a function of the observations X1, . . . , Xn, andnot explicitly dependent on any unknown parameters.

Examples of statistics: X and S2, but not (X − µ)/σ.

Statistics are random variables.

If we take two different samples, we’d expectto get two different values of a statistic.

A statistic is usually used to estimate some unknown parameter from theunderlying probability distribution of the Xi’s.

Examples of parameters: µ, σ2.

ISYE 6739 — Goldsman 7/12/20 22 / 74

Page 124: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Lesson 5.4 — Introduction to Estimation

Definition: A statistic is a function of the observations X1, . . . , Xn, andnot explicitly dependent on any unknown parameters.

Examples of statistics: X and S2, but not (X − µ)/σ.

Statistics are random variables. If we take two different samples, we’d expectto get two different values of a statistic.

A statistic is usually used to estimate some unknown parameter from theunderlying probability distribution of the Xi’s.

Examples of parameters: µ, σ2.

ISYE 6739 — Goldsman 7/12/20 22 / 74

Page 125: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Lesson 5.4 — Introduction to Estimation

Definition: A statistic is a function of the observations X1, . . . , Xn, andnot explicitly dependent on any unknown parameters.

Examples of statistics: X and S2, but not (X − µ)/σ.

Statistics are random variables. If we take two different samples, we’d expectto get two different values of a statistic.

A statistic is usually used to estimate some unknown parameter from theunderlying probability distribution of the Xi’s.

Examples of parameters: µ, σ2.

ISYE 6739 — Goldsman 7/12/20 22 / 74

Page 126: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Lesson 5.4 — Introduction to Estimation

Definition: A statistic is a function of the observations X1, . . . , Xn, andnot explicitly dependent on any unknown parameters.

Examples of statistics: X and S2, but not (X − µ)/σ.

Statistics are random variables. If we take two different samples, we’d expectto get two different values of a statistic.

A statistic is usually used to estimate some unknown parameter from theunderlying probability distribution of the Xi’s.

Examples of parameters: µ, σ2.

ISYE 6739 — Goldsman 7/12/20 22 / 74

Page 127: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Let X1, . . . , Xn be iid RV’s and let T (X) ≡ T (X1, . . . , Xn) be a statisticbased on the Xi’s.

Suppose we use T (X) to estimate some unknownparameter θ. Then T (X) is called a point estimator for θ.

Examples: X is usually a point estimator for the mean µ = E[Xi], and S2 isoften a point estimator for the variance σ2 = Var(Xi).

It would be nice if T (X) had certain properties:

Its expected value should equal the parameter it’s trying to estimate.

It should have low variance.

ISYE 6739 — Goldsman 7/12/20 23 / 74

Page 128: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Let X1, . . . , Xn be iid RV’s and let T (X) ≡ T (X1, . . . , Xn) be a statisticbased on the Xi’s. Suppose we use T (X) to estimate some unknownparameter θ.

Then T (X) is called a point estimator for θ.

Examples: X is usually a point estimator for the mean µ = E[Xi], and S2 isoften a point estimator for the variance σ2 = Var(Xi).

It would be nice if T (X) had certain properties:

Its expected value should equal the parameter it’s trying to estimate.

It should have low variance.

ISYE 6739 — Goldsman 7/12/20 23 / 74

Page 129: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Let X1, . . . , Xn be iid RV’s and let T (X) ≡ T (X1, . . . , Xn) be a statisticbased on the Xi’s. Suppose we use T (X) to estimate some unknownparameter θ. Then T (X) is called a point estimator for θ.

Examples: X is usually a point estimator for the mean µ = E[Xi], and S2 isoften a point estimator for the variance σ2 = Var(Xi).

It would be nice if T (X) had certain properties:

Its expected value should equal the parameter it’s trying to estimate.

It should have low variance.

ISYE 6739 — Goldsman 7/12/20 23 / 74

Page 130: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Let X1, . . . , Xn be iid RV’s and let T (X) ≡ T (X1, . . . , Xn) be a statisticbased on the Xi’s. Suppose we use T (X) to estimate some unknownparameter θ. Then T (X) is called a point estimator for θ.

Examples: X is usually a point estimator for the mean µ = E[Xi], and

S2 isoften a point estimator for the variance σ2 = Var(Xi).

It would be nice if T (X) had certain properties:

Its expected value should equal the parameter it’s trying to estimate.

It should have low variance.

ISYE 6739 — Goldsman 7/12/20 23 / 74

Page 131: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Let X1, . . . , Xn be iid RV’s and let T (X) ≡ T (X1, . . . , Xn) be a statisticbased on the Xi’s. Suppose we use T (X) to estimate some unknownparameter θ. Then T (X) is called a point estimator for θ.

Examples: X is usually a point estimator for the mean µ = E[Xi], and S2 isoften a point estimator for the variance σ2 = Var(Xi).

It would be nice if T (X) had certain properties:

Its expected value should equal the parameter it’s trying to estimate.

It should have low variance.

ISYE 6739 — Goldsman 7/12/20 23 / 74

Page 132: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Let X1, . . . , Xn be iid RV’s and let T (X) ≡ T (X1, . . . , Xn) be a statisticbased on the Xi’s. Suppose we use T (X) to estimate some unknownparameter θ. Then T (X) is called a point estimator for θ.

Examples: X is usually a point estimator for the mean µ = E[Xi], and S2 isoften a point estimator for the variance σ2 = Var(Xi).

It would be nice if T (X) had certain properties:

Its expected value should equal the parameter it’s trying to estimate.

It should have low variance.

ISYE 6739 — Goldsman 7/12/20 23 / 74

Page 133: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Let X1, . . . , Xn be iid RV’s and let T (X) ≡ T (X1, . . . , Xn) be a statisticbased on the Xi’s. Suppose we use T (X) to estimate some unknownparameter θ. Then T (X) is called a point estimator for θ.

Examples: X is usually a point estimator for the mean µ = E[Xi], and S2 isoften a point estimator for the variance σ2 = Var(Xi).

It would be nice if T (X) had certain properties:

Its expected value should equal the parameter it’s trying to estimate.

It should have low variance.

ISYE 6739 — Goldsman 7/12/20 23 / 74

Page 134: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Introduction to Estimation

Let X1, . . . , Xn be iid RV’s and let T (X) ≡ T (X1, . . . , Xn) be a statisticbased on the Xi’s. Suppose we use T (X) to estimate some unknownparameter θ. Then T (X) is called a point estimator for θ.

Examples: X is usually a point estimator for the mean µ = E[Xi], and S2 isoften a point estimator for the variance σ2 = Var(Xi).

It would be nice if T (X) had certain properties:

Its expected value should equal the parameter it’s trying to estimate.

It should have low variance.

ISYE 6739 — Goldsman 7/12/20 23 / 74

Page 135: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Outline

1 Introduction to Descriptive Statistics

2 Summarizing Data

3 Candidate Distributions

4 Introduction to Estimation

5 Unbiased Estimation

6 Mean Squared Error

7 Maximum Likelihood Estimation

8 Trickier MLE Examples

9 Invariance Property of MLEs

10 Method of Moments Estimation

11 Sampling Distributions

ISYE 6739 — Goldsman 7/12/20 24 / 74

Page 136: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Lesson 5.5 — Unbiased Estimation

Definition: T (X) is unbiased for θ if E[T (X)] = θ.

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ.Then X is always unbiased for µ.

E[X] = E

[1

n

n∑i=1

Xi

]= E[Xi] = µ.

That’s why X is called the sample mean. 2

Baby Example: In particular, suppose X1, . . . , Xn are iid Exp(λ). Then Xis unbiased for µ = E[Xi] = 1/λ.

But be careful. . . . 1/X is biased for λ in this exponential case, i.e.,E[1/X] 6= 1/E[X] = λ. 2

ISYE 6739 — Goldsman 7/12/20 25 / 74

Page 137: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Lesson 5.5 — Unbiased Estimation

Definition: T (X) is unbiased for θ if E[T (X)] = θ.

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ.Then X is always unbiased for µ.

E[X] = E

[1

n

n∑i=1

Xi

]= E[Xi] = µ.

That’s why X is called the sample mean. 2

Baby Example: In particular, suppose X1, . . . , Xn are iid Exp(λ). Then Xis unbiased for µ = E[Xi] = 1/λ.

But be careful. . . . 1/X is biased for λ in this exponential case, i.e.,E[1/X] 6= 1/E[X] = λ. 2

ISYE 6739 — Goldsman 7/12/20 25 / 74

Page 138: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Lesson 5.5 — Unbiased Estimation

Definition: T (X) is unbiased for θ if E[T (X)] = θ.

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ.

Then X is always unbiased for µ.

E[X] = E

[1

n

n∑i=1

Xi

]= E[Xi] = µ.

That’s why X is called the sample mean. 2

Baby Example: In particular, suppose X1, . . . , Xn are iid Exp(λ). Then Xis unbiased for µ = E[Xi] = 1/λ.

But be careful. . . . 1/X is biased for λ in this exponential case, i.e.,E[1/X] 6= 1/E[X] = λ. 2

ISYE 6739 — Goldsman 7/12/20 25 / 74

Page 139: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Lesson 5.5 — Unbiased Estimation

Definition: T (X) is unbiased for θ if E[T (X)] = θ.

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ.Then X is always unbiased for µ.

E[X] = E

[1

n

n∑i=1

Xi

]= E[Xi] = µ.

That’s why X is called the sample mean. 2

Baby Example: In particular, suppose X1, . . . , Xn are iid Exp(λ). Then Xis unbiased for µ = E[Xi] = 1/λ.

But be careful. . . . 1/X is biased for λ in this exponential case, i.e.,E[1/X] 6= 1/E[X] = λ. 2

ISYE 6739 — Goldsman 7/12/20 25 / 74

Page 140: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Lesson 5.5 — Unbiased Estimation

Definition: T (X) is unbiased for θ if E[T (X)] = θ.

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ.Then X is always unbiased for µ.

E[X] = E

[1

n

n∑i=1

Xi

]= E[Xi] = µ.

That’s why X is called the sample mean. 2

Baby Example: In particular, suppose X1, . . . , Xn are iid Exp(λ). Then Xis unbiased for µ = E[Xi] = 1/λ.

But be careful. . . . 1/X is biased for λ in this exponential case, i.e.,E[1/X] 6= 1/E[X] = λ. 2

ISYE 6739 — Goldsman 7/12/20 25 / 74

Page 141: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Lesson 5.5 — Unbiased Estimation

Definition: T (X) is unbiased for θ if E[T (X)] = θ.

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ.Then X is always unbiased for µ.

E[X] = E

[1

n

n∑i=1

Xi

]= E[Xi] = µ.

That’s why X is called the sample mean. 2

Baby Example: In particular, suppose X1, . . . , Xn are iid Exp(λ). Then Xis unbiased for µ = E[Xi] = 1/λ.

But be careful. . . . 1/X is biased for λ in this exponential case, i.e.,E[1/X] 6= 1/E[X] = λ. 2

ISYE 6739 — Goldsman 7/12/20 25 / 74

Page 142: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Lesson 5.5 — Unbiased Estimation

Definition: T (X) is unbiased for θ if E[T (X)] = θ.

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ.Then X is always unbiased for µ.

E[X] = E

[1

n

n∑i=1

Xi

]= E[Xi] = µ.

That’s why X is called the sample mean. 2

Baby Example: In particular, suppose X1, . . . , Xn are iid Exp(λ).

Then Xis unbiased for µ = E[Xi] = 1/λ.

But be careful. . . . 1/X is biased for λ in this exponential case, i.e.,E[1/X] 6= 1/E[X] = λ. 2

ISYE 6739 — Goldsman 7/12/20 25 / 74

Page 143: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Lesson 5.5 — Unbiased Estimation

Definition: T (X) is unbiased for θ if E[T (X)] = θ.

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ.Then X is always unbiased for µ.

E[X] = E

[1

n

n∑i=1

Xi

]= E[Xi] = µ.

That’s why X is called the sample mean. 2

Baby Example: In particular, suppose X1, . . . , Xn are iid Exp(λ). Then Xis unbiased for µ = E[Xi] = 1/λ.

But be careful. . . . 1/X is biased for λ in this exponential case, i.e.,E[1/X] 6= 1/E[X] = λ. 2

ISYE 6739 — Goldsman 7/12/20 25 / 74

Page 144: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Lesson 5.5 — Unbiased Estimation

Definition: T (X) is unbiased for θ if E[T (X)] = θ.

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ.Then X is always unbiased for µ.

E[X] = E

[1

n

n∑i=1

Xi

]= E[Xi] = µ.

That’s why X is called the sample mean. 2

Baby Example: In particular, suppose X1, . . . , Xn are iid Exp(λ). Then Xis unbiased for µ = E[Xi] = 1/λ.

But be careful. . . . 1/X is biased for λ in this exponential case, i.e.,

E[1/X] 6= 1/E[X] = λ. 2

ISYE 6739 — Goldsman 7/12/20 25 / 74

Page 145: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Lesson 5.5 — Unbiased Estimation

Definition: T (X) is unbiased for θ if E[T (X)] = θ.

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ.Then X is always unbiased for µ.

E[X] = E

[1

n

n∑i=1

Xi

]= E[Xi] = µ.

That’s why X is called the sample mean. 2

Baby Example: In particular, suppose X1, . . . , Xn are iid Exp(λ). Then Xis unbiased for µ = E[Xi] = 1/λ.

But be careful. . . . 1/X is biased for λ in this exponential case, i.e.,E[1/X] 6= 1/E[X] = λ. 2

ISYE 6739 — Goldsman 7/12/20 25 / 74

Page 146: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ andvariance σ2.

Then S2 is always unbiased for σ2.

E[S2] = E

[1

n− 1

n∑i=1

(Xi − X)2

]= Var(Xi) = σ2.

This is why S2 is called the sample variance. 2

Baby Example: Suppose X1, . . . , Xn are iid Exp(λ). Then S2 is unbiasedfor Var(Xi) = 1/λ2. 2

ISYE 6739 — Goldsman 7/12/20 26 / 74

Page 147: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ andvariance σ2. Then S2 is always unbiased for σ2.

E[S2] = E

[1

n− 1

n∑i=1

(Xi − X)2

]= Var(Xi) = σ2.

This is why S2 is called the sample variance. 2

Baby Example: Suppose X1, . . . , Xn are iid Exp(λ). Then S2 is unbiasedfor Var(Xi) = 1/λ2. 2

ISYE 6739 — Goldsman 7/12/20 26 / 74

Page 148: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ andvariance σ2. Then S2 is always unbiased for σ2.

E[S2] = E

[1

n− 1

n∑i=1

(Xi − X)2

]= Var(Xi) = σ2.

This is why S2 is called the sample variance. 2

Baby Example: Suppose X1, . . . , Xn are iid Exp(λ). Then S2 is unbiasedfor Var(Xi) = 1/λ2. 2

ISYE 6739 — Goldsman 7/12/20 26 / 74

Page 149: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ andvariance σ2. Then S2 is always unbiased for σ2.

E[S2] = E

[1

n− 1

n∑i=1

(Xi − X)2

]= Var(Xi) = σ2.

This is why S2 is called the sample variance. 2

Baby Example: Suppose X1, . . . , Xn are iid Exp(λ). Then S2 is unbiasedfor Var(Xi) = 1/λ2. 2

ISYE 6739 — Goldsman 7/12/20 26 / 74

Page 150: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ andvariance σ2. Then S2 is always unbiased for σ2.

E[S2] = E

[1

n− 1

n∑i=1

(Xi − X)2

]= Var(Xi) = σ2.

This is why S2 is called the sample variance. 2

Baby Example: Suppose X1, . . . , Xn are iid Exp(λ).

Then S2 is unbiasedfor Var(Xi) = 1/λ2. 2

ISYE 6739 — Goldsman 7/12/20 26 / 74

Page 151: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Example/Theorem: Suppose X1, . . . , Xn are iid anything with mean µ andvariance σ2. Then S2 is always unbiased for σ2.

E[S2] = E

[1

n− 1

n∑i=1

(Xi − X)2

]= Var(Xi) = σ2.

This is why S2 is called the sample variance. 2

Baby Example: Suppose X1, . . . , Xn are iid Exp(λ). Then S2 is unbiasedfor Var(Xi) = 1/λ2. 2

ISYE 6739 — Goldsman 7/12/20 26 / 74

Page 152: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Proof (of general result): First, some algebra gives

n∑i=1

(Xi − X)2 =n∑i=1

(X2i − 2XXi + X2)

=

n∑i=1

X2i − 2X

n∑i=1

Xi + nX2

=

n∑i=1

X2i − 2nX2 + nX2

=

n∑i=1

X2i − nX2.

So. . .

ISYE 6739 — Goldsman 7/12/20 27 / 74

Page 153: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Proof (of general result): First, some algebra gives

n∑i=1

(Xi − X)2 =

n∑i=1

(X2i − 2XXi + X2)

=

n∑i=1

X2i − 2X

n∑i=1

Xi + nX2

=

n∑i=1

X2i − 2nX2 + nX2

=

n∑i=1

X2i − nX2.

So. . .

ISYE 6739 — Goldsman 7/12/20 27 / 74

Page 154: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Proof (of general result): First, some algebra gives

n∑i=1

(Xi − X)2 =

n∑i=1

(X2i − 2XXi + X2)

=

n∑i=1

X2i − 2X

n∑i=1

Xi + nX2

=

n∑i=1

X2i − 2nX2 + nX2

=

n∑i=1

X2i − nX2.

So. . .

ISYE 6739 — Goldsman 7/12/20 27 / 74

Page 155: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Proof (of general result): First, some algebra gives

n∑i=1

(Xi − X)2 =

n∑i=1

(X2i − 2XXi + X2)

=

n∑i=1

X2i − 2X

n∑i=1

Xi + nX2

=

n∑i=1

X2i − 2nX2 + nX2

=

n∑i=1

X2i − nX2.

So. . .

ISYE 6739 — Goldsman 7/12/20 27 / 74

Page 156: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Proof (of general result): First, some algebra gives

n∑i=1

(Xi − X)2 =

n∑i=1

(X2i − 2XXi + X2)

=

n∑i=1

X2i − 2X

n∑i=1

Xi + nX2

=

n∑i=1

X2i − 2nX2 + nX2

=

n∑i=1

X2i − nX2.

So. . .

ISYE 6739 — Goldsman 7/12/20 27 / 74

Page 157: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Proof (of general result): First, some algebra gives

n∑i=1

(Xi − X)2 =

n∑i=1

(X2i − 2XXi + X2)

=

n∑i=1

X2i − 2X

n∑i=1

Xi + nX2

=

n∑i=1

X2i − 2nX2 + nX2

=

n∑i=1

X2i − nX2.

So. . .

ISYE 6739 — Goldsman 7/12/20 27 / 74

Page 158: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

E[S2] =1

n− 1E[ n∑i=1

(Xi − X)2]

=1

n− 1E[ n∑i=1

X2i − nX2

]=

1

n− 1

( n∑i=1

E[X2i ]− nE[X2]

)=

n

n− 1

(E[X2

1 ]− E[X2])

(since the Xi’s are iid)

=n

n− 1

(Var(X1) + (E[X1])2 −Var(X)− (E[X])2

)=

n

n− 1(σ2 − σ2/n) (since E[X1] = E[X] and Var(X) = σ2/n)

= σ2. Done. 2

Remark: S is not unbiased for the standard deviation σ.

ISYE 6739 — Goldsman 7/12/20 28 / 74

Page 159: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

E[S2] =1

n− 1E[ n∑i=1

(Xi − X)2]

=1

n− 1E[ n∑i=1

X2i − nX2

]

=1

n− 1

( n∑i=1

E[X2i ]− nE[X2]

)=

n

n− 1

(E[X2

1 ]− E[X2])

(since the Xi’s are iid)

=n

n− 1

(Var(X1) + (E[X1])2 −Var(X)− (E[X])2

)=

n

n− 1(σ2 − σ2/n) (since E[X1] = E[X] and Var(X) = σ2/n)

= σ2. Done. 2

Remark: S is not unbiased for the standard deviation σ.

ISYE 6739 — Goldsman 7/12/20 28 / 74

Page 160: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

E[S2] =1

n− 1E[ n∑i=1

(Xi − X)2]

=1

n− 1E[ n∑i=1

X2i − nX2

]=

1

n− 1

( n∑i=1

E[X2i ]− nE[X2]

)

=n

n− 1

(E[X2

1 ]− E[X2])

(since the Xi’s are iid)

=n

n− 1

(Var(X1) + (E[X1])2 −Var(X)− (E[X])2

)=

n

n− 1(σ2 − σ2/n) (since E[X1] = E[X] and Var(X) = σ2/n)

= σ2. Done. 2

Remark: S is not unbiased for the standard deviation σ.

ISYE 6739 — Goldsman 7/12/20 28 / 74

Page 161: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

E[S2] =1

n− 1E[ n∑i=1

(Xi − X)2]

=1

n− 1E[ n∑i=1

X2i − nX2

]=

1

n− 1

( n∑i=1

E[X2i ]− nE[X2]

)=

n

n− 1

(E[X2

1 ]− E[X2])

(since the Xi’s are iid)

=n

n− 1

(Var(X1) + (E[X1])2 −Var(X)− (E[X])2

)=

n

n− 1(σ2 − σ2/n) (since E[X1] = E[X] and Var(X) = σ2/n)

= σ2. Done. 2

Remark: S is not unbiased for the standard deviation σ.

ISYE 6739 — Goldsman 7/12/20 28 / 74

Page 162: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

E[S2] =1

n− 1E[ n∑i=1

(Xi − X)2]

=1

n− 1E[ n∑i=1

X2i − nX2

]=

1

n− 1

( n∑i=1

E[X2i ]− nE[X2]

)=

n

n− 1

(E[X2

1 ]− E[X2])

(since the Xi’s are iid)

=n

n− 1

(Var(X1) + (E[X1])2 −Var(X)− (E[X])2

)

=n

n− 1(σ2 − σ2/n) (since E[X1] = E[X] and Var(X) = σ2/n)

= σ2. Done. 2

Remark: S is not unbiased for the standard deviation σ.

ISYE 6739 — Goldsman 7/12/20 28 / 74

Page 163: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

E[S2] =1

n− 1E[ n∑i=1

(Xi − X)2]

=1

n− 1E[ n∑i=1

X2i − nX2

]=

1

n− 1

( n∑i=1

E[X2i ]− nE[X2]

)=

n

n− 1

(E[X2

1 ]− E[X2])

(since the Xi’s are iid)

=n

n− 1

(Var(X1) + (E[X1])2 −Var(X)− (E[X])2

)=

n

n− 1(σ2 − σ2/n) (since E[X1] = E[X] and Var(X) = σ2/n)

= σ2. Done. 2

Remark: S is not unbiased for the standard deviation σ.

ISYE 6739 — Goldsman 7/12/20 28 / 74

Page 164: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

E[S2] =1

n− 1E[ n∑i=1

(Xi − X)2]

=1

n− 1E[ n∑i=1

X2i − nX2

]=

1

n− 1

( n∑i=1

E[X2i ]− nE[X2]

)=

n

n− 1

(E[X2

1 ]− E[X2])

(since the Xi’s are iid)

=n

n− 1

(Var(X1) + (E[X1])2 −Var(X)− (E[X])2

)=

n

n− 1(σ2 − σ2/n) (since E[X1] = E[X] and Var(X) = σ2/n)

= σ2. Done. 2

Remark: S is not unbiased for the standard deviation σ.

ISYE 6739 — Goldsman 7/12/20 28 / 74

Page 165: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

E[S2] =1

n− 1E[ n∑i=1

(Xi − X)2]

=1

n− 1E[ n∑i=1

X2i − nX2

]=

1

n− 1

( n∑i=1

E[X2i ]− nE[X2]

)=

n

n− 1

(E[X2

1 ]− E[X2])

(since the Xi’s are iid)

=n

n− 1

(Var(X1) + (E[X1])2 −Var(X)− (E[X])2

)=

n

n− 1(σ2 − σ2/n) (since E[X1] = E[X] and Var(X) = σ2/n)

= σ2. Done. 2

Remark: S is not unbiased for the standard deviation σ.

ISYE 6739 — Goldsman 7/12/20 28 / 74

Page 166: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Big Example: Suppose that X1, . . . , Xniid∼ Unif(0, θ),

i.e., the pdf isf(x) = 1/θ, for 0 < x < θ. Think of it this way: I give you a bunch ofrandom numbers between 0 and θ, and you have to guess what θ is.

We’ll look at three unbiased estimators for θ:

Y1 = 2X .

Y2 = n+1n max1≤i≤nXi.

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

If they’re all unbiased, which one’s the best?

ISYE 6739 — Goldsman 7/12/20 29 / 74

Page 167: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Big Example: Suppose that X1, . . . , Xniid∼ Unif(0, θ), i.e., the pdf is

f(x) = 1/θ, for 0 < x < θ.

Think of it this way: I give you a bunch ofrandom numbers between 0 and θ, and you have to guess what θ is.

We’ll look at three unbiased estimators for θ:

Y1 = 2X .

Y2 = n+1n max1≤i≤nXi.

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

If they’re all unbiased, which one’s the best?

ISYE 6739 — Goldsman 7/12/20 29 / 74

Page 168: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Big Example: Suppose that X1, . . . , Xniid∼ Unif(0, θ), i.e., the pdf is

f(x) = 1/θ, for 0 < x < θ. Think of it this way: I give you a bunch ofrandom numbers between 0 and θ, and you have to guess what θ is.

We’ll look at three unbiased estimators for θ:

Y1 = 2X .

Y2 = n+1n max1≤i≤nXi.

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

If they’re all unbiased, which one’s the best?

ISYE 6739 — Goldsman 7/12/20 29 / 74

Page 169: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Big Example: Suppose that X1, . . . , Xniid∼ Unif(0, θ), i.e., the pdf is

f(x) = 1/θ, for 0 < x < θ. Think of it this way: I give you a bunch ofrandom numbers between 0 and θ, and you have to guess what θ is.

We’ll look at three unbiased estimators for θ:

Y1 = 2X .

Y2 = n+1n max1≤i≤nXi.

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

If they’re all unbiased, which one’s the best?

ISYE 6739 — Goldsman 7/12/20 29 / 74

Page 170: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Big Example: Suppose that X1, . . . , Xniid∼ Unif(0, θ), i.e., the pdf is

f(x) = 1/θ, for 0 < x < θ. Think of it this way: I give you a bunch ofrandom numbers between 0 and θ, and you have to guess what θ is.

We’ll look at three unbiased estimators for θ:

Y1 = 2X .

Y2 = n+1n max1≤i≤nXi.

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

If they’re all unbiased, which one’s the best?

ISYE 6739 — Goldsman 7/12/20 29 / 74

Page 171: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Big Example: Suppose that X1, . . . , Xniid∼ Unif(0, θ), i.e., the pdf is

f(x) = 1/θ, for 0 < x < θ. Think of it this way: I give you a bunch ofrandom numbers between 0 and θ, and you have to guess what θ is.

We’ll look at three unbiased estimators for θ:

Y1 = 2X .

Y2 = n+1n max1≤i≤nXi.

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

If they’re all unbiased, which one’s the best?

ISYE 6739 — Goldsman 7/12/20 29 / 74

Page 172: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Big Example: Suppose that X1, . . . , Xniid∼ Unif(0, θ), i.e., the pdf is

f(x) = 1/θ, for 0 < x < θ. Think of it this way: I give you a bunch ofrandom numbers between 0 and θ, and you have to guess what θ is.

We’ll look at three unbiased estimators for θ:

Y1 = 2X .

Y2 = n+1n max1≤i≤nXi.

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

If they’re all unbiased, which one’s the best?

ISYE 6739 — Goldsman 7/12/20 29 / 74

Page 173: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Big Example: Suppose that X1, . . . , Xniid∼ Unif(0, θ), i.e., the pdf is

f(x) = 1/θ, for 0 < x < θ. Think of it this way: I give you a bunch ofrandom numbers between 0 and θ, and you have to guess what θ is.

We’ll look at three unbiased estimators for θ:

Y1 = 2X .

Y2 = n+1n max1≤i≤nXi.

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

If they’re all unbiased, which one’s the best?

ISYE 6739 — Goldsman 7/12/20 29 / 74

Page 174: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Good” Estimator:

Y1 = 2X .

Proof (that it’s unbiased): E[Y1] = 2E[X] = 2E[Xi] = θ. 2

“Better” Estimator: Y2 = n+1n max1≤i≤nXi.

Why might this estimator for θ make sense? (We’ll say why it’s “better” in alittle while.)

Proof (that it’s unbiased): E[Y2] = n+1n E[maxiXi] = θ iff

E[maxXi] =nθ

n+ 1(which is what we’ll show below).

ISYE 6739 — Goldsman 7/12/20 30 / 74

Page 175: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Good” Estimator: Y1 = 2X .

Proof (that it’s unbiased): E[Y1] = 2E[X] = 2E[Xi] = θ. 2

“Better” Estimator: Y2 = n+1n max1≤i≤nXi.

Why might this estimator for θ make sense? (We’ll say why it’s “better” in alittle while.)

Proof (that it’s unbiased): E[Y2] = n+1n E[maxiXi] = θ iff

E[maxXi] =nθ

n+ 1(which is what we’ll show below).

ISYE 6739 — Goldsman 7/12/20 30 / 74

Page 176: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Good” Estimator: Y1 = 2X .

Proof (that it’s unbiased):

E[Y1] = 2E[X] = 2E[Xi] = θ. 2

“Better” Estimator: Y2 = n+1n max1≤i≤nXi.

Why might this estimator for θ make sense? (We’ll say why it’s “better” in alittle while.)

Proof (that it’s unbiased): E[Y2] = n+1n E[maxiXi] = θ iff

E[maxXi] =nθ

n+ 1(which is what we’ll show below).

ISYE 6739 — Goldsman 7/12/20 30 / 74

Page 177: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Good” Estimator: Y1 = 2X .

Proof (that it’s unbiased): E[Y1] = 2E[X] = 2E[Xi] = θ. 2

“Better” Estimator: Y2 = n+1n max1≤i≤nXi.

Why might this estimator for θ make sense? (We’ll say why it’s “better” in alittle while.)

Proof (that it’s unbiased): E[Y2] = n+1n E[maxiXi] = θ iff

E[maxXi] =nθ

n+ 1(which is what we’ll show below).

ISYE 6739 — Goldsman 7/12/20 30 / 74

Page 178: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Good” Estimator: Y1 = 2X .

Proof (that it’s unbiased): E[Y1] = 2E[X] = 2E[Xi] = θ. 2

“Better” Estimator:

Y2 = n+1n max1≤i≤nXi.

Why might this estimator for θ make sense? (We’ll say why it’s “better” in alittle while.)

Proof (that it’s unbiased): E[Y2] = n+1n E[maxiXi] = θ iff

E[maxXi] =nθ

n+ 1(which is what we’ll show below).

ISYE 6739 — Goldsman 7/12/20 30 / 74

Page 179: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Good” Estimator: Y1 = 2X .

Proof (that it’s unbiased): E[Y1] = 2E[X] = 2E[Xi] = θ. 2

“Better” Estimator: Y2 = n+1n max1≤i≤nXi.

Why might this estimator for θ make sense? (We’ll say why it’s “better” in alittle while.)

Proof (that it’s unbiased): E[Y2] = n+1n E[maxiXi] = θ iff

E[maxXi] =nθ

n+ 1(which is what we’ll show below).

ISYE 6739 — Goldsman 7/12/20 30 / 74

Page 180: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Good” Estimator: Y1 = 2X .

Proof (that it’s unbiased): E[Y1] = 2E[X] = 2E[Xi] = θ. 2

“Better” Estimator: Y2 = n+1n max1≤i≤nXi.

Why might this estimator for θ make sense? (We’ll say why it’s “better” in alittle while.)

Proof (that it’s unbiased): E[Y2] = n+1n E[maxiXi] = θ iff

E[maxXi] =nθ

n+ 1(which is what we’ll show below).

ISYE 6739 — Goldsman 7/12/20 30 / 74

Page 181: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Good” Estimator: Y1 = 2X .

Proof (that it’s unbiased): E[Y1] = 2E[X] = 2E[Xi] = θ. 2

“Better” Estimator: Y2 = n+1n max1≤i≤nXi.

Why might this estimator for θ make sense? (We’ll say why it’s “better” in alittle while.)

Proof (that it’s unbiased):

E[Y2] = n+1n E[maxiXi] = θ iff

E[maxXi] =nθ

n+ 1(which is what we’ll show below).

ISYE 6739 — Goldsman 7/12/20 30 / 74

Page 182: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Good” Estimator: Y1 = 2X .

Proof (that it’s unbiased): E[Y1] = 2E[X] = 2E[Xi] = θ. 2

“Better” Estimator: Y2 = n+1n max1≤i≤nXi.

Why might this estimator for θ make sense? (We’ll say why it’s “better” in alittle while.)

Proof (that it’s unbiased): E[Y2] = n+1n E[maxiXi] = θ iff

E[maxXi] =nθ

n+ 1(which is what we’ll show below).

ISYE 6739 — Goldsman 7/12/20 30 / 74

Page 183: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Good” Estimator: Y1 = 2X .

Proof (that it’s unbiased): E[Y1] = 2E[X] = 2E[Xi] = θ. 2

“Better” Estimator: Y2 = n+1n max1≤i≤nXi.

Why might this estimator for θ make sense? (We’ll say why it’s “better” in alittle while.)

Proof (that it’s unbiased): E[Y2] = n+1n E[maxiXi] = θ iff

E[maxXi] =nθ

n+ 1(which is what we’ll show below).

ISYE 6739 — Goldsman 7/12/20 30 / 74

Page 184: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

First, let’s get the cdf of M ≡ maxiXi:

P (M ≤ y) = P (X1 ≤ y and X2 ≤ y and · · · and Xn ≤ y)

= P (X1 ≤ y)P (X2 ≤ y) · · ·P (Xn ≤ y) (Xi’s indep)

= [P (X1 ≤ y)]n (Xi’s identically distributed)

=

[∫ y

0fX1(x) dx

]n=

[∫ y

0(1/θ) dx

]n= (y/θ)n.

ISYE 6739 — Goldsman 7/12/20 31 / 74

Page 185: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

First, let’s get the cdf of M ≡ maxiXi:

P (M ≤ y) = P (X1 ≤ y and X2 ≤ y and · · · and Xn ≤ y)

= P (X1 ≤ y)P (X2 ≤ y) · · ·P (Xn ≤ y) (Xi’s indep)

= [P (X1 ≤ y)]n (Xi’s identically distributed)

=

[∫ y

0fX1(x) dx

]n=

[∫ y

0(1/θ) dx

]n= (y/θ)n.

ISYE 6739 — Goldsman 7/12/20 31 / 74

Page 186: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

First, let’s get the cdf of M ≡ maxiXi:

P (M ≤ y) = P (X1 ≤ y and X2 ≤ y and · · · and Xn ≤ y)

= P (X1 ≤ y)P (X2 ≤ y) · · ·P (Xn ≤ y) (Xi’s indep)

= [P (X1 ≤ y)]n (Xi’s identically distributed)

=

[∫ y

0fX1(x) dx

]n=

[∫ y

0(1/θ) dx

]n= (y/θ)n.

ISYE 6739 — Goldsman 7/12/20 31 / 74

Page 187: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

First, let’s get the cdf of M ≡ maxiXi:

P (M ≤ y) = P (X1 ≤ y and X2 ≤ y and · · · and Xn ≤ y)

= P (X1 ≤ y)P (X2 ≤ y) · · ·P (Xn ≤ y) (Xi’s indep)

= [P (X1 ≤ y)]n (Xi’s identically distributed)

=

[∫ y

0fX1(x) dx

]n=

[∫ y

0(1/θ) dx

]n= (y/θ)n.

ISYE 6739 — Goldsman 7/12/20 31 / 74

Page 188: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

First, let’s get the cdf of M ≡ maxiXi:

P (M ≤ y) = P (X1 ≤ y and X2 ≤ y and · · · and Xn ≤ y)

= P (X1 ≤ y)P (X2 ≤ y) · · ·P (Xn ≤ y) (Xi’s indep)

= [P (X1 ≤ y)]n (Xi’s identically distributed)

=

[∫ y

0fX1(x) dx

]n

=

[∫ y

0(1/θ) dx

]n= (y/θ)n.

ISYE 6739 — Goldsman 7/12/20 31 / 74

Page 189: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

First, let’s get the cdf of M ≡ maxiXi:

P (M ≤ y) = P (X1 ≤ y and X2 ≤ y and · · · and Xn ≤ y)

= P (X1 ≤ y)P (X2 ≤ y) · · ·P (Xn ≤ y) (Xi’s indep)

= [P (X1 ≤ y)]n (Xi’s identically distributed)

=

[∫ y

0fX1(x) dx

]n=

[∫ y

0(1/θ) dx

]n

= (y/θ)n.

ISYE 6739 — Goldsman 7/12/20 31 / 74

Page 190: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

First, let’s get the cdf of M ≡ maxiXi:

P (M ≤ y) = P (X1 ≤ y and X2 ≤ y and · · · and Xn ≤ y)

= P (X1 ≤ y)P (X2 ≤ y) · · ·P (Xn ≤ y) (Xi’s indep)

= [P (X1 ≤ y)]n (Xi’s identically distributed)

=

[∫ y

0fX1(x) dx

]n=

[∫ y

0(1/θ) dx

]n= (y/θ)n.

ISYE 6739 — Goldsman 7/12/20 31 / 74

Page 191: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

This implies that the pdf of M is

fM (y) ≡ d

dy(y/θ)n =

nyn−1

θn, 0 < y < θ,

and this implies that

E[M ] =

∫ θ

0yfM (y) dy =

∫ θ

0

nyn

θndy =

n+ 1.

Whew! This finally shows that Y2 = n+1n max1≤i≤nXi is an unbiased

estimator for θ! 2

Lastly, let’s look at. . .

ISYE 6739 — Goldsman 7/12/20 32 / 74

Page 192: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

This implies that the pdf of M is

fM (y) ≡ d

dy(y/θ)n

=nyn−1

θn, 0 < y < θ,

and this implies that

E[M ] =

∫ θ

0yfM (y) dy =

∫ θ

0

nyn

θndy =

n+ 1.

Whew! This finally shows that Y2 = n+1n max1≤i≤nXi is an unbiased

estimator for θ! 2

Lastly, let’s look at. . .

ISYE 6739 — Goldsman 7/12/20 32 / 74

Page 193: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

This implies that the pdf of M is

fM (y) ≡ d

dy(y/θ)n =

nyn−1

θn, 0 < y < θ,

and this implies that

E[M ] =

∫ θ

0yfM (y) dy =

∫ θ

0

nyn

θndy =

n+ 1.

Whew! This finally shows that Y2 = n+1n max1≤i≤nXi is an unbiased

estimator for θ! 2

Lastly, let’s look at. . .

ISYE 6739 — Goldsman 7/12/20 32 / 74

Page 194: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

This implies that the pdf of M is

fM (y) ≡ d

dy(y/θ)n =

nyn−1

θn, 0 < y < θ,

and this implies that

E[M ] =

∫ θ

0yfM (y) dy

=

∫ θ

0

nyn

θndy =

n+ 1.

Whew! This finally shows that Y2 = n+1n max1≤i≤nXi is an unbiased

estimator for θ! 2

Lastly, let’s look at. . .

ISYE 6739 — Goldsman 7/12/20 32 / 74

Page 195: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

This implies that the pdf of M is

fM (y) ≡ d

dy(y/θ)n =

nyn−1

θn, 0 < y < θ,

and this implies that

E[M ] =

∫ θ

0yfM (y) dy =

∫ θ

0

nyn

θndy =

n+ 1.

Whew! This finally shows that Y2 = n+1n max1≤i≤nXi is an unbiased

estimator for θ! 2

Lastly, let’s look at. . .

ISYE 6739 — Goldsman 7/12/20 32 / 74

Page 196: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

This implies that the pdf of M is

fM (y) ≡ d

dy(y/θ)n =

nyn−1

θn, 0 < y < θ,

and this implies that

E[M ] =

∫ θ

0yfM (y) dy =

∫ θ

0

nyn

θndy =

n+ 1.

Whew! This finally shows that Y2 = n+1n max1≤i≤nXi is an unbiased

estimator for θ! 2

Lastly, let’s look at. . .

ISYE 6739 — Goldsman 7/12/20 32 / 74

Page 197: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

This implies that the pdf of M is

fM (y) ≡ d

dy(y/θ)n =

nyn−1

θn, 0 < y < θ,

and this implies that

E[M ] =

∫ θ

0yfM (y) dy =

∫ θ

0

nyn

θndy =

n+ 1.

Whew! This finally shows that Y2 = n+1n max1≤i≤nXi is an unbiased

estimator for θ! 2

Lastly, let’s look at. . .

ISYE 6739 — Goldsman 7/12/20 32 / 74

Page 198: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Ugly” Estimator:

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

Ha! It’s possible to get a negative estimate for θ, which is strange since θ > 0!

Proof (that it’s unbiased):

E[Y3] = 12E[X] · 1

2− 8E[X] · 1

2= 2E[X] = θ. 2

Usually, it’s good for an estimator to be unbiased, but the “ugly” estimator Y3

shows that unbiased estimators can sometimes be goofy.

Therefore, let’s look at some other properties an estimator can have.

ISYE 6739 — Goldsman 7/12/20 33 / 74

Page 199: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Ugly” Estimator:

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

Ha! It’s possible to get a negative estimate for θ, which is strange since θ > 0!

Proof (that it’s unbiased):

E[Y3] = 12E[X] · 1

2− 8E[X] · 1

2= 2E[X] = θ. 2

Usually, it’s good for an estimator to be unbiased, but the “ugly” estimator Y3

shows that unbiased estimators can sometimes be goofy.

Therefore, let’s look at some other properties an estimator can have.

ISYE 6739 — Goldsman 7/12/20 33 / 74

Page 200: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Ugly” Estimator:

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

Ha! It’s possible to get a negative estimate for θ, which is strange since θ > 0!

Proof (that it’s unbiased):

E[Y3] = 12E[X] · 1

2− 8E[X] · 1

2= 2E[X] = θ. 2

Usually, it’s good for an estimator to be unbiased, but the “ugly” estimator Y3

shows that unbiased estimators can sometimes be goofy.

Therefore, let’s look at some other properties an estimator can have.

ISYE 6739 — Goldsman 7/12/20 33 / 74

Page 201: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Ugly” Estimator:

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

Ha! It’s possible to get a negative estimate for θ, which is strange since θ > 0!

Proof (that it’s unbiased):

E[Y3] = 12E[X] · 1

2− 8E[X] · 1

2= 2E[X] = θ. 2

Usually, it’s good for an estimator to be unbiased, but the “ugly” estimator Y3

shows that unbiased estimators can sometimes be goofy.

Therefore, let’s look at some other properties an estimator can have.

ISYE 6739 — Goldsman 7/12/20 33 / 74

Page 202: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Ugly” Estimator:

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

Ha! It’s possible to get a negative estimate for θ, which is strange since θ > 0!

Proof (that it’s unbiased):

E[Y3] = 12E[X] · 1

2− 8E[X] · 1

2

= 2E[X] = θ. 2

Usually, it’s good for an estimator to be unbiased, but the “ugly” estimator Y3

shows that unbiased estimators can sometimes be goofy.

Therefore, let’s look at some other properties an estimator can have.

ISYE 6739 — Goldsman 7/12/20 33 / 74

Page 203: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Ugly” Estimator:

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

Ha! It’s possible to get a negative estimate for θ, which is strange since θ > 0!

Proof (that it’s unbiased):

E[Y3] = 12E[X] · 1

2− 8E[X] · 1

2= 2E[X] = θ. 2

Usually, it’s good for an estimator to be unbiased, but the “ugly” estimator Y3

shows that unbiased estimators can sometimes be goofy.

Therefore, let’s look at some other properties an estimator can have.

ISYE 6739 — Goldsman 7/12/20 33 / 74

Page 204: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Ugly” Estimator:

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

Ha! It’s possible to get a negative estimate for θ, which is strange since θ > 0!

Proof (that it’s unbiased):

E[Y3] = 12E[X] · 1

2− 8E[X] · 1

2= 2E[X] = θ. 2

Usually, it’s good for an estimator to be unbiased, but the “ugly” estimator Y3

shows that unbiased estimators can sometimes be goofy.

Therefore, let’s look at some other properties an estimator can have.

ISYE 6739 — Goldsman 7/12/20 33 / 74

Page 205: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

“Ugly” Estimator:

Y3 =

{12X w.p. 1/2

−8X w.p. 1/2.

Ha! It’s possible to get a negative estimate for θ, which is strange since θ > 0!

Proof (that it’s unbiased):

E[Y3] = 12E[X] · 1

2− 8E[X] · 1

2= 2E[X] = θ. 2

Usually, it’s good for an estimator to be unbiased, but the “ugly” estimator Y3

shows that unbiased estimators can sometimes be goofy.

Therefore, let’s look at some other properties an estimator can have.

ISYE 6739 — Goldsman 7/12/20 33 / 74

Page 206: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

For instance, consider the variance of an estimator.

Big Example (cont’d): Again suppose that

X1, . . . , Xniid∼ Unif(0, θ).

Recall that both Y1 = 2X and Y2 = n+1n M are unbiased for θ.

Let’s find Var(Y1) and Var(Y2). First,

Var(Y1) = 4Var(X) =4

n·Var(Xi) =

4

n· θ

2

12=

θ2

3n.

ISYE 6739 — Goldsman 7/12/20 34 / 74

Page 207: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

For instance, consider the variance of an estimator.

Big Example (cont’d): Again suppose that

X1, . . . , Xniid∼ Unif(0, θ).

Recall that both Y1 = 2X and Y2 = n+1n M are unbiased for θ.

Let’s find Var(Y1) and Var(Y2). First,

Var(Y1) = 4Var(X) =4

n·Var(Xi) =

4

n· θ

2

12=

θ2

3n.

ISYE 6739 — Goldsman 7/12/20 34 / 74

Page 208: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

For instance, consider the variance of an estimator.

Big Example (cont’d): Again suppose that

X1, . . . , Xniid∼ Unif(0, θ).

Recall that both Y1 = 2X and Y2 = n+1n M are unbiased for θ.

Let’s find Var(Y1) and Var(Y2). First,

Var(Y1) = 4Var(X) =4

n·Var(Xi) =

4

n· θ

2

12=

θ2

3n.

ISYE 6739 — Goldsman 7/12/20 34 / 74

Page 209: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

For instance, consider the variance of an estimator.

Big Example (cont’d): Again suppose that

X1, . . . , Xniid∼ Unif(0, θ).

Recall that both Y1 = 2X and Y2 = n+1n M are unbiased for θ.

Let’s find Var(Y1) and Var(Y2).

First,

Var(Y1) = 4Var(X) =4

n·Var(Xi) =

4

n· θ

2

12=

θ2

3n.

ISYE 6739 — Goldsman 7/12/20 34 / 74

Page 210: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

For instance, consider the variance of an estimator.

Big Example (cont’d): Again suppose that

X1, . . . , Xniid∼ Unif(0, θ).

Recall that both Y1 = 2X and Y2 = n+1n M are unbiased for θ.

Let’s find Var(Y1) and Var(Y2). First,

Var(Y1) = 4Var(X)

=4

n·Var(Xi) =

4

n· θ

2

12=

θ2

3n.

ISYE 6739 — Goldsman 7/12/20 34 / 74

Page 211: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

For instance, consider the variance of an estimator.

Big Example (cont’d): Again suppose that

X1, . . . , Xniid∼ Unif(0, θ).

Recall that both Y1 = 2X and Y2 = n+1n M are unbiased for θ.

Let’s find Var(Y1) and Var(Y2). First,

Var(Y1) = 4Var(X) =4

n·Var(Xi)

=4

n· θ

2

12=

θ2

3n.

ISYE 6739 — Goldsman 7/12/20 34 / 74

Page 212: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

For instance, consider the variance of an estimator.

Big Example (cont’d): Again suppose that

X1, . . . , Xniid∼ Unif(0, θ).

Recall that both Y1 = 2X and Y2 = n+1n M are unbiased for θ.

Let’s find Var(Y1) and Var(Y2). First,

Var(Y1) = 4Var(X) =4

n·Var(Xi) =

4

n· θ

2

12=

θ2

3n.

ISYE 6739 — Goldsman 7/12/20 34 / 74

Page 213: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

For instance, consider the variance of an estimator.

Big Example (cont’d): Again suppose that

X1, . . . , Xniid∼ Unif(0, θ).

Recall that both Y1 = 2X and Y2 = n+1n M are unbiased for θ.

Let’s find Var(Y1) and Var(Y2). First,

Var(Y1) = 4Var(X) =4

n·Var(Xi) =

4

n· θ

2

12=

θ2

3n.

ISYE 6739 — Goldsman 7/12/20 34 / 74

Page 214: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Meanwhile,

Var(Y2) =

(n+ 1

n

)2

Var(M)

=

(n+ 1

n

)2

E[M2]−(n+ 1

n· E[M ]

)2

=

(n+ 1

n

)2 ∫ θ

0

nyn+1

θndy − θ2

= θ2 · (n+ 1)2

n(n+ 2)− θ2 =

θ2

n(n+ 2)<

θ2

3n.

Thus, both Y1 and Y2 are unbiased, but Y2 has much lower variance than Y1.We can break the “unbiasedness tie” by choosing Y2. 2

ISYE 6739 — Goldsman 7/12/20 35 / 74

Page 215: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Meanwhile,

Var(Y2) =

(n+ 1

n

)2

Var(M)

=

(n+ 1

n

)2

E[M2]−(n+ 1

n· E[M ]

)2

=

(n+ 1

n

)2 ∫ θ

0

nyn+1

θndy − θ2

= θ2 · (n+ 1)2

n(n+ 2)− θ2 =

θ2

n(n+ 2)<

θ2

3n.

Thus, both Y1 and Y2 are unbiased, but Y2 has much lower variance than Y1.We can break the “unbiasedness tie” by choosing Y2. 2

ISYE 6739 — Goldsman 7/12/20 35 / 74

Page 216: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Meanwhile,

Var(Y2) =

(n+ 1

n

)2

Var(M)

=

(n+ 1

n

)2

E[M2]−(n+ 1

n· E[M ]

)2

=

(n+ 1

n

)2 ∫ θ

0

nyn+1

θndy − θ2

= θ2 · (n+ 1)2

n(n+ 2)− θ2 =

θ2

n(n+ 2)<

θ2

3n.

Thus, both Y1 and Y2 are unbiased, but Y2 has much lower variance than Y1.We can break the “unbiasedness tie” by choosing Y2. 2

ISYE 6739 — Goldsman 7/12/20 35 / 74

Page 217: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Meanwhile,

Var(Y2) =

(n+ 1

n

)2

Var(M)

=

(n+ 1

n

)2

E[M2]−(n+ 1

n· E[M ]

)2

=

(n+ 1

n

)2 ∫ θ

0

nyn+1

θndy − θ2

= θ2 · (n+ 1)2

n(n+ 2)− θ2 =

θ2

n(n+ 2)

<θ2

3n.

Thus, both Y1 and Y2 are unbiased, but Y2 has much lower variance than Y1.We can break the “unbiasedness tie” by choosing Y2. 2

ISYE 6739 — Goldsman 7/12/20 35 / 74

Page 218: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Meanwhile,

Var(Y2) =

(n+ 1

n

)2

Var(M)

=

(n+ 1

n

)2

E[M2]−(n+ 1

n· E[M ]

)2

=

(n+ 1

n

)2 ∫ θ

0

nyn+1

θndy − θ2

= θ2 · (n+ 1)2

n(n+ 2)− θ2 =

θ2

n(n+ 2)<

θ2

3n.

Thus, both Y1 and Y2 are unbiased, but Y2 has much lower variance than Y1.We can break the “unbiasedness tie” by choosing Y2. 2

ISYE 6739 — Goldsman 7/12/20 35 / 74

Page 219: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Meanwhile,

Var(Y2) =

(n+ 1

n

)2

Var(M)

=

(n+ 1

n

)2

E[M2]−(n+ 1

n· E[M ]

)2

=

(n+ 1

n

)2 ∫ θ

0

nyn+1

θndy − θ2

= θ2 · (n+ 1)2

n(n+ 2)− θ2 =

θ2

n(n+ 2)<

θ2

3n.

Thus, both Y1 and Y2 are unbiased,

but Y2 has much lower variance than Y1.We can break the “unbiasedness tie” by choosing Y2. 2

ISYE 6739 — Goldsman 7/12/20 35 / 74

Page 220: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Meanwhile,

Var(Y2) =

(n+ 1

n

)2

Var(M)

=

(n+ 1

n

)2

E[M2]−(n+ 1

n· E[M ]

)2

=

(n+ 1

n

)2 ∫ θ

0

nyn+1

θndy − θ2

= θ2 · (n+ 1)2

n(n+ 2)− θ2 =

θ2

n(n+ 2)<

θ2

3n.

Thus, both Y1 and Y2 are unbiased, but Y2 has much lower variance than Y1.

We can break the “unbiasedness tie” by choosing Y2. 2

ISYE 6739 — Goldsman 7/12/20 35 / 74

Page 221: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Unbiased Estimation

Meanwhile,

Var(Y2) =

(n+ 1

n

)2

Var(M)

=

(n+ 1

n

)2

E[M2]−(n+ 1

n· E[M ]

)2

=

(n+ 1

n

)2 ∫ θ

0

nyn+1

θndy − θ2

= θ2 · (n+ 1)2

n(n+ 2)− θ2 =

θ2

n(n+ 2)<

θ2

3n.

Thus, both Y1 and Y2 are unbiased, but Y2 has much lower variance than Y1.We can break the “unbiasedness tie” by choosing Y2. 2

ISYE 6739 — Goldsman 7/12/20 35 / 74

Page 222: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Outline

1 Introduction to Descriptive Statistics

2 Summarizing Data

3 Candidate Distributions

4 Introduction to Estimation

5 Unbiased Estimation

6 Mean Squared Error

7 Maximum Likelihood Estimation

8 Trickier MLE Examples

9 Invariance Property of MLEs

10 Method of Moments Estimation

11 Sampling Distributions

ISYE 6739 — Goldsman 7/12/20 36 / 74

Page 223: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Lesson 5.6 — Mean Squared Error

We’ll now talk about a statistical performance measure that combinesinformation about the bias and the variance of an estimator.

Definition: The Mean Squared Error (MSE) of an estimator T (X) of θ is

MSE(T (X)) ≡ E[(T (X)− θ)2].

Before giving an easier interpretation of MSE, define the bias of an estimatorfor the parameter θ,

Bias(T (X)) ≡ E[T (X)]− θ.

ISYE 6739 — Goldsman 7/12/20 37 / 74

Page 224: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Lesson 5.6 — Mean Squared Error

We’ll now talk about a statistical performance measure that combinesinformation about the bias and the variance of an estimator.

Definition: The Mean Squared Error (MSE) of an estimator T (X) of θ is

MSE(T (X)) ≡ E[(T (X)− θ)2].

Before giving an easier interpretation of MSE, define the bias of an estimatorfor the parameter θ,

Bias(T (X)) ≡ E[T (X)]− θ.

ISYE 6739 — Goldsman 7/12/20 37 / 74

Page 225: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Lesson 5.6 — Mean Squared Error

We’ll now talk about a statistical performance measure that combinesinformation about the bias and the variance of an estimator.

Definition: The Mean Squared Error (MSE) of an estimator T (X) of θ is

MSE(T (X)) ≡ E[(T (X)− θ)2].

Before giving an easier interpretation of MSE, define the bias of an estimatorfor the parameter θ,

Bias(T (X)) ≡ E[T (X)]− θ.

ISYE 6739 — Goldsman 7/12/20 37 / 74

Page 226: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Lesson 5.6 — Mean Squared Error

We’ll now talk about a statistical performance measure that combinesinformation about the bias and the variance of an estimator.

Definition: The Mean Squared Error (MSE) of an estimator T (X) of θ is

MSE(T (X)) ≡ E[(T (X)− θ)2].

Before giving an easier interpretation of MSE, define the bias of an estimatorfor the parameter θ,

Bias(T (X)) ≡ E[T (X)]− θ.

ISYE 6739 — Goldsman 7/12/20 37 / 74

Page 227: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Lesson 5.6 — Mean Squared Error

We’ll now talk about a statistical performance measure that combinesinformation about the bias and the variance of an estimator.

Definition: The Mean Squared Error (MSE) of an estimator T (X) of θ is

MSE(T (X)) ≡ E[(T (X)− θ)2].

Before giving an easier interpretation of MSE, define the bias of an estimatorfor the parameter θ,

Bias(T (X)) ≡ E[T (X)]− θ.

ISYE 6739 — Goldsman 7/12/20 37 / 74

Page 228: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Lesson 5.6 — Mean Squared Error

We’ll now talk about a statistical performance measure that combinesinformation about the bias and the variance of an estimator.

Definition: The Mean Squared Error (MSE) of an estimator T (X) of θ is

MSE(T (X)) ≡ E[(T (X)− θ)2].

Before giving an easier interpretation of MSE, define the bias of an estimatorfor the parameter θ,

Bias(T (X)) ≡ E[T (X)]− θ.

ISYE 6739 — Goldsman 7/12/20 37 / 74

Page 229: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Theorem/Proof: Easier interpretation of MSE.

MSE(T (X)) = E[(T (X)− θ)2]

= E[T 2]− 2θE[T ] + θ2

= E[T 2]− (E[T ])2 + (E[T ])2 − 2θE[T ] + θ2

= Var(T ) + (E[T ]− θ︸ ︷︷ ︸Bias

)2.

So MSE = Bias2 + Var, and thus combines the bias and variance of anestimator. 2

ISYE 6739 — Goldsman 7/12/20 38 / 74

Page 230: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Theorem/Proof: Easier interpretation of MSE.

MSE(T (X)) = E[(T (X)− θ)2]

= E[T 2]− 2θE[T ] + θ2

= E[T 2]− (E[T ])2 + (E[T ])2 − 2θE[T ] + θ2

= Var(T ) + (E[T ]− θ︸ ︷︷ ︸Bias

)2.

So MSE = Bias2 + Var, and thus combines the bias and variance of anestimator. 2

ISYE 6739 — Goldsman 7/12/20 38 / 74

Page 231: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Theorem/Proof: Easier interpretation of MSE.

MSE(T (X)) = E[(T (X)− θ)2]

= E[T 2]− 2θE[T ] + θ2

= E[T 2]− (E[T ])2 + (E[T ])2 − 2θE[T ] + θ2

= Var(T ) + (E[T ]− θ︸ ︷︷ ︸Bias

)2.

So MSE = Bias2 + Var, and thus combines the bias and variance of anestimator. 2

ISYE 6739 — Goldsman 7/12/20 38 / 74

Page 232: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Theorem/Proof: Easier interpretation of MSE.

MSE(T (X)) = E[(T (X)− θ)2]

= E[T 2]− 2θE[T ] + θ2

= E[T 2]− (E[T ])2 + (E[T ])2 − 2θE[T ] + θ2

= Var(T ) + (E[T ]− θ︸ ︷︷ ︸Bias

)2.

So MSE = Bias2 + Var, and thus combines the bias and variance of anestimator. 2

ISYE 6739 — Goldsman 7/12/20 38 / 74

Page 233: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Theorem/Proof: Easier interpretation of MSE.

MSE(T (X)) = E[(T (X)− θ)2]

= E[T 2]− 2θE[T ] + θ2

= E[T 2]− (E[T ])2 + (E[T ])2 − 2θE[T ] + θ2

= Var(T ) + (E[T ]− θ︸ ︷︷ ︸Bias

)2.

So MSE = Bias2 + Var, and thus combines the bias and variance of anestimator. 2

ISYE 6739 — Goldsman 7/12/20 38 / 74

Page 234: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Theorem/Proof: Easier interpretation of MSE.

MSE(T (X)) = E[(T (X)− θ)2]

= E[T 2]− 2θE[T ] + θ2

= E[T 2]− (E[T ])2 + (E[T ])2 − 2θE[T ] + θ2

= Var(T ) + (E[T ]− θ︸ ︷︷ ︸Bias

)2.

So MSE = Bias2 + Var, and thus combines the bias and variance of anestimator. 2

ISYE 6739 — Goldsman 7/12/20 38 / 74

Page 235: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

The lower the MSE the better.

If T1(X) and T2(X) are two estimators of θ,we’d usually prefer the one with the lower MSE — even if it happens to havehigher bias.

Definition: The relative efficiency of T2(X) to T1(X) isMSE(T1(X))/MSE(T2(X)). If this quantity is < 1, then we’d want T1(X).

Example: Suppose that estimator A has bias = 3 and variance = 10, whileestimator B has bias = −2 and variance = 14. Which estimator (A or B) hasthe lower mean squared error?

Solution: MSE = Bias2 + Var, so

MSE(A) = 9 + 10 = 19 and MSE(B) = 4 + 14 = 18.

Thus, B has lower MSE. 2

ISYE 6739 — Goldsman 7/12/20 39 / 74

Page 236: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

The lower the MSE the better. If T1(X) and T2(X) are two estimators of θ,we’d usually prefer the one with the lower MSE

— even if it happens to havehigher bias.

Definition: The relative efficiency of T2(X) to T1(X) isMSE(T1(X))/MSE(T2(X)). If this quantity is < 1, then we’d want T1(X).

Example: Suppose that estimator A has bias = 3 and variance = 10, whileestimator B has bias = −2 and variance = 14. Which estimator (A or B) hasthe lower mean squared error?

Solution: MSE = Bias2 + Var, so

MSE(A) = 9 + 10 = 19 and MSE(B) = 4 + 14 = 18.

Thus, B has lower MSE. 2

ISYE 6739 — Goldsman 7/12/20 39 / 74

Page 237: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

The lower the MSE the better. If T1(X) and T2(X) are two estimators of θ,we’d usually prefer the one with the lower MSE — even if it happens to havehigher bias.

Definition: The relative efficiency of T2(X) to T1(X) isMSE(T1(X))/MSE(T2(X)). If this quantity is < 1, then we’d want T1(X).

Example: Suppose that estimator A has bias = 3 and variance = 10, whileestimator B has bias = −2 and variance = 14. Which estimator (A or B) hasthe lower mean squared error?

Solution: MSE = Bias2 + Var, so

MSE(A) = 9 + 10 = 19 and MSE(B) = 4 + 14 = 18.

Thus, B has lower MSE. 2

ISYE 6739 — Goldsman 7/12/20 39 / 74

Page 238: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

The lower the MSE the better. If T1(X) and T2(X) are two estimators of θ,we’d usually prefer the one with the lower MSE — even if it happens to havehigher bias.

Definition: The relative efficiency of T2(X) to T1(X) isMSE(T1(X))/MSE(T2(X)).

If this quantity is < 1, then we’d want T1(X).

Example: Suppose that estimator A has bias = 3 and variance = 10, whileestimator B has bias = −2 and variance = 14. Which estimator (A or B) hasthe lower mean squared error?

Solution: MSE = Bias2 + Var, so

MSE(A) = 9 + 10 = 19 and MSE(B) = 4 + 14 = 18.

Thus, B has lower MSE. 2

ISYE 6739 — Goldsman 7/12/20 39 / 74

Page 239: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

The lower the MSE the better. If T1(X) and T2(X) are two estimators of θ,we’d usually prefer the one with the lower MSE — even if it happens to havehigher bias.

Definition: The relative efficiency of T2(X) to T1(X) isMSE(T1(X))/MSE(T2(X)). If this quantity is < 1, then we’d want T1(X).

Example: Suppose that estimator A has bias = 3 and variance = 10, whileestimator B has bias = −2 and variance = 14. Which estimator (A or B) hasthe lower mean squared error?

Solution: MSE = Bias2 + Var, so

MSE(A) = 9 + 10 = 19 and MSE(B) = 4 + 14 = 18.

Thus, B has lower MSE. 2

ISYE 6739 — Goldsman 7/12/20 39 / 74

Page 240: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

The lower the MSE the better. If T1(X) and T2(X) are two estimators of θ,we’d usually prefer the one with the lower MSE — even if it happens to havehigher bias.

Definition: The relative efficiency of T2(X) to T1(X) isMSE(T1(X))/MSE(T2(X)). If this quantity is < 1, then we’d want T1(X).

Example: Suppose that estimator A has bias = 3 and variance = 10,

whileestimator B has bias = −2 and variance = 14. Which estimator (A or B) hasthe lower mean squared error?

Solution: MSE = Bias2 + Var, so

MSE(A) = 9 + 10 = 19 and MSE(B) = 4 + 14 = 18.

Thus, B has lower MSE. 2

ISYE 6739 — Goldsman 7/12/20 39 / 74

Page 241: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

The lower the MSE the better. If T1(X) and T2(X) are two estimators of θ,we’d usually prefer the one with the lower MSE — even if it happens to havehigher bias.

Definition: The relative efficiency of T2(X) to T1(X) isMSE(T1(X))/MSE(T2(X)). If this quantity is < 1, then we’d want T1(X).

Example: Suppose that estimator A has bias = 3 and variance = 10, whileestimator B has bias = −2 and variance = 14.

Which estimator (A or B) hasthe lower mean squared error?

Solution: MSE = Bias2 + Var, so

MSE(A) = 9 + 10 = 19 and MSE(B) = 4 + 14 = 18.

Thus, B has lower MSE. 2

ISYE 6739 — Goldsman 7/12/20 39 / 74

Page 242: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

The lower the MSE the better. If T1(X) and T2(X) are two estimators of θ,we’d usually prefer the one with the lower MSE — even if it happens to havehigher bias.

Definition: The relative efficiency of T2(X) to T1(X) isMSE(T1(X))/MSE(T2(X)). If this quantity is < 1, then we’d want T1(X).

Example: Suppose that estimator A has bias = 3 and variance = 10, whileestimator B has bias = −2 and variance = 14. Which estimator (A or B) hasthe lower mean squared error?

Solution: MSE = Bias2 + Var, so

MSE(A) = 9 + 10 = 19 and MSE(B) = 4 + 14 = 18.

Thus, B has lower MSE. 2

ISYE 6739 — Goldsman 7/12/20 39 / 74

Page 243: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

The lower the MSE the better. If T1(X) and T2(X) are two estimators of θ,we’d usually prefer the one with the lower MSE — even if it happens to havehigher bias.

Definition: The relative efficiency of T2(X) to T1(X) isMSE(T1(X))/MSE(T2(X)). If this quantity is < 1, then we’d want T1(X).

Example: Suppose that estimator A has bias = 3 and variance = 10, whileestimator B has bias = −2 and variance = 14. Which estimator (A or B) hasthe lower mean squared error?

Solution: MSE = Bias2 + Var, so

MSE(A) = 9 + 10 = 19 and MSE(B) = 4 + 14 = 18.

Thus, B has lower MSE. 2

ISYE 6739 — Goldsman 7/12/20 39 / 74

Page 244: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

The lower the MSE the better. If T1(X) and T2(X) are two estimators of θ,we’d usually prefer the one with the lower MSE — even if it happens to havehigher bias.

Definition: The relative efficiency of T2(X) to T1(X) isMSE(T1(X))/MSE(T2(X)). If this quantity is < 1, then we’d want T1(X).

Example: Suppose that estimator A has bias = 3 and variance = 10, whileestimator B has bias = −2 and variance = 14. Which estimator (A or B) hasthe lower mean squared error?

Solution: MSE = Bias2 + Var, so

MSE(A) = 9 + 10 = 19

and MSE(B) = 4 + 14 = 18.

Thus, B has lower MSE. 2

ISYE 6739 — Goldsman 7/12/20 39 / 74

Page 245: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

The lower the MSE the better. If T1(X) and T2(X) are two estimators of θ,we’d usually prefer the one with the lower MSE — even if it happens to havehigher bias.

Definition: The relative efficiency of T2(X) to T1(X) isMSE(T1(X))/MSE(T2(X)). If this quantity is < 1, then we’d want T1(X).

Example: Suppose that estimator A has bias = 3 and variance = 10, whileestimator B has bias = −2 and variance = 14. Which estimator (A or B) hasthe lower mean squared error?

Solution: MSE = Bias2 + Var, so

MSE(A) = 9 + 10 = 19 and MSE(B) = 4 + 14 = 18.

Thus, B has lower MSE. 2

ISYE 6739 — Goldsman 7/12/20 39 / 74

Page 246: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

The lower the MSE the better. If T1(X) and T2(X) are two estimators of θ,we’d usually prefer the one with the lower MSE — even if it happens to havehigher bias.

Definition: The relative efficiency of T2(X) to T1(X) isMSE(T1(X))/MSE(T2(X)). If this quantity is < 1, then we’d want T1(X).

Example: Suppose that estimator A has bias = 3 and variance = 10, whileestimator B has bias = −2 and variance = 14. Which estimator (A or B) hasthe lower mean squared error?

Solution: MSE = Bias2 + Var, so

MSE(A) = 9 + 10 = 19 and MSE(B) = 4 + 14 = 18.

Thus, B has lower MSE. 2

ISYE 6739 — Goldsman 7/12/20 39 / 74

Page 247: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Example: X1, . . . , Xniid∼ Unif(0, θ).

Two estimators: Y1 = 2X , and Y2 = n+1n maxiXi.

Showed before E[Y1] = E[Y2] = θ (so both estimators are unbiased).

Also, Var(Y1) = θ2

3n , and Var(Y2) = θ2

n(n+2) .

Thus,

MSE(Y1) =θ2

3nand MSE(Y2) =

θ2

n(n+ 2),

so Y2 is better (by an order of magnitude, actually). 2

ISYE 6739 — Goldsman 7/12/20 40 / 74

Page 248: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Example: X1, . . . , Xniid∼ Unif(0, θ).

Two estimators: Y1 = 2X , and Y2 = n+1n maxiXi.

Showed before E[Y1] = E[Y2] = θ (so both estimators are unbiased).

Also, Var(Y1) = θ2

3n , and Var(Y2) = θ2

n(n+2) .

Thus,

MSE(Y1) =θ2

3nand MSE(Y2) =

θ2

n(n+ 2),

so Y2 is better (by an order of magnitude, actually). 2

ISYE 6739 — Goldsman 7/12/20 40 / 74

Page 249: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Example: X1, . . . , Xniid∼ Unif(0, θ).

Two estimators: Y1 = 2X , and Y2 = n+1n maxiXi.

Showed before E[Y1] = E[Y2] = θ (so both estimators are unbiased).

Also, Var(Y1) = θ2

3n , and Var(Y2) = θ2

n(n+2) .

Thus,

MSE(Y1) =θ2

3nand MSE(Y2) =

θ2

n(n+ 2),

so Y2 is better (by an order of magnitude, actually). 2

ISYE 6739 — Goldsman 7/12/20 40 / 74

Page 250: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Example: X1, . . . , Xniid∼ Unif(0, θ).

Two estimators: Y1 = 2X , and Y2 = n+1n maxiXi.

Showed before E[Y1] = E[Y2] = θ (so both estimators are unbiased).

Also, Var(Y1) = θ2

3n , and Var(Y2) = θ2

n(n+2) .

Thus,

MSE(Y1) =θ2

3nand MSE(Y2) =

θ2

n(n+ 2),

so Y2 is better (by an order of magnitude, actually). 2

ISYE 6739 — Goldsman 7/12/20 40 / 74

Page 251: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Example: X1, . . . , Xniid∼ Unif(0, θ).

Two estimators: Y1 = 2X , and Y2 = n+1n maxiXi.

Showed before E[Y1] = E[Y2] = θ (so both estimators are unbiased).

Also, Var(Y1) = θ2

3n , and Var(Y2) = θ2

n(n+2) .

Thus,

MSE(Y1) =θ2

3nand MSE(Y2) =

θ2

n(n+ 2),

so Y2 is better (by an order of magnitude, actually). 2

ISYE 6739 — Goldsman 7/12/20 40 / 74

Page 252: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Mean Squared Error

Example: X1, . . . , Xniid∼ Unif(0, θ).

Two estimators: Y1 = 2X , and Y2 = n+1n maxiXi.

Showed before E[Y1] = E[Y2] = θ (so both estimators are unbiased).

Also, Var(Y1) = θ2

3n , and Var(Y2) = θ2

n(n+2) .

Thus,

MSE(Y1) =θ2

3nand MSE(Y2) =

θ2

n(n+ 2),

so Y2 is better (by an order of magnitude, actually). 2

ISYE 6739 — Goldsman 7/12/20 40 / 74

Page 253: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Outline

1 Introduction to Descriptive Statistics

2 Summarizing Data

3 Candidate Distributions

4 Introduction to Estimation

5 Unbiased Estimation

6 Mean Squared Error

7 Maximum Likelihood Estimation

8 Trickier MLE Examples

9 Invariance Property of MLEs

10 Method of Moments Estimation

11 Sampling Distributions

ISYE 6739 — Goldsman 7/12/20 41 / 74

Page 254: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Lesson 5.7 — Maximum Likelihood Estimation

Definition: Consider an iid random sample X1, . . . , Xn, where each Xi haspmf/pdf f(x). Further, suppose that θ is some unknown parameter from Xi.

The likelihood function is L(θ) ≡∏ni=1 f(xi).

The maximum likelihood estimator (MLE) of θ is the value of θ thatmaximizes L(θ). The MLE is a function of the Xi’s and is a RV.

Remark: We can very informally regard the MLE as the “most likely”estimate of θ.

ISYE 6739 — Goldsman 7/12/20 42 / 74

Page 255: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Lesson 5.7 — Maximum Likelihood Estimation

Definition: Consider an iid random sample X1, . . . , Xn, where each Xi haspmf/pdf f(x).

Further, suppose that θ is some unknown parameter from Xi.

The likelihood function is L(θ) ≡∏ni=1 f(xi).

The maximum likelihood estimator (MLE) of θ is the value of θ thatmaximizes L(θ). The MLE is a function of the Xi’s and is a RV.

Remark: We can very informally regard the MLE as the “most likely”estimate of θ.

ISYE 6739 — Goldsman 7/12/20 42 / 74

Page 256: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Lesson 5.7 — Maximum Likelihood Estimation

Definition: Consider an iid random sample X1, . . . , Xn, where each Xi haspmf/pdf f(x). Further, suppose that θ is some unknown parameter from Xi.

The likelihood function is L(θ) ≡∏ni=1 f(xi).

The maximum likelihood estimator (MLE) of θ is the value of θ thatmaximizes L(θ). The MLE is a function of the Xi’s and is a RV.

Remark: We can very informally regard the MLE as the “most likely”estimate of θ.

ISYE 6739 — Goldsman 7/12/20 42 / 74

Page 257: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Lesson 5.7 — Maximum Likelihood Estimation

Definition: Consider an iid random sample X1, . . . , Xn, where each Xi haspmf/pdf f(x). Further, suppose that θ is some unknown parameter from Xi.

The likelihood function is L(θ) ≡∏ni=1 f(xi).

The maximum likelihood estimator (MLE) of θ is the value of θ thatmaximizes L(θ). The MLE is a function of the Xi’s and is a RV.

Remark: We can very informally regard the MLE as the “most likely”estimate of θ.

ISYE 6739 — Goldsman 7/12/20 42 / 74

Page 258: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Lesson 5.7 — Maximum Likelihood Estimation

Definition: Consider an iid random sample X1, . . . , Xn, where each Xi haspmf/pdf f(x). Further, suppose that θ is some unknown parameter from Xi.

The likelihood function is L(θ) ≡∏ni=1 f(xi).

The maximum likelihood estimator (MLE) of θ is the value of θ thatmaximizes L(θ).

The MLE is a function of the Xi’s and is a RV.

Remark: We can very informally regard the MLE as the “most likely”estimate of θ.

ISYE 6739 — Goldsman 7/12/20 42 / 74

Page 259: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Lesson 5.7 — Maximum Likelihood Estimation

Definition: Consider an iid random sample X1, . . . , Xn, where each Xi haspmf/pdf f(x). Further, suppose that θ is some unknown parameter from Xi.

The likelihood function is L(θ) ≡∏ni=1 f(xi).

The maximum likelihood estimator (MLE) of θ is the value of θ thatmaximizes L(θ). The MLE is a function of the Xi’s and is a RV.

Remark: We can very informally regard the MLE as the “most likely”estimate of θ.

ISYE 6739 — Goldsman 7/12/20 42 / 74

Page 260: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Lesson 5.7 — Maximum Likelihood Estimation

Definition: Consider an iid random sample X1, . . . , Xn, where each Xi haspmf/pdf f(x). Further, suppose that θ is some unknown parameter from Xi.

The likelihood function is L(θ) ≡∏ni=1 f(xi).

The maximum likelihood estimator (MLE) of θ is the value of θ thatmaximizes L(θ). The MLE is a function of the Xi’s and is a RV.

Remark: We can very informally regard the MLE as the “most likely”estimate of θ.

ISYE 6739 — Goldsman 7/12/20 42 / 74

Page 261: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Exp(λ).

Find the MLE for λ.

First of all, the likelihood function is

L(λ) =

n∏i=1

f(xi) =

n∏i=1

λe−λxi = λn exp(− λ

n∑i=1

xi

).

Now maximize L(λ) with respect to λ. Could take the derivative and plowthrough all of the horrible algebra. Too tedious. Need a trick. . . .

Useful Trick: Since the natural log function is one-to-one, it’s easy to seethat the λ that maximizes L(λ) also maximizes `n(L(λ))!

`n(L(λ)) = `n

(λn exp

(− λ

n∑i=1

xi

))= n`n(λ)− λ

n∑i=1

xi.

ISYE 6739 — Goldsman 7/12/20 43 / 74

Page 262: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Exp(λ). Find the MLE for λ.

First of all, the likelihood function is

L(λ) =

n∏i=1

f(xi) =

n∏i=1

λe−λxi = λn exp(− λ

n∑i=1

xi

).

Now maximize L(λ) with respect to λ. Could take the derivative and plowthrough all of the horrible algebra. Too tedious. Need a trick. . . .

Useful Trick: Since the natural log function is one-to-one, it’s easy to seethat the λ that maximizes L(λ) also maximizes `n(L(λ))!

`n(L(λ)) = `n

(λn exp

(− λ

n∑i=1

xi

))= n`n(λ)− λ

n∑i=1

xi.

ISYE 6739 — Goldsman 7/12/20 43 / 74

Page 263: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Exp(λ). Find the MLE for λ.

First of all, the likelihood function is

L(λ) =

n∏i=1

f(xi)

=

n∏i=1

λe−λxi = λn exp(− λ

n∑i=1

xi

).

Now maximize L(λ) with respect to λ. Could take the derivative and plowthrough all of the horrible algebra. Too tedious. Need a trick. . . .

Useful Trick: Since the natural log function is one-to-one, it’s easy to seethat the λ that maximizes L(λ) also maximizes `n(L(λ))!

`n(L(λ)) = `n

(λn exp

(− λ

n∑i=1

xi

))= n`n(λ)− λ

n∑i=1

xi.

ISYE 6739 — Goldsman 7/12/20 43 / 74

Page 264: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Exp(λ). Find the MLE for λ.

First of all, the likelihood function is

L(λ) =

n∏i=1

f(xi) =

n∏i=1

λe−λxi

= λn exp(− λ

n∑i=1

xi

).

Now maximize L(λ) with respect to λ. Could take the derivative and plowthrough all of the horrible algebra. Too tedious. Need a trick. . . .

Useful Trick: Since the natural log function is one-to-one, it’s easy to seethat the λ that maximizes L(λ) also maximizes `n(L(λ))!

`n(L(λ)) = `n

(λn exp

(− λ

n∑i=1

xi

))= n`n(λ)− λ

n∑i=1

xi.

ISYE 6739 — Goldsman 7/12/20 43 / 74

Page 265: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Exp(λ). Find the MLE for λ.

First of all, the likelihood function is

L(λ) =

n∏i=1

f(xi) =

n∏i=1

λe−λxi = λn exp(− λ

n∑i=1

xi

).

Now maximize L(λ) with respect to λ. Could take the derivative and plowthrough all of the horrible algebra. Too tedious. Need a trick. . . .

Useful Trick: Since the natural log function is one-to-one, it’s easy to seethat the λ that maximizes L(λ) also maximizes `n(L(λ))!

`n(L(λ)) = `n

(λn exp

(− λ

n∑i=1

xi

))= n`n(λ)− λ

n∑i=1

xi.

ISYE 6739 — Goldsman 7/12/20 43 / 74

Page 266: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Exp(λ). Find the MLE for λ.

First of all, the likelihood function is

L(λ) =

n∏i=1

f(xi) =

n∏i=1

λe−λxi = λn exp(− λ

n∑i=1

xi

).

Now maximize L(λ) with respect to λ.

Could take the derivative and plowthrough all of the horrible algebra. Too tedious. Need a trick. . . .

Useful Trick: Since the natural log function is one-to-one, it’s easy to seethat the λ that maximizes L(λ) also maximizes `n(L(λ))!

`n(L(λ)) = `n

(λn exp

(− λ

n∑i=1

xi

))= n`n(λ)− λ

n∑i=1

xi.

ISYE 6739 — Goldsman 7/12/20 43 / 74

Page 267: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Exp(λ). Find the MLE for λ.

First of all, the likelihood function is

L(λ) =

n∏i=1

f(xi) =

n∏i=1

λe−λxi = λn exp(− λ

n∑i=1

xi

).

Now maximize L(λ) with respect to λ. Could take the derivative and plowthrough all of the horrible algebra. Too tedious. Need a trick. . . .

Useful Trick: Since the natural log function is one-to-one, it’s easy to seethat the λ that maximizes L(λ) also maximizes `n(L(λ))!

`n(L(λ)) = `n

(λn exp

(− λ

n∑i=1

xi

))= n`n(λ)− λ

n∑i=1

xi.

ISYE 6739 — Goldsman 7/12/20 43 / 74

Page 268: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Exp(λ). Find the MLE for λ.

First of all, the likelihood function is

L(λ) =

n∏i=1

f(xi) =

n∏i=1

λe−λxi = λn exp(− λ

n∑i=1

xi

).

Now maximize L(λ) with respect to λ. Could take the derivative and plowthrough all of the horrible algebra. Too tedious. Need a trick. . . .

Useful Trick: Since the natural log function is one-to-one, it’s easy to seethat the λ that maximizes L(λ) also maximizes `n(L(λ))!

`n(L(λ)) = `n

(λn exp

(− λ

n∑i=1

xi

))= n`n(λ)− λ

n∑i=1

xi.

ISYE 6739 — Goldsman 7/12/20 43 / 74

Page 269: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Exp(λ). Find the MLE for λ.

First of all, the likelihood function is

L(λ) =

n∏i=1

f(xi) =

n∏i=1

λe−λxi = λn exp(− λ

n∑i=1

xi

).

Now maximize L(λ) with respect to λ. Could take the derivative and plowthrough all of the horrible algebra. Too tedious. Need a trick. . . .

Useful Trick: Since the natural log function is one-to-one, it’s easy to seethat the λ that maximizes L(λ) also maximizes `n(L(λ))!

`n(L(λ)) = `n

(λn exp

(− λ

n∑i=1

xi

))

= n`n(λ)− λn∑i=1

xi.

ISYE 6739 — Goldsman 7/12/20 43 / 74

Page 270: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Exp(λ). Find the MLE for λ.

First of all, the likelihood function is

L(λ) =

n∏i=1

f(xi) =

n∏i=1

λe−λxi = λn exp(− λ

n∑i=1

xi

).

Now maximize L(λ) with respect to λ. Could take the derivative and plowthrough all of the horrible algebra. Too tedious. Need a trick. . . .

Useful Trick: Since the natural log function is one-to-one, it’s easy to seethat the λ that maximizes L(λ) also maximizes `n(L(λ))!

`n(L(λ)) = `n

(λn exp

(− λ

n∑i=1

xi

))= n`n(λ)− λ

n∑i=1

xi.

ISYE 6739 — Goldsman 7/12/20 43 / 74

Page 271: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

The trick makes our job less horrible.

d

dλ`n(L(λ)) =

d

(n`n(λ)− λ

n∑i=1

xi

)=

n

λ−

n∑i=1

xi ≡ 0.

This implies that the MLE is λ = 1/X . 2

Remarks:

λ = 1/X makes sense, since E[X] = 1/λ.

At the end, we put a little hat over λ to indicate that this is the MLE. It’slike a party hat!

At the end, we make all of the little xi’s into big Xi’s to indicate that thisis a random variable.

Just to be careful, you “probably” ought to do a second-derivative test.

ISYE 6739 — Goldsman 7/12/20 44 / 74

Page 272: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

The trick makes our job less horrible.

d

dλ`n(L(λ))

=d

(n`n(λ)− λ

n∑i=1

xi

)=

n

λ−

n∑i=1

xi ≡ 0.

This implies that the MLE is λ = 1/X . 2

Remarks:

λ = 1/X makes sense, since E[X] = 1/λ.

At the end, we put a little hat over λ to indicate that this is the MLE. It’slike a party hat!

At the end, we make all of the little xi’s into big Xi’s to indicate that thisis a random variable.

Just to be careful, you “probably” ought to do a second-derivative test.

ISYE 6739 — Goldsman 7/12/20 44 / 74

Page 273: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

The trick makes our job less horrible.

d

dλ`n(L(λ)) =

d

(n`n(λ)− λ

n∑i=1

xi

)

=n

λ−

n∑i=1

xi ≡ 0.

This implies that the MLE is λ = 1/X . 2

Remarks:

λ = 1/X makes sense, since E[X] = 1/λ.

At the end, we put a little hat over λ to indicate that this is the MLE. It’slike a party hat!

At the end, we make all of the little xi’s into big Xi’s to indicate that thisis a random variable.

Just to be careful, you “probably” ought to do a second-derivative test.

ISYE 6739 — Goldsman 7/12/20 44 / 74

Page 274: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

The trick makes our job less horrible.

d

dλ`n(L(λ)) =

d

(n`n(λ)− λ

n∑i=1

xi

)=

n

λ−

n∑i=1

xi ≡ 0.

This implies that the MLE is λ = 1/X . 2

Remarks:

λ = 1/X makes sense, since E[X] = 1/λ.

At the end, we put a little hat over λ to indicate that this is the MLE. It’slike a party hat!

At the end, we make all of the little xi’s into big Xi’s to indicate that thisis a random variable.

Just to be careful, you “probably” ought to do a second-derivative test.

ISYE 6739 — Goldsman 7/12/20 44 / 74

Page 275: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

The trick makes our job less horrible.

d

dλ`n(L(λ)) =

d

(n`n(λ)− λ

n∑i=1

xi

)=

n

λ−

n∑i=1

xi ≡ 0.

This implies that the MLE is λ = 1/X . 2

Remarks:

λ = 1/X makes sense, since E[X] = 1/λ.

At the end, we put a little hat over λ to indicate that this is the MLE. It’slike a party hat!

At the end, we make all of the little xi’s into big Xi’s to indicate that thisis a random variable.

Just to be careful, you “probably” ought to do a second-derivative test.

ISYE 6739 — Goldsman 7/12/20 44 / 74

Page 276: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

The trick makes our job less horrible.

d

dλ`n(L(λ)) =

d

(n`n(λ)− λ

n∑i=1

xi

)=

n

λ−

n∑i=1

xi ≡ 0.

This implies that the MLE is λ = 1/X . 2

Remarks:

λ = 1/X makes sense, since E[X] = 1/λ.

At the end, we put a little hat over λ to indicate that this is the MLE. It’slike a party hat!

At the end, we make all of the little xi’s into big Xi’s to indicate that thisis a random variable.

Just to be careful, you “probably” ought to do a second-derivative test.

ISYE 6739 — Goldsman 7/12/20 44 / 74

Page 277: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

The trick makes our job less horrible.

d

dλ`n(L(λ)) =

d

(n`n(λ)− λ

n∑i=1

xi

)=

n

λ−

n∑i=1

xi ≡ 0.

This implies that the MLE is λ = 1/X . 2

Remarks:

λ = 1/X makes sense, since E[X] = 1/λ.

At the end, we put a little hat over λ to indicate that this is the MLE. It’slike a party hat!

At the end, we make all of the little xi’s into big Xi’s to indicate that thisis a random variable.

Just to be careful, you “probably” ought to do a second-derivative test.

ISYE 6739 — Goldsman 7/12/20 44 / 74

Page 278: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

The trick makes our job less horrible.

d

dλ`n(L(λ)) =

d

(n`n(λ)− λ

n∑i=1

xi

)=

n

λ−

n∑i=1

xi ≡ 0.

This implies that the MLE is λ = 1/X . 2

Remarks:

λ = 1/X makes sense, since E[X] = 1/λ.

At the end, we put a little hat over λ to indicate that this is the MLE. It’slike a party hat!

At the end, we make all of the little xi’s into big Xi’s to indicate that thisis a random variable.

Just to be careful, you “probably” ought to do a second-derivative test.

ISYE 6739 — Goldsman 7/12/20 44 / 74

Page 279: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

The trick makes our job less horrible.

d

dλ`n(L(λ)) =

d

(n`n(λ)− λ

n∑i=1

xi

)=

n

λ−

n∑i=1

xi ≡ 0.

This implies that the MLE is λ = 1/X . 2

Remarks:

λ = 1/X makes sense, since E[X] = 1/λ.

At the end, we put a little hat over λ to indicate that this is the MLE. It’slike a party hat!

At the end, we make all of the little xi’s into big Xi’s to indicate that thisis a random variable.

Just to be careful, you “probably” ought to do a second-derivative test.

ISYE 6739 — Goldsman 7/12/20 44 / 74

Page 280: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

The trick makes our job less horrible.

d

dλ`n(L(λ)) =

d

(n`n(λ)− λ

n∑i=1

xi

)=

n

λ−

n∑i=1

xi ≡ 0.

This implies that the MLE is λ = 1/X . 2

Remarks:

λ = 1/X makes sense, since E[X] = 1/λ.

At the end, we put a little hat over λ to indicate that this is the MLE. It’slike a party hat!

At the end, we make all of the little xi’s into big Xi’s to indicate that thisis a random variable.

Just to be careful, you “probably” ought to do a second-derivative test.

ISYE 6739 — Goldsman 7/12/20 44 / 74

Page 281: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Bern(p).

Find the MLE for p.Useful trick for this problem: Since

Xi =

{1 w.p. p

0 w.p. 1− p,

we can write the pmf as

f(x) = px(1− p)1−x, x = 0, 1.

Thus, the likelihood function is

L(p) =

n∏i=1

f(xi) =

n∏i=1

pxi(1− p)1−xi = p∑n

i=1 xi(1− p)n−∑n

i=1 xi .

ISYE 6739 — Goldsman 7/12/20 45 / 74

Page 282: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Bern(p). Find the MLE for p.

Useful trick for this problem: Since

Xi =

{1 w.p. p

0 w.p. 1− p,

we can write the pmf as

f(x) = px(1− p)1−x, x = 0, 1.

Thus, the likelihood function is

L(p) =

n∏i=1

f(xi) =

n∏i=1

pxi(1− p)1−xi = p∑n

i=1 xi(1− p)n−∑n

i=1 xi .

ISYE 6739 — Goldsman 7/12/20 45 / 74

Page 283: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Bern(p). Find the MLE for p.

Useful trick for this problem: Since

Xi =

{1 w.p. p

0 w.p. 1− p,

we can write the pmf as

f(x) = px(1− p)1−x, x = 0, 1.

Thus, the likelihood function is

L(p) =

n∏i=1

f(xi) =

n∏i=1

pxi(1− p)1−xi = p∑n

i=1 xi(1− p)n−∑n

i=1 xi .

ISYE 6739 — Goldsman 7/12/20 45 / 74

Page 284: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Bern(p). Find the MLE for p.

Useful trick for this problem: Since

Xi =

{1 w.p. p

0 w.p. 1− p,

we can write the pmf as

f(x) = px(1− p)1−x, x = 0, 1.

Thus, the likelihood function is

L(p) =

n∏i=1

f(xi) =

n∏i=1

pxi(1− p)1−xi = p∑n

i=1 xi(1− p)n−∑n

i=1 xi .

ISYE 6739 — Goldsman 7/12/20 45 / 74

Page 285: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Bern(p). Find the MLE for p.

Useful trick for this problem: Since

Xi =

{1 w.p. p

0 w.p. 1− p,

we can write the pmf as

f(x) = px(1− p)1−x, x = 0, 1.

Thus, the likelihood function is

L(p) =

n∏i=1

f(xi) =

n∏i=1

pxi(1− p)1−xi = p∑n

i=1 xi(1− p)n−∑n

i=1 xi .

ISYE 6739 — Goldsman 7/12/20 45 / 74

Page 286: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Bern(p). Find the MLE for p.

Useful trick for this problem: Since

Xi =

{1 w.p. p

0 w.p. 1− p,

we can write the pmf as

f(x) = px(1− p)1−x, x = 0, 1.

Thus, the likelihood function is

L(p) =

n∏i=1

f(xi) =

n∏i=1

pxi(1− p)1−xi = p∑n

i=1 xi(1− p)n−∑n

i=1 xi .

ISYE 6739 — Goldsman 7/12/20 45 / 74

Page 287: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Bern(p). Find the MLE for p.

Useful trick for this problem: Since

Xi =

{1 w.p. p

0 w.p. 1− p,

we can write the pmf as

f(x) = px(1− p)1−x, x = 0, 1.

Thus, the likelihood function is

L(p) =

n∏i=1

f(xi)

=

n∏i=1

pxi(1− p)1−xi = p∑n

i=1 xi(1− p)n−∑n

i=1 xi .

ISYE 6739 — Goldsman 7/12/20 45 / 74

Page 288: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Bern(p). Find the MLE for p.

Useful trick for this problem: Since

Xi =

{1 w.p. p

0 w.p. 1− p,

we can write the pmf as

f(x) = px(1− p)1−x, x = 0, 1.

Thus, the likelihood function is

L(p) =

n∏i=1

f(xi) =

n∏i=1

pxi(1− p)1−xi

= p∑n

i=1 xi(1− p)n−∑n

i=1 xi .

ISYE 6739 — Goldsman 7/12/20 45 / 74

Page 289: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

Example: Suppose X1, . . . , Xniid∼ Bern(p). Find the MLE for p.

Useful trick for this problem: Since

Xi =

{1 w.p. p

0 w.p. 1− p,

we can write the pmf as

f(x) = px(1− p)1−x, x = 0, 1.

Thus, the likelihood function is

L(p) =

n∏i=1

f(xi) =

n∏i=1

pxi(1− p)1−xi = p∑n

i=1 xi(1− p)n−∑n

i=1 xi .

ISYE 6739 — Goldsman 7/12/20 45 / 74

Page 290: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

This implies that

`n(L(p)) =n∑i=1

xi `n(p) +(n−

n∑i=1

xi

)`n(1− p)

⇒d

dp`n(L(p)) =

∑i xip−n−

∑i xi

1− p≡ 0

(1− p)( n∑i=1

xi

)= p

(n−

n∑i=1

xi

)⇒

p = X.

This makes sense since E[X] = p. 2

ISYE 6739 — Goldsman 7/12/20 46 / 74

Page 291: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

This implies that

`n(L(p)) =

n∑i=1

xi `n(p) +(n−

n∑i=1

xi

)`n(1− p)

⇒d

dp`n(L(p)) =

∑i xip−n−

∑i xi

1− p≡ 0

(1− p)( n∑i=1

xi

)= p

(n−

n∑i=1

xi

)⇒

p = X.

This makes sense since E[X] = p. 2

ISYE 6739 — Goldsman 7/12/20 46 / 74

Page 292: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

This implies that

`n(L(p)) =

n∑i=1

xi `n(p) +(n−

n∑i=1

xi

)`n(1− p)

⇒d

dp`n(L(p)) =

∑i xip−n−

∑i xi

1− p≡ 0

(1− p)( n∑i=1

xi

)= p

(n−

n∑i=1

xi

)⇒

p = X.

This makes sense since E[X] = p. 2

ISYE 6739 — Goldsman 7/12/20 46 / 74

Page 293: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

This implies that

`n(L(p)) =

n∑i=1

xi `n(p) +(n−

n∑i=1

xi

)`n(1− p)

⇒d

dp`n(L(p)) =

∑i xip−n−

∑i xi

1− p≡ 0

(1− p)( n∑i=1

xi

)= p

(n−

n∑i=1

xi

)

⇒p = X.

This makes sense since E[X] = p. 2

ISYE 6739 — Goldsman 7/12/20 46 / 74

Page 294: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

This implies that

`n(L(p)) =

n∑i=1

xi `n(p) +(n−

n∑i=1

xi

)`n(1− p)

⇒d

dp`n(L(p)) =

∑i xip−n−

∑i xi

1− p≡ 0

(1− p)( n∑i=1

xi

)= p

(n−

n∑i=1

xi

)⇒

p = X.

This makes sense since E[X] = p. 2

ISYE 6739 — Goldsman 7/12/20 46 / 74

Page 295: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Maximum Likelihood Estimation

This implies that

`n(L(p)) =

n∑i=1

xi `n(p) +(n−

n∑i=1

xi

)`n(1− p)

⇒d

dp`n(L(p)) =

∑i xip−n−

∑i xi

1− p≡ 0

(1− p)( n∑i=1

xi

)= p

(n−

n∑i=1

xi

)⇒

p = X.

This makes sense since E[X] = p. 2

ISYE 6739 — Goldsman 7/12/20 46 / 74

Page 296: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Outline

1 Introduction to Descriptive Statistics

2 Summarizing Data

3 Candidate Distributions

4 Introduction to Estimation

5 Unbiased Estimation

6 Mean Squared Error

7 Maximum Likelihood Estimation

8 Trickier MLE Examples

9 Invariance Property of MLEs

10 Method of Moments Estimation

11 Sampling Distributions

ISYE 6739 — Goldsman 7/12/20 47 / 74

Page 297: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Lesson 5.8 — Trickier MLE Examples

Example: X1, . . . , Xniid∼ Nor(µ, σ2). Get simultaneous MLEs for µ and σ2.

L(µ, σ2) =n∏i=1

f(xi) =n∏i=1

1√2πσ2

exp{− 1

2

(xi − µ)2

σ2

}=

1

(2πσ2)n/2exp

{− 1

2

n∑i=1

(xi − µ)2

σ2

}.

⇒ `n(L(µ, σ2)) = −n2`n(2π)− n

2`n(σ2)− 1

2σ2

n∑i=1

(xi − µ)2

⇒ ∂

∂µ`n(L(µ, σ2)) =

1

σ2

n∑i=1

(xi − µ) ≡ 0,

and so µ = X .

ISYE 6739 — Goldsman 7/12/20 48 / 74

Page 298: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Lesson 5.8 — Trickier MLE Examples

Example: X1, . . . , Xniid∼ Nor(µ, σ2).

Get simultaneous MLEs for µ and σ2.

L(µ, σ2) =n∏i=1

f(xi) =n∏i=1

1√2πσ2

exp{− 1

2

(xi − µ)2

σ2

}=

1

(2πσ2)n/2exp

{− 1

2

n∑i=1

(xi − µ)2

σ2

}.

⇒ `n(L(µ, σ2)) = −n2`n(2π)− n

2`n(σ2)− 1

2σ2

n∑i=1

(xi − µ)2

⇒ ∂

∂µ`n(L(µ, σ2)) =

1

σ2

n∑i=1

(xi − µ) ≡ 0,

and so µ = X .

ISYE 6739 — Goldsman 7/12/20 48 / 74

Page 299: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Lesson 5.8 — Trickier MLE Examples

Example: X1, . . . , Xniid∼ Nor(µ, σ2). Get simultaneous MLEs for µ and σ2.

L(µ, σ2) =n∏i=1

f(xi) =n∏i=1

1√2πσ2

exp{− 1

2

(xi − µ)2

σ2

}=

1

(2πσ2)n/2exp

{− 1

2

n∑i=1

(xi − µ)2

σ2

}.

⇒ `n(L(µ, σ2)) = −n2`n(2π)− n

2`n(σ2)− 1

2σ2

n∑i=1

(xi − µ)2

⇒ ∂

∂µ`n(L(µ, σ2)) =

1

σ2

n∑i=1

(xi − µ) ≡ 0,

and so µ = X .

ISYE 6739 — Goldsman 7/12/20 48 / 74

Page 300: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Lesson 5.8 — Trickier MLE Examples

Example: X1, . . . , Xniid∼ Nor(µ, σ2). Get simultaneous MLEs for µ and σ2.

L(µ, σ2) =

n∏i=1

f(xi)

=n∏i=1

1√2πσ2

exp{− 1

2

(xi − µ)2

σ2

}=

1

(2πσ2)n/2exp

{− 1

2

n∑i=1

(xi − µ)2

σ2

}.

⇒ `n(L(µ, σ2)) = −n2`n(2π)− n

2`n(σ2)− 1

2σ2

n∑i=1

(xi − µ)2

⇒ ∂

∂µ`n(L(µ, σ2)) =

1

σ2

n∑i=1

(xi − µ) ≡ 0,

and so µ = X .

ISYE 6739 — Goldsman 7/12/20 48 / 74

Page 301: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Lesson 5.8 — Trickier MLE Examples

Example: X1, . . . , Xniid∼ Nor(µ, σ2). Get simultaneous MLEs for µ and σ2.

L(µ, σ2) =

n∏i=1

f(xi) =

n∏i=1

1√2πσ2

exp{− 1

2

(xi − µ)2

σ2

}

=1

(2πσ2)n/2exp

{− 1

2

n∑i=1

(xi − µ)2

σ2

}.

⇒ `n(L(µ, σ2)) = −n2`n(2π)− n

2`n(σ2)− 1

2σ2

n∑i=1

(xi − µ)2

⇒ ∂

∂µ`n(L(µ, σ2)) =

1

σ2

n∑i=1

(xi − µ) ≡ 0,

and so µ = X .

ISYE 6739 — Goldsman 7/12/20 48 / 74

Page 302: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Lesson 5.8 — Trickier MLE Examples

Example: X1, . . . , Xniid∼ Nor(µ, σ2). Get simultaneous MLEs for µ and σ2.

L(µ, σ2) =

n∏i=1

f(xi) =

n∏i=1

1√2πσ2

exp{− 1

2

(xi − µ)2

σ2

}=

1

(2πσ2)n/2exp

{− 1

2

n∑i=1

(xi − µ)2

σ2

}.

⇒ `n(L(µ, σ2)) = −n2`n(2π)− n

2`n(σ2)− 1

2σ2

n∑i=1

(xi − µ)2

⇒ ∂

∂µ`n(L(µ, σ2)) =

1

σ2

n∑i=1

(xi − µ) ≡ 0,

and so µ = X .

ISYE 6739 — Goldsman 7/12/20 48 / 74

Page 303: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Lesson 5.8 — Trickier MLE Examples

Example: X1, . . . , Xniid∼ Nor(µ, σ2). Get simultaneous MLEs for µ and σ2.

L(µ, σ2) =

n∏i=1

f(xi) =

n∏i=1

1√2πσ2

exp{− 1

2

(xi − µ)2

σ2

}=

1

(2πσ2)n/2exp

{− 1

2

n∑i=1

(xi − µ)2

σ2

}.

⇒ `n(L(µ, σ2)) = −n2`n(2π)− n

2`n(σ2)− 1

2σ2

n∑i=1

(xi − µ)2

⇒ ∂

∂µ`n(L(µ, σ2)) =

1

σ2

n∑i=1

(xi − µ) ≡ 0,

and so µ = X .

ISYE 6739 — Goldsman 7/12/20 48 / 74

Page 304: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Lesson 5.8 — Trickier MLE Examples

Example: X1, . . . , Xniid∼ Nor(µ, σ2). Get simultaneous MLEs for µ and σ2.

L(µ, σ2) =

n∏i=1

f(xi) =

n∏i=1

1√2πσ2

exp{− 1

2

(xi − µ)2

σ2

}=

1

(2πσ2)n/2exp

{− 1

2

n∑i=1

(xi − µ)2

σ2

}.

⇒ `n(L(µ, σ2)) = −n2`n(2π)− n

2`n(σ2)− 1

2σ2

n∑i=1

(xi − µ)2

⇒ ∂

∂µ`n(L(µ, σ2)) =

1

σ2

n∑i=1

(xi − µ) ≡ 0,

and so µ = X .

ISYE 6739 — Goldsman 7/12/20 48 / 74

Page 305: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Lesson 5.8 — Trickier MLE Examples

Example: X1, . . . , Xniid∼ Nor(µ, σ2). Get simultaneous MLEs for µ and σ2.

L(µ, σ2) =

n∏i=1

f(xi) =

n∏i=1

1√2πσ2

exp{− 1

2

(xi − µ)2

σ2

}=

1

(2πσ2)n/2exp

{− 1

2

n∑i=1

(xi − µ)2

σ2

}.

⇒ `n(L(µ, σ2)) = −n2`n(2π)− n

2`n(σ2)− 1

2σ2

n∑i=1

(xi − µ)2

⇒ ∂

∂µ`n(L(µ, σ2)) =

1

σ2

n∑i=1

(xi − µ) ≡ 0,

and so µ = X .

ISYE 6739 — Goldsman 7/12/20 48 / 74

Page 306: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Similarly, take the partial with respect to σ2 (not σ),

∂σ2`n(L(µ, σ2)) = − n

2σ2+

1

2σ4

n∑i=1

(xi − µ)2 ≡ 0,

and eventually get

σ2 =1

n

n∑i=1

(Xi − X)2. 2

Remark: Notice how close σ2 is to the (unbiased) sample variance,

S2 =1

n− 1

n∑i=1

(Xi − X)2 =n

n− 1σ2.

σ2 is a little bit biased, but it has slightly less variance than S2. Anyway, as ngets big, S2 and σ2 become the same.

ISYE 6739 — Goldsman 7/12/20 49 / 74

Page 307: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Similarly, take the partial with respect to σ2 (not σ),

∂σ2`n(L(µ, σ2))

= − n

2σ2+

1

2σ4

n∑i=1

(xi − µ)2 ≡ 0,

and eventually get

σ2 =1

n

n∑i=1

(Xi − X)2. 2

Remark: Notice how close σ2 is to the (unbiased) sample variance,

S2 =1

n− 1

n∑i=1

(Xi − X)2 =n

n− 1σ2.

σ2 is a little bit biased, but it has slightly less variance than S2. Anyway, as ngets big, S2 and σ2 become the same.

ISYE 6739 — Goldsman 7/12/20 49 / 74

Page 308: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Similarly, take the partial with respect to σ2 (not σ),

∂σ2`n(L(µ, σ2)) = − n

2σ2+

1

2σ4

n∑i=1

(xi − µ)2 ≡ 0,

and eventually get

σ2 =1

n

n∑i=1

(Xi − X)2. 2

Remark: Notice how close σ2 is to the (unbiased) sample variance,

S2 =1

n− 1

n∑i=1

(Xi − X)2 =n

n− 1σ2.

σ2 is a little bit biased, but it has slightly less variance than S2. Anyway, as ngets big, S2 and σ2 become the same.

ISYE 6739 — Goldsman 7/12/20 49 / 74

Page 309: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Similarly, take the partial with respect to σ2 (not σ),

∂σ2`n(L(µ, σ2)) = − n

2σ2+

1

2σ4

n∑i=1

(xi − µ)2 ≡ 0,

and eventually get

σ2 =1

n

n∑i=1

(Xi − X)2. 2

Remark: Notice how close σ2 is to the (unbiased) sample variance,

S2 =1

n− 1

n∑i=1

(Xi − X)2 =n

n− 1σ2.

σ2 is a little bit biased, but it has slightly less variance than S2. Anyway, as ngets big, S2 and σ2 become the same.

ISYE 6739 — Goldsman 7/12/20 49 / 74

Page 310: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Similarly, take the partial with respect to σ2 (not σ),

∂σ2`n(L(µ, σ2)) = − n

2σ2+

1

2σ4

n∑i=1

(xi − µ)2 ≡ 0,

and eventually get

σ2 =1

n

n∑i=1

(Xi − X)2. 2

Remark: Notice how close σ2 is to the (unbiased) sample variance,

S2 =1

n− 1

n∑i=1

(Xi − X)2 =n

n− 1σ2.

σ2 is a little bit biased, but it has slightly less variance than S2. Anyway, as ngets big, S2 and σ2 become the same.

ISYE 6739 — Goldsman 7/12/20 49 / 74

Page 311: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Similarly, take the partial with respect to σ2 (not σ),

∂σ2`n(L(µ, σ2)) = − n

2σ2+

1

2σ4

n∑i=1

(xi − µ)2 ≡ 0,

and eventually get

σ2 =1

n

n∑i=1

(Xi − X)2. 2

Remark: Notice how close σ2 is to the (unbiased) sample variance,

S2 =1

n− 1

n∑i=1

(Xi − X)2 =n

n− 1σ2.

σ2 is a little bit biased, but it has slightly less variance than S2. Anyway, as ngets big, S2 and σ2 become the same.

ISYE 6739 — Goldsman 7/12/20 49 / 74

Page 312: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Similarly, take the partial with respect to σ2 (not σ),

∂σ2`n(L(µ, σ2)) = − n

2σ2+

1

2σ4

n∑i=1

(xi − µ)2 ≡ 0,

and eventually get

σ2 =1

n

n∑i=1

(Xi − X)2. 2

Remark: Notice how close σ2 is to the (unbiased) sample variance,

S2 =1

n− 1

n∑i=1

(Xi − X)2 =n

n− 1σ2.

σ2 is a little bit biased, but it has slightly less variance than S2.

Anyway, as ngets big, S2 and σ2 become the same.

ISYE 6739 — Goldsman 7/12/20 49 / 74

Page 313: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Similarly, take the partial with respect to σ2 (not σ),

∂σ2`n(L(µ, σ2)) = − n

2σ2+

1

2σ4

n∑i=1

(xi − µ)2 ≡ 0,

and eventually get

σ2 =1

n

n∑i=1

(Xi − X)2. 2

Remark: Notice how close σ2 is to the (unbiased) sample variance,

S2 =1

n− 1

n∑i=1

(Xi − X)2 =n

n− 1σ2.

σ2 is a little bit biased, but it has slightly less variance than S2. Anyway, as ngets big, S2 and σ2 become the same.

ISYE 6739 — Goldsman 7/12/20 49 / 74

Page 314: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: The pdf of the Gamma distribution w/parameters r and λ is

f(x) =λr

Γ(r)xr−1e−λx, x > 0.

Suppose X1, . . . , Xniid∼ Gam(r, λ). Find the MLEs for r and λ.

L(r, λ) =n∏i=1

f(xi) =λnr

[Γ(r)]n

( n∏i=1

xi

)r−1e−λ

∑i xi

⇒ `n(L) = rn `n(λ)− n `n(Γ(r)) + (r − 1)`n( n∏i=1

xi

)− λ

n∑i=1

xi

⇒ ∂

∂λ`n(L) =

rn

λ−

n∑i=1

xi ≡ 0,

so that λ = r/X .

ISYE 6739 — Goldsman 7/12/20 50 / 74

Page 315: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: The pdf of the Gamma distribution w/parameters r and λ is

f(x) =λr

Γ(r)xr−1e−λx, x > 0.

Suppose X1, . . . , Xniid∼ Gam(r, λ). Find the MLEs for r and λ.

L(r, λ) =n∏i=1

f(xi) =λnr

[Γ(r)]n

( n∏i=1

xi

)r−1e−λ

∑i xi

⇒ `n(L) = rn `n(λ)− n `n(Γ(r)) + (r − 1)`n( n∏i=1

xi

)− λ

n∑i=1

xi

⇒ ∂

∂λ`n(L) =

rn

λ−

n∑i=1

xi ≡ 0,

so that λ = r/X .

ISYE 6739 — Goldsman 7/12/20 50 / 74

Page 316: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: The pdf of the Gamma distribution w/parameters r and λ is

f(x) =λr

Γ(r)xr−1e−λx, x > 0.

Suppose X1, . . . , Xniid∼ Gam(r, λ).

Find the MLEs for r and λ.

L(r, λ) =n∏i=1

f(xi) =λnr

[Γ(r)]n

( n∏i=1

xi

)r−1e−λ

∑i xi

⇒ `n(L) = rn `n(λ)− n `n(Γ(r)) + (r − 1)`n( n∏i=1

xi

)− λ

n∑i=1

xi

⇒ ∂

∂λ`n(L) =

rn

λ−

n∑i=1

xi ≡ 0,

so that λ = r/X .

ISYE 6739 — Goldsman 7/12/20 50 / 74

Page 317: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: The pdf of the Gamma distribution w/parameters r and λ is

f(x) =λr

Γ(r)xr−1e−λx, x > 0.

Suppose X1, . . . , Xniid∼ Gam(r, λ). Find the MLEs for r and λ.

L(r, λ) =n∏i=1

f(xi) =λnr

[Γ(r)]n

( n∏i=1

xi

)r−1e−λ

∑i xi

⇒ `n(L) = rn `n(λ)− n `n(Γ(r)) + (r − 1)`n( n∏i=1

xi

)− λ

n∑i=1

xi

⇒ ∂

∂λ`n(L) =

rn

λ−

n∑i=1

xi ≡ 0,

so that λ = r/X .

ISYE 6739 — Goldsman 7/12/20 50 / 74

Page 318: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: The pdf of the Gamma distribution w/parameters r and λ is

f(x) =λr

Γ(r)xr−1e−λx, x > 0.

Suppose X1, . . . , Xniid∼ Gam(r, λ). Find the MLEs for r and λ.

L(r, λ) =

n∏i=1

f(xi)

=λnr

[Γ(r)]n

( n∏i=1

xi

)r−1e−λ

∑i xi

⇒ `n(L) = rn `n(λ)− n `n(Γ(r)) + (r − 1)`n( n∏i=1

xi

)− λ

n∑i=1

xi

⇒ ∂

∂λ`n(L) =

rn

λ−

n∑i=1

xi ≡ 0,

so that λ = r/X .

ISYE 6739 — Goldsman 7/12/20 50 / 74

Page 319: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: The pdf of the Gamma distribution w/parameters r and λ is

f(x) =λr

Γ(r)xr−1e−λx, x > 0.

Suppose X1, . . . , Xniid∼ Gam(r, λ). Find the MLEs for r and λ.

L(r, λ) =

n∏i=1

f(xi) =λnr

[Γ(r)]n

( n∏i=1

xi

)r−1e−λ

∑i xi

⇒ `n(L) = rn `n(λ)− n `n(Γ(r)) + (r − 1)`n( n∏i=1

xi

)− λ

n∑i=1

xi

⇒ ∂

∂λ`n(L) =

rn

λ−

n∑i=1

xi ≡ 0,

so that λ = r/X .

ISYE 6739 — Goldsman 7/12/20 50 / 74

Page 320: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: The pdf of the Gamma distribution w/parameters r and λ is

f(x) =λr

Γ(r)xr−1e−λx, x > 0.

Suppose X1, . . . , Xniid∼ Gam(r, λ). Find the MLEs for r and λ.

L(r, λ) =

n∏i=1

f(xi) =λnr

[Γ(r)]n

( n∏i=1

xi

)r−1e−λ

∑i xi

⇒ `n(L) = rn `n(λ)− n `n(Γ(r)) + (r − 1)`n( n∏i=1

xi

)− λ

n∑i=1

xi

⇒ ∂

∂λ`n(L) =

rn

λ−

n∑i=1

xi ≡ 0,

so that λ = r/X .

ISYE 6739 — Goldsman 7/12/20 50 / 74

Page 321: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: The pdf of the Gamma distribution w/parameters r and λ is

f(x) =λr

Γ(r)xr−1e−λx, x > 0.

Suppose X1, . . . , Xniid∼ Gam(r, λ). Find the MLEs for r and λ.

L(r, λ) =

n∏i=1

f(xi) =λnr

[Γ(r)]n

( n∏i=1

xi

)r−1e−λ

∑i xi

⇒ `n(L) = rn `n(λ)− n `n(Γ(r)) + (r − 1)`n( n∏i=1

xi

)− λ

n∑i=1

xi

⇒ ∂

∂λ`n(L) =

rn

λ−

n∑i=1

xi ≡ 0,

so that λ = r/X .

ISYE 6739 — Goldsman 7/12/20 50 / 74

Page 322: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: The pdf of the Gamma distribution w/parameters r and λ is

f(x) =λr

Γ(r)xr−1e−λx, x > 0.

Suppose X1, . . . , Xniid∼ Gam(r, λ). Find the MLEs for r and λ.

L(r, λ) =

n∏i=1

f(xi) =λnr

[Γ(r)]n

( n∏i=1

xi

)r−1e−λ

∑i xi

⇒ `n(L) = rn `n(λ)− n `n(Γ(r)) + (r − 1)`n( n∏i=1

xi

)− λ

n∑i=1

xi

⇒ ∂

∂λ`n(L) =

rn

λ−

n∑i=1

xi ≡ 0,

so that λ = r/X .

ISYE 6739 — Goldsman 7/12/20 50 / 74

Page 323: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

The Trouble in River City is, we need to find r.

To do so, we have

∂r`n(L) =

∂r

[rn `n(λ)− n `n(Γ(r)) + (r − 1)`n

( n∏i=1

xi

)− λ

n∑i=1

xi

]= n `n(λ)− n

Γ(r)

d

drΓ(r) + `n

( n∏i=1

xi

)= n `n(λ)− nΨ(r) + `n

( n∏i=1

xi

)≡ 0,

where Ψ(r) ≡ Γ′(r)/Γ(r) is the digamma function.

ISYE 6739 — Goldsman 7/12/20 51 / 74

Page 324: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

The Trouble in River City is, we need to find r. To do so, we have

∂r`n(L) =

∂r

[rn `n(λ)− n `n(Γ(r)) + (r − 1)`n

( n∏i=1

xi

)− λ

n∑i=1

xi

]

= n `n(λ)− n

Γ(r)

d

drΓ(r) + `n

( n∏i=1

xi

)= n `n(λ)− nΨ(r) + `n

( n∏i=1

xi

)≡ 0,

where Ψ(r) ≡ Γ′(r)/Γ(r) is the digamma function.

ISYE 6739 — Goldsman 7/12/20 51 / 74

Page 325: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

The Trouble in River City is, we need to find r. To do so, we have

∂r`n(L) =

∂r

[rn `n(λ)− n `n(Γ(r)) + (r − 1)`n

( n∏i=1

xi

)− λ

n∑i=1

xi

]= n `n(λ)− n

Γ(r)

d

drΓ(r) + `n

( n∏i=1

xi

)

= n `n(λ)− nΨ(r) + `n( n∏i=1

xi

)≡ 0,

where Ψ(r) ≡ Γ′(r)/Γ(r) is the digamma function.

ISYE 6739 — Goldsman 7/12/20 51 / 74

Page 326: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

The Trouble in River City is, we need to find r. To do so, we have

∂r`n(L) =

∂r

[rn `n(λ)− n `n(Γ(r)) + (r − 1)`n

( n∏i=1

xi

)− λ

n∑i=1

xi

]= n `n(λ)− n

Γ(r)

d

drΓ(r) + `n

( n∏i=1

xi

)= n `n(λ)− nΨ(r) + `n

( n∏i=1

xi

)≡ 0,

where Ψ(r) ≡ Γ′(r)/Γ(r) is the digamma function.

ISYE 6739 — Goldsman 7/12/20 51 / 74

dg2
Highlight
Page 327: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

The Trouble in River City is, we need to find r. To do so, we have

∂r`n(L) =

∂r

[rn `n(λ)− n `n(Γ(r)) + (r − 1)`n

( n∏i=1

xi

)− λ

n∑i=1

xi

]= n `n(λ)− n

Γ(r)

d

drΓ(r) + `n

( n∏i=1

xi

)= n `n(λ)− nΨ(r) + `n

( n∏i=1

xi

)≡ 0,

where Ψ(r) ≡ Γ′(r)/Γ(r) is the digamma function.

ISYE 6739 — Goldsman 7/12/20 51 / 74

Page 328: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

At this point, substitute in λ = r/X , and use a computer method (bisection,Newton’s method, etc.) to search for the value of r that solves

n `n(r/X)− nΨ(r) + `n( n∏i=1

xi

)≡ 0.

The gamma function is readily available in any reasonable software package;but if the digamma function happens to be unavailable in your town, you cantake advantage of the approximation

Γ′(r).=

Γ(r + h)− Γ(r)

h(for any small h of your choosing). 2

ISYE 6739 — Goldsman 7/12/20 52 / 74

Page 329: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

At this point, substitute in λ = r/X , and use a computer method (bisection,Newton’s method, etc.) to search for the value of r that solves

n `n(r/X)− nΨ(r) + `n( n∏i=1

xi

)≡ 0.

The gamma function is readily available in any reasonable software package;but if the digamma function happens to be unavailable in your town, you cantake advantage of the approximation

Γ′(r).=

Γ(r + h)− Γ(r)

h(for any small h of your choosing). 2

ISYE 6739 — Goldsman 7/12/20 52 / 74

Page 330: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

At this point, substitute in λ = r/X , and use a computer method (bisection,Newton’s method, etc.) to search for the value of r that solves

n `n(r/X)− nΨ(r) + `n( n∏i=1

xi

)≡ 0.

The gamma function is readily available in any reasonable software package;but if the digamma function happens to be unavailable in your town, you cantake advantage of the approximation

Γ′(r).=

Γ(r + h)− Γ(r)

h(for any small h of your choosing). 2

ISYE 6739 — Goldsman 7/12/20 52 / 74

dg2
Highlight
Page 331: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

At this point, substitute in λ = r/X , and use a computer method (bisection,Newton’s method, etc.) to search for the value of r that solves

n `n(r/X)− nΨ(r) + `n( n∏i=1

xi

)≡ 0.

The gamma function is readily available in any reasonable software package;but if the digamma function happens to be unavailable in your town, you cantake advantage of the approximation

Γ′(r).=

Γ(r + h)− Γ(r)

h(for any small h of your choosing). 2

ISYE 6739 — Goldsman 7/12/20 52 / 74

dg2
Highlight
dg2
Highlight
Page 332: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: Suppose X1, . . . , Xniid∼ Unif(0, θ).

Find the MLE for θ.

The pdf is f(x) = 1/θ, 0 < x < θ, (beware of the funny limits). Then

L(θ) =n∏i=1

f(xi) = 1/θn if 0 ≤ xi ≤ θ, ∀i

In order to have L(θ) > 0, we must have 0 ≤ xi ≤ θ, ∀i. In other words, wemust have θ ≥ maxi xi.

Subject to this constraint, L(θ) = 1/θn is maximized at the smallest possibleθ value, namely, θ = maxiXi.

This makes sense in light of the similar (unbiased) estimator,Y2 = n+1

n maxiXi, from a previous lesson. 2

Remark: We used very little calculus in this example!

ISYE 6739 — Goldsman 7/12/20 53 / 74

Page 333: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: Suppose X1, . . . , Xniid∼ Unif(0, θ). Find the MLE for θ.

The pdf is f(x) = 1/θ, 0 < x < θ, (beware of the funny limits). Then

L(θ) =n∏i=1

f(xi) = 1/θn if 0 ≤ xi ≤ θ, ∀i

In order to have L(θ) > 0, we must have 0 ≤ xi ≤ θ, ∀i. In other words, wemust have θ ≥ maxi xi.

Subject to this constraint, L(θ) = 1/θn is maximized at the smallest possibleθ value, namely, θ = maxiXi.

This makes sense in light of the similar (unbiased) estimator,Y2 = n+1

n maxiXi, from a previous lesson. 2

Remark: We used very little calculus in this example!

ISYE 6739 — Goldsman 7/12/20 53 / 74

Page 334: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: Suppose X1, . . . , Xniid∼ Unif(0, θ). Find the MLE for θ.

The pdf is f(x) = 1/θ, 0 < x < θ, (beware of the funny limits).

Then

L(θ) =n∏i=1

f(xi) = 1/θn if 0 ≤ xi ≤ θ, ∀i

In order to have L(θ) > 0, we must have 0 ≤ xi ≤ θ, ∀i. In other words, wemust have θ ≥ maxi xi.

Subject to this constraint, L(θ) = 1/θn is maximized at the smallest possibleθ value, namely, θ = maxiXi.

This makes sense in light of the similar (unbiased) estimator,Y2 = n+1

n maxiXi, from a previous lesson. 2

Remark: We used very little calculus in this example!

ISYE 6739 — Goldsman 7/12/20 53 / 74

dg2
Highlight
Page 335: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: Suppose X1, . . . , Xniid∼ Unif(0, θ). Find the MLE for θ.

The pdf is f(x) = 1/θ, 0 < x < θ, (beware of the funny limits). Then

L(θ) =

n∏i=1

f(xi) = 1/θn if 0 ≤ xi ≤ θ, ∀i

In order to have L(θ) > 0, we must have 0 ≤ xi ≤ θ, ∀i. In other words, wemust have θ ≥ maxi xi.

Subject to this constraint, L(θ) = 1/θn is maximized at the smallest possibleθ value, namely, θ = maxiXi.

This makes sense in light of the similar (unbiased) estimator,Y2 = n+1

n maxiXi, from a previous lesson. 2

Remark: We used very little calculus in this example!

ISYE 6739 — Goldsman 7/12/20 53 / 74

dg2
Highlight
dg2
Highlight
dg2
Highlight
Page 336: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: Suppose X1, . . . , Xniid∼ Unif(0, θ). Find the MLE for θ.

The pdf is f(x) = 1/θ, 0 < x < θ, (beware of the funny limits). Then

L(θ) =

n∏i=1

f(xi) = 1/θn if 0 ≤ xi ≤ θ, ∀i

In order to have L(θ) > 0, we must have 0 ≤ xi ≤ θ, ∀i.

In other words, wemust have θ ≥ maxi xi.

Subject to this constraint, L(θ) = 1/θn is maximized at the smallest possibleθ value, namely, θ = maxiXi.

This makes sense in light of the similar (unbiased) estimator,Y2 = n+1

n maxiXi, from a previous lesson. 2

Remark: We used very little calculus in this example!

ISYE 6739 — Goldsman 7/12/20 53 / 74

Page 337: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: Suppose X1, . . . , Xniid∼ Unif(0, θ). Find the MLE for θ.

The pdf is f(x) = 1/θ, 0 < x < θ, (beware of the funny limits). Then

L(θ) =

n∏i=1

f(xi) = 1/θn if 0 ≤ xi ≤ θ, ∀i

In order to have L(θ) > 0, we must have 0 ≤ xi ≤ θ, ∀i. In other words, wemust have θ ≥ maxi xi.

Subject to this constraint, L(θ) = 1/θn is maximized at the smallest possibleθ value, namely, θ = maxiXi.

This makes sense in light of the similar (unbiased) estimator,Y2 = n+1

n maxiXi, from a previous lesson. 2

Remark: We used very little calculus in this example!

ISYE 6739 — Goldsman 7/12/20 53 / 74

Page 338: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: Suppose X1, . . . , Xniid∼ Unif(0, θ). Find the MLE for θ.

The pdf is f(x) = 1/θ, 0 < x < θ, (beware of the funny limits). Then

L(θ) =

n∏i=1

f(xi) = 1/θn if 0 ≤ xi ≤ θ, ∀i

In order to have L(θ) > 0, we must have 0 ≤ xi ≤ θ, ∀i. In other words, wemust have θ ≥ maxi xi.

Subject to this constraint, L(θ) = 1/θn is maximized at the smallest possibleθ value, namely, θ = maxiXi.

This makes sense in light of the similar (unbiased) estimator,Y2 = n+1

n maxiXi, from a previous lesson. 2

Remark: We used very little calculus in this example!

ISYE 6739 — Goldsman 7/12/20 53 / 74

dg2
Highlight
dg2
Highlight
dg2
Highlight
dg2
Highlight
Page 339: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: Suppose X1, . . . , Xniid∼ Unif(0, θ). Find the MLE for θ.

The pdf is f(x) = 1/θ, 0 < x < θ, (beware of the funny limits). Then

L(θ) =

n∏i=1

f(xi) = 1/θn if 0 ≤ xi ≤ θ, ∀i

In order to have L(θ) > 0, we must have 0 ≤ xi ≤ θ, ∀i. In other words, wemust have θ ≥ maxi xi.

Subject to this constraint, L(θ) = 1/θn is maximized at the smallest possibleθ value, namely, θ = maxiXi.

This makes sense in light of the similar (unbiased) estimator,Y2 = n+1

n maxiXi, from a previous lesson. 2

Remark: We used very little calculus in this example!

ISYE 6739 — Goldsman 7/12/20 53 / 74

dg2
Highlight
dg2
Highlight
dg2
Highlight
Page 340: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Trickier MLE Examples

Example: Suppose X1, . . . , Xniid∼ Unif(0, θ). Find the MLE for θ.

The pdf is f(x) = 1/θ, 0 < x < θ, (beware of the funny limits). Then

L(θ) =

n∏i=1

f(xi) = 1/θn if 0 ≤ xi ≤ θ, ∀i

In order to have L(θ) > 0, we must have 0 ≤ xi ≤ θ, ∀i. In other words, wemust have θ ≥ maxi xi.

Subject to this constraint, L(θ) = 1/θn is maximized at the smallest possibleθ value, namely, θ = maxiXi.

This makes sense in light of the similar (unbiased) estimator,Y2 = n+1

n maxiXi, from a previous lesson. 2

Remark: We used very little calculus in this example!

ISYE 6739 — Goldsman 7/12/20 53 / 74

Page 341: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Outline

1 Introduction to Descriptive Statistics

2 Summarizing Data

3 Candidate Distributions

4 Introduction to Estimation

5 Unbiased Estimation

6 Mean Squared Error

7 Maximum Likelihood Estimation

8 Trickier MLE Examples

9 Invariance Property of MLEs

10 Method of Moments Estimation

11 Sampling Distributions

ISYE 6739 — Goldsman 7/12/20 54 / 74

Page 342: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Lesson 5.9 — Invariance Property of MLEs

We can get MLEs of functions of parameters almost for free!

Theorem (Invariance Property): If θ is the MLE of some parameter θ andh(·) is any reasonable function, then h(θ) is the MLE of h(θ).

Remark: We noted before that such a property does not hold forunbiasedness. For instance, although E[S2] = σ2, it is usually the case thatE[√S2] 6= σ.

Remark: The proof of the Invariance Property is “easy” when h(·) is aone-to-one function. It’s not so easy — but still generally true — when h(·) isnastier.

ISYE 6739 — Goldsman 7/12/20 55 / 74

Page 343: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Lesson 5.9 — Invariance Property of MLEs

We can get MLEs of functions of parameters almost for free!

Theorem (Invariance Property): If θ is the MLE of some parameter θ andh(·) is any reasonable function, then h(θ) is the MLE of h(θ).

Remark: We noted before that such a property does not hold forunbiasedness. For instance, although E[S2] = σ2, it is usually the case thatE[√S2] 6= σ.

Remark: The proof of the Invariance Property is “easy” when h(·) is aone-to-one function. It’s not so easy — but still generally true — when h(·) isnastier.

ISYE 6739 — Goldsman 7/12/20 55 / 74

dg2
Highlight
Page 344: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Lesson 5.9 — Invariance Property of MLEs

We can get MLEs of functions of parameters almost for free!

Theorem (Invariance Property): If θ is the MLE of some parameter θ andh(·) is any reasonable function, then h(θ) is the MLE of h(θ).

Remark: We noted before that such a property does not hold forunbiasedness. For instance, although E[S2] = σ2, it is usually the case thatE[√S2] 6= σ.

Remark: The proof of the Invariance Property is “easy” when h(·) is aone-to-one function. It’s not so easy — but still generally true — when h(·) isnastier.

ISYE 6739 — Goldsman 7/12/20 55 / 74

dg2
Highlight
Page 345: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Lesson 5.9 — Invariance Property of MLEs

We can get MLEs of functions of parameters almost for free!

Theorem (Invariance Property): If θ is the MLE of some parameter θ andh(·) is any reasonable function, then h(θ) is the MLE of h(θ).

Remark: We noted before that such a property does not hold forunbiasedness.

For instance, although E[S2] = σ2, it is usually the case thatE[√S2] 6= σ.

Remark: The proof of the Invariance Property is “easy” when h(·) is aone-to-one function. It’s not so easy — but still generally true — when h(·) isnastier.

ISYE 6739 — Goldsman 7/12/20 55 / 74

Page 346: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Lesson 5.9 — Invariance Property of MLEs

We can get MLEs of functions of parameters almost for free!

Theorem (Invariance Property): If θ is the MLE of some parameter θ andh(·) is any reasonable function, then h(θ) is the MLE of h(θ).

Remark: We noted before that such a property does not hold forunbiasedness. For instance, although E[S2] = σ2, it is usually the case thatE[√S2] 6= σ.

Remark: The proof of the Invariance Property is “easy” when h(·) is aone-to-one function. It’s not so easy — but still generally true — when h(·) isnastier.

ISYE 6739 — Goldsman 7/12/20 55 / 74

Page 347: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Lesson 5.9 — Invariance Property of MLEs

We can get MLEs of functions of parameters almost for free!

Theorem (Invariance Property): If θ is the MLE of some parameter θ andh(·) is any reasonable function, then h(θ) is the MLE of h(θ).

Remark: We noted before that such a property does not hold forunbiasedness. For instance, although E[S2] = σ2, it is usually the case thatE[√S2] 6= σ.

Remark: The proof of the Invariance Property is “easy” when h(·) is aone-to-one function.

It’s not so easy — but still generally true — when h(·) isnastier.

ISYE 6739 — Goldsman 7/12/20 55 / 74

Page 348: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Lesson 5.9 — Invariance Property of MLEs

We can get MLEs of functions of parameters almost for free!

Theorem (Invariance Property): If θ is the MLE of some parameter θ andh(·) is any reasonable function, then h(θ) is the MLE of h(θ).

Remark: We noted before that such a property does not hold forunbiasedness. For instance, although E[S2] = σ2, it is usually the case thatE[√S2] 6= σ.

Remark: The proof of the Invariance Property is “easy” when h(·) is aone-to-one function. It’s not so easy — but still generally true — when h(·) isnastier.

ISYE 6739 — Goldsman 7/12/20 55 / 74

Page 349: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

We saw that the MLE for σ2 is σ2 = 1n

∑ni=1(Xi − X)2.

If we consider the function h(y) = +√y, then the Invariance Property says

that the MLE of σ is

σ =

√σ2 =

√√√√ 1

n

n∑i=1

(Xi − X)2. 2

Example: Suppose X1, . . . , Xniid∼ Bern(p).

We saw that the MLE for p is p = X . Then Invariance says that the MLE forVar(Xi) = p(1− p) is p(1− p) = X(1− X). 2

ISYE 6739 — Goldsman 7/12/20 56 / 74

Page 350: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

We saw that the MLE for σ2 is σ2 = 1n

∑ni=1(Xi − X)2.

If we consider the function h(y) = +√y, then the Invariance Property says

that the MLE of σ is

σ =

√σ2 =

√√√√ 1

n

n∑i=1

(Xi − X)2. 2

Example: Suppose X1, . . . , Xniid∼ Bern(p).

We saw that the MLE for p is p = X . Then Invariance says that the MLE forVar(Xi) = p(1− p) is p(1− p) = X(1− X). 2

ISYE 6739 — Goldsman 7/12/20 56 / 74

dg2
Highlight
dg2
Highlight
dg2
Highlight
Page 351: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

We saw that the MLE for σ2 is σ2 = 1n

∑ni=1(Xi − X)2.

If we consider the function h(y) = +√y, then the Invariance Property says

that the MLE of σ is

σ =

√σ2 =

√√√√ 1

n

n∑i=1

(Xi − X)2. 2

Example: Suppose X1, . . . , Xniid∼ Bern(p).

We saw that the MLE for p is p = X . Then Invariance says that the MLE forVar(Xi) = p(1− p) is p(1− p) = X(1− X). 2

ISYE 6739 — Goldsman 7/12/20 56 / 74

Page 352: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

We saw that the MLE for σ2 is σ2 = 1n

∑ni=1(Xi − X)2.

If we consider the function h(y) = +√y, then the Invariance Property says

that the MLE of σ is

σ =

√σ2

=

√√√√ 1

n

n∑i=1

(Xi − X)2. 2

Example: Suppose X1, . . . , Xniid∼ Bern(p).

We saw that the MLE for p is p = X . Then Invariance says that the MLE forVar(Xi) = p(1− p) is p(1− p) = X(1− X). 2

ISYE 6739 — Goldsman 7/12/20 56 / 74

Page 353: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

We saw that the MLE for σ2 is σ2 = 1n

∑ni=1(Xi − X)2.

If we consider the function h(y) = +√y, then the Invariance Property says

that the MLE of σ is

σ =

√σ2 =

√√√√ 1

n

n∑i=1

(Xi − X)2. 2

Example: Suppose X1, . . . , Xniid∼ Bern(p).

We saw that the MLE for p is p = X . Then Invariance says that the MLE forVar(Xi) = p(1− p) is p(1− p) = X(1− X). 2

ISYE 6739 — Goldsman 7/12/20 56 / 74

Page 354: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

We saw that the MLE for σ2 is σ2 = 1n

∑ni=1(Xi − X)2.

If we consider the function h(y) = +√y, then the Invariance Property says

that the MLE of σ is

σ =

√σ2 =

√√√√ 1

n

n∑i=1

(Xi − X)2. 2

Example: Suppose X1, . . . , Xniid∼ Bern(p).

We saw that the MLE for p is p = X . Then Invariance says that the MLE forVar(Xi) = p(1− p) is p(1− p) = X(1− X). 2

ISYE 6739 — Goldsman 7/12/20 56 / 74

Page 355: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

We saw that the MLE for σ2 is σ2 = 1n

∑ni=1(Xi − X)2.

If we consider the function h(y) = +√y, then the Invariance Property says

that the MLE of σ is

σ =

√σ2 =

√√√√ 1

n

n∑i=1

(Xi − X)2. 2

Example: Suppose X1, . . . , Xniid∼ Bern(p).

We saw that the MLE for p is p = X .

Then Invariance says that the MLE forVar(Xi) = p(1− p) is p(1− p) = X(1− X). 2

ISYE 6739 — Goldsman 7/12/20 56 / 74

Page 356: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

We saw that the MLE for σ2 is σ2 = 1n

∑ni=1(Xi − X)2.

If we consider the function h(y) = +√y, then the Invariance Property says

that the MLE of σ is

σ =

√σ2 =

√√√√ 1

n

n∑i=1

(Xi − X)2. 2

Example: Suppose X1, . . . , Xniid∼ Bern(p).

We saw that the MLE for p is p = X . Then Invariance says that the MLE forVar(Xi) = p(1− p) is p(1− p) = X(1− X). 2

ISYE 6739 — Goldsman 7/12/20 56 / 74

dg2
Highlight
dg2
Highlight
dg2
Highlight
dg2
Highlight
Page 357: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Exp(λ).

We define the survival function as

F (x) = P (X > x) = 1− F (x) = e−λx.

In addition, we saw that the MLE for λ is λ = 1/X .

Then Invariance says that the MLE of F (x) is

F (x) = e−λx = e−x/X .

This kind of thing is used all of the time in the actuarial sciences. 2

ISYE 6739 — Goldsman 7/12/20 57 / 74

Page 358: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Exp(λ).

We define the survival function as

F (x) = P (X > x) = 1− F (x) = e−λx.

In addition, we saw that the MLE for λ is λ = 1/X .

Then Invariance says that the MLE of F (x) is

F (x) = e−λx = e−x/X .

This kind of thing is used all of the time in the actuarial sciences. 2

ISYE 6739 — Goldsman 7/12/20 57 / 74

Page 359: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Exp(λ).

We define the survival function as

F (x) = P (X > x)

= 1− F (x) = e−λx.

In addition, we saw that the MLE for λ is λ = 1/X .

Then Invariance says that the MLE of F (x) is

F (x) = e−λx = e−x/X .

This kind of thing is used all of the time in the actuarial sciences. 2

ISYE 6739 — Goldsman 7/12/20 57 / 74

dg2
Highlight
dg2
Highlight
Page 360: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Exp(λ).

We define the survival function as

F (x) = P (X > x) = 1− F (x)

= e−λx.

In addition, we saw that the MLE for λ is λ = 1/X .

Then Invariance says that the MLE of F (x) is

F (x) = e−λx = e−x/X .

This kind of thing is used all of the time in the actuarial sciences. 2

ISYE 6739 — Goldsman 7/12/20 57 / 74

Page 361: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Exp(λ).

We define the survival function as

F (x) = P (X > x) = 1− F (x) = e−λx.

In addition, we saw that the MLE for λ is λ = 1/X .

Then Invariance says that the MLE of F (x) is

F (x) = e−λx = e−x/X .

This kind of thing is used all of the time in the actuarial sciences. 2

ISYE 6739 — Goldsman 7/12/20 57 / 74

Page 362: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Exp(λ).

We define the survival function as

F (x) = P (X > x) = 1− F (x) = e−λx.

In addition, we saw that the MLE for λ is λ = 1/X .

Then Invariance says that the MLE of F (x) is

F (x) = e−λx = e−x/X .

This kind of thing is used all of the time in the actuarial sciences. 2

ISYE 6739 — Goldsman 7/12/20 57 / 74

Page 363: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Exp(λ).

We define the survival function as

F (x) = P (X > x) = 1− F (x) = e−λx.

In addition, we saw that the MLE for λ is λ = 1/X .

Then Invariance says that the MLE of F (x) is

F (x) = e−λx = e−x/X .

This kind of thing is used all of the time in the actuarial sciences. 2

ISYE 6739 — Goldsman 7/12/20 57 / 74

Page 364: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Exp(λ).

We define the survival function as

F (x) = P (X > x) = 1− F (x) = e−λx.

In addition, we saw that the MLE for λ is λ = 1/X .

Then Invariance says that the MLE of F (x) is

F (x) = e−λx = e−x/X .

This kind of thing is used all of the time in the actuarial sciences. 2

ISYE 6739 — Goldsman 7/12/20 57 / 74

Page 365: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Invariance Property of MLEs

Example: Suppose X1, . . . , Xniid∼ Exp(λ).

We define the survival function as

F (x) = P (X > x) = 1− F (x) = e−λx.

In addition, we saw that the MLE for λ is λ = 1/X .

Then Invariance says that the MLE of F (x) is

F (x) = e−λx = e−x/X .

This kind of thing is used all of the time in the actuarial sciences. 2

ISYE 6739 — Goldsman 7/12/20 57 / 74

Page 366: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Outline

1 Introduction to Descriptive Statistics

2 Summarizing Data

3 Candidate Distributions

4 Introduction to Estimation

5 Unbiased Estimation

6 Mean Squared Error

7 Maximum Likelihood Estimation

8 Trickier MLE Examples

9 Invariance Property of MLEs

10 Method of Moments Estimation

11 Sampling Distributions

ISYE 6739 — Goldsman 7/12/20 58 / 74

Page 367: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Lesson 5.10 — Method of Moments Estimation

Recall that the kth moment of a random variable X is

µk ≡ E[Xk] =

{ ∑x x

kf(x) if X is discrete∫R x

kf(x) dx if X is continuous.

Definition: Suppose X1, . . . , Xn are iid random variables. Then themethod of moments (MoM) estimator for µk is mk ≡

∑ni=1X

ki /n.

Remark: As n→∞, the Law of Large Numbers implies that∑ni=1X

ki /n→ E[Xk], i.e., mk → µk (so this is a good estimator).

Remark: You should always love your MoM!

ISYE 6739 — Goldsman 7/12/20 59 / 74

Page 368: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Lesson 5.10 — Method of Moments Estimation

Recall that the kth moment of a random variable X is

µk ≡ E[Xk] =

{ ∑x x

kf(x) if X is discrete∫R x

kf(x) dx if X is continuous.

Definition: Suppose X1, . . . , Xn are iid random variables. Then themethod of moments (MoM) estimator for µk is mk ≡

∑ni=1X

ki /n.

Remark: As n→∞, the Law of Large Numbers implies that∑ni=1X

ki /n→ E[Xk], i.e., mk → µk (so this is a good estimator).

Remark: You should always love your MoM!

ISYE 6739 — Goldsman 7/12/20 59 / 74

Page 369: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Lesson 5.10 — Method of Moments Estimation

Recall that the kth moment of a random variable X is

µk ≡ E[Xk] =

{ ∑x x

kf(x) if X is discrete∫R x

kf(x) dx if X is continuous.

Definition: Suppose X1, . . . , Xn are iid random variables. Then themethod of moments (MoM) estimator for µk is mk ≡

∑ni=1X

ki /n.

Remark: As n→∞, the Law of Large Numbers implies that∑ni=1X

ki /n→ E[Xk], i.e., mk → µk (so this is a good estimator).

Remark: You should always love your MoM!

ISYE 6739 — Goldsman 7/12/20 59 / 74

dg2
Highlight
dg2
Highlight
dg2
Highlight
dg2
Highlight
dg2
Highlight
Page 370: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Lesson 5.10 — Method of Moments Estimation

Recall that the kth moment of a random variable X is

µk ≡ E[Xk] =

{ ∑x x

kf(x) if X is discrete∫R x

kf(x) dx if X is continuous.

Definition: Suppose X1, . . . , Xn are iid random variables.

Then themethod of moments (MoM) estimator for µk is mk ≡

∑ni=1X

ki /n.

Remark: As n→∞, the Law of Large Numbers implies that∑ni=1X

ki /n→ E[Xk], i.e., mk → µk (so this is a good estimator).

Remark: You should always love your MoM!

ISYE 6739 — Goldsman 7/12/20 59 / 74

dg2
Highlight
dg2
Highlight
Page 371: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Lesson 5.10 — Method of Moments Estimation

Recall that the kth moment of a random variable X is

µk ≡ E[Xk] =

{ ∑x x

kf(x) if X is discrete∫R x

kf(x) dx if X is continuous.

Definition: Suppose X1, . . . , Xn are iid random variables. Then themethod of moments (MoM) estimator for µk is mk ≡

∑ni=1X

ki /n.

Remark: As n→∞, the Law of Large Numbers implies that∑ni=1X

ki /n→ E[Xk], i.e., mk → µk (so this is a good estimator).

Remark: You should always love your MoM!

ISYE 6739 — Goldsman 7/12/20 59 / 74

dg2
Highlight
dg2
Highlight
dg2
Highlight
Page 372: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Lesson 5.10 — Method of Moments Estimation

Recall that the kth moment of a random variable X is

µk ≡ E[Xk] =

{ ∑x x

kf(x) if X is discrete∫R x

kf(x) dx if X is continuous.

Definition: Suppose X1, . . . , Xn are iid random variables. Then themethod of moments (MoM) estimator for µk is mk ≡

∑ni=1X

ki /n.

Remark: As n→∞, the Law of Large Numbers implies that∑ni=1X

ki /n→ E[Xk],

i.e., mk → µk (so this is a good estimator).

Remark: You should always love your MoM!

ISYE 6739 — Goldsman 7/12/20 59 / 74

dg2
Highlight
Page 373: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Lesson 5.10 — Method of Moments Estimation

Recall that the kth moment of a random variable X is

µk ≡ E[Xk] =

{ ∑x x

kf(x) if X is discrete∫R x

kf(x) dx if X is continuous.

Definition: Suppose X1, . . . , Xn are iid random variables. Then themethod of moments (MoM) estimator for µk is mk ≡

∑ni=1X

ki /n.

Remark: As n→∞, the Law of Large Numbers implies that∑ni=1X

ki /n→ E[Xk], i.e., mk → µk

(so this is a good estimator).

Remark: You should always love your MoM!

ISYE 6739 — Goldsman 7/12/20 59 / 74

Page 374: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Lesson 5.10 — Method of Moments Estimation

Recall that the kth moment of a random variable X is

µk ≡ E[Xk] =

{ ∑x x

kf(x) if X is discrete∫R x

kf(x) dx if X is continuous.

Definition: Suppose X1, . . . , Xn are iid random variables. Then themethod of moments (MoM) estimator for µk is mk ≡

∑ni=1X

ki /n.

Remark: As n→∞, the Law of Large Numbers implies that∑ni=1X

ki /n→ E[Xk], i.e., mk → µk (so this is a good estimator).

Remark: You should always love your MoM!

ISYE 6739 — Goldsman 7/12/20 59 / 74

Page 375: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Lesson 5.10 — Method of Moments Estimation

Recall that the kth moment of a random variable X is

µk ≡ E[Xk] =

{ ∑x x

kf(x) if X is discrete∫R x

kf(x) dx if X is continuous.

Definition: Suppose X1, . . . , Xn are iid random variables. Then themethod of moments (MoM) estimator for µk is mk ≡

∑ni=1X

ki /n.

Remark: As n→∞, the Law of Large Numbers implies that∑ni=1X

ki /n→ E[Xk], i.e., mk → µk (so this is a good estimator).

Remark: You should always love your MoM!

ISYE 6739 — Goldsman 7/12/20 59 / 74

Page 376: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Examples:

The MoM estimator for the true mean µ1 = µ = E[Xi] is the sample meanm1 = X =

∑ni=1Xi/n.

The MoM estimator for µ2 = E[X2i ] is m2 =

∑ni=1X

2i /n.

The MoM estimator for Var(Xi) = E[X2i ]− (E[Xi])

2 = µ2 − µ21 is

m2 −m21 =

1

n

n∑i=1

X2i − X2 =

n− 1

nS2.

(For large n, it’s also OK to use S2.)

General Game Plan: Express the parameter of interest in terms of the truemoments µk = E[Xk]. Then substitute in the sample moments mk.

ISYE 6739 — Goldsman 7/12/20 60 / 74

Page 377: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Examples:

The MoM estimator for the true mean µ1 = µ = E[Xi] is the sample meanm1 = X =

∑ni=1Xi/n.

The MoM estimator for µ2 = E[X2i ] is m2 =

∑ni=1X

2i /n.

The MoM estimator for Var(Xi) = E[X2i ]− (E[Xi])

2 = µ2 − µ21 is

m2 −m21 =

1

n

n∑i=1

X2i − X2 =

n− 1

nS2.

(For large n, it’s also OK to use S2.)

General Game Plan: Express the parameter of interest in terms of the truemoments µk = E[Xk]. Then substitute in the sample moments mk.

ISYE 6739 — Goldsman 7/12/20 60 / 74

Page 378: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Examples:

The MoM estimator for the true mean µ1 = µ = E[Xi] is the sample meanm1 = X =

∑ni=1Xi/n.

The MoM estimator for µ2 = E[X2i ] is m2 =

∑ni=1X

2i /n.

The MoM estimator for Var(Xi) = E[X2i ]− (E[Xi])

2 = µ2 − µ21 is

m2 −m21 =

1

n

n∑i=1

X2i − X2 =

n− 1

nS2.

(For large n, it’s also OK to use S2.)

General Game Plan: Express the parameter of interest in terms of the truemoments µk = E[Xk]. Then substitute in the sample moments mk.

ISYE 6739 — Goldsman 7/12/20 60 / 74

dg2
Highlight
Page 379: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Examples:

The MoM estimator for the true mean µ1 = µ = E[Xi] is the sample meanm1 = X =

∑ni=1Xi/n.

The MoM estimator for µ2 = E[X2i ] is m2 =

∑ni=1X

2i /n.

The MoM estimator for Var(Xi) = E[X2i ]− (E[Xi])

2 = µ2 − µ21 is

m2 −m21 =

1

n

n∑i=1

X2i − X2 =

n− 1

nS2.

(For large n, it’s also OK to use S2.)

General Game Plan: Express the parameter of interest in terms of the truemoments µk = E[Xk]. Then substitute in the sample moments mk.

ISYE 6739 — Goldsman 7/12/20 60 / 74

dg2
Highlight
dg2
Highlight
Page 380: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Examples:

The MoM estimator for the true mean µ1 = µ = E[Xi] is the sample meanm1 = X =

∑ni=1Xi/n.

The MoM estimator for µ2 = E[X2i ] is m2 =

∑ni=1X

2i /n.

The MoM estimator for Var(Xi) = E[X2i ]− (E[Xi])

2 = µ2 − µ21 is

m2 −m21

=1

n

n∑i=1

X2i − X2 =

n− 1

nS2.

(For large n, it’s also OK to use S2.)

General Game Plan: Express the parameter of interest in terms of the truemoments µk = E[Xk]. Then substitute in the sample moments mk.

ISYE 6739 — Goldsman 7/12/20 60 / 74

dg2
Highlight
Page 381: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Examples:

The MoM estimator for the true mean µ1 = µ = E[Xi] is the sample meanm1 = X =

∑ni=1Xi/n.

The MoM estimator for µ2 = E[X2i ] is m2 =

∑ni=1X

2i /n.

The MoM estimator for Var(Xi) = E[X2i ]− (E[Xi])

2 = µ2 − µ21 is

m2 −m21 =

1

n

n∑i=1

X2i − X2

=n− 1

nS2.

(For large n, it’s also OK to use S2.)

General Game Plan: Express the parameter of interest in terms of the truemoments µk = E[Xk]. Then substitute in the sample moments mk.

ISYE 6739 — Goldsman 7/12/20 60 / 74

Page 382: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Examples:

The MoM estimator for the true mean µ1 = µ = E[Xi] is the sample meanm1 = X =

∑ni=1Xi/n.

The MoM estimator for µ2 = E[X2i ] is m2 =

∑ni=1X

2i /n.

The MoM estimator for Var(Xi) = E[X2i ]− (E[Xi])

2 = µ2 − µ21 is

m2 −m21 =

1

n

n∑i=1

X2i − X2 =

n− 1

nS2.

(For large n, it’s also OK to use S2.)

General Game Plan: Express the parameter of interest in terms of the truemoments µk = E[Xk]. Then substitute in the sample moments mk.

ISYE 6739 — Goldsman 7/12/20 60 / 74

dg2
Highlight
dg2
Highlight
dg2
Highlight
Page 383: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Examples:

The MoM estimator for the true mean µ1 = µ = E[Xi] is the sample meanm1 = X =

∑ni=1Xi/n.

The MoM estimator for µ2 = E[X2i ] is m2 =

∑ni=1X

2i /n.

The MoM estimator for Var(Xi) = E[X2i ]− (E[Xi])

2 = µ2 − µ21 is

m2 −m21 =

1

n

n∑i=1

X2i − X2 =

n− 1

nS2.

(For large n, it’s also OK to use S2.)

General Game Plan: Express the parameter of interest in terms of the truemoments µk = E[Xk]. Then substitute in the sample moments mk.

ISYE 6739 — Goldsman 7/12/20 60 / 74

Page 384: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Examples:

The MoM estimator for the true mean µ1 = µ = E[Xi] is the sample meanm1 = X =

∑ni=1Xi/n.

The MoM estimator for µ2 = E[X2i ] is m2 =

∑ni=1X

2i /n.

The MoM estimator for Var(Xi) = E[X2i ]− (E[Xi])

2 = µ2 − µ21 is

m2 −m21 =

1

n

n∑i=1

X2i − X2 =

n− 1

nS2.

(For large n, it’s also OK to use S2.)

General Game Plan:

Express the parameter of interest in terms of the truemoments µk = E[Xk]. Then substitute in the sample moments mk.

ISYE 6739 — Goldsman 7/12/20 60 / 74

Page 385: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Examples:

The MoM estimator for the true mean µ1 = µ = E[Xi] is the sample meanm1 = X =

∑ni=1Xi/n.

The MoM estimator for µ2 = E[X2i ] is m2 =

∑ni=1X

2i /n.

The MoM estimator for Var(Xi) = E[X2i ]− (E[Xi])

2 = µ2 − µ21 is

m2 −m21 =

1

n

n∑i=1

X2i − X2 =

n− 1

nS2.

(For large n, it’s also OK to use S2.)

General Game Plan: Express the parameter of interest in terms of the truemoments µk = E[Xk].

Then substitute in the sample moments mk.

ISYE 6739 — Goldsman 7/12/20 60 / 74

dg2
Highlight
Page 386: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Examples:

The MoM estimator for the true mean µ1 = µ = E[Xi] is the sample meanm1 = X =

∑ni=1Xi/n.

The MoM estimator for µ2 = E[X2i ] is m2 =

∑ni=1X

2i /n.

The MoM estimator for Var(Xi) = E[X2i ]− (E[Xi])

2 = µ2 − µ21 is

m2 −m21 =

1

n

n∑i=1

X2i − X2 =

n− 1

nS2.

(For large n, it’s also OK to use S2.)

General Game Plan: Express the parameter of interest in terms of the truemoments µk = E[Xk]. Then substitute in the sample moments mk.

ISYE 6739 — Goldsman 7/12/20 60 / 74

Page 387: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Suppose X1, . . . , Xniid∼ Pois(λ).

Since λ = E[Xi], a MoM estimator for λ is X .

But also note that λ = Var(Xi), so another MoM estimator for λ is n−1n S2

(or plain old S2). 2

Usually use the easier-looking estimator if you have a choice.

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

MoM estimators for µ and σ2 are X and n−1n S2 (or S2), respectively.

For this example, these estimators are the same as the MLEs. 2

Let’s finish up with a less-trivial example. . . .

ISYE 6739 — Goldsman 7/12/20 61 / 74

Page 388: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Suppose X1, . . . , Xniid∼ Pois(λ).

Since λ = E[Xi], a MoM estimator for λ is X .

But also note that λ = Var(Xi), so another MoM estimator for λ is n−1n S2

(or plain old S2). 2

Usually use the easier-looking estimator if you have a choice.

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

MoM estimators for µ and σ2 are X and n−1n S2 (or S2), respectively.

For this example, these estimators are the same as the MLEs. 2

Let’s finish up with a less-trivial example. . . .

ISYE 6739 — Goldsman 7/12/20 61 / 74

dg2
Highlight
dg2
Highlight
Page 389: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Suppose X1, . . . , Xniid∼ Pois(λ).

Since λ = E[Xi], a MoM estimator for λ is X .

But also note that λ = Var(Xi),

so another MoM estimator for λ is n−1n S2

(or plain old S2). 2

Usually use the easier-looking estimator if you have a choice.

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

MoM estimators for µ and σ2 are X and n−1n S2 (or S2), respectively.

For this example, these estimators are the same as the MLEs. 2

Let’s finish up with a less-trivial example. . . .

ISYE 6739 — Goldsman 7/12/20 61 / 74

Page 390: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Suppose X1, . . . , Xniid∼ Pois(λ).

Since λ = E[Xi], a MoM estimator for λ is X .

But also note that λ = Var(Xi), so another MoM estimator for λ is n−1n S2

(or plain old S2). 2

Usually use the easier-looking estimator if you have a choice.

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

MoM estimators for µ and σ2 are X and n−1n S2 (or S2), respectively.

For this example, these estimators are the same as the MLEs. 2

Let’s finish up with a less-trivial example. . . .

ISYE 6739 — Goldsman 7/12/20 61 / 74

dg2
Highlight
dg2
Highlight
dg2
Highlight
dg2
Highlight
Page 391: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Suppose X1, . . . , Xniid∼ Pois(λ).

Since λ = E[Xi], a MoM estimator for λ is X .

But also note that λ = Var(Xi), so another MoM estimator for λ is n−1n S2

(or plain old S2). 2

Usually use the easier-looking estimator if you have a choice.

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

MoM estimators for µ and σ2 are X and n−1n S2 (or S2), respectively.

For this example, these estimators are the same as the MLEs. 2

Let’s finish up with a less-trivial example. . . .

ISYE 6739 — Goldsman 7/12/20 61 / 74

dg2
Highlight
Page 392: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Suppose X1, . . . , Xniid∼ Pois(λ).

Since λ = E[Xi], a MoM estimator for λ is X .

But also note that λ = Var(Xi), so another MoM estimator for λ is n−1n S2

(or plain old S2). 2

Usually use the easier-looking estimator if you have a choice.

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

MoM estimators for µ and σ2 are X and n−1n S2 (or S2), respectively.

For this example, these estimators are the same as the MLEs. 2

Let’s finish up with a less-trivial example. . . .

ISYE 6739 — Goldsman 7/12/20 61 / 74

Page 393: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Suppose X1, . . . , Xniid∼ Pois(λ).

Since λ = E[Xi], a MoM estimator for λ is X .

But also note that λ = Var(Xi), so another MoM estimator for λ is n−1n S2

(or plain old S2). 2

Usually use the easier-looking estimator if you have a choice.

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

MoM estimators for µ and σ2 are X and n−1n S2 (or S2), respectively.

For this example, these estimators are the same as the MLEs. 2

Let’s finish up with a less-trivial example. . . .

ISYE 6739 — Goldsman 7/12/20 61 / 74

dg2
Highlight
dg2
Highlight
dg2
Highlight
Page 394: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Suppose X1, . . . , Xniid∼ Pois(λ).

Since λ = E[Xi], a MoM estimator for λ is X .

But also note that λ = Var(Xi), so another MoM estimator for λ is n−1n S2

(or plain old S2). 2

Usually use the easier-looking estimator if you have a choice.

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

MoM estimators for µ and σ2 are X and n−1n S2 (or S2), respectively.

For this example, these estimators are the same as the MLEs. 2

Let’s finish up with a less-trivial example. . . .

ISYE 6739 — Goldsman 7/12/20 61 / 74

Page 395: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Suppose X1, . . . , Xniid∼ Pois(λ).

Since λ = E[Xi], a MoM estimator for λ is X .

But also note that λ = Var(Xi), so another MoM estimator for λ is n−1n S2

(or plain old S2). 2

Usually use the easier-looking estimator if you have a choice.

Example: Suppose X1, . . . , Xniid∼ Nor(µ, σ2).

MoM estimators for µ and σ2 are X and n−1n S2 (or S2), respectively.

For this example, these estimators are the same as the MLEs. 2

Let’s finish up with a less-trivial example. . . .

ISYE 6739 — Goldsman 7/12/20 61 / 74

Page 396: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Suppose X1, . . . , Xniid∼ Beta(a, b). The pdf is

f(x) =Γ(a+ b)

Γ(a)Γ(b)xa−1(1− x)b−1, 0 < x < 1.

It turns out (after lots of algebra) that

E[X] =a

a+ band Var(X) =

ab

(a+ b)2(a+ b+ 1).

Let’s estimate a and b via MoM.

ISYE 6739 — Goldsman 7/12/20 62 / 74

Page 397: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Suppose X1, . . . , Xniid∼ Beta(a, b). The pdf is

f(x) =Γ(a+ b)

Γ(a)Γ(b)xa−1(1− x)b−1, 0 < x < 1.

It turns out (after lots of algebra) that

E[X] =a

a+ band Var(X) =

ab

(a+ b)2(a+ b+ 1).

Let’s estimate a and b via MoM.

ISYE 6739 — Goldsman 7/12/20 62 / 74

Page 398: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Suppose X1, . . . , Xniid∼ Beta(a, b). The pdf is

f(x) =Γ(a+ b)

Γ(a)Γ(b)xa−1(1− x)b−1, 0 < x < 1.

It turns out (after lots of algebra) that

E[X] =a

a+ b

and Var(X) =ab

(a+ b)2(a+ b+ 1).

Let’s estimate a and b via MoM.

ISYE 6739 — Goldsman 7/12/20 62 / 74

Page 399: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Suppose X1, . . . , Xniid∼ Beta(a, b). The pdf is

f(x) =Γ(a+ b)

Γ(a)Γ(b)xa−1(1− x)b−1, 0 < x < 1.

It turns out (after lots of algebra) that

E[X] =a

a+ band Var(X) =

ab

(a+ b)2(a+ b+ 1).

Let’s estimate a and b via MoM.

ISYE 6739 — Goldsman 7/12/20 62 / 74

Page 400: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Suppose X1, . . . , Xniid∼ Beta(a, b). The pdf is

f(x) =Γ(a+ b)

Γ(a)Γ(b)xa−1(1− x)b−1, 0 < x < 1.

It turns out (after lots of algebra) that

E[X] =a

a+ band Var(X) =

ab

(a+ b)2(a+ b+ 1).

Let’s estimate a and b via MoM.

ISYE 6739 — Goldsman 7/12/20 62 / 74

Page 401: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

We have

E[X] =a

a+ b⇒

a =bE[X]

1− E[X]

.=

bX

1− X, (1)

so

Var(X) =ab

(a+ b)2(a+ b+ 1)=

E[X]b

(a+ b)(a+ b+ 1).

Plug into the above X for E[X], S2 for Var(X), and bX1−X for a. Then after

lots of algebra, we can solve for b:

b.=

(1− X)2X

S2− 1 + X.

To finish up, you can plug back into Equation (1) to get the MoM estimatorfor a.

ISYE 6739 — Goldsman 7/12/20 63 / 74

Page 402: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

We have

E[X] =a

a+ b⇒ a =

bE[X]

1− E[X]

.=

bX

1− X, (1)

so

Var(X) =ab

(a+ b)2(a+ b+ 1)=

E[X]b

(a+ b)(a+ b+ 1).

Plug into the above X for E[X], S2 for Var(X), and bX1−X for a. Then after

lots of algebra, we can solve for b:

b.=

(1− X)2X

S2− 1 + X.

To finish up, you can plug back into Equation (1) to get the MoM estimatorfor a.

ISYE 6739 — Goldsman 7/12/20 63 / 74

Page 403: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

We have

E[X] =a

a+ b⇒ a =

bE[X]

1− E[X]

.=

bX

1− X, (1)

so

Var(X) =ab

(a+ b)2(a+ b+ 1)=

E[X]b

(a+ b)(a+ b+ 1).

Plug into the above X for E[X], S2 for Var(X), and bX1−X for a. Then after

lots of algebra, we can solve for b:

b.=

(1− X)2X

S2− 1 + X.

To finish up, you can plug back into Equation (1) to get the MoM estimatorfor a.

ISYE 6739 — Goldsman 7/12/20 63 / 74

Page 404: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

We have

E[X] =a

a+ b⇒ a =

bE[X]

1− E[X]

.=

bX

1− X, (1)

so

Var(X) =ab

(a+ b)2(a+ b+ 1)

=E[X]b

(a+ b)(a+ b+ 1).

Plug into the above X for E[X], S2 for Var(X), and bX1−X for a. Then after

lots of algebra, we can solve for b:

b.=

(1− X)2X

S2− 1 + X.

To finish up, you can plug back into Equation (1) to get the MoM estimatorfor a.

ISYE 6739 — Goldsman 7/12/20 63 / 74

Page 405: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

We have

E[X] =a

a+ b⇒ a =

bE[X]

1− E[X]

.=

bX

1− X, (1)

so

Var(X) =ab

(a+ b)2(a+ b+ 1)=

E[X]b

(a+ b)(a+ b+ 1).

Plug into the above X for E[X], S2 for Var(X), and bX1−X for a. Then after

lots of algebra, we can solve for b:

b.=

(1− X)2X

S2− 1 + X.

To finish up, you can plug back into Equation (1) to get the MoM estimatorfor a.

ISYE 6739 — Goldsman 7/12/20 63 / 74

Page 406: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

We have

E[X] =a

a+ b⇒ a =

bE[X]

1− E[X]

.=

bX

1− X, (1)

so

Var(X) =ab

(a+ b)2(a+ b+ 1)=

E[X]b

(a+ b)(a+ b+ 1).

Plug into the above X for E[X],

S2 for Var(X), and bX1−X for a. Then after

lots of algebra, we can solve for b:

b.=

(1− X)2X

S2− 1 + X.

To finish up, you can plug back into Equation (1) to get the MoM estimatorfor a.

ISYE 6739 — Goldsman 7/12/20 63 / 74

Page 407: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

We have

E[X] =a

a+ b⇒ a =

bE[X]

1− E[X]

.=

bX

1− X, (1)

so

Var(X) =ab

(a+ b)2(a+ b+ 1)=

E[X]b

(a+ b)(a+ b+ 1).

Plug into the above X for E[X], S2 for Var(X),

and bX1−X for a. Then after

lots of algebra, we can solve for b:

b.=

(1− X)2X

S2− 1 + X.

To finish up, you can plug back into Equation (1) to get the MoM estimatorfor a.

ISYE 6739 — Goldsman 7/12/20 63 / 74

Page 408: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

We have

E[X] =a

a+ b⇒ a =

bE[X]

1− E[X]

.=

bX

1− X, (1)

so

Var(X) =ab

(a+ b)2(a+ b+ 1)=

E[X]b

(a+ b)(a+ b+ 1).

Plug into the above X for E[X], S2 for Var(X), and bX1−X for a.

Then afterlots of algebra, we can solve for b:

b.=

(1− X)2X

S2− 1 + X.

To finish up, you can plug back into Equation (1) to get the MoM estimatorfor a.

ISYE 6739 — Goldsman 7/12/20 63 / 74

Page 409: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

We have

E[X] =a

a+ b⇒ a =

bE[X]

1− E[X]

.=

bX

1− X, (1)

so

Var(X) =ab

(a+ b)2(a+ b+ 1)=

E[X]b

(a+ b)(a+ b+ 1).

Plug into the above X for E[X], S2 for Var(X), and bX1−X for a. Then after

lots of algebra, we can solve for b:

b.=

(1− X)2X

S2− 1 + X.

To finish up, you can plug back into Equation (1) to get the MoM estimatorfor a.

ISYE 6739 — Goldsman 7/12/20 63 / 74

Page 410: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

We have

E[X] =a

a+ b⇒ a =

bE[X]

1− E[X]

.=

bX

1− X, (1)

so

Var(X) =ab

(a+ b)2(a+ b+ 1)=

E[X]b

(a+ b)(a+ b+ 1).

Plug into the above X for E[X], S2 for Var(X), and bX1−X for a. Then after

lots of algebra, we can solve for b:

b.=

(1− X)2X

S2− 1 + X.

To finish up, you can plug back into Equation (1) to get the MoM estimatorfor a.

ISYE 6739 — Goldsman 7/12/20 63 / 74

Page 411: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

We have

E[X] =a

a+ b⇒ a =

bE[X]

1− E[X]

.=

bX

1− X, (1)

so

Var(X) =ab

(a+ b)2(a+ b+ 1)=

E[X]b

(a+ b)(a+ b+ 1).

Plug into the above X for E[X], S2 for Var(X), and bX1−X for a. Then after

lots of algebra, we can solve for b:

b.=

(1− X)2X

S2− 1 + X.

To finish up, you can plug back into Equation (1) to get the MoM estimatorfor a.

ISYE 6739 — Goldsman 7/12/20 63 / 74

Page 412: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Consider the following data set consisting of n = 10 observationsthat we have obtained from a Beta distribution.

0.86 0.77 0.84 0.38 0.83 0.54 0.77 0.94 0.37 0.40

We immediately have X = 0.67, and S2 = 0.04971. Then the MoMestimators are

b.=

(1− X)2X

S2− 1 + X = 1.1377,

and then

a.=

bX

1− X= 2.310. 2

ISYE 6739 — Goldsman 7/12/20 64 / 74

Page 413: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Consider the following data set consisting of n = 10 observationsthat we have obtained from a Beta distribution.

0.86 0.77 0.84 0.38 0.83 0.54 0.77 0.94 0.37 0.40

We immediately have X = 0.67, and S2 = 0.04971. Then the MoMestimators are

b.=

(1− X)2X

S2− 1 + X = 1.1377,

and then

a.=

bX

1− X= 2.310. 2

ISYE 6739 — Goldsman 7/12/20 64 / 74

Page 414: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Consider the following data set consisting of n = 10 observationsthat we have obtained from a Beta distribution.

0.86 0.77 0.84 0.38 0.83 0.54 0.77 0.94 0.37 0.40

We immediately have X = 0.67, and S2 = 0.04971.

Then the MoMestimators are

b.=

(1− X)2X

S2− 1 + X = 1.1377,

and then

a.=

bX

1− X= 2.310. 2

ISYE 6739 — Goldsman 7/12/20 64 / 74

Page 415: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Consider the following data set consisting of n = 10 observationsthat we have obtained from a Beta distribution.

0.86 0.77 0.84 0.38 0.83 0.54 0.77 0.94 0.37 0.40

We immediately have X = 0.67, and S2 = 0.04971. Then the MoMestimators are

b.=

(1− X)2X

S2− 1 + X = 1.1377,

and then

a.=

bX

1− X= 2.310. 2

ISYE 6739 — Goldsman 7/12/20 64 / 74

Page 416: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Consider the following data set consisting of n = 10 observationsthat we have obtained from a Beta distribution.

0.86 0.77 0.84 0.38 0.83 0.54 0.77 0.94 0.37 0.40

We immediately have X = 0.67, and S2 = 0.04971. Then the MoMestimators are

b.=

(1− X)2X

S2− 1 + X = 1.1377,

and then

a.=

bX

1− X= 2.310. 2

ISYE 6739 — Goldsman 7/12/20 64 / 74

Page 417: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Method of Moments Estimation

Example: Consider the following data set consisting of n = 10 observationsthat we have obtained from a Beta distribution.

0.86 0.77 0.84 0.38 0.83 0.54 0.77 0.94 0.37 0.40

We immediately have X = 0.67, and S2 = 0.04971. Then the MoMestimators are

b.=

(1− X)2X

S2− 1 + X = 1.1377,

and then

a.=

bX

1− X= 2.310. 2

ISYE 6739 — Goldsman 7/12/20 64 / 74

Page 418: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Outline

1 Introduction to Descriptive Statistics

2 Summarizing Data

3 Candidate Distributions

4 Introduction to Estimation

5 Unbiased Estimation

6 Mean Squared Error

7 Maximum Likelihood Estimation

8 Trickier MLE Examples

9 Invariance Property of MLEs

10 Method of Moments Estimation

11 Sampling Distributions

ISYE 6739 — Goldsman 7/12/20 65 / 74

Page 419: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Introduction and Normal Distribution

Goal: Talk about some distributions we’ll need later to do “confidenceintervals” (CIs) and “hypothesis tests”: Normal, χ2, t, and F .

Definition: Recall that a statistic is just a function of the observationsX1, . . . , Xn from a random sample. The function does not depend explicitlyon any unknown parameters.

Example: X and S2 are statistics, but (X − µ)/σ is not.

Since statistics are RV’s, it’s useful to figure out their distributions.The distribution of a statistic is called a sampling distribution.

Example: X1, . . . , Xniid∼ Nor(µ, σ2) ⇒ X ∼ Nor(µ, σ2/n).

The normal is used to get CIs and do hypothesis tests for µ.

ISYE 6739 — Goldsman 7/12/20 66 / 74

Page 420: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Introduction and Normal Distribution

Goal:

Talk about some distributions we’ll need later to do “confidenceintervals” (CIs) and “hypothesis tests”: Normal, χ2, t, and F .

Definition: Recall that a statistic is just a function of the observationsX1, . . . , Xn from a random sample. The function does not depend explicitlyon any unknown parameters.

Example: X and S2 are statistics, but (X − µ)/σ is not.

Since statistics are RV’s, it’s useful to figure out their distributions.The distribution of a statistic is called a sampling distribution.

Example: X1, . . . , Xniid∼ Nor(µ, σ2) ⇒ X ∼ Nor(µ, σ2/n).

The normal is used to get CIs and do hypothesis tests for µ.

ISYE 6739 — Goldsman 7/12/20 66 / 74

Page 421: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Introduction and Normal Distribution

Goal: Talk about some distributions we’ll need later to do “confidenceintervals” (CIs) and “hypothesis tests”:

Normal, χ2, t, and F .

Definition: Recall that a statistic is just a function of the observationsX1, . . . , Xn from a random sample. The function does not depend explicitlyon any unknown parameters.

Example: X and S2 are statistics, but (X − µ)/σ is not.

Since statistics are RV’s, it’s useful to figure out their distributions.The distribution of a statistic is called a sampling distribution.

Example: X1, . . . , Xniid∼ Nor(µ, σ2) ⇒ X ∼ Nor(µ, σ2/n).

The normal is used to get CIs and do hypothesis tests for µ.

ISYE 6739 — Goldsman 7/12/20 66 / 74

Page 422: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Introduction and Normal Distribution

Goal: Talk about some distributions we’ll need later to do “confidenceintervals” (CIs) and “hypothesis tests”: Normal, χ2, t, and F .

Definition: Recall that a statistic is just a function of the observationsX1, . . . , Xn from a random sample. The function does not depend explicitlyon any unknown parameters.

Example: X and S2 are statistics, but (X − µ)/σ is not.

Since statistics are RV’s, it’s useful to figure out their distributions.The distribution of a statistic is called a sampling distribution.

Example: X1, . . . , Xniid∼ Nor(µ, σ2) ⇒ X ∼ Nor(µ, σ2/n).

The normal is used to get CIs and do hypothesis tests for µ.

ISYE 6739 — Goldsman 7/12/20 66 / 74

Page 423: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Introduction and Normal Distribution

Goal: Talk about some distributions we’ll need later to do “confidenceintervals” (CIs) and “hypothesis tests”: Normal, χ2, t, and F .

Definition: Recall that a statistic is just a function of the observationsX1, . . . , Xn from a random sample.

The function does not depend explicitlyon any unknown parameters.

Example: X and S2 are statistics, but (X − µ)/σ is not.

Since statistics are RV’s, it’s useful to figure out their distributions.The distribution of a statistic is called a sampling distribution.

Example: X1, . . . , Xniid∼ Nor(µ, σ2) ⇒ X ∼ Nor(µ, σ2/n).

The normal is used to get CIs and do hypothesis tests for µ.

ISYE 6739 — Goldsman 7/12/20 66 / 74

Page 424: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Introduction and Normal Distribution

Goal: Talk about some distributions we’ll need later to do “confidenceintervals” (CIs) and “hypothesis tests”: Normal, χ2, t, and F .

Definition: Recall that a statistic is just a function of the observationsX1, . . . , Xn from a random sample. The function does not depend explicitlyon any unknown parameters.

Example: X and S2 are statistics, but (X − µ)/σ is not.

Since statistics are RV’s, it’s useful to figure out their distributions.The distribution of a statistic is called a sampling distribution.

Example: X1, . . . , Xniid∼ Nor(µ, σ2) ⇒ X ∼ Nor(µ, σ2/n).

The normal is used to get CIs and do hypothesis tests for µ.

ISYE 6739 — Goldsman 7/12/20 66 / 74

Page 425: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Introduction and Normal Distribution

Goal: Talk about some distributions we’ll need later to do “confidenceintervals” (CIs) and “hypothesis tests”: Normal, χ2, t, and F .

Definition: Recall that a statistic is just a function of the observationsX1, . . . , Xn from a random sample. The function does not depend explicitlyon any unknown parameters.

Example: X and S2 are statistics, but (X − µ)/σ is not.

Since statistics are RV’s, it’s useful to figure out their distributions.The distribution of a statistic is called a sampling distribution.

Example: X1, . . . , Xniid∼ Nor(µ, σ2) ⇒ X ∼ Nor(µ, σ2/n).

The normal is used to get CIs and do hypothesis tests for µ.

ISYE 6739 — Goldsman 7/12/20 66 / 74

Page 426: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Introduction and Normal Distribution

Goal: Talk about some distributions we’ll need later to do “confidenceintervals” (CIs) and “hypothesis tests”: Normal, χ2, t, and F .

Definition: Recall that a statistic is just a function of the observationsX1, . . . , Xn from a random sample. The function does not depend explicitlyon any unknown parameters.

Example: X and S2 are statistics, but (X − µ)/σ is not.

Since statistics are RV’s, it’s useful to figure out their distributions.

The distribution of a statistic is called a sampling distribution.

Example: X1, . . . , Xniid∼ Nor(µ, σ2) ⇒ X ∼ Nor(µ, σ2/n).

The normal is used to get CIs and do hypothesis tests for µ.

ISYE 6739 — Goldsman 7/12/20 66 / 74

Page 427: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Introduction and Normal Distribution

Goal: Talk about some distributions we’ll need later to do “confidenceintervals” (CIs) and “hypothesis tests”: Normal, χ2, t, and F .

Definition: Recall that a statistic is just a function of the observationsX1, . . . , Xn from a random sample. The function does not depend explicitlyon any unknown parameters.

Example: X and S2 are statistics, but (X − µ)/σ is not.

Since statistics are RV’s, it’s useful to figure out their distributions.The distribution of a statistic is called a sampling distribution.

Example: X1, . . . , Xniid∼ Nor(µ, σ2) ⇒ X ∼ Nor(µ, σ2/n).

The normal is used to get CIs and do hypothesis tests for µ.

ISYE 6739 — Goldsman 7/12/20 66 / 74

Page 428: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Introduction and Normal Distribution

Goal: Talk about some distributions we’ll need later to do “confidenceintervals” (CIs) and “hypothesis tests”: Normal, χ2, t, and F .

Definition: Recall that a statistic is just a function of the observationsX1, . . . , Xn from a random sample. The function does not depend explicitlyon any unknown parameters.

Example: X and S2 are statistics, but (X − µ)/σ is not.

Since statistics are RV’s, it’s useful to figure out their distributions.The distribution of a statistic is called a sampling distribution.

Example: X1, . . . , Xniid∼ Nor(µ, σ2) ⇒ X ∼ Nor(µ, σ2/n).

The normal is used to get CIs and do hypothesis tests for µ.

ISYE 6739 — Goldsman 7/12/20 66 / 74

dg2
Highlight
Page 429: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Introduction and Normal Distribution

Goal: Talk about some distributions we’ll need later to do “confidenceintervals” (CIs) and “hypothesis tests”: Normal, χ2, t, and F .

Definition: Recall that a statistic is just a function of the observationsX1, . . . , Xn from a random sample. The function does not depend explicitlyon any unknown parameters.

Example: X and S2 are statistics, but (X − µ)/σ is not.

Since statistics are RV’s, it’s useful to figure out their distributions.The distribution of a statistic is called a sampling distribution.

Example: X1, . . . , Xniid∼ Nor(µ, σ2) ⇒ X ∼ Nor(µ, σ2/n).

The normal is used to get CIs and do hypothesis tests for µ.

ISYE 6739 — Goldsman 7/12/20 66 / 74

Page 430: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

χ2 Distribution

Definition/Theorem: If Z1, . . . , Zkiid∼ Nor(0, 1), then Y ≡

∑ki=1 Z

2i has

the chi-squared distribution with k degrees of freedom (df), and wewrite Y ∼ χ2(k).

The term “df” informally corresponds to the number of “independent piecesof information” you have. For example, if you have RV’s X1, . . . , Xn suchthat

∑ni=1Xi = c, a known constant, then you might have n− 1 df, since

knowledge of any n− 1 of the Xi’s gives you the remaining Xi.

We also informally “lose” a degree of freedom every time we have to estimatea parameter. For instance, if we have access to n observations, but have toestimate two parameters µ and σ2, then we might only end up with n− 2 df.

In reality, df corresponds to the number of dimensions of a certain space (notcovered in this course)!

ISYE 6739 — Goldsman 7/12/20 67 / 74

Page 431: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

χ2 Distribution

Definition/Theorem: If Z1, . . . , Zkiid∼ Nor(0, 1), then Y ≡

∑ki=1 Z

2i has

the chi-squared distribution with k degrees of freedom (df),

and wewrite Y ∼ χ2(k).

The term “df” informally corresponds to the number of “independent piecesof information” you have. For example, if you have RV’s X1, . . . , Xn suchthat

∑ni=1Xi = c, a known constant, then you might have n− 1 df, since

knowledge of any n− 1 of the Xi’s gives you the remaining Xi.

We also informally “lose” a degree of freedom every time we have to estimatea parameter. For instance, if we have access to n observations, but have toestimate two parameters µ and σ2, then we might only end up with n− 2 df.

In reality, df corresponds to the number of dimensions of a certain space (notcovered in this course)!

ISYE 6739 — Goldsman 7/12/20 67 / 74

dg2
Highlight
dg2
Highlight
Page 432: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

χ2 Distribution

Definition/Theorem: If Z1, . . . , Zkiid∼ Nor(0, 1), then Y ≡

∑ki=1 Z

2i has

the chi-squared distribution with k degrees of freedom (df), and wewrite Y ∼ χ2(k).

The term “df” informally corresponds to the number of “independent piecesof information” you have. For example, if you have RV’s X1, . . . , Xn suchthat

∑ni=1Xi = c, a known constant, then you might have n− 1 df, since

knowledge of any n− 1 of the Xi’s gives you the remaining Xi.

We also informally “lose” a degree of freedom every time we have to estimatea parameter. For instance, if we have access to n observations, but have toestimate two parameters µ and σ2, then we might only end up with n− 2 df.

In reality, df corresponds to the number of dimensions of a certain space (notcovered in this course)!

ISYE 6739 — Goldsman 7/12/20 67 / 74

dg2
Highlight
Page 433: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

χ2 Distribution

Definition/Theorem: If Z1, . . . , Zkiid∼ Nor(0, 1), then Y ≡

∑ki=1 Z

2i has

the chi-squared distribution with k degrees of freedom (df), and wewrite Y ∼ χ2(k).

The term “df” informally corresponds to the number of “independent piecesof information” you have.

For example, if you have RV’s X1, . . . , Xn suchthat

∑ni=1Xi = c, a known constant, then you might have n− 1 df, since

knowledge of any n− 1 of the Xi’s gives you the remaining Xi.

We also informally “lose” a degree of freedom every time we have to estimatea parameter. For instance, if we have access to n observations, but have toestimate two parameters µ and σ2, then we might only end up with n− 2 df.

In reality, df corresponds to the number of dimensions of a certain space (notcovered in this course)!

ISYE 6739 — Goldsman 7/12/20 67 / 74

Page 434: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

χ2 Distribution

Definition/Theorem: If Z1, . . . , Zkiid∼ Nor(0, 1), then Y ≡

∑ki=1 Z

2i has

the chi-squared distribution with k degrees of freedom (df), and wewrite Y ∼ χ2(k).

The term “df” informally corresponds to the number of “independent piecesof information” you have. For example, if you have RV’s X1, . . . , Xn suchthat

∑ni=1Xi = c, a known constant,

then you might have n− 1 df, sinceknowledge of any n− 1 of the Xi’s gives you the remaining Xi.

We also informally “lose” a degree of freedom every time we have to estimatea parameter. For instance, if we have access to n observations, but have toestimate two parameters µ and σ2, then we might only end up with n− 2 df.

In reality, df corresponds to the number of dimensions of a certain space (notcovered in this course)!

ISYE 6739 — Goldsman 7/12/20 67 / 74

dg2
Highlight
dg2
Highlight
Page 435: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

χ2 Distribution

Definition/Theorem: If Z1, . . . , Zkiid∼ Nor(0, 1), then Y ≡

∑ki=1 Z

2i has

the chi-squared distribution with k degrees of freedom (df), and wewrite Y ∼ χ2(k).

The term “df” informally corresponds to the number of “independent piecesof information” you have. For example, if you have RV’s X1, . . . , Xn suchthat

∑ni=1Xi = c, a known constant, then you might have n− 1 df, since

knowledge of any n− 1 of the Xi’s gives you the remaining Xi.

We also informally “lose” a degree of freedom every time we have to estimatea parameter. For instance, if we have access to n observations, but have toestimate two parameters µ and σ2, then we might only end up with n− 2 df.

In reality, df corresponds to the number of dimensions of a certain space (notcovered in this course)!

ISYE 6739 — Goldsman 7/12/20 67 / 74

dg2
Highlight
Page 436: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

χ2 Distribution

Definition/Theorem: If Z1, . . . , Zkiid∼ Nor(0, 1), then Y ≡

∑ki=1 Z

2i has

the chi-squared distribution with k degrees of freedom (df), and wewrite Y ∼ χ2(k).

The term “df” informally corresponds to the number of “independent piecesof information” you have. For example, if you have RV’s X1, . . . , Xn suchthat

∑ni=1Xi = c, a known constant, then you might have n− 1 df, since

knowledge of any n− 1 of the Xi’s gives you the remaining Xi.

We also informally “lose” a degree of freedom every time we have to estimatea parameter.

For instance, if we have access to n observations, but have toestimate two parameters µ and σ2, then we might only end up with n− 2 df.

In reality, df corresponds to the number of dimensions of a certain space (notcovered in this course)!

ISYE 6739 — Goldsman 7/12/20 67 / 74

Page 437: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

χ2 Distribution

Definition/Theorem: If Z1, . . . , Zkiid∼ Nor(0, 1), then Y ≡

∑ki=1 Z

2i has

the chi-squared distribution with k degrees of freedom (df), and wewrite Y ∼ χ2(k).

The term “df” informally corresponds to the number of “independent piecesof information” you have. For example, if you have RV’s X1, . . . , Xn suchthat

∑ni=1Xi = c, a known constant, then you might have n− 1 df, since

knowledge of any n− 1 of the Xi’s gives you the remaining Xi.

We also informally “lose” a degree of freedom every time we have to estimatea parameter. For instance, if we have access to n observations, but have toestimate two parameters µ and σ2, then we might only end up with n− 2 df.

In reality, df corresponds to the number of dimensions of a certain space (notcovered in this course)!

ISYE 6739 — Goldsman 7/12/20 67 / 74

Page 438: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

χ2 Distribution

Definition/Theorem: If Z1, . . . , Zkiid∼ Nor(0, 1), then Y ≡

∑ki=1 Z

2i has

the chi-squared distribution with k degrees of freedom (df), and wewrite Y ∼ χ2(k).

The term “df” informally corresponds to the number of “independent piecesof information” you have. For example, if you have RV’s X1, . . . , Xn suchthat

∑ni=1Xi = c, a known constant, then you might have n− 1 df, since

knowledge of any n− 1 of the Xi’s gives you the remaining Xi.

We also informally “lose” a degree of freedom every time we have to estimatea parameter. For instance, if we have access to n observations, but have toestimate two parameters µ and σ2, then we might only end up with n− 2 df.

In reality, df corresponds to the number of dimensions of a certain space (notcovered in this course)!

ISYE 6739 — Goldsman 7/12/20 67 / 74

Page 439: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

The pdf of the chi-squared distribution is

fY (y) =1

2k/2Γ(k2

)y k2−1e−y/2, y > 0.

Fun Facts: Can show that E[Y ] = k, and Var(Y ) = 2k.

The exponential distribution is a special case of the chi-squared distribution.In fact, χ2(2) ∼ Exp(1/2).

Proof: Just plug k = 2 into the pdf. 2

For k > 2, the χ2(k) pdf is skewed to the right. (You get an occasional“large” observation.)

For large k, the χ2(k) is approximately normal (by the CLT).

ISYE 6739 — Goldsman 7/12/20 68 / 74

Page 440: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

The pdf of the chi-squared distribution is

fY (y) =1

2k/2Γ(k2

)y k2−1e−y/2, y > 0.

Fun Facts: Can show that E[Y ] = k, and Var(Y ) = 2k.

The exponential distribution is a special case of the chi-squared distribution.In fact, χ2(2) ∼ Exp(1/2).

Proof: Just plug k = 2 into the pdf. 2

For k > 2, the χ2(k) pdf is skewed to the right. (You get an occasional“large” observation.)

For large k, the χ2(k) is approximately normal (by the CLT).

ISYE 6739 — Goldsman 7/12/20 68 / 74

Page 441: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

The pdf of the chi-squared distribution is

fY (y) =1

2k/2Γ(k2

)y k2−1e−y/2, y > 0.

Fun Facts: Can show that E[Y ] = k, and Var(Y ) = 2k.

The exponential distribution is a special case of the chi-squared distribution.In fact, χ2(2) ∼ Exp(1/2).

Proof: Just plug k = 2 into the pdf. 2

For k > 2, the χ2(k) pdf is skewed to the right. (You get an occasional“large” observation.)

For large k, the χ2(k) is approximately normal (by the CLT).

ISYE 6739 — Goldsman 7/12/20 68 / 74

dg2
Highlight
dg2
Highlight
Page 442: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

The pdf of the chi-squared distribution is

fY (y) =1

2k/2Γ(k2

)y k2−1e−y/2, y > 0.

Fun Facts: Can show that E[Y ] = k, and Var(Y ) = 2k.

The exponential distribution is a special case of the chi-squared distribution.

In fact, χ2(2) ∼ Exp(1/2).

Proof: Just plug k = 2 into the pdf. 2

For k > 2, the χ2(k) pdf is skewed to the right. (You get an occasional“large” observation.)

For large k, the χ2(k) is approximately normal (by the CLT).

ISYE 6739 — Goldsman 7/12/20 68 / 74

Page 443: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

The pdf of the chi-squared distribution is

fY (y) =1

2k/2Γ(k2

)y k2−1e−y/2, y > 0.

Fun Facts: Can show that E[Y ] = k, and Var(Y ) = 2k.

The exponential distribution is a special case of the chi-squared distribution.In fact, χ2(2) ∼ Exp(1/2).

Proof: Just plug k = 2 into the pdf. 2

For k > 2, the χ2(k) pdf is skewed to the right. (You get an occasional“large” observation.)

For large k, the χ2(k) is approximately normal (by the CLT).

ISYE 6739 — Goldsman 7/12/20 68 / 74

Page 444: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

The pdf of the chi-squared distribution is

fY (y) =1

2k/2Γ(k2

)y k2−1e−y/2, y > 0.

Fun Facts: Can show that E[Y ] = k, and Var(Y ) = 2k.

The exponential distribution is a special case of the chi-squared distribution.In fact, χ2(2) ∼ Exp(1/2).

Proof: Just plug k = 2 into the pdf. 2

For k > 2, the χ2(k) pdf is skewed to the right. (You get an occasional“large” observation.)

For large k, the χ2(k) is approximately normal (by the CLT).

ISYE 6739 — Goldsman 7/12/20 68 / 74

Page 445: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

The pdf of the chi-squared distribution is

fY (y) =1

2k/2Γ(k2

)y k2−1e−y/2, y > 0.

Fun Facts: Can show that E[Y ] = k, and Var(Y ) = 2k.

The exponential distribution is a special case of the chi-squared distribution.In fact, χ2(2) ∼ Exp(1/2).

Proof: Just plug k = 2 into the pdf. 2

For k > 2, the χ2(k) pdf is skewed to the right.

(You get an occasional“large” observation.)

For large k, the χ2(k) is approximately normal (by the CLT).

ISYE 6739 — Goldsman 7/12/20 68 / 74

Page 446: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

The pdf of the chi-squared distribution is

fY (y) =1

2k/2Γ(k2

)y k2−1e−y/2, y > 0.

Fun Facts: Can show that E[Y ] = k, and Var(Y ) = 2k.

The exponential distribution is a special case of the chi-squared distribution.In fact, χ2(2) ∼ Exp(1/2).

Proof: Just plug k = 2 into the pdf. 2

For k > 2, the χ2(k) pdf is skewed to the right. (You get an occasional“large” observation.)

For large k, the χ2(k) is approximately normal (by the CLT).

ISYE 6739 — Goldsman 7/12/20 68 / 74

Page 447: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

The pdf of the chi-squared distribution is

fY (y) =1

2k/2Γ(k2

)y k2−1e−y/2, y > 0.

Fun Facts: Can show that E[Y ] = k, and Var(Y ) = 2k.

The exponential distribution is a special case of the chi-squared distribution.In fact, χ2(2) ∼ Exp(1/2).

Proof: Just plug k = 2 into the pdf. 2

For k > 2, the χ2(k) pdf is skewed to the right. (You get an occasional“large” observation.)

For large k, the χ2(k) is approximately normal (by the CLT).

ISYE 6739 — Goldsman 7/12/20 68 / 74

Page 448: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Definition: The (1− α) quantile of a RV X is that value xα such thatP (X > xα) = 1− F (xα) = α.

Note that xα = F−1(1− α), where F−1(·)is the inverse cdf of X .

Notation: If Y ∼ χ2(k), then we denote the (1− α) quantile with thespecial symbol χ2

α,k (instead of xα). In other words, P (Y > χ2α,k) = α.

You can look up χ2α,k, e.g., in a table at the back of the book or via the Excel

function CHISQ.INV(1− α, k).

Example: If Y ∼ χ2(10), then

P (Y > χ20.05,10) = 0.05,

where we can look up χ20.05,10 = 18.31. 2

ISYE 6739 — Goldsman 7/12/20 69 / 74

dg2
Highlight
Page 449: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Definition: The (1− α) quantile of a RV X is that value xα such thatP (X > xα) = 1− F (xα) = α. Note that xα = F−1(1− α), where F−1(·)is the inverse cdf of X .

Notation: If Y ∼ χ2(k), then we denote the (1− α) quantile with thespecial symbol χ2

α,k (instead of xα). In other words, P (Y > χ2α,k) = α.

You can look up χ2α,k, e.g., in a table at the back of the book or via the Excel

function CHISQ.INV(1− α, k).

Example: If Y ∼ χ2(10), then

P (Y > χ20.05,10) = 0.05,

where we can look up χ20.05,10 = 18.31. 2

ISYE 6739 — Goldsman 7/12/20 69 / 74

dg2
Highlight
Page 450: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Definition: The (1− α) quantile of a RV X is that value xα such thatP (X > xα) = 1− F (xα) = α. Note that xα = F−1(1− α), where F−1(·)is the inverse cdf of X .

Notation: If Y ∼ χ2(k), then we denote the (1− α) quantile with thespecial symbol χ2

α,k (instead of xα).

In other words, P (Y > χ2α,k) = α.

You can look up χ2α,k, e.g., in a table at the back of the book or via the Excel

function CHISQ.INV(1− α, k).

Example: If Y ∼ χ2(10), then

P (Y > χ20.05,10) = 0.05,

where we can look up χ20.05,10 = 18.31. 2

ISYE 6739 — Goldsman 7/12/20 69 / 74

dg2
Highlight
Page 451: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Definition: The (1− α) quantile of a RV X is that value xα such thatP (X > xα) = 1− F (xα) = α. Note that xα = F−1(1− α), where F−1(·)is the inverse cdf of X .

Notation: If Y ∼ χ2(k), then we denote the (1− α) quantile with thespecial symbol χ2

α,k (instead of xα). In other words, P (Y > χ2α,k) = α.

You can look up χ2α,k, e.g., in a table at the back of the book or via the Excel

function CHISQ.INV(1− α, k).

Example: If Y ∼ χ2(10), then

P (Y > χ20.05,10) = 0.05,

where we can look up χ20.05,10 = 18.31. 2

ISYE 6739 — Goldsman 7/12/20 69 / 74

Page 452: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Definition: The (1− α) quantile of a RV X is that value xα such thatP (X > xα) = 1− F (xα) = α. Note that xα = F−1(1− α), where F−1(·)is the inverse cdf of X .

Notation: If Y ∼ χ2(k), then we denote the (1− α) quantile with thespecial symbol χ2

α,k (instead of xα). In other words, P (Y > χ2α,k) = α.

You can look up χ2α,k, e.g., in a table at the back of the book or via the Excel

function CHISQ.INV(1− α, k).

Example: If Y ∼ χ2(10), then

P (Y > χ20.05,10) = 0.05,

where we can look up χ20.05,10 = 18.31. 2

ISYE 6739 — Goldsman 7/12/20 69 / 74

Page 453: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Definition: The (1− α) quantile of a RV X is that value xα such thatP (X > xα) = 1− F (xα) = α. Note that xα = F−1(1− α), where F−1(·)is the inverse cdf of X .

Notation: If Y ∼ χ2(k), then we denote the (1− α) quantile with thespecial symbol χ2

α,k (instead of xα). In other words, P (Y > χ2α,k) = α.

You can look up χ2α,k, e.g., in a table at the back of the book or via the Excel

function CHISQ.INV(1− α, k).

Example: If Y ∼ χ2(10), then

P (Y > χ20.05,10) = 0.05,

where we can look up χ20.05,10 = 18.31. 2

ISYE 6739 — Goldsman 7/12/20 69 / 74

Page 454: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Definition: The (1− α) quantile of a RV X is that value xα such thatP (X > xα) = 1− F (xα) = α. Note that xα = F−1(1− α), where F−1(·)is the inverse cdf of X .

Notation: If Y ∼ χ2(k), then we denote the (1− α) quantile with thespecial symbol χ2

α,k (instead of xα). In other words, P (Y > χ2α,k) = α.

You can look up χ2α,k, e.g., in a table at the back of the book or via the Excel

function CHISQ.INV(1− α, k).

Example: If Y ∼ χ2(10), then

P (Y > χ20.05,10) = 0.05,

where we can look up χ20.05,10 = 18.31. 2

ISYE 6739 — Goldsman 7/12/20 69 / 74

dg2
Highlight
Page 455: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Definition: The (1− α) quantile of a RV X is that value xα such thatP (X > xα) = 1− F (xα) = α. Note that xα = F−1(1− α), where F−1(·)is the inverse cdf of X .

Notation: If Y ∼ χ2(k), then we denote the (1− α) quantile with thespecial symbol χ2

α,k (instead of xα). In other words, P (Y > χ2α,k) = α.

You can look up χ2α,k, e.g., in a table at the back of the book or via the Excel

function CHISQ.INV(1− α, k).

Example: If Y ∼ χ2(10), then

P (Y > χ20.05,10) = 0.05,

where we can look up χ20.05,10 = 18.31. 2

ISYE 6739 — Goldsman 7/12/20 69 / 74

dg2
Highlight
dg2
Highlight
Page 456: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Theorem: χ2’s add up.

If Y1, . . . , Yn are independent with Yi ∼ χ2(di), forall i, then

∑ni=1 Yi ∼ χ2(

∑ni=1 di).

Proof: Just use mgf’s. Won’t go thru it here. 2

So where does the χ2 distribution come up in statistics?

It usually arises when we try to estimate σ2.

Example: If X1, . . . , Xniid∼ Nor(µ, σ2), then, as we’ll show in the next

module,

S2 =1

n− 1

n∑i=1

(Xi − X)2 ∼ σ2χ2(n− 1)

n− 1. 2

ISYE 6739 — Goldsman 7/12/20 70 / 74

Page 457: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Theorem: χ2’s add up. If Y1, . . . , Yn are independent with Yi ∼ χ2(di), forall i, then

∑ni=1 Yi ∼ χ2(

∑ni=1 di).

Proof: Just use mgf’s. Won’t go thru it here. 2

So where does the χ2 distribution come up in statistics?

It usually arises when we try to estimate σ2.

Example: If X1, . . . , Xniid∼ Nor(µ, σ2), then, as we’ll show in the next

module,

S2 =1

n− 1

n∑i=1

(Xi − X)2 ∼ σ2χ2(n− 1)

n− 1. 2

ISYE 6739 — Goldsman 7/12/20 70 / 74

Page 458: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Theorem: χ2’s add up. If Y1, . . . , Yn are independent with Yi ∼ χ2(di), forall i, then

∑ni=1 Yi ∼ χ2(

∑ni=1 di).

Proof: Just use mgf’s. Won’t go thru it here. 2

So where does the χ2 distribution come up in statistics?

It usually arises when we try to estimate σ2.

Example: If X1, . . . , Xniid∼ Nor(µ, σ2), then, as we’ll show in the next

module,

S2 =1

n− 1

n∑i=1

(Xi − X)2 ∼ σ2χ2(n− 1)

n− 1. 2

ISYE 6739 — Goldsman 7/12/20 70 / 74

Page 459: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Theorem: χ2’s add up. If Y1, . . . , Yn are independent with Yi ∼ χ2(di), forall i, then

∑ni=1 Yi ∼ χ2(

∑ni=1 di).

Proof: Just use mgf’s. Won’t go thru it here. 2

So where does the χ2 distribution come up in statistics?

It usually arises when we try to estimate σ2.

Example: If X1, . . . , Xniid∼ Nor(µ, σ2), then, as we’ll show in the next

module,

S2 =1

n− 1

n∑i=1

(Xi − X)2 ∼ σ2χ2(n− 1)

n− 1. 2

ISYE 6739 — Goldsman 7/12/20 70 / 74

Page 460: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Theorem: χ2’s add up. If Y1, . . . , Yn are independent with Yi ∼ χ2(di), forall i, then

∑ni=1 Yi ∼ χ2(

∑ni=1 di).

Proof: Just use mgf’s. Won’t go thru it here. 2

So where does the χ2 distribution come up in statistics?

It usually arises when we try to estimate σ2.

Example: If X1, . . . , Xniid∼ Nor(µ, σ2), then, as we’ll show in the next

module,

S2 =1

n− 1

n∑i=1

(Xi − X)2 ∼ σ2χ2(n− 1)

n− 1. 2

ISYE 6739 — Goldsman 7/12/20 70 / 74

Page 461: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Theorem: χ2’s add up. If Y1, . . . , Yn are independent with Yi ∼ χ2(di), forall i, then

∑ni=1 Yi ∼ χ2(

∑ni=1 di).

Proof: Just use mgf’s. Won’t go thru it here. 2

So where does the χ2 distribution come up in statistics?

It usually arises when we try to estimate σ2.

Example: If X1, . . . , Xniid∼ Nor(µ, σ2), then, as we’ll show in the next

module,

S2 =1

n− 1

n∑i=1

(Xi − X)2 ∼ σ2χ2(n− 1)

n− 1. 2

ISYE 6739 — Goldsman 7/12/20 70 / 74

dg2
Highlight
Page 462: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Theorem: χ2’s add up. If Y1, . . . , Yn are independent with Yi ∼ χ2(di), forall i, then

∑ni=1 Yi ∼ χ2(

∑ni=1 di).

Proof: Just use mgf’s. Won’t go thru it here. 2

So where does the χ2 distribution come up in statistics?

It usually arises when we try to estimate σ2.

Example: If X1, . . . , Xniid∼ Nor(µ, σ2), then, as we’ll show in the next

module,

S2 =1

n− 1

n∑i=1

(Xi − X)2 ∼ σ2χ2(n− 1)

n− 1. 2

ISYE 6739 — Goldsman 7/12/20 70 / 74

dg2
Highlight
dg2
Highlight
dg2
Highlight
dg2
Highlight
Page 463: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

t Distribution

Definition/Theorem: Suppose that Z ∼ Nor(0, 1), Y ∼ χ2(k), and Z andY are independent. Then T ≡ Z/

√Y/k has the Student t distribution

with k degrees of freedom, and we write T ∼ t(k).

The pdf is

fT (x) =Γ(k+1

2

)√πk Γ

(k2

)(x2

k+ 1)− k+1

2, x ∈ R.

Fun Facts: The t(k) looks like the Nor(0,1), except the t has fatter tails.

The k = 1 case gives the Cauchy distribution, which has really fat tails.

As the degrees of freedom k becomes large, t(k)→ Nor(0, 1).

Can show that E[T ] = 0 for k > 1, and Var(T ) = kk−2 for k > 2.

ISYE 6739 — Goldsman 7/12/20 71 / 74

Page 464: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

t Distribution

Definition/Theorem: Suppose that Z ∼ Nor(0, 1), Y ∼ χ2(k), and Z andY are independent.

Then T ≡ Z/√Y/k has the Student t distribution

with k degrees of freedom, and we write T ∼ t(k).

The pdf is

fT (x) =Γ(k+1

2

)√πk Γ

(k2

)(x2

k+ 1)− k+1

2, x ∈ R.

Fun Facts: The t(k) looks like the Nor(0,1), except the t has fatter tails.

The k = 1 case gives the Cauchy distribution, which has really fat tails.

As the degrees of freedom k becomes large, t(k)→ Nor(0, 1).

Can show that E[T ] = 0 for k > 1, and Var(T ) = kk−2 for k > 2.

ISYE 6739 — Goldsman 7/12/20 71 / 74

Page 465: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

t Distribution

Definition/Theorem: Suppose that Z ∼ Nor(0, 1), Y ∼ χ2(k), and Z andY are independent. Then T ≡ Z/

√Y/k has the Student t distribution

with k degrees of freedom,

and we write T ∼ t(k).

The pdf is

fT (x) =Γ(k+1

2

)√πk Γ

(k2

)(x2

k+ 1)− k+1

2, x ∈ R.

Fun Facts: The t(k) looks like the Nor(0,1), except the t has fatter tails.

The k = 1 case gives the Cauchy distribution, which has really fat tails.

As the degrees of freedom k becomes large, t(k)→ Nor(0, 1).

Can show that E[T ] = 0 for k > 1, and Var(T ) = kk−2 for k > 2.

ISYE 6739 — Goldsman 7/12/20 71 / 74

Page 466: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

t Distribution

Definition/Theorem: Suppose that Z ∼ Nor(0, 1), Y ∼ χ2(k), and Z andY are independent. Then T ≡ Z/

√Y/k has the Student t distribution

with k degrees of freedom, and we write T ∼ t(k).

The pdf is

fT (x) =Γ(k+1

2

)√πk Γ

(k2

)(x2

k+ 1)− k+1

2, x ∈ R.

Fun Facts: The t(k) looks like the Nor(0,1), except the t has fatter tails.

The k = 1 case gives the Cauchy distribution, which has really fat tails.

As the degrees of freedom k becomes large, t(k)→ Nor(0, 1).

Can show that E[T ] = 0 for k > 1, and Var(T ) = kk−2 for k > 2.

ISYE 6739 — Goldsman 7/12/20 71 / 74

Page 467: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

t Distribution

Definition/Theorem: Suppose that Z ∼ Nor(0, 1), Y ∼ χ2(k), and Z andY are independent. Then T ≡ Z/

√Y/k has the Student t distribution

with k degrees of freedom, and we write T ∼ t(k).

The pdf is

fT (x) =Γ(k+1

2

)√πk Γ

(k2

)(x2

k+ 1)− k+1

2, x ∈ R.

Fun Facts: The t(k) looks like the Nor(0,1), except the t has fatter tails.

The k = 1 case gives the Cauchy distribution, which has really fat tails.

As the degrees of freedom k becomes large, t(k)→ Nor(0, 1).

Can show that E[T ] = 0 for k > 1, and Var(T ) = kk−2 for k > 2.

ISYE 6739 — Goldsman 7/12/20 71 / 74

Page 468: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

t Distribution

Definition/Theorem: Suppose that Z ∼ Nor(0, 1), Y ∼ χ2(k), and Z andY are independent. Then T ≡ Z/

√Y/k has the Student t distribution

with k degrees of freedom, and we write T ∼ t(k).

The pdf is

fT (x) =Γ(k+1

2

)√πk Γ

(k2

)(x2

k+ 1)− k+1

2, x ∈ R.

Fun Facts: The t(k) looks like the Nor(0,1), except the t has fatter tails.

The k = 1 case gives the Cauchy distribution, which has really fat tails.

As the degrees of freedom k becomes large, t(k)→ Nor(0, 1).

Can show that E[T ] = 0 for k > 1, and Var(T ) = kk−2 for k > 2.

ISYE 6739 — Goldsman 7/12/20 71 / 74

Page 469: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

t Distribution

Definition/Theorem: Suppose that Z ∼ Nor(0, 1), Y ∼ χ2(k), and Z andY are independent. Then T ≡ Z/

√Y/k has the Student t distribution

with k degrees of freedom, and we write T ∼ t(k).

The pdf is

fT (x) =Γ(k+1

2

)√πk Γ

(k2

)(x2

k+ 1)− k+1

2, x ∈ R.

Fun Facts:

The t(k) looks like the Nor(0,1), except the t has fatter tails.

The k = 1 case gives the Cauchy distribution, which has really fat tails.

As the degrees of freedom k becomes large, t(k)→ Nor(0, 1).

Can show that E[T ] = 0 for k > 1, and Var(T ) = kk−2 for k > 2.

ISYE 6739 — Goldsman 7/12/20 71 / 74

Page 470: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

t Distribution

Definition/Theorem: Suppose that Z ∼ Nor(0, 1), Y ∼ χ2(k), and Z andY are independent. Then T ≡ Z/

√Y/k has the Student t distribution

with k degrees of freedom, and we write T ∼ t(k).

The pdf is

fT (x) =Γ(k+1

2

)√πk Γ

(k2

)(x2

k+ 1)− k+1

2, x ∈ R.

Fun Facts: The t(k) looks like the Nor(0,1), except the t has fatter tails.

The k = 1 case gives the Cauchy distribution, which has really fat tails.

As the degrees of freedom k becomes large, t(k)→ Nor(0, 1).

Can show that E[T ] = 0 for k > 1, and Var(T ) = kk−2 for k > 2.

ISYE 6739 — Goldsman 7/12/20 71 / 74

Page 471: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

t Distribution

Definition/Theorem: Suppose that Z ∼ Nor(0, 1), Y ∼ χ2(k), and Z andY are independent. Then T ≡ Z/

√Y/k has the Student t distribution

with k degrees of freedom, and we write T ∼ t(k).

The pdf is

fT (x) =Γ(k+1

2

)√πk Γ

(k2

)(x2

k+ 1)− k+1

2, x ∈ R.

Fun Facts: The t(k) looks like the Nor(0,1), except the t has fatter tails.

The k = 1 case gives the Cauchy distribution, which has really fat tails.

As the degrees of freedom k becomes large, t(k)→ Nor(0, 1).

Can show that E[T ] = 0 for k > 1, and Var(T ) = kk−2 for k > 2.

ISYE 6739 — Goldsman 7/12/20 71 / 74

Page 472: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

t Distribution

Definition/Theorem: Suppose that Z ∼ Nor(0, 1), Y ∼ χ2(k), and Z andY are independent. Then T ≡ Z/

√Y/k has the Student t distribution

with k degrees of freedom, and we write T ∼ t(k).

The pdf is

fT (x) =Γ(k+1

2

)√πk Γ

(k2

)(x2

k+ 1)− k+1

2, x ∈ R.

Fun Facts: The t(k) looks like the Nor(0,1), except the t has fatter tails.

The k = 1 case gives the Cauchy distribution, which has really fat tails.

As the degrees of freedom k becomes large, t(k)→ Nor(0, 1).

Can show that E[T ] = 0 for k > 1, and Var(T ) = kk−2 for k > 2.

ISYE 6739 — Goldsman 7/12/20 71 / 74

Page 473: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

t Distribution

Definition/Theorem: Suppose that Z ∼ Nor(0, 1), Y ∼ χ2(k), and Z andY are independent. Then T ≡ Z/

√Y/k has the Student t distribution

with k degrees of freedom, and we write T ∼ t(k).

The pdf is

fT (x) =Γ(k+1

2

)√πk Γ

(k2

)(x2

k+ 1)− k+1

2, x ∈ R.

Fun Facts: The t(k) looks like the Nor(0,1), except the t has fatter tails.

The k = 1 case gives the Cauchy distribution, which has really fat tails.

As the degrees of freedom k becomes large, t(k)→ Nor(0, 1).

Can show that E[T ] = 0 for k > 1, and Var(T ) = kk−2 for k > 2.

ISYE 6739 — Goldsman 7/12/20 71 / 74

Page 474: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If T ∼ t(k), then we denote the (1− α) quantile by tα,k.

In other words, P (T > tα,k) = α.

Example: If T ∼ t(10), then P (T > t0.05,10) = 0.05, where we findt0.05,10 = 1.812 in the back of the book or via the Excel functionT.INV(1− α, k). 2

Remarks: So what do we use the t distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe mean µ. Stay tuned.

By the way, why did I originally call it the Student t distribution?

“Student” is the pseudonym of the guy (William Gossett) who first derived it.Gossett was a statistician at the Guinness Brewery.

ISYE 6739 — Goldsman 7/12/20 72 / 74

Page 475: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If T ∼ t(k), then we denote the (1− α) quantile by tα,k.In other words, P (T > tα,k) = α.

Example: If T ∼ t(10), then P (T > t0.05,10) = 0.05, where we findt0.05,10 = 1.812 in the back of the book or via the Excel functionT.INV(1− α, k). 2

Remarks: So what do we use the t distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe mean µ. Stay tuned.

By the way, why did I originally call it the Student t distribution?

“Student” is the pseudonym of the guy (William Gossett) who first derived it.Gossett was a statistician at the Guinness Brewery.

ISYE 6739 — Goldsman 7/12/20 72 / 74

Page 476: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If T ∼ t(k), then we denote the (1− α) quantile by tα,k.In other words, P (T > tα,k) = α.

Example: If T ∼ t(10), then

P (T > t0.05,10) = 0.05, where we findt0.05,10 = 1.812 in the back of the book or via the Excel functionT.INV(1− α, k). 2

Remarks: So what do we use the t distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe mean µ. Stay tuned.

By the way, why did I originally call it the Student t distribution?

“Student” is the pseudonym of the guy (William Gossett) who first derived it.Gossett was a statistician at the Guinness Brewery.

ISYE 6739 — Goldsman 7/12/20 72 / 74

Page 477: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If T ∼ t(k), then we denote the (1− α) quantile by tα,k.In other words, P (T > tα,k) = α.

Example: If T ∼ t(10), then P (T > t0.05,10) = 0.05,

where we findt0.05,10 = 1.812 in the back of the book or via the Excel functionT.INV(1− α, k). 2

Remarks: So what do we use the t distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe mean µ. Stay tuned.

By the way, why did I originally call it the Student t distribution?

“Student” is the pseudonym of the guy (William Gossett) who first derived it.Gossett was a statistician at the Guinness Brewery.

ISYE 6739 — Goldsman 7/12/20 72 / 74

Page 478: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If T ∼ t(k), then we denote the (1− α) quantile by tα,k.In other words, P (T > tα,k) = α.

Example: If T ∼ t(10), then P (T > t0.05,10) = 0.05, where we findt0.05,10 = 1.812 in the back of the book or via the Excel functionT.INV(1− α, k). 2

Remarks: So what do we use the t distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe mean µ. Stay tuned.

By the way, why did I originally call it the Student t distribution?

“Student” is the pseudonym of the guy (William Gossett) who first derived it.Gossett was a statistician at the Guinness Brewery.

ISYE 6739 — Goldsman 7/12/20 72 / 74

Page 479: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If T ∼ t(k), then we denote the (1− α) quantile by tα,k.In other words, P (T > tα,k) = α.

Example: If T ∼ t(10), then P (T > t0.05,10) = 0.05, where we findt0.05,10 = 1.812 in the back of the book or via the Excel functionT.INV(1− α, k). 2

Remarks: So what do we use the t distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe mean µ. Stay tuned.

By the way, why did I originally call it the Student t distribution?

“Student” is the pseudonym of the guy (William Gossett) who first derived it.Gossett was a statistician at the Guinness Brewery.

ISYE 6739 — Goldsman 7/12/20 72 / 74

Page 480: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If T ∼ t(k), then we denote the (1− α) quantile by tα,k.In other words, P (T > tα,k) = α.

Example: If T ∼ t(10), then P (T > t0.05,10) = 0.05, where we findt0.05,10 = 1.812 in the back of the book or via the Excel functionT.INV(1− α, k). 2

Remarks: So what do we use the t distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe mean µ. Stay tuned.

By the way, why did I originally call it the Student t distribution?

“Student” is the pseudonym of the guy (William Gossett) who first derived it.Gossett was a statistician at the Guinness Brewery.

ISYE 6739 — Goldsman 7/12/20 72 / 74

Page 481: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If T ∼ t(k), then we denote the (1− α) quantile by tα,k.In other words, P (T > tα,k) = α.

Example: If T ∼ t(10), then P (T > t0.05,10) = 0.05, where we findt0.05,10 = 1.812 in the back of the book or via the Excel functionT.INV(1− α, k). 2

Remarks: So what do we use the t distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe mean µ. Stay tuned.

By the way, why did I originally call it the Student t distribution?

“Student” is the pseudonym of the guy (William Gossett) who first derived it.Gossett was a statistician at the Guinness Brewery.

ISYE 6739 — Goldsman 7/12/20 72 / 74

Page 482: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If T ∼ t(k), then we denote the (1− α) quantile by tα,k.In other words, P (T > tα,k) = α.

Example: If T ∼ t(10), then P (T > t0.05,10) = 0.05, where we findt0.05,10 = 1.812 in the back of the book or via the Excel functionT.INV(1− α, k). 2

Remarks: So what do we use the t distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe mean µ. Stay tuned.

By the way, why did I originally call it the Student t distribution?

“Student” is the pseudonym of the guy (William Gossett) who first derived it.Gossett was a statistician at the Guinness Brewery.

ISYE 6739 — Goldsman 7/12/20 72 / 74

Page 483: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

F Distribution

Definition/Theorem: Suppose that X ∼ χ2(n), Y ∼ χ2(m), and X and Yare independent. Then F ≡ X/n

Y/m = mX/(nY ) has the F distribution withn and m df, denoted F ∼ F (n,m).

The pdf is

fF (x) =Γ(n+m

2

)Γ(n2

)Γ(m2

) ( nm)n2 x

n2−1

( nmx+ 1)n+m

2

, x > 0.

Fun Facts: The F (n,m) is usually a bit skewed to the right.

Note that you have to specify two df’s.

Can show that E[F ] = mm−2 (m > 2), and Var(F ) = blech.

t distribution is a special case — can you figure out which?

ISYE 6739 — Goldsman 7/12/20 73 / 74

Page 484: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

F Distribution

Definition/Theorem: Suppose that X ∼ χ2(n), Y ∼ χ2(m), and X and Yare independent.

Then F ≡ X/nY/m = mX/(nY ) has the F distribution with

n and m df, denoted F ∼ F (n,m).

The pdf is

fF (x) =Γ(n+m

2

)Γ(n2

)Γ(m2

) ( nm)n2 x

n2−1

( nmx+ 1)n+m

2

, x > 0.

Fun Facts: The F (n,m) is usually a bit skewed to the right.

Note that you have to specify two df’s.

Can show that E[F ] = mm−2 (m > 2), and Var(F ) = blech.

t distribution is a special case — can you figure out which?

ISYE 6739 — Goldsman 7/12/20 73 / 74

Page 485: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

F Distribution

Definition/Theorem: Suppose that X ∼ χ2(n), Y ∼ χ2(m), and X and Yare independent. Then F ≡ X/n

Y/m = mX/(nY ) has the F distribution withn and m df, denoted F ∼ F (n,m).

The pdf is

fF (x) =Γ(n+m

2

)Γ(n2

)Γ(m2

) ( nm)n2 x

n2−1

( nmx+ 1)n+m

2

, x > 0.

Fun Facts: The F (n,m) is usually a bit skewed to the right.

Note that you have to specify two df’s.

Can show that E[F ] = mm−2 (m > 2), and Var(F ) = blech.

t distribution is a special case — can you figure out which?

ISYE 6739 — Goldsman 7/12/20 73 / 74

Page 486: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

F Distribution

Definition/Theorem: Suppose that X ∼ χ2(n), Y ∼ χ2(m), and X and Yare independent. Then F ≡ X/n

Y/m = mX/(nY ) has the F distribution withn and m df, denoted F ∼ F (n,m).

The pdf is

fF (x) =Γ(n+m

2

)Γ(n2

)Γ(m2

) ( nm)n2 x

n2−1

( nmx+ 1)n+m

2

, x > 0.

Fun Facts: The F (n,m) is usually a bit skewed to the right.

Note that you have to specify two df’s.

Can show that E[F ] = mm−2 (m > 2), and Var(F ) = blech.

t distribution is a special case — can you figure out which?

ISYE 6739 — Goldsman 7/12/20 73 / 74

Page 487: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

F Distribution

Definition/Theorem: Suppose that X ∼ χ2(n), Y ∼ χ2(m), and X and Yare independent. Then F ≡ X/n

Y/m = mX/(nY ) has the F distribution withn and m df, denoted F ∼ F (n,m).

The pdf is

fF (x) =Γ(n+m

2

)Γ(n2

)Γ(m2

) ( nm)n2 x

n2−1

( nmx+ 1)n+m

2

, x > 0.

Fun Facts: The F (n,m) is usually a bit skewed to the right.

Note that you have to specify two df’s.

Can show that E[F ] = mm−2 (m > 2), and Var(F ) = blech.

t distribution is a special case — can you figure out which?

ISYE 6739 — Goldsman 7/12/20 73 / 74

Page 488: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

F Distribution

Definition/Theorem: Suppose that X ∼ χ2(n), Y ∼ χ2(m), and X and Yare independent. Then F ≡ X/n

Y/m = mX/(nY ) has the F distribution withn and m df, denoted F ∼ F (n,m).

The pdf is

fF (x) =Γ(n+m

2

)Γ(n2

)Γ(m2

) ( nm)n2 x

n2−1

( nmx+ 1)n+m

2

, x > 0.

Fun Facts: The F (n,m) is usually a bit skewed to the right.

Note that you have to specify two df’s.

Can show that E[F ] = mm−2 (m > 2), and Var(F ) = blech.

t distribution is a special case — can you figure out which?

ISYE 6739 — Goldsman 7/12/20 73 / 74

Page 489: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

F Distribution

Definition/Theorem: Suppose that X ∼ χ2(n), Y ∼ χ2(m), and X and Yare independent. Then F ≡ X/n

Y/m = mX/(nY ) has the F distribution withn and m df, denoted F ∼ F (n,m).

The pdf is

fF (x) =Γ(n+m

2

)Γ(n2

)Γ(m2

) ( nm)n2 x

n2−1

( nmx+ 1)n+m

2

, x > 0.

Fun Facts: The F (n,m) is usually a bit skewed to the right.

Note that you have to specify two df’s.

Can show that E[F ] = mm−2 (m > 2), and Var(F ) = blech.

t distribution is a special case — can you figure out which?

ISYE 6739 — Goldsman 7/12/20 73 / 74

Page 490: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

F Distribution

Definition/Theorem: Suppose that X ∼ χ2(n), Y ∼ χ2(m), and X and Yare independent. Then F ≡ X/n

Y/m = mX/(nY ) has the F distribution withn and m df, denoted F ∼ F (n,m).

The pdf is

fF (x) =Γ(n+m

2

)Γ(n2

)Γ(m2

) ( nm)n2 x

n2−1

( nmx+ 1)n+m

2

, x > 0.

Fun Facts: The F (n,m) is usually a bit skewed to the right.

Note that you have to specify two df’s.

Can show that E[F ] = mm−2 (m > 2), and Var(F ) = blech.

t distribution is a special case — can you figure out which?

ISYE 6739 — Goldsman 7/12/20 73 / 74

Page 491: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

F Distribution

Definition/Theorem: Suppose that X ∼ χ2(n), Y ∼ χ2(m), and X and Yare independent. Then F ≡ X/n

Y/m = mX/(nY ) has the F distribution withn and m df, denoted F ∼ F (n,m).

The pdf is

fF (x) =Γ(n+m

2

)Γ(n2

)Γ(m2

) ( nm)n2 x

n2−1

( nmx+ 1)n+m

2

, x > 0.

Fun Facts: The F (n,m) is usually a bit skewed to the right.

Note that you have to specify two df’s.

Can show that E[F ] = mm−2 (m > 2), and Var(F ) = blech.

t distribution is a special case — can you figure out which?

ISYE 6739 — Goldsman 7/12/20 73 / 74

Page 492: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If F ∼ F (n,m), then we denote the (1− α) quantile by Fα,n,m.

That is, P (F > Fα,n,m) = α.

Tables came be found in back of the book for various α, n,m or you can usethe Excel function F.INV(1− α, n,m)

Example: If F ∼ F (5, 10), then P (F > F0.05,5,10) = 0.05, where we findF0.05,5,10 = 3.326. 2

Remarks: It can be shown that F1−α,m,n = 1/Fα,n,m. Use this fact if youhave to find something like F0.95,10,5 = 1/F0.05,5,10 = 1/3.326.

So what do we use the F distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe ratio of variances from two different processes. Details later.

ISYE 6739 — Goldsman 7/12/20 74 / 74

Page 493: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If F ∼ F (n,m), then we denote the (1− α) quantile by Fα,n,m.That is, P (F > Fα,n,m) = α.

Tables came be found in back of the book for various α, n,m or you can usethe Excel function F.INV(1− α, n,m)

Example: If F ∼ F (5, 10), then P (F > F0.05,5,10) = 0.05, where we findF0.05,5,10 = 3.326. 2

Remarks: It can be shown that F1−α,m,n = 1/Fα,n,m. Use this fact if youhave to find something like F0.95,10,5 = 1/F0.05,5,10 = 1/3.326.

So what do we use the F distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe ratio of variances from two different processes. Details later.

ISYE 6739 — Goldsman 7/12/20 74 / 74

Page 494: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If F ∼ F (n,m), then we denote the (1− α) quantile by Fα,n,m.That is, P (F > Fα,n,m) = α.

Tables came be found in back of the book for various α, n,m or you can usethe Excel function F.INV(1− α, n,m)

Example: If F ∼ F (5, 10), then P (F > F0.05,5,10) = 0.05, where we findF0.05,5,10 = 3.326. 2

Remarks: It can be shown that F1−α,m,n = 1/Fα,n,m. Use this fact if youhave to find something like F0.95,10,5 = 1/F0.05,5,10 = 1/3.326.

So what do we use the F distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe ratio of variances from two different processes. Details later.

ISYE 6739 — Goldsman 7/12/20 74 / 74

Page 495: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If F ∼ F (n,m), then we denote the (1− α) quantile by Fα,n,m.That is, P (F > Fα,n,m) = α.

Tables came be found in back of the book for various α, n,m or you can usethe Excel function F.INV(1− α, n,m)

Example: If F ∼ F (5, 10), then P (F > F0.05,5,10) = 0.05, where we findF0.05,5,10 = 3.326. 2

Remarks: It can be shown that F1−α,m,n = 1/Fα,n,m. Use this fact if youhave to find something like F0.95,10,5 = 1/F0.05,5,10 = 1/3.326.

So what do we use the F distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe ratio of variances from two different processes. Details later.

ISYE 6739 — Goldsman 7/12/20 74 / 74

Page 496: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If F ∼ F (n,m), then we denote the (1− α) quantile by Fα,n,m.That is, P (F > Fα,n,m) = α.

Tables came be found in back of the book for various α, n,m or you can usethe Excel function F.INV(1− α, n,m)

Example: If F ∼ F (5, 10), then P (F > F0.05,5,10) = 0.05, where we findF0.05,5,10 = 3.326. 2

Remarks: It can be shown that F1−α,m,n = 1/Fα,n,m.

Use this fact if youhave to find something like F0.95,10,5 = 1/F0.05,5,10 = 1/3.326.

So what do we use the F distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe ratio of variances from two different processes. Details later.

ISYE 6739 — Goldsman 7/12/20 74 / 74

Page 497: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If F ∼ F (n,m), then we denote the (1− α) quantile by Fα,n,m.That is, P (F > Fα,n,m) = α.

Tables came be found in back of the book for various α, n,m or you can usethe Excel function F.INV(1− α, n,m)

Example: If F ∼ F (5, 10), then P (F > F0.05,5,10) = 0.05, where we findF0.05,5,10 = 3.326. 2

Remarks: It can be shown that F1−α,m,n = 1/Fα,n,m. Use this fact if youhave to find something like F0.95,10,5 = 1/F0.05,5,10 = 1/3.326.

So what do we use the F distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe ratio of variances from two different processes. Details later.

ISYE 6739 — Goldsman 7/12/20 74 / 74

Page 498: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If F ∼ F (n,m), then we denote the (1− α) quantile by Fα,n,m.That is, P (F > Fα,n,m) = α.

Tables came be found in back of the book for various α, n,m or you can usethe Excel function F.INV(1− α, n,m)

Example: If F ∼ F (5, 10), then P (F > F0.05,5,10) = 0.05, where we findF0.05,5,10 = 3.326. 2

Remarks: It can be shown that F1−α,m,n = 1/Fα,n,m. Use this fact if youhave to find something like F0.95,10,5 = 1/F0.05,5,10 = 1/3.326.

So what do we use the F distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe ratio of variances from two different processes. Details later.

ISYE 6739 — Goldsman 7/12/20 74 / 74

Page 499: myblue5. Getting Started with Statisticssman/courses/6739/6739-05-StatsBasics-200… · Coke vs. Pepsi. The effect of cigarette smoking on the probability of getting cancer. The effect

Sampling Distributions

Notation: If F ∼ F (n,m), then we denote the (1− α) quantile by Fα,n,m.That is, P (F > Fα,n,m) = α.

Tables came be found in back of the book for various α, n,m or you can usethe Excel function F.INV(1− α, n,m)

Example: If F ∼ F (5, 10), then P (F > F0.05,5,10) = 0.05, where we findF0.05,5,10 = 3.326. 2

Remarks: It can be shown that F1−α,m,n = 1/Fα,n,m. Use this fact if youhave to find something like F0.95,10,5 = 1/F0.05,5,10 = 1/3.326.

So what do we use the F distribution for in statistics?

It’s used when we find confidence intervals and conduct hypothesis tests forthe ratio of variances from two different processes. Details later.

ISYE 6739 — Goldsman 7/12/20 74 / 74