engr 610 applied statistics fall 2007 - week 2 marshall university cite jack smith

24
ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Upload: arline-pearson

Post on 17-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

ENGR 610Applied Statistics

Fall 2007 - Week 2

Marshall University

CITE

Jack Smith

Page 2: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Overview for Today Homework problems 1.25, 2.54, 2.55 Review of Ch 3 Homework problems 3.27, 3.31 Probability and Discrete Probability

Distributions (Ch 4) Homework assignment

Page 3: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Homework problems

1.25 2.54 2.55

Page 4: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Chapter 3 Review

Measures of… Central Tendency Variation Shape

Skewness Kurtosis

Box-and-Whisker Plots

Page 5: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Measures ofCentral Tendency Mean (arithmetic)

Average value: Median

Middle value - 50th percentile (2nd quartile) Mode

Most popular (peak) value(s) - can be multi-modal Midrange

(Max+Min)/2 Midhinge

(Q3+Q1)/2 - average of 1st and 3rd quartiles

1

NX i

i

N

Page 6: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Measures of Variation Range (max-min) Inter-Quartile Range (Q3-Q1) Variance

Sum of squares (SS) of the deviation from mean divided by the degrees of freedom (df) - see pp 113-5

df = N, for the whole population df = n-1, for a sample

2nd moment about the mean (dispersion)(1st moment about the mean is zero!)

Standard Deviation Square root of variance (same units as variable)

Sample (s2, s, n) vs Population (2, , N)

Page 7: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Quantiles Equipartitions of ranked array of observations

Percentiles - 100 Deciles - 10 Quartiles - 4 (25%, 50%, 75%) Median - 2

Pn = n(N+1)/100 -th ordered observation

Dn = n(N+1)/10

Qn = n(N+1)/4

Median = (N+1)/2 = Q2 = D5 = P50

Page 8: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Measures of Shape Symmetry

Skewness - extended tail in one direction 3rd moment about the mean

Kurtosis Flatness, peakedness

Leptokurtic - highly peaked, long tails Mesokurtic - “normal”, triangular, short tails Platykurtic - broad, even

4th moment about the mean

Page 9: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Box-and-Whisker Plots Graphical representation of five-number summary

Min, Max (full range) Q1, Q3 (middle 50%) Median (50th %-ile)

Shows symmetry (skewness) of distribution

Page 10: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Other Resources SPSS Tutorial at Statistical Consulting

Services http://www.stats-consult.com/tutorials.html

MathWorld http://mathworld.wolfram.com See Probability and Statistics

Wikipedia http://en.wikipedia.org/wiki/Category:Probability

_and_statistics

Page 11: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Homework Problems 3.27 3.31

Page 12: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Chapter 4

Probability Introduction to Probability Rules of Probability

Discrete Probability Distributions Probability Distributions Binomial Distribution Poisson Distribution Hypergeometric, Negative Binomial, Geometric

Distributions

Page 13: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Introduction to Probability Probability - numeric value representing the chance,

likelihood, or possibility that an event will occur Classical, theoretical Empirical Subjective

Elementary event - a distinct individual outcome Event - a set of elementary events Joint event - defined by two or more characteristics

Page 14: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Rules of Probability1. A probability P(A) for event A is between 0 (null

event) and 1 (certain event)2. The complement of P(A) is the probability that A will

not occur, and P(not-A) = 1- P(A) 3. Two events are mutually exclusive if

P(A and B) = 04. If two events are mutually exclusive, then

P(A or B) = P(A) + P(B)5. If set of events are mutually exclusive and

collectively exhaustive, then

P(Ai) 1i

Page 15: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Rules of Probability6. If two events are not mutually exclusive, then

P(A or B) = P(A) + P(B) - P(A and B), whereP(A and B) is the joint probability of A and B.

7. The conditional probability of B occurring, given that A has occurred, is given byP(B|A) = P(A and B)/P(A)

8. If two events are independent, thenP(A and B) = P(A) x P(B) andP(A) = P(A|B) and P(B) = P(B|A)

9. If two events are not independent, thenP(A and B) = P(A) x P(B|A)

Page 16: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Probability Distributions A probability distribution for a discrete random

variable is complete set of all possible distinct outcomes and their probabilities of occurring, whose sum is 1.

The expected value of a discrete random variable is its weighted average over all possible values where the weights are given by the probability distribution.

E(X) X iP(X i)i

Page 17: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Probability Distributions The variance of a discrete random variable is the

weighted average of the squared difference between each possible outcome and the mean over all possible values where the weights (frequencies) are given by the probability distribution.

The standard deviation (X) is then the square root of the variance.

X2 (X i X )2P(X i)

i

Page 18: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Binomial Distribution Each elementary event is a Bernoulli event, with one

of two mutually exclusive and collectively exhaustive possible outcomes.

The probability of “success” (p) is constant from trial to trial, and the probability of “failure” is 1-p.

The outcome for each trial is independent of any other trial

The proportion of trials resulting in x successes, out of n trials, with a constant probability of p, is given by:

P(X x | n, p) n!

x!(n x)!px (1 p)n x

Page 19: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Binomial Distribution, cont’d Binomial coefficients follow Pascal’s Triangle 1

1 1

1 2 1

1 3 3 1 Distribution nearly bell-shaped for large n and p=1/2. Skewed right (positive) for p<1/2, and

left (negative) for p>1/2 Mean () = np Variance (2) = np(1-p)

Page 20: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Poisson Distribution Probability for a particular number of discrete events

over a continuous interval (area of opportunity) Assumes a Poisson process (“isolable” event) Limit case of Binomial distribution for large n Based only on expectation value ()

P(X x | ) e x

x!

Page 21: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Poisson Distribution, cont’d Mean () = variance (2) = Right-skewed, but approaches symmetric bell-shape

as gets large

Page 22: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Other Discrete Probability Distributions

Hypergeometric (pp 159-160) Bernoulli events, but selected from finite population

without replacement p A/N, where A number of successes in population N Approaches binomial for n < 5% of N

Negative Binomial (pp 162-163) Number of trials (n) until xth success Binomial with last trial constrained to be a success

Geometric (pp 164-165) Special case of negative binomial for x = 1 (1st success)

Page 23: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Cumulative Probabilities

P(X<x) = P(X=1) + P(X=2) +…+ P(X=x-1)

P(X>x) = P(X=x+1) + P(X=x+2) +…+ P(X=n)

Page 24: ENGR 610 Applied Statistics Fall 2007 - Week 2 Marshall University CITE Jack Smith

Homework Ch 4

Appendix 4.1 Problems: 4.57,60,61,64

Read Ch 5 Continuous Probability Distributions and

Sampling Distributions