engineering statistics study sheet

3
Population: The set of all possible observations of interest to the problem at hand. Sample: The part of the population from which we collect information Factor: the explanatory variable(s), often with multiple levels (i.e., i f temperature is the factor, level s could be 50 and 75) Treatment/Treatment Combin ation: a specific combination of the levels of each factor. Example: 2 factors Pressure (4 atm, 5 atm, 6 atm) Temperature (50 and 75) How many treatment combinations? Categorical or Continuous: Categorical factors have limited levels; Continuous do not have limited levels. Experimental Unit (EU): The smallest unit to which we apply a treatment co mbination. Observational Unit (OU): The unit upon which we take the measurement Sampling Techniques: Simple Random Sample (SRS): every possible sample of n observations from the population has the same chance of being selected Stratified Random Sample: when there are several groups or strata (grouping them can remove some of the variability) a simple random sample is performed on each in order to get at least one representative from each group. Also useful for comparing strata. Systematic Random Sample: randomly select an item within the first m items. Thereafter, sample each m th item. Lends itself well to high-speed part manufacturing. Pairing sampling units: Pairing sampling units allows us to remove the unit to unit va riability and focus on the real issues useful when the sampling units differ widely. Types of Designs: Completely Randomized Design (CRD)- A design which randomly allocates all treatment combinations to the EUs. Each EU has the same chance of receiving any treatment combination. Randomized Complete Block Design (RCBD)- A design whose blocks contain a single observation on each treatment. Stemplots: Pros: can evaluate the “shape” of the data; don’t lose origi nal data; can suggest natural groupings  Cons: loses time order of data; cumbersome to construct by hand with very large data sets Subset: Complement: Union: Intersection: Mutually Exclusive: Conditional Probability: Dependent Probability: Independent Probability: , Calculate P(   if P(A) is difficult Addition Rule: Mutually Exclusive Addition: (   )   Law of Total Probability:   (     )      Bayes’ Rule: P(B|A) = | ||  Cumulative Distribution Function (CDF) : Discrete random variables can assume, at most, a countable set of numbers. Continuous random variables can assume any possible real value. The distinction between a cdf (big F) and a pmf (lit tle f) is that the pmf gives you the probability of one particular point (=) where as the cdf gives you the probability of that point plus everything before it ( ). In addition to a CDF, every continuous r.v. has a probability density function (PDF): If we know the pdf of  X , then we can find the cdf by integrating: This implies that for some interval [a,b], we have that We define the expected value or the population mean of  X (denoted =E (  X )) as if  X is discrete, if  X is continuous.

Upload: ryan

Post on 14-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Engineering Statistics Study Sheet

7/27/2019 Engineering Statistics Study Sheet

http://slidepdf.com/reader/full/engineering-statistics-study-sheet 1/2

Population: The set of all possible observations of interest to the problem at hand.

Sample: The part of the population from which we collect information

Factor: the explanatory variable(s), often with multiple levels (i.e., i f temperature is the factor, levels could be 50 and 75)

Treatment/Treatment Combination: a specific combination of the levels of each factor.

Example: 2 factors

Pressure (4 atm, 5 atm, 6 atm)

Temperature (50 and 75)

How many treatment combinations?

Categorical or Continuous: Categorical factors have limited levels; Continuous do not have limited levels.

Experimental Unit (EU): The smallest unit to which we apply a treatment combination.Observational Unit (OU): The unit upon which we take the measurement

Sampling Techniques: Simple Random Sample (SRS): every possible sample of n observations from the population has the same chance of 

being selected

Stratified Random Sample: when there are several groups or strata (grouping them can remove some of the

variability) a simple random sample is performed on each in order to get at least one representative from each group. Also useful for

comparing strata.

Systematic Random Sample: randomly select an item within the first m items. Thereafter, sample each mth

item.

Lends itself well to high-speed part manufacturing.

Pairing sampling units: Pairing sampling units allows us to remove the unit to unit variability and focus on the real

issues – useful when the sampling units differ widely.

Types of Designs: Completely Randomized Design (CRD)- A design which randomly allocates all treatment combinations to the EUs. Each EU has

the same chance of receiving any treatment combination.

Randomized Complete Block Design (RCBD)- A design whose blocks contain a single observation on each treatment.Stemplots: Pros: can evaluate the “shape” of the data; don’t lose original data; can suggest natural groupings  

Cons: loses time order of data; cumbersome to construct by hand with very large data sets

Subset:

Complement:

Union:

Intersection:

Mutually Exclusive:

Conditional Probability:

Dependent Probability:

Independent Probability: ,

Calculate P(  if P(A) is difficult

Addition Rule:

Mutually Exclusive Addition: (  )    Law of Total Probability:   (   )    Bayes’ Rule: P(B|A) =

||| 

Cumulative Distribution Function (CDF) :

Discrete random variables can assume, at most, a countable set of numbers.

Continuous random variables can assume any possible real value.

The distinction between a cdf (big F) and a pmf (little f) is that the pmf gives you the probability of one particular point (=) where as the cdf gives

you the probability of that point plus everything before it ().

In addition to a CDF, every continuous r.v. has a probability density function (PDF):

If we know the pdf of  X , then we can find the cdf by integrating: This implies that for some interval [a,b], we have that

We define the expected value or the population mean of  X (denoted =E ( X )) as

if  X is discrete,

if  X is continuous.

Page 2: Engineering Statistics Study Sheet

7/27/2019 Engineering Statistics Study Sheet

http://slidepdf.com/reader/full/engineering-statistics-study-sheet 2/2

A measurement of the deviation from = E(X) is called the variance of X , denoted by Var ( X ). The variance is defined by the formula

Standard Deviation:  

Discrete Distributions:

Bernoulli Distribution: For the event that just has two different outcomes, 1 indicates success (or yes or categorical a), 0 indicates fail (or

no or categorical b). The notation is defined by the operator. It is one of the discrete random variable distribution. PMF: f(x) = px

(1-p)1-x

 

where x ϵ{ 0, 1}, Mean or expectation = E(X)= p, Variance = Var (X) = p(1-p), If X 1, ---, Xn are independent and identically distributed

random variable from Bernoulli distribution with probability of success (p), then Y= ∑ ~ Binomial (n , p) where  ϵ{ 0, 1}

The Binomial Distribution, The single most important discrete distribution. Has parameters n and p., Models data that can be classified as

a success (p) or failure (1-p). The r.v. X is the sum of successes in n independent trials. Shorthand: X~Binomial(n,p).The PMF for the binomial distribution is ()=(¦) ^ 〖(1−)〗^(−), 0≤≤1;=0,1,…, where (¦)=!/!(−)! is the number of 

ways to get x successes in n trials and it can be shown that E(X) = np Var(X) = np(1-p)

TI: MATHPRB3:nCr

The Poisson Distribution: Has parameter . is the average rate for a time frame. Models independent count data. Shorthand:

X~Poisson().

The Geometric Distribution: Has parameter p. Models the number of independent trials to obtain a “success”. p is the probability of 

success on any given trial.

The PMF: ()=〖(1−)〗^(−1) 0<<1; x=1,2,… ( )= 1/  ( )= (1−)/^2

The Negative Binomial Distribution: Has parameters r and p. Like the geometric except it models the number of observations needed to

get r successes. p is the probability of success on any given trial.

The PMF:  ()= ((−1)¦(−1))〖(1−)〗^(−) ^ 0<p<1;x=r,r+1, … E(X)= /  ( )=((1−))/^2The Hypergeometric Distribution: Has parameters N, n, and r. Models the number of successes out of n trials when sampling without

replacement from a finite population of N objects that contains exactly r successes.

The PMF:  ()=(¦)((−)¦(−))/((¦) ) ≤; −≤−; =0,1,2,…, E(X)=/  ( )=((−)(−))/(^2 (−1))

Continuous Distributions

  ∫    

The Exponential Distribution: Has rate . is the rate parameter, explained as the rate of the event happen. 1/ is the expected

lifetime between two consecutive events.

The PDF and CDF: x>0; >0 ( )=1/  ( )=1/^2

The Uniform Distribution: Has parameters a and b. Models time between events and interarrival times. Useful when an event has

occurred within a time frame but you have no information on the exact time. The time frame, or interval, is given by [a,b] or (a,b).

The PDF and CDF: ()=1/(−) and ()= (−)/(−), a≤≤b, ,The Weibull Distribution (optional): Has parameters and . is the scale parameter and is the shape parameter. If the data follow a

Weibull distribution, these parameters can be adjusted to better fit the data. Also models the time between events and interarrival times

The PDF and CDF: ()=〖()〗^(−1) ^(−〖()〗^ ) ()=1−^(−〖()〗^ ) x, , >0

where is the gamma function.

The Gamma Distribution (optional): Has parameters (scale) and (shape). Also models the time between events and interarrival times.

Shorthand: X~Γ(,)

The PDF: ()= (^ ^(−1))/(Γ()) ^(−) x, , >0

The Normal Disrtibution: The single most important continuous distribution. Has parameters (the mean) and ^2 (the variance).

Models many phenomena; in certain cases, can model the behavior of averages.

The PDF:

When =0 and ^2==1, the normal distribution become the standard normal distribution, denoted by Z.

The PDF becomes: ,

Sample Mean: , ,

Central Limit Theorem: Suppose that X1,X2,… Xn are a random sample from a population with mean µ and variance . Then, if n is

“large enough” (typically 30), the distribution of the sample average is approximately normal with mean µ and variance .

           

√ ⁄