engineering statistics study sheet
TRANSCRIPT
7/27/2019 Engineering Statistics Study Sheet
http://slidepdf.com/reader/full/engineering-statistics-study-sheet 1/2
Population: The set of all possible observations of interest to the problem at hand.
Sample: The part of the population from which we collect information
Factor: the explanatory variable(s), often with multiple levels (i.e., i f temperature is the factor, levels could be 50 and 75)
Treatment/Treatment Combination: a specific combination of the levels of each factor.
Example: 2 factors
Pressure (4 atm, 5 atm, 6 atm)
Temperature (50 and 75)
How many treatment combinations?
Categorical or Continuous: Categorical factors have limited levels; Continuous do not have limited levels.
Experimental Unit (EU): The smallest unit to which we apply a treatment combination.Observational Unit (OU): The unit upon which we take the measurement
Sampling Techniques: Simple Random Sample (SRS): every possible sample of n observations from the population has the same chance of
being selected
Stratified Random Sample: when there are several groups or strata (grouping them can remove some of the
variability) a simple random sample is performed on each in order to get at least one representative from each group. Also useful for
comparing strata.
Systematic Random Sample: randomly select an item within the first m items. Thereafter, sample each mth
item.
Lends itself well to high-speed part manufacturing.
Pairing sampling units: Pairing sampling units allows us to remove the unit to unit variability and focus on the real
issues – useful when the sampling units differ widely.
Types of Designs: Completely Randomized Design (CRD)- A design which randomly allocates all treatment combinations to the EUs. Each EU has
the same chance of receiving any treatment combination.
Randomized Complete Block Design (RCBD)- A design whose blocks contain a single observation on each treatment.Stemplots: Pros: can evaluate the “shape” of the data; don’t lose original data; can suggest natural groupings
Cons: loses time order of data; cumbersome to construct by hand with very large data sets
Subset:
Complement:
Union:
Intersection:
Mutually Exclusive:
Conditional Probability:
Dependent Probability:
Independent Probability: ,
Calculate P( if P(A) is difficult
Addition Rule:
Mutually Exclusive Addition: ( ) Law of Total Probability: ( ) Bayes’ Rule: P(B|A) =
|||
Cumulative Distribution Function (CDF) :
Discrete random variables can assume, at most, a countable set of numbers.
Continuous random variables can assume any possible real value.
The distinction between a cdf (big F) and a pmf (little f) is that the pmf gives you the probability of one particular point (=) where as the cdf gives
you the probability of that point plus everything before it ().
In addition to a CDF, every continuous r.v. has a probability density function (PDF):
If we know the pdf of X , then we can find the cdf by integrating: This implies that for some interval [a,b], we have that
We define the expected value or the population mean of X (denoted =E ( X )) as
if X is discrete,
if X is continuous.
7/27/2019 Engineering Statistics Study Sheet
http://slidepdf.com/reader/full/engineering-statistics-study-sheet 2/2
A measurement of the deviation from = E(X) is called the variance of X , denoted by Var ( X ). The variance is defined by the formula
Standard Deviation:
Discrete Distributions:
Bernoulli Distribution: For the event that just has two different outcomes, 1 indicates success (or yes or categorical a), 0 indicates fail (or
no or categorical b). The notation is defined by the operator. It is one of the discrete random variable distribution. PMF: f(x) = px
(1-p)1-x
where x ϵ{ 0, 1}, Mean or expectation = E(X)= p, Variance = Var (X) = p(1-p), If X 1, ---, Xn are independent and identically distributed
random variable from Bernoulli distribution with probability of success (p), then Y= ∑ ~ Binomial (n , p) where ϵ{ 0, 1}
The Binomial Distribution, The single most important discrete distribution. Has parameters n and p., Models data that can be classified as
a success (p) or failure (1-p). The r.v. X is the sum of successes in n independent trials. Shorthand: X~Binomial(n,p).The PMF for the binomial distribution is ()=(¦) ^ 〖(1−)〗^(−), 0≤≤1;=0,1,…, where (¦)=!/!(−)! is the number of
ways to get x successes in n trials and it can be shown that E(X) = np Var(X) = np(1-p)
TI: MATHPRB3:nCr
The Poisson Distribution: Has parameter . is the average rate for a time frame. Models independent count data. Shorthand:
X~Poisson().
The Geometric Distribution: Has parameter p. Models the number of independent trials to obtain a “success”. p is the probability of
success on any given trial.
The PMF: ()=〖(1−)〗^(−1) 0<<1; x=1,2,… ( )= 1/ ( )= (1−)/^2
The Negative Binomial Distribution: Has parameters r and p. Like the geometric except it models the number of observations needed to
get r successes. p is the probability of success on any given trial.
The PMF: ()= ((−1)¦(−1))〖(1−)〗^(−) ^ 0<p<1;x=r,r+1, … E(X)= / ( )=((1−))/^2The Hypergeometric Distribution: Has parameters N, n, and r. Models the number of successes out of n trials when sampling without
replacement from a finite population of N objects that contains exactly r successes.
The PMF: ()=(¦)((−)¦(−))/((¦) ) ≤; −≤−; =0,1,2,…, E(X)=/ ( )=((−)(−))/(^2 (−1))
Continuous Distributions
∫
The Exponential Distribution: Has rate . is the rate parameter, explained as the rate of the event happen. 1/ is the expected
lifetime between two consecutive events.
The PDF and CDF: x>0; >0 ( )=1/ ( )=1/^2
The Uniform Distribution: Has parameters a and b. Models time between events and interarrival times. Useful when an event has
occurred within a time frame but you have no information on the exact time. The time frame, or interval, is given by [a,b] or (a,b).
The PDF and CDF: ()=1/(−) and ()= (−)/(−), a≤≤b, ,The Weibull Distribution (optional): Has parameters and . is the scale parameter and is the shape parameter. If the data follow a
Weibull distribution, these parameters can be adjusted to better fit the data. Also models the time between events and interarrival times
The PDF and CDF: ()=〖()〗^(−1) ^(−〖()〗^ ) ()=1−^(−〖()〗^ ) x, , >0
where is the gamma function.
The Gamma Distribution (optional): Has parameters (scale) and (shape). Also models the time between events and interarrival times.
Shorthand: X~Γ(,)
The PDF: ()= (^ ^(−1))/(Γ()) ^(−) x, , >0
The Normal Disrtibution: The single most important continuous distribution. Has parameters (the mean) and ^2 (the variance).
Models many phenomena; in certain cases, can model the behavior of averages.
The PDF:
When =0 and ^2==1, the normal distribution become the standard normal distribution, denoted by Z.
The PDF becomes: ,
Sample Mean: , ,
Central Limit Theorem: Suppose that X1,X2,… Xn are a random sample from a population with mean µ and variance . Then, if n is
“large enough” (typically 30), the distribution of the sample average is approximately normal with mean µ and variance .
√ ⁄