selected statistics examples from lectures. matlab: histograms >> y = [1 3 5 8 2 4 6 7 8 3 2 9...
TRANSCRIPT
Selected Statistics Examples from Lectures
Matlab: Histograms
>> y = [1 3 5 8 2 4 6 7 8 3 2 9 4 3 6 7 4 1 5 3 5 8 9 6 2 4 6 1 5 6 9 8 7 5 3 4 5 2 9 6 5 9 4 1 6 7 8 5 4 2 9 6 7 9 2 5 3 1 9 6 8 4 3 6 7 9 1 3 4 7 5 2 9 8 5 7 4 5 4 3 6 7 9 3 1 6 9 5 6 7 3 2 1 5 7 8 5 3 1 9 7 5 3 4 7 9 1]’;
>> mean(y) ans = 5.1589
>> var(y) ans = 6.1726
>> std(y) ans = 2.4845
>> hist(y,9) histogram plot with 9 bins
>> n = hist(y,9) store result in vector n
>> x = [2 4 6 8]’
>> n = hist(y,x) create histogram with bin centers specified by vector x 2 4 6 8
0
5
10
15
20
25
30
35
1 2 3 4 5 6 7 8 90
2
4
6
8
10
12
14
16
Probability Examples
Probability that at least one coin will turn heads up from five tossed coins» Number of outcomes: 25 = 32» Probability of each outcome: 1/32» Probability of no heads: P(AC) = 1/32» Probability at least one head: P(A) = 1-P(AC) = 31/32
Probability of getting an odd number or a number less than 4 from a single dice toss» Probability of odd number: P(A) = 3/6» Probability of number less than 4: P(B) = 3/6» Probability of both:» Probability of either:
3/26/26/36/3)()()()( BAPBPAPBAP
6/2)( BAP
complement set
Permutation Examples
Process for manufacturing polymer thin films» Compute the probability that the first 6 films will be too thin
and the next 4 films will be too thick if the thickness is a random variable with the mean equal to the desired thickness
» n1 = 6, n2 = 4
» Probability
Encryption cipher» Letters arranged in five-letter words: n = 26, k = 5
» Total number of different words: nk = 265 = 11,881,376
» Total number of different words containing each letter no more than once:
%5.0005.0!10
!4!6
!!!
1
21
nnn
P
600,893,7)!526(
!26
)!(
!
kn
n
Combination Examples
Effect of repetitions» Three letters a, b, c taken two at a time (n = 3, k = 2)
» Combinations without repetition
» Combinations with repetitions
500 thin films taken 5 at a time» Combinations without repetitions
bcacabk
n3
)!23(!2
!3
2
3
ccbbaa
bcacab
k
kn6
)!24(!2
!4
2
41
600,686,244,255)!5500(!5
!500
5
500
k
n
Matlab: Permutations and Combinations
>> perms([2 4 6]) all possible permutations of 2, 4, 6
6 4 26 2 44 6 24 2 62 4 62 6 4
>> randperm(6) returns one possible permutation of 1-65 1 2 3 4 6
>> nchoosek(5,4) number of combinations of 5 things taken 4 at a time without repetitions
ans = 5
>> nchoosek(2:2:10,4) all possible combinations of 2, 4, 6, 8, 10 taken 4 at a time without repetitions
2 4 6 82 4 6 102 4 8 102 6 8 104 6 8 10
Continuous Distribution Example
Probability density function
Cumulative distribution function
Probability of events
1)1(75.0)(otherwise0
11)1(75.0)(
1
1
22
dvvdvvfxx
xf
11
1125.075.05.0
10
)(
25.075.05.0)1(75.0)()(
3
3
1
2
x
xxx
x
xF
xxdvvdvvfxFxx
73.025.075.05.095.0)()(
%75.68)1(75.0)5.0()5.0()5.05.0(3
5.0
5.0
2
xxxxFxXP
dvvFFxP
Matlab: Normal Distribution
Normal distribution: normpdf(x,mu,sigma)» normpdf(8,10,2) ans = 0.1210» normpdf(9,10,2) ans = 0.1760» normpdf(8,10,4) ans = 0.0880
Normal cumulative distribution: normcdf(x,mu,sigma)» normcdf(8,10,2) ans = 0.1587» normcdf(12,10,2) ans = 0.8413
Inverse normal cumulative distribution: norminv(p,mu,sigma)» norminv([0.025 0.975],10,2) ans = 6.0801 13.9199
Random number from normal distribution: normrnd(mu,sigma,v)» normrnd(10,2,[1 5]) ans = 9.1349 6.6688
10.2507 10.5754 7.7071
Matlab: Normal Distribution Example
The temperature of a bioreactor follows a normal distribution with an average temperature of 30oC and a standard deviation of 1oC. What percentage of the reactor operating time will the temperature be within +/-0.5oC of the average?
Calculate probability at 29.5oC and 30.5oC, then calculate the difference:» p=normcdf([29.5 30.5],30,1)p = [0.3085 0.6915]» p(2) – p(1)0.3829
The reactor temperature will be within +/- 0.5oC of the average ~38% of the operating time
25 26 27 28 29 30 31 32 33 34 350
0.1
0.2
0.3
0.4
Temperature
Discrete Distribution Example
Matlab function: unicdf(x,n),
>> x = (0:6);
>> y = unidcdf(x,6);
>> stairs(x,y)
Poisson Distribution Example
Probability of a defective thin polymer film p = 0.01 What is the probability of more than 2 defects in a
lot of 100 samples? Binomial distribution: = np = (100)(0.01) = 1 Since p <<1, can use Poisson distribution to
approximate the solution.
%03.8)2(1)2(
9197.0!
)()2()2(2
0!2
1!1
1!0
112
0
210
XPXP
eex
xfFXPj j
x
jj
j
Matlab: Maximum Likelihood
In a chemical vapor deposition process to make a solar cell, there is an 87% probability that a sufficient amount of silicon will be deposited in each cell. Estimate the maximum likelihood of success in 5000 cells.
s = binornd(n,p) – randomly generate the number of positive outcomes given n trials each with probability p of success» s = binornd(5000,0.87)
s=4338
phat = binofit(s,n) – returns the maximum likelihood estimate given s successful outcomes in n trials» phat = binofit(s,5000)
phat = 0.8676
Mean Confidence Interval Example
Measurements of polymer molecular weight (scaled by 10-5)
Confidence interval
26.131.127.127.112.133.119.122.136.125.1
31.121.1050.010
0049.0)26.2(
0049.026.1
26.2975.0)(911095.0
95.0
2
CONFk
sx
ccFm
Variance Confidence Interval Example
Measurements of polymer molecular weight (scaled by 10-5)
Confidence interval
26.131.127.127.112.133.119.122.136.125.1
016.0002.0
0023.002.19
)0049.0)(9(0163.0
70.2
)0049.0)(9(0049.0
02.19975.0)(70.2025.0)(995.0
295.0
212
2211
CONF
kks
ccFccFm
Matlab: Confidence Intervals
>> [muhat,sigmahat,muci,sigmaci] = normfit(data,alpha) data: vector or matrix of data alpha: confidence level = 1-alpha muhat: estimated mean sigmahat: estimated standard deviation muci: confidence interval on the mean sigmaci: confidence interval on the standard deviation
>> [muhat,sigmahat,muci,sigmaci] = normfit([1.25 1.36 1.22 1.19 1.33 1.12 1.27 1.27 1.31 1.26],0.05)
muhat = 1.2580sigmahat = 0.0697muci = 1.2081
1.3079sigmaci = 0.0480
0.1273
Mean Hypothesis Test Example
Measurements of polymer molecular weight
Hypothesis: 0 = 1.3 instead of the alternative 1 < 0
Significance level: = 0.10 Degrees of freedom: m = 9 Critical value
Sample t
Reject hypothesis
0049.0258.1 2 sx
38.1~10.0)(
38.1~90.01)~(
0
0
cccTP
ccTP
38.1897.1100049.0
3.1258.1
ct
Polymerization Reactor Control
Lab measurements» Viscosity and monomer content of polymer every four hours» Three monomer content measurements: 23%, 18%, 22%
Mean content control» Expected mean and known variance: 0 = 20%, = 1» Control limits:
» Sample mean: 21% process is in control
MonomersCatalysts
SolventHydrogen
On-linemeasurements
Labmeasurements
Polymer
Monomer ContentViscosity
%5.2158.2%5.183
158.22058.2
2
0
2
0 n
UCLn
LCL
Goodness of Fit Example
Maximum likelihood estimates
n
jj
n
jj
xxn
xxn
1
22
1
9.712)(1
ˆ
7.3641
ˆ
Goodness of Fit Example cont.
Matlab: Goodness of Fit Example
Find maximum likelihood estimates for µ and σ of a normal distribution with function mle
>> data=[320 … 360];
>> phat = mle(data)
phat = 364.7 26.7 Analyze dataset for goodness of fit using function chi2gof – usage:
[h,p,stats]=chi2gof(x,’edges’,edges) for data in vector x divided into intervals with the specified edges» If h = 1, reject hypothesis at 5% significance» If h = 0, accept hypothesis at 5% significance» p: probability of observing given statistic» stats: includes chi-square statistic and degrees of freedom
>> [h,p,stats]=chi2gof(data,’edges’,[-inf,325:10:405,inf]);
>> h = 0
>> p = 0.8990
>> chi2stat = 2.8440
>> df = 7