typing equations in ms word 2010 - michigan technological …fmorriso/cm3215/lectures/cm3215... ·...
TRANSCRIPT
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
1
CM3215
Fundamentals of Chemical Engineering Laboratory
Professor Faith Morrison
Department of Chemical EngineeringMichigan Technological University
© Faith A. Morrison, Michigan Tech U.
1
https://www.youtube.com/watch?v=ceNp9meHTmY
Typing Equations in
MS Word 2010
© Faith A. Morrison, Michigan Tech U.2
Where are we in our discussion of error
analysis?
Let’s revisit:
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
2
© Faith A. Morrison, Michigan Tech U.
3
Summary: Error Analysis with Real Numbers
• To understand the accuracy of our numbers, we need to determine a confidence interval.
2 with 95.0% confidence
• The Standard error for a measured quantity is the largest of: determined by replicates / or
by estimate of reading error / 3 orby estimate of calibration error maxerror/2
• Standard error for derived quantities (arrived at from equations), is
obtained at through error propagation,which is a combination of variances.
For replicate data with 7, replace “2” with . ,
From Lecture 4—Error Propagation:
© Faith A. Morrison, Michigan Tech U.
4
Error Propagation
, , . . .
We use an analysis based on the Taylor series expansion of a nonlinear function.
Taylor series:
A calculation of the function , , from uncertain values of , , is a
random variable of mean and variance :
Covariance terms, if are correlated
(higher order terms)
From Lecture 4—Error Propagation:
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
3
© Faith A. Morrison, Michigan Tech U.
5
Error Propagation
, , . . .
We use an analysis based on the Taylor series expansion of a nonlinear function.
Taylor series:
Covariance terms, if are correlated
neglect
Note: covariance terms are not always zero or small; but they often are. For now, this is fine.
From Lecture 4—Error Propagation:
(higher order terms)
A calculation of the function , , from uncertain values of , , is a
random variable of mean and variance :
© Faith A. Morrison, Michigan Tech U.
6
Worksheet for error
propagation
www.chem.mtu.edu/~fmorriso/cm3215/ErrorPropagationWorksheet.pdf
From Lecture 4—Error Propagation:
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
4
© Faith A. Morrison, Michigan Tech U.7
30.800
13.410
10.00
1/
1/
/
5.8 10
0.02
3.3 10 /
1.21 10 /
1.21 10 /
0.0035 /
1.739 / 1.739 0.007 /
Example 1:What is the uncertainty (95% confidence interval) in as
determined in the lab?
5.8 10 3.3 10 /
From Lecture 4—Error Propagation:
© Faith A. Morrison, Michigan Tech U.
8
Example 1:What is the uncertainty (95% confidence interval) in as
determined in the lab?
f(x 1 ,x 2 ,x 3 ) f BF 1.739 g/ml 2es 0.007 g/ml
xi value df/dxi (df/dxi)2 exi exi
2 (df/dxi)2exi
2
x1 MF 30.800 g 0.10 0.010 5.8E‐05 3.3E‐09 3.33E‐11 g2/ml2
x2 ME 13.410 g ‐0.10 0.010 5.8E‐05 3.3E‐09 3.33E‐11 g2/ml2
x3 Vpyc 10.000 ml ‐0.174 0.0302 0.02 4.0E‐04 1.210E‐05 g2/ml2
es2 1.21E‐05 g2/ml2
es 0.0035 g/ml
Error propagation Worksheet
Excel is an excellent tool for error propagation
From Lecture 4—Error Propagation:
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
5
68
Summary: Error Analysis with Real Numbers
• To understand the accuracy of our numbers, we need to determine a confidence interval.
with 95.0% confidence
• Standard error for derived quantities (arrived at from equations), is obtained through
error propagation, which is a combination of variances.
For replicate data with , replace “2” with
• Replication improves the estimation of the mean. The answer from replicates is more reliable than single values (if no systematic errors).
• The weighting values indicate the impact of individual errors on the final value.
• The prediction interval of the next value of x should encompass 95% of all measured values.
• The Standard error for a measured quantity is the sum, in quadrature, of: determined by replicates
by estimate of reading errorby estimate of calibration error
• Estimates for (particularly those obtained through ) may need to be re‐evaluated, if unreasonably narrow confidence intervals are identified.
95% PI: or if
From Lecture 4—Error Propagation:
© F
aith
A.
Mor
rison
, M
ichi
gan
Tech
U.
9
© Faith A. Morrison, Michigan Tech U.
10
Now, how do we determine uncertainty from numbers
that we obtain as parameters in a curve‐fit?
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
6
CM3215
Fundamentals of Chemical Engineering Laboratory
Professor Faith Morrison
Department of Chemical EngineeringMichigan Technological University
Uncertainty in Least Squares Curve Fitting: Excel’s LINEST
© Faith A. Morrison, Michigan Tech U.11
Reference: www.chem.mtu.edu/~fmorriso/cm3215/Unc
ertaintySlopeInterceptOfLeastSquaresFit.pdf
1. Quick start—Replicate error2. Reading Error3. Calibration Error4. Error Propagation5. Least Squares Curve Fitting
© Faith A. Morrison, Michigan Tech U.
12
12⋮ ⋮ ⋮
Question: For a dataset of data pairs , that is expected to show a linear
relationship between and , what are the parameters and of the equation for the line?
slopeintercept
Ordinary, Least Squares, LinearRegression
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
7
© Faith A. Morrison, Michigan Tech U.
13
Ordinary, Least Squares, LinearRegression
Solution:
• Assume you know the with certainty (“ordinary”least squares)
• Guess a line, • Create a measure of the error between the guess
and the data (error measure should always be positive, so square it)
• Add these individual error measures to calculate a sum of squared errors,
• Use calculus (derivatives) to find the values of and that result in the least sum of squared error.
slopeintercept
≡
12⋮⋮⋮⋮
data line
© Faith A. Morrison, Michigan Tech U.
14
Ordinary, Least Squares, LinearRegression
Result:
slopeintercept
∑ ∑ ∑
∑ ∑
∑ ∑ ∑ ∑
∑ ∑
Least squares slope
Least squares intercept
12⋮⋮⋮⋮
In Excel: SLOPE(y‐range, x‐range)INTERCEPT(y‐range,x‐range)
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
8
© Faith A. Morrison, Michigan Tech U.
15
Ordinary, Least Squares, LinearRegression
Result:
slopeintercept
∑ ∑ ∑
∑ ∑
∑ ∑ ∑ ∑
∑ ∑
Least squares slope
Least squares intercept
12⋮⋮⋮⋮
In Excel: SLOPE(y‐range, x‐range)INTERCEPT(y‐range,x‐range)
and are calculated from the
,
These are the formulas used in Excel trendlines.
© Faith A. Morrison, Michigan Tech U.
16
Ordinary, Least Squares, LinearRegression
slopeintercept
Least squares slope
Least squares intercept
12⋮⋮⋮⋮
But, what are the error
limits on
and ?
Result:
∑ ∑ ∑
∑ ∑
∑ ∑ ∑ ∑
∑ ∑
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
9
© Faith A. Morrison, Michigan Tech U.
17
Ordinary, Least Squares, LinearRegression
slopeintercept
12⋮⋮⋮⋮
But, what are the error
limits on
and ?
slope ?Intercept ?
© Faith A. Morrison, Michigan Tech U.
18
Ordinary, Least Squares, LinearRegression
slopeintercept
12⋮⋮⋮⋮
But, what are the error
limits on
and ?
slope ?Intercept ?
slope 2 Intercept 2
Answer:
But what is ?
(Later we will correct the “2” for small )
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
10
© F
aith
A.
Mor
rison
, M
ichi
gan
Tech
U.
19
Ordinary, Least Squares, LinearRegression
Answer:
© Faith A. Morrison, Michigan Tech U.
Ordinary, Least Squares, Linear Regression Answer:
2
⋮ ⋮
,
,
,
,
⋮
,
⋮
,
,
,
∑ ∑ ∑
∑ ∑
Error limits on
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
11
© Faith A. Morrison, Michigan Tech U.
Answer:
2
⋮ ⋮
,
,
,
,
⋮
,
⋮
,
,
,
∑ ∑ ∑
∑ ∑
Only the are variables; we assumed we knew the with certainty
Error limits on Ordinary, Least Squares, Linear Regression
© Faith A. Morrison, Michigan Tech U.
Answer:
2
⋮ ⋮
,
,
,
,
⋮
,
⋮
,
,
,
∑ ∑ ∑
∑ ∑
Assume that the variances of the are the same for all .
Error limits on Ordinary, Least Squares, Linear Regression
( , is the standard
deviation of at a given value of
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
12
© Faith A. Morrison, Michigan Tech U.
23
Ordinary, Least Squares, LinearRegression
slopeintercept
, ≡12
The variance of , given
,
The variance of the mean value of at a given
(This formula comes from the definition
of variance)
In Excel: • , STEYX(y‐range, x‐range), or• use LINEST
( , is the standard deviation of
at a given value of ; ordinary least squares assumes it is constant)
© Faith A. Morrison, Michigan Tech U.
24
Ordinary, Least Squares, LinearRegression
slopeintercept
, ≡12
The variance of , given
,
In Excel: • , STEYX(y‐range, x‐range), or• use LINEST
The variance of the mean value of at a given
( , is the standard deviation of
at a given value of ; ordinary least squares assumes it is constant)
Best value of at a given
, is calculated
from the ,
(This formula comes from the definition
of variance)
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
13
© Faith A. Morrison, Michigan Tech U.
25
Ordinary, Least Squares, LinearRegression
slopeintercept
What are the error limits on ?
slope 2
,
Answer:
In Excel:
•STEYX(y−range, x−range
(DEVSQ(x−range) , or
• use LINEST
for 2 6 :slope . ,
(This is the final result of the algebra indicated on the error propagation slide)
© Faith A. Morrison, Michigan Tech U.26
Ordinary, Least Squares, LinearRegression
slopeintercept
What are the error limits on ?
intercept 2
Answer:
?Solve the same way, error
propagation on the formula for
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
14
© Faith A. Morrison, Michigan Tech U.
Answer:
2
⋮ ⋮
,
,
,
,
⋮
,
⋮
,
,
,
∑ ∑ ∑ ∑
∑ ∑
Error limits on Ordinary, Least Squares, Linear Regression
© Faith A. Morrison, Michigan Tech U.28
Ordinary, Least Squares, LinearRegression
slopeintercept
What are the error limits on ?
intercept 2
,1
Answer:
In Excel:
• Calculate from STEYX(y−range, x−range) and DEVSQ(x−range) and the formula above, or
• use LINEST
for 2 6 :intercept . ,
(This is the final result of the algebra indicated on the error propagation slide)
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
15
29
Ordinary, Least Squares, LinearRegression
slopeintercept
www.chem.mtu.edu/~fmorriso/cm3215/UncertaintySlopeInterceptOfLeastSquaresFit.pdf
© Faith A. Morrison, Michigan Tech U.
For instructions on how to use Microsoft Excel’s LINEST
function, see the handout on the web:
(the appendix has some derivations, if you’re interested)
© Faith A. Morrison, Michigan Tech U.30
Ordinary, Least Squares, LinearRegression
slopeintercept
What are the error limits on a value of obtained from the equation ?
At a chosen ,
2
?
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
16
© Faith A. Morrison, Michigan Tech U.
Answer:
2
1
0 0
Error limits on Ordinary, Least Squares, Linear Regression
© Faith A. Morrison, Michigan Tech U.
Answer:
2
1
0 0
But, and are not independent (both are calculated from the ).
Ordinary, Least Squares, Linear RegressionError limits on
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
17
© Faith A. Morrison, Michigan Tech U.
Answer:
2
1
0 0
2 Cov ,
Ordinary, Least Squares, Linear RegressionError limits on
© Faith A. Morrison, Michigan Tech U.34
Ordinary, Least Squares, LinearRegression
slopeintercept
What are the error limits on a value of obtained from the equation ?
at , 2
,1
Answer:
In Excel: • , STEYX(y−range,x−range)
• DEVSQ(x−range)• AVERAGE(x−range)
for 2 6 ,replace “2” with . ,
(This is the final result of the algebra indicated on previous slide; see Appendix B of the handout.)
Use this for error limits on values obtained from the fit.
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
18
at , we predict a new measurement of
will fall in the prediction interval:
2
© Faith A. Morrison, Michigan Tech U.35
Ordinary, Least Squares, LinearRegression
slopeintercept
What are the error limits on a predicted next experimental value of ?
?
Answer:
at , we predict a new measurement of
will fall in the prediction interval:
2
© Faith A. Morrison, Michigan Tech U.36
Ordinary, Least Squares, LinearRegression
slopeintercept
?
Answer:
Solve with same approach as we have been using: write the equation to calculate the quantity,
then propagate the error.
(See Appendix B of the handout.)
What are the error limits on a predicted next experimental value of ?
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
19
© Faith A. Morrison, Michigan Tech U.37
Ordinary, Least Squares, LinearRegression
slopeintercept
for 2 6 ,replace “2” with . ,
(See Appendix B of the handout.)
at , we predict a new measurement of
will fall in the prediction interval:
2
, 11
Answer:
What are the error limits on a predicted next experimental value of ?
© Faith A. Morrison, Michigan Tech U.38
Ordinary, Least Squares, LinearRegression
Prediction interval of data:
,
0.80
0.90
1.00
1.10
1.20
1.30
1.40
0.0 10.0 20.0 30.0 40.0 50.0 60.0 70.0
density, g/m
l
wt % sugar
Aqueous Sugar Solutions, 20oC, 2014
CM3215 Fall 2014 data
+95%CI
‐95%CI
trendline
‐95%PI
‐95%PI
(Notice that 95% of the data points fall within the PI; that’s what it means to be a PI. The next data point likely will fall here too.)
(for large , the values of at each are well predicted (CI is narrow))
Confidence interval for values from the fit:
,
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
20
© Faith A. Morrison, Michigan Tech U.39
Ordinary, Least Squares, LinearRegression
0.80
0.90
1.00
1.10
1.20
1.30
1.40
0.0 10.0 20.0 30.0 40.0 50.0 60.0 70.0
density, g/m
l
wt % sugar
Aqueous Sugar Solutions, 20oC, 2014
CM3215 Fall 2014 data
+95%CI
‐95%CI
trendline
‐95%PI
‐95%PI
Note: if your data are replicates (data taken repeatedly at chosen values), do not pre‐average the ‐data and follow‐up with a least‐squares curve fit. Instead, use all the replicates as individual values, and let LINEST find the least squared error.
© Faith A. Morrison, Michigan Tech U.
40
Summary: Uncertainty Ordinary, Least Squares, Linear Regression
• The Ordinary Least Squares Linear Regressionmethod provides the equations needed to obtain model parameters slope and intercept.
• The equations for the parameters may be used with error propagation to obtain the variances associated with the parameters and .
95% confidence intervals on the parameters are constructed with 2 for large
For 2 6, the 95% CI is constructed as . ,
• We can construct 95% CI on the best values of at a chosen . These CI are used for error range on the fit.
• We can construct 95% prediction intervals (PI) on a next value of at a chosen ; use to evaluate next experimental point acquired.
slopeintercept
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
21
© Faith A. Morrison, Michigan Tech U.
41
Excel Summary: Uncertainty Ordinary, Least Squares, Linear Regression
• AVERAGE(range)
• VAR.S(range)
• STDEV.S(range)
• COUNT(range)
• DEVSQ(x‐range)
• SLOPE(y‐range, x‐range)
• INTERCEPT(y‐range,x‐range)
• , STEYX(y‐range, x‐range)
• LINEST (see handout)
• LOGEST (look it up)
slopeintercept
• ,
• ,
• ,
• , 1
Use for CI error bars on ‐values obtained from a fit
Use for PI of next
measured value of
© Faith A. Morrison, Michigan Tech U.
42
Excel Handy List: Uncertainty Ordinary, Least Squares, Linear Regression
• TREND(known‐y’s, known‐x’s, ) for and related by
• GROWTH(known‐y’s, known‐x’s, ) for and related by
slopeintercept
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
22
© Faith A. Morrison, Michigan Tech U.
43
One final piece of advice:Uncertainty Ordinary, Least Squares, Linear Regression
slopeintercept
Often, you can transform your data to make it linear, allowing you to use linear regression. For example, if you know the ‐data vary as the square root of the ‐data, then
will be linear. If data plotted with log‐log scaling (using scatterplot) look quadratic, then
will be quadratic, and we can use trendline to obtain a fit:
Transforming data can greatly broaden our ability to fit empirical models to data.
versus
log versus log
log log log
© Faith A. Morrison, Michigan Tech U.44
Professor Faith Morrison
Department of Chemical EngineeringMichigan Technological University
Done!
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
23
© Faith A. Morrison, Michigan Tech U.45
Comment on Curve Fitting: Coefficient of Determination,
Which data set has a larger ?
0.0
5.0
10.0
15.0
20.0
25.0
30.0
0.00 2.00 4.00 6.00 8.00 10.00 12.00
y‐data
x‐data
© Faith A. Morrison, Michigan Tech U.46
Comment on Curve Fitting: Coefficient of Determination,
Which data set has a larger ?
y = 2.00832x + 6.42864R² = 0.86297
y = 0.0117x + 6.8857R² = 0.0053
0.0
5.0
10.0
15.0
20.0
25.0
30.0
0.00 2.00 4.00 6.00 8.00 10.00 12.00
y‐data
x‐data
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
24
© Faith A. Morrison, Michigan Tech U.47
Comment on Curve Fitting: Coefficient of Determination,
From page 6:
is a measure of the comparison of the hypothesized linear relationship
and the relationshiopconstant (horizontal line). So, if
it is a horizontal line, will be zero.
© Faith A. Morrison, Michigan Tech U.48
Which is the correct fit?
y = 2.005x + 6.6572R² = 0.9458
0.0
5.0
10.0
15.0
20.0
25.0
30.0
0.00 2.00 4.00 6.00 8.00 10.00 12.00
y‐data
x‐data
y = ‐0.0106x4 + 0.2386x3 ‐ 1.7052x2 + 6.1052x + 4.4855R² = 0.9625
0.0
5.0
10.0
15.0
20.0
25.0
30.0
0.00 2.00 4.00 6.00 8.00 10.00 12.00
y‐data
x‐data
CM3215 Statistics 5: Linear Regression and LINEST (Faith A. Morrison)
1/26/2016
25
© Faith A. Morrison, Michigan Tech U.49
Which is the correct fit?
y = 2.005x + 6.6572R² = 0.9458
0.0
5.0
10.0
15.0
20.0
25.0
30.0
0.00 2.00 4.00 6.00 8.00 10.00 12.00
y‐data
x‐data
y = ‐0.0106x4 + 0.2386x3 ‐ 1.7052x2 + 6.1052x + 4.4855R² = 0.9625
0.0
5.0
10.0
15.0
20.0
25.0
30.0
0.00 2.00 4.00 6.00 8.00 10.00 12.00
y‐data
x‐data
• (it depends on the error bars)• Likely that the linear fit is a “truer” relationship to be
used for interpolation