nonlinear equations

Nonlinear Equations

Jyun-Ming Chen

2

Contents

• Bisection

• False Position

• Newton

• Quasi-Newton

• Inverse Interpolation

• Method Comparison

Fall 2015

3

Solve the Problem Numerically

• Consider the problem in the following general form:

f(x) = 0

• Many methods to choose from:– Interval Bisection

Method

– Newton

– Secant

– …

Fall 2015

4

Interval Bisection Method

• Recall the following theorem from calculus

• Intermediate Value Theorem ( 中間值定理 )– If f(x) is continuous on [a,b]

and k is a constant, lies between f(a) and f(b), then there is a value x[a,b] such that

f(x) = kFall 2015

5

Bisection Method (cont)

• Simply setting k = 0

• Observe:– if sign( f(a) ) ≠ sign( f(b) )– then there is a point x [a, b] such that f(x) = 0

Fall 2015

6

Definition

• non-trivial interval [a,b]:f(a) ≠ 0, f(b) ≠ 0

and

sign( f(a) ) ≠ sign( f(b) )

sign(-2) = -1

sign(+5) = 1

Fall 2015

7

Idea

• Start with a non-trivial interval [a,b]

• Set c(a+b)/2

• Three possible cases:

⑴ f(c) ＝ 0, solution found

⑵ f(c) ≠ 0, [c,b] nontrivial

⑶ f(c) ≠ 0, [a,c] nontrivial

• Keep shrinking the interval until convergence

• → ⑴ problem solved• → ⑵⑶ a new smaller

nontrivial interval ½ size_______

Fall 2015

8

Algorithm

What’s wrong with this code?

Fall 2015

9

Remarks

• Convergence– Guaranteed once a nontrivial interval is found

• Convergence Rate– A quantitative measure of how fast the

algorithm is– An important characteristics for comparing

algorithms

Fall 2015

10

Convergence Rate of Bisection

• Let: – Length of initial

interval L0

– After k iterations, length of interval is Lk

– Lk=L0/2k

– Algorithm stops when Lk eps

• Plug in some values…

93.1910

1log

10

1Let

62

6

k

eps

L

This is quite slow, compared to other

methods…Meaning of

eps Fall 2015

11

How to get initial (nontrivial) interval [a,b] ?

• Hint from the physical problem

• For polynomial equation, the following theorem is applicable:

roots (real and complex) of the polynomial

f(x) = anxn + an-1xn-1 +…+ a1x + aο

satisfy the bound:

) , , , (1

1 10 nn

aaaMaxa

x ) , , , (1

1 10 nn

aaaMaxa

x

Fall 2015

12

Example

• Roots are bounded by

• Hence, real roots are in [-10,10]

• Roots are

–1.5251,

2.2626 ± 0.8844i

093 23 xxx109) 1, 3, ,1( max

1

11 x

complex

Fall 2015

13

Other Theorems for Polynomial Equations

• Sturm theorem: – The number of real roots of an algebraic

equation with real coefficients whose real roots are simple over an interval, the endpoints of which are not roots, is equal to the difference between the number of sign changes of the Sturm chains formed for the interval ends.

Fall 2015

14

Sturm Chain

Fall 2015

15

Example

38879.1 ,32836.10802951.0 ,334734.0 ,21465.1

13)( 5

ix

xxxf

Fall 2015

16

Sturm Theorem (cont)

• For roots with multiplicity:– The theorem does not apply, but …– The new equation : f(x)/gcd(f(x),f’(x))

• All roots are simple

• All roots are same as f(x)

Fall 2015

17

Sturm Chain by Maxima

Fall 2015

18

Maxima (cont)

Fall 2015

19

Descarte’s Sign Rule• A method of determining the

maximum number of positive and negative real roots of a polynomial.

• For positive roots, start with the sign of the coefficient of the lowest power. Count the number of sign changes n as you proceed from the lowest to the highest power (ignoring powers which do not appear). Then n is the maximum number of positive roots.

• For negative roots, starting with a polynomial f(x), write a new polynomial f(-x) with the signs of all odd powers reversed, while leaving the signs of the even powers unchanged. Then proceed as before to count the number of sign changes n. Then n is the maximum number of

negative roots.

3 positive roots

4 negative roots Fall 2015

20

False Position Method

• x2 defined as the intersection of x axis and x0f0-x1f1

• Choose [x0,x2] or [x2,x1], whichever is non-trivial

• Continue in the same way as bisection

• Compared to bisection:x2=(x1+x0)/2

Fall 2015

21

False Position (cont)

Determine intersection point

• Using similar triangles:

)(

)11

(

1

1

0

0

01

102

1

1

0

0

102

1

21

0

02

f

x

f

x

ff

ffx

f

x

f

x

ffx

f

xx

f

xx

)(1

011001

2 fxfxff

x

)(1

011001

2 fxfxff

x

Fall 2015

22

False Position (cont)

Alternatively, the straight line passing thru (x0,f0) and (x1,f1)

Intersection: simply set y=0 to get x

)( 001

010 xx

xx

fffy

0001

01

001

010 )(0

xxfff

xx

xxxx

fff

Fall 2015

23

Example

0 ,1 ,0

0sin3)(

1010

ffxx

exxxf x

k xk (Bisection) fk xk (False

Position)

fk

1 0.5 0.471

2 0.25 0.372

3 0.375 0.362

4 0.3125 0.360

5 0.34315 -0.042 0.360 2.93×10-5

Fall 2015

24

False Position

• Always better than bisection?

Fall 2015

25

(x0, f0)

Newton’s Method

tangent line thru (x0 , f0)

00 )( slope fxf

)(

)0 (

axis-on with intersecti

)(

0100

000

xxff

yset

x

xxffy

,...3,2,1 ,1

kf

fxx

k

kkk

,...3,2,1 ,1

kf

fxx

k

kkk

Graphical Derivation

Also known as Newton-Raphson method

Fall 2015

26

Newton’s Method (cont)

• Derived using Taylor’s expansion

f(x)

))(x-x(xf)f(x(x)f

)(xfxx

xxxf

xxxfxfxf

ofion approximat good a is

ˆthen

large not too and near is if

)(2

)())(()()(

000

00

20

0000

Fall 2015

27

Taylor’s Expansion (cont)

0)(ˆ

0)(for iteratenext theas

0)(ˆ ofroot theTake

xf

xf

xf

,...3,2,1 ,1

kf

fxx

k

kkk

,...3,2,1 ,1

kf

fxx

k

kkk

Fall 2015

28

Example

• Old Barbarians used the following formula to compute the square root of a number a

explain why this works:

)(2

11

kkk x

axx

8-84

4-025.11

3

2-2

1-1

0

104.7 107.41

103 1.00030 )025.1(2

1

102.5 025.1)25.1

125.1(

2

1

102.5 25.1)2

12(

2

1

1 2

:Error

x

x

x

x

x

Finding square root of 1 (a=1)with x0 = 2

Fall 2015

29

Newton’s Method

)(2

1 ...

)22

( 2

2)(

)(

)(

)(

0)( solve toMethod sNewton' Use

0)( of roots theof one is

22

1

2

1

2

kk

kk

kk

k

kkk

k

kkk

x

ax

x

a

x

xx

x

axxx

xxf

axxf

xf

xfxx

xf

axxfa

01)( 2 xxf

Fall 2015

30

Definition

• Error of the ith iterate

• Order of a method m, satisfies

where Ek is an error bound of k

)lim (i.e.

valueconverged theis where

αx

x

ii

ii

constantlim 1

k

mk

k

Ε

Ε constantlim 1

k

mk

k

Ε

Ε

Fall 2015

31

Linear Convergence of Bisection

root

a0

L2

L1

L0

a1 b1

a2 b2

b0

2

2

1222

0111

0

LabL

LabL

abL

22

is bounderror The

2 isroot of approx. reasonable a

,],[With

000

00

00

Lab

ba

ba

22

is bounderror The

2 isroot of approx. reasonable a

,],[With

000

00

00

Lab

ba

ba

Fall 2015

32

/2

2

1limor

2

1 /2

/2

22

11

1

211

00

L

E

E

Ε

ΕL

L

k

k

k

Linear Convergence of Bisection (cont)

• We say the order of bisection method is one, or the method has linear convergence

Fall 2015

33

Quadratic Convergence of Newton

• Let x* be the converged solution

• Recall

)()()(

)(2

1)()()( 2

xfxfxf

xfxfxfxf

)(

)(1

k

kkk xf

xfxx

Fall 2015

34

Quadratic Convergence of Newton (cont)

• Subtracting x*:

21

2

2

1

1

)(

)(

2

1

)(

)(

2

1

)(

)(21

)()(

)(

)(

kk

kkk

kk

kk

k

kkk

xf

xf

xf

xf

xf

xfxfxf

xf

xfxxxx

Or we say Newton’s method has quadratic convergenceFall 2015

35

Example: Newton’s Method

• f(x)= x3–3x2 – x+9=0

10)9131max(1

1

9,1,3,1

]10,10[ thmRecall

0123

,,,x

aaaa

xk xk

0 0

1 9

2 6.41

46 -1.5250

163)(

93)(

0 choose

2

23

0

xxxf

xxxxf

x

Worse than bisection !?Fall 2015

36

Why?

• plot f(x) • Plot xk vs. k

-1.525k

xk 60

5

30

10

-1.525

-10 25 35 40

Fall 2015

37

Newton Iteration

-20

0

20

40

60

80

100

0 10 20 30 40 50 60 70

k

xk 1數列

Fall 2015

38

Case 1:

Fall 2015

39

Case 2:

Diverge to

Fall 2015

40

Recall Quadratic Convergence of Newton’s

• The previous example showed the importance of initial guess x0

• If you have a good x0, will you always get quadratic convergence?– The problem of multiple-root

21 )(

)(

2

1kk xf

xf

Fall 2015

41

Example• f(x)=(x+1)3=0• Convergence is linear near multiple roots

Prove this!!

Fall 2015

42

Multiple Root

• If x* is a root of f(x)=0, then (x-x*) can be factored out of f(x)– f(x) = (x-x*) g(x)

• For multiple roots:– f(x) = (x-x*)k g(x) – k>1 and g(x) has no factor of (x-x*)

Fall 2015

43

Multiple Root (cont)

0][*)*(*)(:1

0*)(*)(

)(*)()()(:1

)](*)()([*)(

)(*)()(*)()(

1

1

1

k

k

kk

xxxfk

xgxf

xgxxxgxfk

xgxxxkgxx

xgxxxgxxkxf

Implication:Fall 2015

44

• where k is the multiplicity of the root

• Get quadratic convergence!

• Problem: do not know k in advance!

Remedies for Multiple Roots

)(

)(1

n

nnn xf

xfkxx

Fall 2015

45

Modified Newton’s Method

0(x) ofroot thealso is 0(x) ofroot the

:Check

0)( ofroot thefind tomethod sNewton' use

)(

)()(function new a Define

)(*)()(*)()(

)(*)()( ofty multiplici1

fF

xF

xf

xfxF

xgxxxgxxkxf

xgxxxfkkk

k

Fall 2015

46

Modified Newton’s Method (cont)

)(*)()(

)(*)(

)(*)()(*)(

)(*)(

)(

)()(

) converge alwaysNewton (hence,

roots multiple no has 0)(

1

xgxxxkg

xgxx

xgxxxgxxk

xgxx

xf

xfxF

llyquadratica

xF

kk

k

Fall 2015

47

Examplef(x)=(x–1)3sin((x – 1)2)

Fall 2015

48

Quasi-Newton’s Method

• Recall Newton:

• The denominator requires derivation and extra coding

• The derivative might not explicitly available (e.g., tabulated data)

• May be too time-consuming to compute in higher dimensions

)(

)(1

k

kkk xf

xfxx

Fall 2015

49

Quasi-Newton (cont)

• Quasi:

• where gk is a good and easily computed approx. to f’(xk)

• The convergence rate is usually inferior to that of Newton’s

k

kkk g

xfxx

)(1

Fall 2015

50

Secant Method

– Use the slope of secant to replace the slope of tangent

– Need two points to start

)()()(

)(

Or,

)()(

11

1

1

1

kkkk

kkk

kk

kkk

xxxfxf

xfxx

xx

xfxfg

Order: 1.62

Fall 2015

51

Idea:

• x2: Intersection of x-axis and a line interpolating x0 f0 & x1 f1

• x3: Intersection of x-axis and a line interpolating x1 f1 & x2 f2

• xk+1: Intersection of x-axis and a line interpolating xk-1fk-1 & xkfk

x0x1

‧

‧

‧‧

x2

Fall 2015

52

Comparison

• Newton’s method • False Position

(Newton)

xkxk+1

‧

‧‧‧

f ’(xk)

‧

Fall 2015

53

Secant vs. False Position

False PositionSecantFall 2015

54

Beyond Linear Approximations

• Both secant and Newton use linear approximations• Higher order approximation yields better accuracy?• Try to fit a quadratic polynomial g(x) thru the

following three points:g(xi) = f(xi), i = k, k–1, k – 2

• Let xk+1 be the root of g(x) = 0– Could have two roots; choose the one near xk

• This is called the Muller's Method

Fall 2015

55

• See Textbook

• g(x) 通過 (xk-2, fk-2), (xk-1, fk-1), (xk,fk)

Muller's MethodOrder: 1.84

Fall 2015

56

Finding the Interpolating Quadratic Polynomial g(x)

kkkk

kkkk

kkkk

faxaxaxg

faxaxaxg

faxaxaxg

axaxaxg

012

2

10112

121

20212

222

012

2

)(

)(

)(

)(

3 eqns to solveunknowns ： a2 , a1 , a0

))((1

)()( 121

21

1

1

21

1

kkkk

kk

kk

kk

kkk

kk

kkk xxxx

xx

ff

xx

ff

xxxx

xx

fffxg ))((

1)()( 1

21

21

1

1

21

1

kkkk

kk

kk

kk

kkk

kk

kkk xxxx

xx

ff

xx

ff

xxxx

xx

fffxg

Or,

Double-check !

Fall 2015

57

SummaryIterative Methods for Solving f(x)=0• Basic Idea:

– Local approximation + iterative computation

– At kth step, construct a polynomial p(x) of degree n, then solve p(x) = 0; take one of the roots as the next iterate, xk+1

• In other words,– construct p(x)

– solve p(x) = 0; find the intersection between y=p(x) and x-axis

– choose one root

Fall 2015

58

Revisit Newton

‧

‧xk+1 xk

))(()()(

ofroot theis

)(

)(

1

1

kkk

k

k

kkk

xxxfxfxp

x

xf

xfxx

p(x): is a linear approximation passing thru(xk,fk) with the slope fk

‘

Fall 2015

59

Revisit Secant

p(x)

)()(1

1k

kk

kkk xx

xx

fffxp

p(x): is a linear approximation

passing thru (xk-1,fk-1) and (xk,fk) with the secant slope

xk-1xk

‧

‧

‧‧

Fall 2015

60

Revisit Muller

• p(x) is a parabola (2nd degree approximation) passing thru three points

• Heuristic: choose the root that is closer to the previous iterate

• Potential problem:– No solution (parabola and x-axis do not

intersect!)

Fall 2015

61

Categorize by Starting Condition

• Bisection and False Position– Require non-trivial

interval [a,b]

– Convergence guaranteed

• Newton: one point – x0 → x1 →…

• Secant: two points– x0 x1 → x2 → …

• Muller: three points – x0 x1 x2 → x3→ …

• These methods converge faster but can diverge …

Fall 2015

62

A Slightly Different Method:Inverse Interpolation

• Basic Idea (still the same)– Local approximation + iterative computation

• Method:– At kth step, construct a polynomial g(y) of

degree n; then compute the next iterate by setting g(y = 0):

)0(1 ygxk

Fall 2015

63

Inverse Linear Interpolation

• Secant: Inverse linear Interpolation

‧

‧‧

(xk-1, fk-1)

(xk, fk)

(xk+1, fk+1)x

y

)(

)(

1

11

1

1

1

1

kkk

kkkk

kkk

kkk

kk

kk

k

k

fff

xxxx

fyff

xxxx

xx

ff

xx

fy

x = g(y), xk+1=g(0)

Fall 2015

64

Inverse Quadratic Interpolation

• Find another parabola: x = g(y)

• Set the next iteratexk+1 = g(0)

))((1

)()( 121

21

1

1

21

1

kkkk

kk

kk

kk

kkk

kk

kkk fyfy

ff

xx

ff

xx

fffy

ff

xxxyg ))((

1)()( 1

21

21

1

1

21

1

kkkk

kk

kk

kk

kkk

kk

kkk fyfy

ff

xx

ff

xx

fffy

ff

xxxyg

Fall 2015

65

Example (IQI)

• Solve f(x)=x3–x=0– x0 = 2, x1 = 1.2, x2 = 0.5

k xk+1 xk-2 fk-2 xk-1 fk-1 xk fk

1 0.8102335181319 2 6 1.2 0.528 0.5 -0.3752 1.3884057934643 1.2 0.528 0.5 -0.375 0.810234 -0.2783333 1.5252259894989 0.5 -0.375 0.810234 -0.27833 1.388406 1.2879834 0.9414762279683 0.810234 -0.27833 1.388406 1.287983 1.525226 2.0229295 0.9844320642825 1.388406 1.287983 1.525226 2.022929 0.941476 -0.1069736 1.0010409722070 1.525226 2.022929 0.941476 -0.10697 0.984432 -0.0304137 1.0000043435326 0.941476 -0.10697 0.984432 -0.03041 1.001041 0.0020858 0.9999999997110 0.984432 -0.03041 1.001041 0.002085 1.000004 8.69E-069 1.0000000000000 1.001041 0.002085 1.000004 8.69E-06 1 -5.78E-10

Fall 2015

66

Professional Root Finder

• Need guaranteed convergence and high convergence rate

• Combine bisection and Newton (or inverse quadratic interpolation)– Perform Newton step whenever possible

(convergence is achieved)– If diverge, switch to bisection

Fall 2015

67

Brent’s Method

• Guaranteed to converge• Combine root bracketing, bisection and

inverse quadratic interpolation– van Wijngaarden-Dekker-Brent method– Zbrent in NR

• Brent uses the similar idea in one-dimensional optimization problem– Brent in NR

Fall 2015

GSL Library

68Fall 2015

http://www.gnu.org/software/gsl/manual/html_node/One-dimensional-Root_002dFinding.html

69

Fast Inverse Square Root

To understand the magic number 0x5f3759df, read: Chris Lomont or Paul Hsieh

Fall 2015

http://www.google.com.tw/url?sa=t&source=web&ct=res&cd=1&url=http%3A%2F%2Fwww.lomont.org%2FMath%2FPapers%2F2003%2FInvSqrt.pdf&ei=w5u7Se3wOcnUkAX-xvSkCA&usg=AFQjCNFI4TFQBO1pnDfDkYd5IZ8pxhFWsA&sig2=JaWXjsk8CRamTFJyB0vTUw

http://www.mceniry.net/papers/Fast%20Inverse%20Square%20Root.pdf

nonlinear equations

Documents

roots real

number of real roots

sign fa sign fb sign

negative real roots

fxall roots

simpleall roots

roots are1

length of interval