introduction to function minimization

33
Introduction to Function Minimization

Upload: leda

Post on 23-Jan-2016

59 views

Category:

Documents


0 download

DESCRIPTION

Introduction to Function Minimization. Motivation example. Data on height of a group of 10000 people, men and women Data on gender not recorded, not known who was man and who woman Can one estimate number of men in the group?. Asymmetric histogram Non-Gaussian: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Introduction to Function Minimization

Introduction to Function Minimization

Page 2: Introduction to Function Minimization

Motivation exampleData on height of a group of 10000 people, men and womenData on gender not recorded, not known who was man and who woman

Can one estimate number of men in the group?

Asymmetric histogram

Non-Gaussian: two subgroups (men & woman) ? two superposed Gaussians ?

______________________________This is artificially simulated data, just demo, two Gaussians with different mean randomly men/women = 7/3

Page 3: Introduction to Function Minimization

See error bars!

N

Page 4: Introduction to Function Minimization

Best Gaussian fit

Page 5: Introduction to Function Minimization

Two Gaussiansbest fit

This is artificially simulated data, just demo, two Gaussians were used for simulation, with different mean, randomly men/women = 7/3

Page 6: Introduction to Function Minimization

Two Gaussiansbest fit

This is artificially simulated data, just demo, two Gaussians were used for simulation, with different mean, randomly men/women = 7/3

Page 7: Introduction to Function Minimization

Two Gaussiansbest fit

This is artificially simulated data, just demo, two Gaussians were used for simulation, with different mean, randomly men/women = 7/3

Page 8: Introduction to Function Minimization

Press-the-button user: find the best fit by two Gaussiansbut: How it is done?

Gaussian)(

2

2

2

2

2

)(exp

2

1),;N(

x

x

Two Gaussians

),;N(),;N(),,,,,;F( 543210543210 ppxpppxpppppppx

Find the best values of the parameters

Needs goodness-of-fit criterion

Fit function is nonlinear in its parameters

Page 9: Introduction to Function Minimization

50 histogram binseach bin represents "one data point"

j

j

j

hx

:error :contentbin

:positionbinth j

Goodness-of-fit (least squares) :

50

12

2543210

5432102 )),,,,,;F((

),,,,,(j j

jj ppppppxhpppppp

Page 10: Introduction to Function Minimization

Mathematically, the problem is the following

I have a function of n variables (parameters of goodness-of-fit)

I want to find the pointfor which the function achieves its minimum

),,,,,( 5432102 pppppp

543210 ,,,,, pppppp

The function is often non-linear, so analytical solution of the problem is usually hopeless. I need numerical methods to attack the problem

Page 11: Introduction to Function Minimization

Numerically, the problem is the following

I have a function (in the sense of program subroutine) of n parameters

I want to find the pointfor which the function achieves its minimum. Each call to evaluate the function value is often time-consuming, having in mind its definition like

)p,p,p,p,p,FCN(p 543210

543210 p,p,p,p,p,p

50

12

2543210

543210

)),,,,,;F((),,,,,FCN(

j j

jj ppppppxhpppppp

I need a numerical procedure which can call the function FCN and by repeating calls with possibly different values of the parameters finally finds the minimizing set of parameter values

Page 12: Introduction to Function Minimization

Stepping algorithms

• Start at some point in the parameter space• Choose direction and step size• Walk according to some clever strategy doing

iterative steps and looking for small values of the minimized function

Page 13: Introduction to Function Minimization

One dimensional optimization problem

Stepping algorithm with adaptable step size

fcurrent =FCN(xcurrent);

repeat forever {

xtrial = xcurrent+step; ftrial=FCN(xtrial);

if (ftrial<fcurrent) //success

{xcurrent=xtrial; fcurrent=ftrial; step=3*step;}

else //failure

{step=-0.4*step;}

}

Fast approach to minimum area, slow convergence at the end

Page 14: Introduction to Function Minimization
Page 15: Introduction to Function Minimization
Page 16: Introduction to Function Minimization

parabola

Success – failure method

Page 17: Introduction to Function Minimization

parabola

line

line ............estimates first derivative (gradient)

parabola ... estimates first as well as second derivative

Page 18: Introduction to Function Minimization

Stepping method can estimate gradient as well as second derivative

All functions around minimum look like parabola

Newton: go straight to minimum

gGxx

xxGgxx

xxGxxgfx

10min

0minminmin'

2000

0)()(f

)(21

)()f(

Inverse to the second derivative helps to jump straight to minimum in the direction of the negative gradient

Page 19: Introduction to Function Minimization

Any stepping method needs

• starting point• initial step size• end criterion (otherwise infinite loop)

• step size < δ• improvement in the function values < ε

Page 20: Introduction to Function Minimization

Problem of local and/or boundary minima

Page 21: Introduction to Function Minimization

Many-dimensional minimization

)p,FCN(p 10

• start at point (p0,p1)

• fix value p1

• perform minimization with respect to p0

• fix value p0

• perform minimization with respect to p1

Page 22: Introduction to Function Minimization

axes in wrong directions: not in the direction of gradient

cure: rotate axes after each iteration so that the first axis is in the direction of the (estimated) gradient

Page 23: Introduction to Function Minimization
Page 24: Introduction to Function Minimization

Simplex minimization

get rid of the worst point

estimated gradient direction

trial point

Page 25: Introduction to Function Minimization
Page 26: Introduction to Function Minimization

Gradient methods

Stepping methods which use in addition to the function value at the current point also local gradient at that point

The local gradient can be obtained

• by calling a user-supplied procedure which returns the vector (n-dimensional array) of first order derivatives

• by estimating the gradient numerically evaluating the function value at n points in a vary small

neighborhood of the current point

Performing one-dimensional minimization in the direction of the negative gradient significantly improves the current point position

Page 27: Introduction to Function Minimization

Quadratic approximation to the minimized function at the current point p0 in n dimensions

))((21

)()F( ,0,0,0,0,00 jjiiijiii ppppGppgfp

Knowing g, one can perform one dimensional optimization with respect to "step size" α in search for the best point

iii gpp ,0,0 If G0 were known, the minimum will be at

iijii gGpp ,01,0,0

It would be useful, if the gradient method could also estimate the matrix G-1

Page 28: Introduction to Function Minimization

If the minimized function is exactly quadratic, then G is constant in the whole parameter space

))((21

)()F( ,0,0,0,0,00 jjiiijiii ppppGppgfp

If the minimized function is not exactly quadratic, we expect slow variations of G in the region not far from the minimum

Local numerical estimate of G is costly, needs many calls to F, and than matrix inversion is needed

Idea: can one iteratively estimate G-1?

Page 29: Introduction to Function Minimization

Example of a variable metric method

lklk

ljlkik

kk

jiijij

iiiii

ii

jii

ijii

G

GGGG

ggpppggpp

Ggp

1,0

1,0

1,01

,01

,0i,0

,0

-11,0,0,0

:iterate

and :calculateat :evaluate

value optimizedfor : togo)G (estimated , , :pointcurrent

for quadratic functions the iteration converges to minimum and true G-1

Page 30: Introduction to Function Minimization

Variable metric methods

• fast convergence in the almost quadratic region around minimum

• added value: good knowledge of G-1 at minimum what means that the shape of the optimized function around minimum is known

Page 31: Introduction to Function Minimization

MINUITAuthor: F.James (CERN)

Complex minimization program (package) comprising various minimization procedures as well as other useful data-fitting tools

Among them

• SIMPLEX

• MIGRAD (variable metric method)

Page 32: Introduction to Function Minimization

MINUITOriginally FORTRAN program in the CERN library

Now available in C++

• stand alone (SEAL project)

http://seal.web.cern.ch/seal/MathLibs/Minuit2/html/index.html

•contained in the CERN ROOT package

http://root.cern.ch

Available in Java (FreeHEP JAIDA project)

http://java.freehep.org/freehep-jminuit/index.html

Page 33: Introduction to Function Minimization

References

F. James and M. Winkler, C++ MINUIT User's Guidehttp://seal.cern.ch/documents/minuit/mnusersguide.pdf

F. James, Minuit Tutorial on Function Minimization http://seal.cern.ch/documents/minuit/mntutorial.pdf

F. James, The Interpretation of Errors in Minuit http://seal.cern.ch/documents/minuit/mnerror.pdf

Microsoft Visual c++ Express is FREE c++ for Windowshttp://www.microsoft.com/express/vc/