nassp masters 5003f - computational astronomy - 2010 lecture 5 minimization continued interpolation...
TRANSCRIPT
NASSP Masters 5003F - Computational Astronomy - 2010
Lecture 5
• Minimization continued
• Interpolation & regridding.
NASSP Masters 5003F - Computational Astronomy - 2009
Minimization issues:1. Robust setting of starting bounds: how to make
sure the minimum is within the box.• Once you know that, you can, in the words of Press et
al, “hunt it down like a scared rabbit.”• For only 1 parameter, there are algorithms to find a
bracket which must contain the minimum (Press et al ch. 10.1).• This is more difficult to arrange in more than 1
dimension because one has then to deal with a bounding surface, possibly of complicated shape, rather than 2 scalar values.
2. Speed – if you have to find a series of minima in multi-dimensional functions, a good choice of algorithm (and its settings) can give a valuable speed dividend.
• Eg the XMM-Newton source-fitting task – fits 7 or more parameters per source – it’s one of the slowest tasks in the pipeline. Could have been a lot worse though with a poor choice of algorithm (it uses Levenberg-Marquardt).
NASSP Masters 5003F - Computational Astronomy - 2009
Minimization issues continued:3. Constraints.4. Stability.
• There is often a trade-off between speed and stability.
5. When to stop – ie convergence tests.• Basically the time to stop is when the step size
per iteration is smaller than the dimensions of the uncertainty ellipsoid.
6. Is it a local minimum or a global one?• Many algorithms will be perfectly happy
stopping in a tiny local dip in the function.• Extra intelligence is needed to hunt around and
make sure your minimum is the best in the neighbourhood. Annealing algorithms.
Minimization with 1 parameter
NASSP Masters 5003F - Computational Astronomy - 2009
Can ∂U/∂θ =0 be directly inverted?
yesno
Can U’ and U’’ be evaluated?
no yes
Do so. QED.
Newton-Raphson: find root of U’
Brent’s method
NASSP Masters 5003F - Computational Astronomy - 2009
1-parameter methods:
• Newton-Raphson method (Press et al ch 9.4).– Finding a minimum in U is the same as finding
a root of U’.
• Brent’s method (Press et al ch 10.2). This consists of:– Parabolic fit to three samples of U;– Golden-section search when the parabolic fit
is not stable.
An example:• A problem in x-ray source detection gave
rise to the following requirement:– Find θ that minimizes
subject to the constraint that
for all k. The bk are known background values and sk ≡sij are known PSF values. See I Stewart
(yes, me) A&A (2009) for a detailed description of the problem.NASSP Masters 5003F - Computational Astronomy - 2009
.logPixels
,,
Events
ji
jik kk
k ssb
bU
0 kk sb
NASSP Masters 5003F - Computational Astronomy - 2009
continued...• The minimum is the place at which
∂U/∂θ=0. Ie, θ at the minimum is solution of the equation
This can’t (I don’t think) be solved ‘in closed form’ but it is easy to solve via Newton’s method.– Note there are in general MANY solutions of
the equation (see how many roots there are in the following graph!) but only one (at most) which satisfies the constraint.
.1 Pixels
,,
Events
ji
jik kk
ssb
NASSP Masters 5003F - Computational Astronomy - 2009
A diagrammatic example:
s
θmin
Events 1
k kk sby
NASSP Masters 5003F - Computational Astronomy - 2009
A trick to help the solution:
• Newton’s method works best if the function doesn’t curve too wildly. The present hyperbola-like function is not very promising material.
• But! If we define a coordinate transform
• The result is much more docile:
min
1
s
b
x
.1
1 Pixels
,,
Events
min
ji
jik kk
ssbsbx
x
NASSP Masters 5003F - Computational Astronomy - 2009
Bounds and special cases:• A little thought shows that θ is bounded as
follows:
• Either bound can be taken as a starting value.
• Special cases:– If there are no events, there is no solution to
the equation: from the physics of the problem one can deduce that θ=-(b/s)min.
– If there is just one event, the equation is trivially solvable.
min,
events
min,
1
s
b
s
N
s
b
s jiji
NASSP Masters 5003F - Computational Astronomy - 2009
Minimization with >1 parameter• The basic idea is to
1. Measure the gradient,2. then head down-hill.
You’re bound to hit at least a local minimum eventually.
• But how far should you go at each step, before stopping to sample U and ▼U again?
– Suppose you find you gone wayyy too far - have climbed through a gully and higher up the opposite wall than the height you started?
– Or suppose you have only crept a few inches down a slope which seems to extend for miles?
• Solution: treat each linear advance like a 1-D minimization.
NASSP Masters 5003F - Computational Astronomy - 2009
Steepest descent• This method of steepest descent is pretty fool-
proof – it will generally get there in the end.• It isn’t very efficient though - you can find
yourself engaged in time-consuming ricochets back and forth across a gully.
From Press et al,“Numerical Recipes”
NASSP Masters 5003F - Computational Astronomy - 2009
Powell’s solution to this:
(i) Cycle through each of a set ofN directions, finding the minimumin each direction.
(i) Discard the direction with thelongest step, and replace it withthe direction of the vector sum.
New direction
NASSP Masters 5003F - Computational Astronomy - 2009
Levenberg method:• It’s a bit like Newton’s method – it keeps trying
until it converges.• The problem of course is that, like Newton’s
method, it may not converge.• Marquardt’s modification is extremely cunning.
By means of a single ‘gain’ variable, his algorithm veers between a rapid but undependable Levenberg search and the slower but reliable Steepest Descent.– You get both the hare and the tortoise.
U newold ΘΘH
NASSP Masters 5003F - Computational Astronomy - 2009
Levenberg-Marquardt• At each iteration, the algorithm solves
where I is the identity matrix and λ is a gain parameter.
• If the next U is larger than previous, this means the Levenberg assumption was too bold: λ is increased to make the algorithm more like steepest descent (more tortoise).
• But if U is smaller than previous, λ is reduced, to make the algorithm more Levenberg-like (more hare).
U ΘIH
NASSP Masters 5003F - Computational Astronomy - 2009
Interpolation• The difference between interpolation and
fitting is the comparison between the number Npar of parameters number Ndata of data points.– If Ndata > Npar, you fit.– If Ndata = Npar, you interpolate.– If Ndata < Npar, you either kill off some
parameters... or use Singular Value Decomposition... or resort to Bayesian black magic.
• Because interpolation is only loosely constrained by the data (think of a strait jacket with most of the buttons undone), it can be unstable. See eg in a few slides.
NASSP Masters 5003F - Computational Astronomy - 2009
An example technique: cubic splines.• FYI: a spline originally was a flexible bit of metal used by
boat designers to draw smooth curves between pegs.
• The space between any two points is interpolated by a cubic function.– A cubic has 4 parameters; thus you need 4 pieces of information
to specify it.– These are obtained from the 4 neighbouring function values.
• with special treatment of the ends, where only 3 points are available.
– Cubic interpolation matches f, f’ and f” at the points.
NASSP Masters 5003F - Computational Astronomy - 2009
Cubic Splines algorithm
• See Press et al chapter 3.3 for details.
• Diagrammatic definition of the xs and ys:
• In the formula which is simplest to calculate, the cubic is heavily disguised:
xj-1xj
xj+1
xj+2
yj-1
yj+1yj
yj+2
.11 jjjj yxDyxCyxByxAy
NASSP Masters 5003F - Computational Astronomy - 2009
Cubic Splines algorithm
• To evaluate this, obviously we need to calculate A, B, C and D, plus y”j and y”j+1:
jj
j
xx
xxA
1
1
jj
j
xx
xxAB
1
1
213
6
1xxAAC j
213
6
1xxBBD j
NASSP Masters 5003F - Computational Astronomy - 2009
Cubic Splines algorithm
• All the y”s are obtained at one time by solving a tri-diagonal system of equations specified by:
with (at its simplest) y”1 and y”N both set to zero.
• For examples of instability....
1
1
1
11
1111
1
636
jj
jj
jj
jjj
jjj
jjj
jj
xx
yy
xx
yyy
xxy
xxy
xx
NASSP Masters 5003F - Computational Astronomy - 2009
But splines are Evil!
Credit: Andy Read,U Leics
NASSP Masters 5003F - Computational Astronomy - 2009
But splines are Evil!
Credit: Andy Read,U Leics
Regridding• Suppose you have data at irregularly spaced
samples which you want to have on regular samples (ie with the same space between all adjacent samples).– You have to regrid the data – somehow
interpolate between the input sample points and resample on the regular grid.
– One way is to convolve the samples with a regridding function, then sample the resulting smooth function on the regular grid.
– This is often done in radio interferometry.• The reason is to allow a discrete Fourier transform to
be used (because it is faster).• Thus, much attention is paid to chosing a regridding
function which has a well-behaved Fourier transform.
NASSP Masters 5003F - Computational Astronomy - 2009
NASSP Masters 5003F - Computational Astronomy - 2009
A trap when regridding data which is already regularly sampled:
Original binned data: bin widths are 2 units.
Re-binned data: bin widths are 3.5 units. Moiré effect causes a dipevery 4th bin (since 4 is the smallest integer n such that nx3.5 is exactlydivisible by 2).
Also happens in 2D images.
NASSP Masters 5003F - Computational Astronomy - 2009
One solution: dithering.