lecture 4 the l 2 norm and simple least squares

68
Lecture 4 The L 2 Norm and Simple Least Squares

Upload: kevina

Post on 07-Feb-2016

67 views

Category:

Documents


0 download

DESCRIPTION

Lecture 4 The L 2 Norm and Simple Least Squares. Syllabus. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lecture 4  The  L 2  Norm and Simple Least Squares

Lecture 4

The L2 Normand

Simple Least Squares

Page 2: Lecture 4  The  L 2  Norm and Simple Least Squares

SyllabusLecture 01 Describing Inverse ProblemsLecture 02 Probability and Measurement Error, Part 1Lecture 03 Probability and Measurement Error, Part 2 Lecture 04 The L2 Norm and Simple Least SquaresLecture 05 A Priori Information and Weighted Least SquaredLecture 06 Resolution and Generalized InversesLecture 07 Backus-Gilbert Inverse and the Trade Off of Resolution and VarianceLecture 08 The Principle of Maximum LikelihoodLecture 09 Inexact TheoriesLecture 10 Nonuniqueness and Localized AveragesLecture 11 Vector Spaces and Singular Value DecompositionLecture 12 Equality and Inequality ConstraintsLecture 13 L1 , L∞ Norm Problems and Linear ProgrammingLecture 14 Nonlinear Problems: Grid and Monte Carlo Searches Lecture 15 Nonlinear Problems: Newton’s Method Lecture 16 Nonlinear Problems: Simulated Annealing and Bootstrap Confidence Intervals Lecture 17 Factor AnalysisLecture 18 Varimax Factors, Empircal Orthogonal FunctionsLecture 19 Backus-Gilbert Theory for Continuous Problems; Radon’s ProblemLecture 20 Linear Operators and Their AdjointsLecture 21 Fréchet DerivativesLecture 22 Exemplary Inverse Problems, incl. Filter DesignLecture 23 Exemplary Inverse Problems, incl. Earthquake LocationLecture 24 Exemplary Inverse Problems, incl. Vibrational Problems

Page 3: Lecture 4  The  L 2  Norm and Simple Least Squares

Purpose of the Lecture

Introduce the concept of prediction error and the norms that quantify it

Develop the Least Squares Solution

Develop the Minimum Length Solution

Determine the covariance of these solutions

Page 4: Lecture 4  The  L 2  Norm and Simple Least Squares

Part 1

prediction error and norms

Page 5: Lecture 4  The  L 2  Norm and Simple Least Squares

The Linear Inverse ProblemGm = d

Page 6: Lecture 4  The  L 2  Norm and Simple Least Squares

The Linear Inverse ProblemGm = ddata

model parameters

data kernel

Page 7: Lecture 4  The  L 2  Norm and Simple Least Squares

Gmest = dprean estimate of the model parameters

can be used to predict the data

but the prediction may not match the observed data

(e.g. due to observational error)dpre ≠ dobs

Page 8: Lecture 4  The  L 2  Norm and Simple Least Squares

e = dobs -dprethis mismatch leads us to define the

prediction error

e = 0when the model parameters exactly predict

the data

Page 9: Lecture 4  The  L 2  Norm and Simple Least Squares

B)A)

0 5 100

5

10

15

z

d

0 5 100

5

10

15

z

d

zi

diobsdipre ei

example of prediction errorfor line fit to data

Page 10: Lecture 4  The  L 2  Norm and Simple Least Squares

“norm”rule for quantifying the overall size

of the error vector elot’s of possible ways to do it

Page 11: Lecture 4  The  L 2  Norm and Simple Least Squares

Ln family of norms

Page 12: Lecture 4  The  L 2  Norm and Simple Least Squares

Ln family of norms

Euclidian length

Page 13: Lecture 4  The  L 2  Norm and Simple Least Squares

0 1 2 3 4 5 6 7 8 9 10-1

0

1

z

e

0 1 2 3 4 5 6 7 8 9 10-1

0

1

z

|e|

0 1 2 3 4 5 6 7 8 9 10-1

0

1

z

|e|2

0 1 2 3 4 5 6 7 8 9 10-1

0

1

z

|e|10

higher norms give increaing weight to largest element of e

Page 14: Lecture 4  The  L 2  Norm and Simple Least Squares

limiting case

Page 15: Lecture 4  The  L 2  Norm and Simple Least Squares

guiding principle for solving an inverse problem

find the mest

that minimizes E=||e||

withe = dobs –dpreand dpre = Gmest

Page 16: Lecture 4  The  L 2  Norm and Simple Least Squares

but which norm to use?

it makes a difference!

Page 17: Lecture 4  The  L 2  Norm and Simple Least Squares

0 2 4 6 8 100

5

10

15

z

d

outlier

L1L2

L∞

Page 18: Lecture 4  The  L 2  Norm and Simple Least Squares

B)A)

0 5 100

0.1

0.2

0.3

0.4

0.5

d

p(d)

0 5 100

0.1

0.2

0.3

0.4

0.5

d

p(d)

Answer is related to the distribution of the error. Are outliers common or rare?

long tailsoutliers common

outliers unimportantuse low norm

gives low weight to outliers

short tailsoutliers uncommonoutliers important

use high normgives high weight to outliers

Page 19: Lecture 4  The  L 2  Norm and Simple Least Squares

as we will show later in the class …

use L2 norm when data has

Gaussian-distributed error

Page 20: Lecture 4  The  L 2  Norm and Simple Least Squares

Part 2

Least Squares Solution to Gm=d

Page 21: Lecture 4  The  L 2  Norm and Simple Least Squares

L2 norm of error is its Euclidian length

so E is the square of the Euclidean lengthmimimize E

Principle of Least Squares

= eTe

Page 22: Lecture 4  The  L 2  Norm and Simple Least Squares

Least Squares Solution to Gm=d

minimize E with respect to mq∂E/∂mq = 0

Page 23: Lecture 4  The  L 2  Norm and Simple Least Squares

so, multiply out

Page 24: Lecture 4  The  L 2  Norm and Simple Least Squares

first term

Page 25: Lecture 4  The  L 2  Norm and Simple Least Squares

first term

∂mj /∂mq = δjq since mj and mq are independent variables

Page 26: Lecture 4  The  L 2  Norm and Simple Least Squares

ai = Σj δij bj = bi

Kronecker delta(elements of identity matrix) [I]ij = δij

a = Ib = bai = Σj δij bj = bi

i

Page 27: Lecture 4  The  L 2  Norm and Simple Least Squares

second term

third term

Page 28: Lecture 4  The  L 2  Norm and Simple Least Squares

putting it all together

or

Page 29: Lecture 4  The  L 2  Norm and Simple Least Squares

presuming [GTG] has an inverse

Least Square Solution

Page 30: Lecture 4  The  L 2  Norm and Simple Least Squares

presuming [GTG] has an inverse

Least Square Solution

memorize

Page 31: Lecture 4  The  L 2  Norm and Simple Least Squares

examplestraight line problem

Gm = d

Page 32: Lecture 4  The  L 2  Norm and Simple Least Squares
Page 33: Lecture 4  The  L 2  Norm and Simple Least Squares
Page 34: Lecture 4  The  L 2  Norm and Simple Least Squares
Page 35: Lecture 4  The  L 2  Norm and Simple Least Squares

in practice,no need to multiply matrices

analytically

just use MatLab

mest = (G’*G)\(G’*d);

Page 36: Lecture 4  The  L 2  Norm and Simple Least Squares

another examplefitting a plane surface

Gm = d

Page 37: Lecture 4  The  L 2  Norm and Simple Least Squares

x, km

y, km

z, km

Page 38: Lecture 4  The  L 2  Norm and Simple Least Squares

Part 3

Minimum Length Solution

Page 39: Lecture 4  The  L 2  Norm and Simple Least Squares

but Least Squares will fail

when [GTG] has no inverse

Page 40: Lecture 4  The  L 2  Norm and Simple Least Squares

zd

?

?

?

examplefitting line to a single point

Page 41: Lecture 4  The  L 2  Norm and Simple Least Squares
Page 42: Lecture 4  The  L 2  Norm and Simple Least Squares

zero determinanthence no inverse

Page 43: Lecture 4  The  L 2  Norm and Simple Least Squares

Least Squares will fail

when more than one solution minimizes the error

the inverse problem is “underdetermined”

Page 44: Lecture 4  The  L 2  Norm and Simple Least Squares

R

S

21

simple example of an underdetermined problem

Page 45: Lecture 4  The  L 2  Norm and Simple Least Squares

What to do?

use another guiding principle

“a priori” information about the solution

Page 46: Lecture 4  The  L 2  Norm and Simple Least Squares

in the casechoose a solution that is small

minimize ||m||2

Page 47: Lecture 4  The  L 2  Norm and Simple Least Squares

simplest case“purely underdetermined”

more than one solution has zero error

Page 48: Lecture 4  The  L 2  Norm and Simple Least Squares

minimize L=||m||22with the constraint that e=0

Page 49: Lecture 4  The  L 2  Norm and Simple Least Squares

Method of Lagrange Multipliersminimize L with constraintsC1=0, C2=0, …

equivalent to

minimize Φ=L+λ1C1+λ2C2+…with no constraints λs called “Lagrange Multipliers”

Page 50: Lecture 4  The  L 2  Norm and Simple Least Squares

e(x,y)=0

x

y

L (x,y)(x0,y0)

Page 51: Lecture 4  The  L 2  Norm and Simple Least Squares
Page 52: Lecture 4  The  L 2  Norm and Simple Least Squares

2m=GT λ and Gm=d GG½ T λ =d

λ = 2[GGT ]-1d m=GT [GGT ]-1d

Page 53: Lecture 4  The  L 2  Norm and Simple Least Squares

presuming [GGT] has an inverse

Minimum Length Solution

mest=GT [GGT ]-1d

Page 54: Lecture 4  The  L 2  Norm and Simple Least Squares

presuming [GGT] has an inverse

Minimum Length Solution

mest=GT [GGT ]-1dmemorize

Page 55: Lecture 4  The  L 2  Norm and Simple Least Squares

Part 4

Covariance

Page 56: Lecture 4  The  L 2  Norm and Simple Least Squares

Least Squares Solutionmest= [GTG ]-1GTdMinimum Length Solutionmest=GT [GGT ]-1dboth have the linear formm=Md

Page 57: Lecture 4  The  L 2  Norm and Simple Least Squares

but ifm=Mdthen[cov m] = M [cov d] MT

when data are uncorrelated with uniform variance σd

2

[cov d]=σd2I

so

Page 58: Lecture 4  The  L 2  Norm and Simple Least Squares

Least Squares Solution[cov m] = [GTG ]-1GTσd2 G[GTG ]-1[cov m] = σd

2 [GTG ]-1

Minimum Length Solution[cov m] = GT [GGT ]-1 σd2 [GGT ]-1G[cov m] = σd

2 GT [GGT ]-2G

Page 59: Lecture 4  The  L 2  Norm and Simple Least Squares

Least Squares Solution[cov m] = [GTG ]-1GTσd2 G[GTG ]-1[cov m] = σd

2 [GTG ]-1

Minimum Length Solution[cov m] = GT [GGT ]-1 σd2 [GGT ]-1G[cov m] = σd

2 GT [GGT ]-2G memorize

Page 60: Lecture 4  The  L 2  Norm and Simple Least Squares

where to obtain the value of σd2 a priori value – based on knowledge of accuracy

of measurement technique

my ruler has 1 mm divisions, so σd≈ mm½

a posteriori value – based on prediction error

Page 61: Lecture 4  The  L 2  Norm and Simple Least Squares

variance critically dependent on experiment design (structure of G)

1 12 123

1234

1 12 23 34… …

which is the better way to weigh a set of boxes ?

Page 62: Lecture 4  The  L 2  Norm and Simple Least Squares

0 10 20 30 40 50 60 70 80 90 100

-2

0

2m

z

0 10 20 30 40 50 60 70 80 90 1000

0.5

1

sm

z

A)

B)

miest

σmi

i

i

Page 63: Lecture 4  The  L 2  Norm and Simple Least Squares

0 1 2 3 4

0

1

2

3

4

500

1000

1500

2000

0 1 2 3 4

0

1

2

3

4

500

1000

1500

2000

2500

3000

-5 0 5-10

-5

0

5

10

-5 0 5-10

-5

0

5

10

-5 0 5-10

-5

0

5

10

-5 0 5-10

-5

0

5

10

z

m1

m20 40

4m1

0

4

m20 4E Ezd d

A) B)

C) D)

Relationship between[cov m] and Error Surface

Page 64: Lecture 4  The  L 2  Norm and Simple Least Squares

Taylor Series expansion of the error about its minimum

Page 65: Lecture 4  The  L 2  Norm and Simple Least Squares

Taylor Series expansion of the error about its minimum

curvature matrixwith elements∂2E/ ∂mi∂mj

Page 66: Lecture 4  The  L 2  Norm and Simple Least Squares

for a linear problemcurvature is related to GTGE = (Gm-d)T(Gm-d) =

mT[GTG]m-dTGm-mTGTd+dTdso

∂2E/ ∂mi∂mj = [GTG] ij

Page 67: Lecture 4  The  L 2  Norm and Simple Least Squares

and since

[cov m] = σd2 [GTG]-1we have

Page 68: Lecture 4  The  L 2  Norm and Simple Least Squares

the sharper the minimumthe higher the curvature

the smaller the covariance