methods for solving linear least squares...

48
stuff The Least Square Problem (LSQ) Methods for solving Linear LSQ Comments on the three methods Regularization techniques References Methods for solving Linear Least Squares problems Anibal Sosa IPM for Linear Programming, September 2009 Anibal Sosa Methods for solving Linear Least Squares problems

Upload: trancong

Post on 12-Mar-2018

217 views

Category:

Documents


4 download

TRANSCRIPT

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Methods for solving Linear Least Squares problems

Anibal Sosa

IPM for Linear Programming,September 2009

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Outline

1 The Least Square Problem (LSQ)Linear Least Square Problems

2 Methods for solving Linear LSQNormal EquationsQR FactorizationSingular Value Decomposition (SVD)

3 Comments on the three methods

4 Regularization techniquesTikhonov regularization and Damped SVDTikhonov regularization order one and two

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Outline

1 The Least Square Problem (LSQ)Linear Least Square Problems

2 Methods for solving Linear LSQNormal EquationsQR FactorizationSingular Value Decomposition (SVD)

3 Comments on the three methods

4 Regularization techniquesTikhonov regularization and Damped SVDTikhonov regularization order one and two

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Outline

1 The Least Square Problem (LSQ)Linear Least Square Problems

2 Methods for solving Linear LSQNormal EquationsQR FactorizationSingular Value Decomposition (SVD)

3 Comments on the three methods

4 Regularization techniquesTikhonov regularization and Damped SVDTikhonov regularization order one and two

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Outline

1 The Least Square Problem (LSQ)Linear Least Square Problems

2 Methods for solving Linear LSQNormal EquationsQR FactorizationSingular Value Decomposition (SVD)

3 Comments on the three methods

4 Regularization techniquesTikhonov regularization and Damped SVDTikhonov regularization order one and two

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Linear Least Square Problems

The Least Square Problem (LSQ)

The objective function has the following special formf (x) = 1

2

m∑j=1

r2j (x), where rj : Rn → R are the residuals , i. e.,

minx∈Rn

f (x) = minx∈Rn

12 r

T (x)r(x) = minx∈Rn

12 ||r(x)||22

r : Rn → Rm is called the residual vector, i.e., r =

r1(x)r2(x)...

rm(x)

Least square problems arise in many areas of applications

Largest source of unconstrained optimization problems

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Linear Least Square Problems

The Least Square Problem (LSQ)

The objective function has the following special formf (x) = 1

2

m∑j=1

r2j (x), where rj : Rn → R are the residuals , i. e.,

minx∈Rn

f (x) = minx∈Rn

12 r

T (x)r(x) = minx∈Rn

12 ||r(x)||22

r : Rn → Rm is called the residual vector, i.e., r =

r1(x)r2(x)...

rm(x)

Least square problems arise in many areas of applications

Largest source of unconstrained optimization problems

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Linear Least Square Problems

The Least Square Problem (LSQ)

The objective function has the following special formf (x) = 1

2

m∑j=1

r2j (x), where rj : Rn → R are the residuals , i. e.,

minx∈Rn

f (x) = minx∈Rn

12 r

T (x)r(x) = minx∈Rn

12 ||r(x)||22

r : Rn → Rm is called the residual vector, i.e., r =

r1(x)r2(x)...

rm(x)

Least square problems arise in many areas of applications

Largest source of unconstrained optimization problems

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Linear Least Square Problems

Linear Least Square Problems

Let φ(x ; ρ) be a model function that predict experimental values, forsome fix parameters ρ.Usually we want to minimize the differences between the observedvalues y ∈ Rm(data) and the predicted values φ(x ; ρ) ∈ Rm.

We can use LSQ setting r(x) = φ(x ; ρ)− y

minx∈Rn

12 ||φ(x ; ρ)− y ||22 (1)

If φ in (2) is nonlinear then we have a nonlinear LSQ problem

In our case φ(x) = Ax , thus we say this is a linear LSQ problem

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Linear Least Square Problems

Linear Least Square Problems

Let φ(x ; ρ) be a model function that predict experimental values, forsome fix parameters ρ.Usually we want to minimize the differences between the observedvalues y ∈ Rm(data) and the predicted values φ(x ; ρ) ∈ Rm.

We can use LSQ setting r(x) = φ(x ; ρ)− y

minx∈Rn

12 ||φ(x ; ρ)− y ||22 (1)

If φ in (2) is nonlinear then we have a nonlinear LSQ problem

In our case φ(x) = Ax , thus we say this is a linear LSQ problem

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Linear Least Square Problems

Linear Least Square Problems

Let φ(x ; ρ) be a model function that predict experimental values, forsome fix parameters ρ.Usually we want to minimize the differences between the observedvalues y ∈ Rm(data) and the predicted values φ(x ; ρ) ∈ Rm.

We can use LSQ setting r(x) = φ(x ; ρ)− y

minx∈Rn

12 ||φ(x ; ρ)− y ||22 (1)

If φ in (2) is nonlinear then we have a nonlinear LSQ problem

In our case φ(x) = Ax , thus we say this is a linear LSQ problem

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Linear Least Square Problems

Linear Least Square Problems

Let φ(x ; ρ) be a model function that predict experimental values, forsome fix parameters ρ.Usually we want to minimize the differences between the observedvalues y ∈ Rm(data) and the predicted values φ(x ; ρ) ∈ Rm.

We can use LSQ setting r(x) = φ(x ; ρ)− y

minx∈Rn

12 ||φ(x ; ρ)− y ||22 (1)

If φ in (2) is nonlinear then we have a nonlinear LSQ problem

In our case φ(x) = Ax , thus we say this is a linear LSQ problem

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Linear Least Square Problems

Preliminaries for solving the LSQ problem

Observe that

f (x) =12 ||Ax−y ||

22 =

12 (Ax−y)T (Ax−y) =

12x

TATAx−xTAT y+12y

T y

is easy to prove that

∇f (x) = AT (Ax − y) ∇2f (x) = ATA

Since f is a convex function is well known that any x∗ such that∇f (x∗) = 0 is a global minimizer of f , therefore x∗ satisfy the normalequations

ATAx = AT y

Next we discuss three major algorithms for solving Linear LSQ problems,assuming: i) m ≥ n and ii) A is full rank

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Linear Least Square Problems

Preliminaries for solving the LSQ problem

Observe that

f (x) =12 ||Ax−y ||

22 =

12 (Ax−y)T (Ax−y) =

12x

TATAx−xTAT y+12y

T y

is easy to prove that

∇f (x) = AT (Ax − y) ∇2f (x) = ATA

Since f is a convex function is well known that any x∗ such that∇f (x∗) = 0 is a global minimizer of f , therefore x∗ satisfy the normalequations

ATAx = AT y

Next we discuss three major algorithms for solving Linear LSQ problems,assuming: i) m ≥ n and ii) A is full rank

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Linear Least Square Problems

Preliminaries for solving the LSQ problem

Observe that

f (x) =12 ||Ax−y ||

22 =

12 (Ax−y)T (Ax−y) =

12x

TATAx−xTAT y+12y

T y

is easy to prove that

∇f (x) = AT (Ax − y) ∇2f (x) = ATA

Since f is a convex function is well known that any x∗ such that∇f (x∗) = 0 is a global minimizer of f , therefore x∗ satisfy the normalequations

ATAx = AT y

Next we discuss three major algorithms for solving Linear LSQ problems,assuming: i) m ≥ n and ii) A is full rank

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

Normal Equations

Step1 Compute ATA and AT yStep2 Compute Cholesky factorization of ATA > 0

ATA = RTR, R is an upper triangular matrix(Rii > 0)

Step3 Perform two triangular substitutions

RT z = RT y =⇒ Rx∗ = z

Disadvantages:Relative error of x∗ ≈ κ(A)21

Sensitive to ill-conditioned matrices

1κ(A) = ||A|| ||A−1|| ≈ σ1σn

= κ2(A)Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

Normal Equations

Step1 Compute ATA and AT yStep2 Compute Cholesky factorization of ATA > 0

ATA = RTR, R is an upper triangular matrix(Rii > 0)

Step3 Perform two triangular substitutions

RT z = RT y =⇒ Rx∗ = z

Disadvantages:Relative error of x∗ ≈ κ(A)21

Sensitive to ill-conditioned matrices

1κ(A) = ||A|| ||A−1|| ≈ σ1σn

= κ2(A)Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

Normal Equations

Step1 Compute ATA and AT yStep2 Compute Cholesky factorization of ATA > 0

ATA = RTR, R is an upper triangular matrix(Rii > 0)

Step3 Perform two triangular substitutions

RT z = RT y =⇒ Rx∗ = z

Disadvantages:Relative error of x∗ ≈ κ(A)21

Sensitive to ill-conditioned matrices

1κ(A) = ||A|| ||A−1|| ≈ σ1σn

= κ2(A)Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

QR FactorizationNotice that || · || is invariant under orthogonal transformations

||Ax − y ||22 = ||QT (Ax − y)||22

where Qm×m is orthogonalThe QR factorization is done as follows

AΠ = Q[

R0

]= [Q1 Q2]

[R0

]= Q1R (2)

where Πn×n is a permutation matrix, Q1 is the first n columns of Qand Rn×n is upper triangular with Rii > 0

Using 2 we have

||Ax − y ||22 =

∣∣∣∣[ QT1

QT2

] (AΠΠT x − y

)∣∣∣∣22

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

QR FactorizationNotice that || · || is invariant under orthogonal transformations

||Ax − y ||22 = ||QT (Ax − y)||22

where Qm×m is orthogonalThe QR factorization is done as follows

AΠ = Q[

R0

]= [Q1 Q2]

[R0

]= Q1R (2)

where Πn×n is a permutation matrix, Q1 is the first n columns of Qand Rn×n is upper triangular with Rii > 0

Using 2 we have

||Ax − y ||22 =

∣∣∣∣[ QT1

QT2

] (AΠΠT x − y

)∣∣∣∣22

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

QR FactorizationNotice that || · || is invariant under orthogonal transformations

||Ax − y ||22 = ||QT (Ax − y)||22

where Qm×m is orthogonalThe QR factorization is done as follows

AΠ = Q[

R0

]= [Q1 Q2]

[R0

]= Q1R (2)

where Πn×n is a permutation matrix, Q1 is the first n columns of Qand Rn×n is upper triangular with Rii > 0

Using 2 we have

||Ax − y ||22 =

∣∣∣∣[ QT1

QT2

] (AΠΠT x − y

)∣∣∣∣22

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

QR Factorization(2)

∣∣∣∣∣∣∣∣∣[

QT1

QT2

][Q1 Q2]

[R0

]︸ ︷︷ ︸

ΠT x − y

∣∣∣∣∣∣∣∣∣2

2

=

∣∣∣∣[ R0

]ΠT x −

[QT1

QT2

]y∣∣∣∣22

= ||RΠT x − QT1 y ||22 + ||QT

2 y ||22

Notice that from the last equation:The last term does not depend on xThe minimum value is reached when RΠT x − QT

1 y = 0, therefore

x∗ = ΠR−1QT1 y

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

QR Factorization(2)

∣∣∣∣∣∣∣∣∣[

QT1

QT2

][Q1 Q2]

[R0

]︸ ︷︷ ︸

ΠT x − y

∣∣∣∣∣∣∣∣∣2

2

=

∣∣∣∣[ R0

]ΠT x −

[QT1

QT2

]y∣∣∣∣22

= ||RΠT x − QT1 y ||22 + ||QT

2 y ||22

Notice that from the last equation:The last term does not depend on xThe minimum value is reached when RΠT x − QT

1 y = 0, therefore

x∗ = ΠR−1QT1 y

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

QR Factorization Algorithm

Step1 Compute QR factorization of AStep2 Extract Q1, identify Π and RStep3 Perform one triangular substitution and one permutation

Rz = QT1 y =⇒ x∗ = Πz

Advantage:Relative error of x∗ ≈ κ(A)

Disadvantage:Sometimes is necessary more information about datasensitivity

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

QR Factorization Algorithm

Step1 Compute QR factorization of AStep2 Extract Q1, identify Π and RStep3 Perform one triangular substitution and one permutation

Rz = QT1 y =⇒ x∗ = Πz

Advantage:Relative error of x∗ ≈ κ(A)

Disadvantage:Sometimes is necessary more information about datasensitivity

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

QR Factorization Algorithm

Step1 Compute QR factorization of AStep2 Extract Q1, identify Π and RStep3 Perform one triangular substitution and one permutation

Rz = QT1 y =⇒ x∗ = Πz

Advantage:Relative error of x∗ ≈ κ(A)

Disadvantage:Sometimes is necessary more information about datasensitivity

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

Singular Value Decomposition (SVD)

TheoremIf Am×n is real then there exist orthogonal matrices

U = [u1 . . . um] ∈ Rm×m and V = [v1 . . . vn] ∈ Rn×n

such that A = UΣV T , whereΣ = diag(σ1, . . . , σp) ∈ Rm×n, p = min{m, n} and σ1 ≥ σ2 . . . ≥ σp ≥ 0

In our case σ1 ≥ σ2 . . . ≥ σn > 0 since A is full rank and m� n thus

A = U[

Σ10

]V T = [U1 U2]

[Σ10

]V T = U1Σ1V T (3)

where U1 has the first n columns of U and Σ1 = diag(σ1, . . . , σn).

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

Singular Value Decomposition (SVD)

TheoremIf Am×n is real then there exist orthogonal matrices

U = [u1 . . . um] ∈ Rm×m and V = [v1 . . . vn] ∈ Rn×n

such that A = UΣV T , whereΣ = diag(σ1, . . . , σp) ∈ Rm×n, p = min{m, n} and σ1 ≥ σ2 . . . ≥ σp ≥ 0

In our case σ1 ≥ σ2 . . . ≥ σn > 0 since A is full rank and m� n thus

A = U[

Σ10

]V T = [U1 U2]

[Σ10

]V T = U1Σ1V T (3)

where U1 has the first n columns of U and Σ1 = diag(σ1, . . . , σn).

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

The thin SVD

Using (3) and similar ideas from QR

||Ax − y ||22 =

∣∣∣∣[ Σ10

] (V T x

)−[

UT1

UT2

]y∣∣∣∣22

= ||Σ1(V T x

)− UT

1 y ||22 + ||UT2 y ||22

Again from the last equation:The last term does not depend on xThe minimum value is reached when Σ

(V T x

)−UT

1 y = 0, therefore

x∗ = VΣ−1UT1 y

or equivalently

x∗ =n∑

i=1

(uTi yσi

)vi (4)

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

The thin SVD

Using (3) and similar ideas from QR

||Ax − y ||22 =

∣∣∣∣[ Σ10

] (V T x

)−[

UT1

UT2

]y∣∣∣∣22

= ||Σ1(V T x

)− UT

1 y ||22 + ||UT2 y ||22

Again from the last equation:The last term does not depend on xThe minimum value is reached when Σ

(V T x

)−UT

1 y = 0, therefore

x∗ = VΣ−1UT1 y

or equivalently

x∗ =n∑

i=1

(uTi yσi

)vi (4)

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

SVD

Equation (4) gives useful information about x∗ sensitivitySmall changes in A or y can induce large changes in x∗ if σi is smallA is rank defficient when σn

σ1� 1. (σn is the distance from A to the

set of singular matrices)

x∗calculated as in (4) has the smallest 2-norm of all minimizersAdvantage:

Most robust and reliableDisadvantage:

Most expensive

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal EquationsQR FactorizationSingular Value Decomposition (SVD)

SVD

Equation (4) gives useful information about x∗ sensitivitySmall changes in A or y can induce large changes in x∗ if σi is smallA is rank defficient when σn

σ1� 1. (σn is the distance from A to the

set of singular matrices)

x∗calculated as in (4) has the smallest 2-norm of all minimizersAdvantage:

Most robust and reliableDisadvantage:

Most expensive

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal Eq. vs QR vs SVD

The Cholesky-based algorithm is practical if m� n ( is easier storeATA), even if A is sparse

The QR algorithm avoid squaring κ(A)

When A is rank-deficient, some σi ≈ 0 thus any vector

x∗ =∑σi 6=0

(uTi yσi

)vi +

∑σi=0

τvi

is also a minimizer of ||Ax − y ||, for τ such that σi ≥ τ,. Thussetting τi = 0 we get the minimum norm solution2

Remark: For very large problems is recommended to use iterativemethods as Conjugate Gradient

2This is a type of filter by doing truncationAnibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal Eq. vs QR vs SVD

The Cholesky-based algorithm is practical if m� n ( is easier storeATA), even if A is sparse

The QR algorithm avoid squaring κ(A)

When A is rank-deficient, some σi ≈ 0 thus any vector

x∗ =∑σi 6=0

(uTi yσi

)vi +

∑σi=0

τvi

is also a minimizer of ||Ax − y ||, for τ such that σi ≥ τ,. Thussetting τi = 0 we get the minimum norm solution2

Remark: For very large problems is recommended to use iterativemethods as Conjugate Gradient

2This is a type of filter by doing truncationAnibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal Eq. vs QR vs SVD

The Cholesky-based algorithm is practical if m� n ( is easier storeATA), even if A is sparse

The QR algorithm avoid squaring κ(A)

When A is rank-deficient, some σi ≈ 0 thus any vector

x∗ =∑σi 6=0

(uTi yσi

)vi +

∑σi=0

τvi

is also a minimizer of ||Ax − y ||, for τ such that σi ≥ τ,. Thussetting τi = 0 we get the minimum norm solution2

Remark: For very large problems is recommended to use iterativemethods as Conjugate Gradient

2This is a type of filter by doing truncationAnibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Normal Eq. vs QR vs SVD

The Cholesky-based algorithm is practical if m� n ( is easier storeATA), even if A is sparse

The QR algorithm avoid squaring κ(A)

When A is rank-deficient, some σi ≈ 0 thus any vector

x∗ =∑σi 6=0

(uTi yσi

)vi +

∑σi=0

τvi

is also a minimizer of ||Ax − y ||, for τ such that σi ≥ τ,. Thussetting τi = 0 we get the minimum norm solution2

Remark: For very large problems is recommended to use iterativemethods as Conjugate Gradient

2This is a type of filter by doing truncationAnibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Tikhonov regularization and Damped SVDTikhonov regularization order one and two

Tikhonov regularizationaaRidge regression

Most commonly used method for ill-posed problems

The ill-conditioned problem 1 is posed as

min 12 ||Ax − y ||22 +

12α

2||x ||22 (5)

for some suitable regularization parameter α > 0

This improves the problem condition, even is A is rank-deficient,shifting the small singular values(

ATA + αIn)x = ATAx︸ ︷︷ ︸

λx

+ αx = (λ+ α) x

for any eigenvalue λ and eigenvector x of ATAAnibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Tikhonov regularization and Damped SVDTikhonov regularization order one and two

Tikhonov regularizationaaRidge regression

Most commonly used method for ill-posed problems

The ill-conditioned problem 1 is posed as

min 12 ||Ax − y ||22 +

12α

2||x ||22 (5)

for some suitable regularization parameter α > 0

This improves the problem condition, even is A is rank-deficient,shifting the small singular values(

ATA + αIn)x = ATAx︸ ︷︷ ︸

λx

+ αx = (λ+ α) x

for any eigenvalue λ and eigenvector x of ATAAnibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Tikhonov regularization and Damped SVDTikhonov regularization order one and two

Tikhonov regularizationaaRidge regression

Most commonly used method for ill-posed problems

The ill-conditioned problem 1 is posed as

min 12 ||Ax − y ||22 +

12α

2||x ||22 (5)

for some suitable regularization parameter α > 0

This improves the problem condition, even is A is rank-deficient,shifting the small singular values(

ATA + αIn)x = ATAx︸ ︷︷ ︸

λx

+ αx = (λ+ α) x

for any eigenvalue λ and eigenvector x of ATAAnibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Tikhonov regularization and Damped SVDTikhonov regularization order one and two

Tikhonov regularization and Damped SVDA little algebra shows that the minimum solution of (5) is given bythe nonsingular system(

ATA + α2In)x = AT y

and from (4) we can show that

x∗ =n∑

i=1fi(uTi yσi

)vi

where fi =σ2iσ2i +α2

are known as filter factors3

The impact of an small α in the filter factors is:None for large σi(α� σi),i.e. σ2i

σ2i +α2≈ 1

Reduce the magnification of 1σi

since σ2iσ2i +α2

≈ σ2iα2� 1

A “good” choice of α may provide enough numerical stability toexpect a good approximate solution

3In signal processing are known as Wiener filtersAnibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Tikhonov regularization and Damped SVDTikhonov regularization order one and two

Tikhonov regularization and Damped SVDA little algebra shows that the minimum solution of (5) is given bythe nonsingular system(

ATA + α2In)x = AT y

and from (4) we can show that

x∗ =n∑

i=1fi(uTi yσi

)vi

where fi =σ2iσ2i +α2

are known as filter factors3

The impact of an small α in the filter factors is:None for large σi(α� σi),i.e. σ2i

σ2i +α2≈ 1

Reduce the magnification of 1σi

since σ2iσ2i +α2

≈ σ2iα2� 1

A “good” choice of α may provide enough numerical stability toexpect a good approximate solution

3In signal processing are known as Wiener filtersAnibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Tikhonov regularization and Damped SVDTikhonov regularization order one and two

Tikhonov regularization and Damped SVDA little algebra shows that the minimum solution of (5) is given bythe nonsingular system(

ATA + α2In)x = AT y

and from (4) we can show that

x∗ =n∑

i=1fi(uTi yσi

)vi

where fi =σ2iσ2i +α2

are known as filter factors3

The impact of an small α in the filter factors is:None for large σi(α� σi),i.e. σ2i

σ2i +α2≈ 1

Reduce the magnification of 1σi

since σ2iσ2i +α2

≈ σ2iα2� 1

A “good” choice of α may provide enough numerical stability toexpect a good approximate solution

3In signal processing are known as Wiener filtersAnibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Tikhonov regularization and Damped SVDTikhonov regularization order one and two

Tikhonov regularization order oneDamping the large components in magnitude may not inhibitundesirable behavior of the singular values.Strong regularization is needed, penalizaing rapid changes of xi (4)

min 12 ||Ax − y ||22 +

12α

2n−1∑i=2

(xi − xi−1)2

Again this expression is minimized by the solution of(ATA + α2BT

1 B1)x = AT y

where

B1 =

1 −1 0 0 00 1 −1 0 0...

. . . 1. . . . . .

......

.... . . −1

0 0 0 0 1

(n−1)×n

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Tikhonov regularization and Damped SVDTikhonov regularization order one and two

Tikhonov regularization order oneDamping the large components in magnitude may not inhibitundesirable behavior of the singular values.Strong regularization is needed, penalizaing rapid changes of xi (4)

min 12 ||Ax − y ||22 +

12α

2n−1∑i=2

(xi − xi−1)2

Again this expression is minimized by the solution of(ATA + α2BT

1 B1)x = AT y

where

B1 =

1 −1 0 0 00 1 −1 0 0...

. . . 1. . . . . .

......

.... . . −1

0 0 0 0 1

(n−1)×n

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Tikhonov regularization and Damped SVDTikhonov regularization order one and two

Tikhonov regularization order oneDamping the large components in magnitude may not inhibitundesirable behavior of the singular values.Strong regularization is needed, penalizaing rapid changes of xi (4)

min 12 ||Ax − y ||22 +

12α

2n−1∑i=2

(xi − xi−1)2

Again this expression is minimized by the solution of(ATA + α2BT

1 B1)x = AT y

where

B1 =

1 −1 0 0 00 1 −1 0 0...

. . . 1. . . . . .

......

.... . . −1

0 0 0 0 1

(n−1)×n

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Tikhonov regularization and Damped SVDTikhonov regularization order one and two

Tikhonov regularization order two

An even stronger regularization is

min 12 ||Ax − y ||22 +

12α

2n−1∑i=2

(xi+1 − 2xi + xi−1)2

Again this expression is minimized by the solution of(ATA + α2BT

2 B2)x = AT y

where

B2 =

−2 1 0 0 · · ·1 −2 1 0 · · ·... 1 −2 1

......

.... . . . . . . . .

0 0 0 1 −2

(n−2)×n

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

Tikhonov regularization and Damped SVDTikhonov regularization order one and two

Tikhonov regularization order two

An even stronger regularization is

min 12 ||Ax − y ||22 +

12α

2n−1∑i=2

(xi+1 − 2xi + xi−1)2

Again this expression is minimized by the solution of(ATA + α2BT

2 B2)x = AT y

where

B2 =

−2 1 0 0 · · ·1 −2 1 0 · · ·... 1 −2 1

......

.... . . . . . . . .

0 0 0 1 −2

(n−2)×n

Anibal SosaMethods for solving Linear Least Squares problems

stuff

The Least Square Problem (LSQ)Methods for solving Linear LSQComments on the three methods

Regularization techniquesReferences

References

Numerical Optimization. J. Nocedal, S. Wright. Second Edition.Springer. 2006

Matrix Computations. G. Golub, Van Loan. Third Edition. JhonsHopkins University Press. 1996

Anibal SosaMethods for solving Linear Least Squares problems