model reduction in pde-constrained optimization › files › inhalte...overview i pde constrained...

169
Model Reduction in PDE-Constrained Optimization Matthias Heinkenschloss Department of Computational and Applied Mathematics Rice University, Houston, Texas [email protected] June 14, 2016 Funded in part by AFOSR and NSF 2016 EU Regional School Aachen Institute for Advanced Study in Computational Engineering Science (AICES) RWTH Aachen University Matthias Heinkenschloss June 14, 2016 1

Upload: others

Post on 07-Feb-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

  • Model Reduction inPDE-Constrained Optimization

    Matthias Heinkenschloss

    Department of Computational and Applied MathematicsRice University, Houston, Texas

    [email protected]

    June 14, 2016

    Funded in part by AFOSR and NSF

    2016 EU Regional SchoolAachen Institute for

    Advanced Study in Computational Engineering Science (AICES)RWTH Aachen University

    Matthias Heinkenschloss June 14, 2016 1

  • Outline

    Overview

    Example Optimization Problems

    Optimization Problem

    Projection Based Model Reduction

    Back to Optimization

    Error Estimates

    Linear-Quadratic Problems

    Shape Optimization with Local Parameter Dependence

    Semilinear Parabolic Problems

    Trust-Region Framework

    Matthias Heinkenschloss June 14, 2016 2

  • OverviewI PDE constrained optimization arises in many science and

    engineering applicationsI Numerical solution is iterative and requires the solution of many

    PDEs.I Can we systematically replace the underlying PDE by a (projection

    based) reduced order model to reduce the computational cost?

    Flow control

    Reservoir optimization

    Control of semillinear reactionadvection diffusion

    Shape optimization of biochips

    Matthias Heinkenschloss June 14, 2016 3

  • Reduced order modeling has a long history in optimization

    I Newton’s method uses a ’reduced order’ (reduced nonlinearity)model.

    I Surrogate optimization.

    I Multilevel optimization.

    I ....

    This course

    I Focusses on projection based reduced order models.

    I There is a close connection with surrogate optimization andespecially with multilevel optimization.

    Matthias Heinkenschloss June 14, 2016 4

  • I Original problem

    min J(y,u)

    s.t. c(y,u) = 0, (PDE, size n)

    u ∈ Uad, (control constraints)

    where y ∈ Rn, n large, are the states and u ∈ Rm are the controls.

    I Reduced order problemConstruct W,V ∈ Rn×r, r � n. rank(V) = rank(W) = r.Reduced order problem:

    min J(Vŷ,u)

    s.t. WT c(Vŷ,u) = 0, (ROM PDE, size r)

    u ∈ Uad.

    Optimization variables: states ŷ ∈ Rr, r � n, and controls u ∈ Rm.Reduced state equation WT c(Vŷ,u) = 0 ∈ Rr.

    Matthias Heinkenschloss June 14, 2016 5

  • Rich literature on projection based ROMs in optimization, incl.:I (Strongly convex) Linear-Quadratic Optimal Control Problems

    [Antil et al., 2010], [Chen and Quarteroni, 2014], [Gubisch and Volkwein, 2013],[Kammann et al., 2013], [Kärcher and Grepl, 2014b],[Kärcher and Grepl, 2014a], [Kärcher et al., 2014], [Negri et al., 2013][Tröltzsch and Volkwein, 2009], [Volkwein, 2011], ...

    I Shape/Design Optimization [Amsallem et al., 2015], [Antil et al., 2011],[Choi et al., 2015], [Rozza and Manzoni, 2010], [Zahr and Farhat, 2015], ...

    I Flow control [Borggaard and Gugercin, 2015], [Kunisch and Volkwein, 1999],[Ravindran, 2000], [Rowley et al., 2004], ...

    I Newton-Kantorovich type estimates [Dihlmann and Haasdonk, 2015],[Gohlke, 2013].

    I Optimality system based [Kunisch and Volkwein, 2008], [Grimm et al., 2015],[Kunisch and Müller, 2015], ...

    I Balanced Truncation: [Antil et al., 2010], [Antil et al., 2011],[Antil et al., 2012], [Sun et al., 2008].

    I Trust-region based approaches [Afanasiev and Hinze, 2001], [Arian et al., 2000],[Fahl and Sachs, 2003], [Gohlke, 2013], [Yue and Meerbergen, 2013], ...

    I Model Predictive control [Ghiglieri and Ulbrich, 2014],[Alla and Volkwein, 2014], ...

    I Feedback control [Kunisch et al., 2004], [Kunisch and Xie, 2005],[Alla and Falcone, 2013], ...

    Matthias Heinkenschloss June 14, 2016 6

  • Outline

    Overview

    Example Optimization Problems

    Optimization Problem

    Projection Based Model Reduction

    Back to Optimization

    Error Estimates

    Linear-Quadratic Problems

    Shape Optimization with Local Parameter Dependence

    Semilinear Parabolic Problems

    Trust-Region Framework

    Matthias Heinkenschloss June 14, 2016 7

  • Overview

    Some examples of optimization problems governed by partial differentialequations (PDEs)

    I Linear Quadratic Elliptic Problem

    I Linear Quadratic Parabolic Problem

    I Shape Optimization with Local Parameter Dependence

    I Oil Reservoir Waterflooding Optimization

    Matthias Heinkenschloss June 14, 2016 8

  • Linear Quadratic Elliptic Problem

    minimize1

    2

    ∫Ωs

    (y(x)− ŷ(x))2dx+ α2

    ∫∂Ωc

    u2(x)dx

    subject to

    −κf∆y(x) + a(x) · ∇y(x) = 0, x ∈ Ωf ,−κs∆y(x) = f(x), x ∈ Ωs,

    κ∂

    ∂ny(x) = 0, x ∈ ∂Ω \ ∂Ωc,

    y(x) = d(x) + u(x), x ∈ ∂Ωc.

    0 0.2 0.4 0.6 0.8 10

    0.2

    0.4

    0.6

    0.75

    1

    x1

    x 2

    Solid / Fluid Model

    SOLID

    Velocity field a(x) and control

    boundary ∂Ωc (bold vertical line)

    0 0.2 0.4 0.6 0.8 10

    0.2

    0.4

    0.6

    0.8

    1

    x1

    x 2

    Uncontrolled Temperature

    7580

    85 9095

    100

    105

    110

    115

    120

    125

    130

    130

    125

    120115

    110

    105

    100

    9590

    Uncontrolled temperature.

    Matthias Heinkenschloss June 14, 2016 9

  • Finite element approximation

    min1

    2yTQy + yT c +

    1

    2uTRu,

    s.t. Ay(t) + Bu(t) = f .

    Strongly convex problem.

    Matthias Heinkenschloss June 14, 2016 10

  • Linear Quadratic Parabolic Problem([Antil et al., 2010], modeled after [Dedé and Quarteroni, 2005])

    Minimize1

    2

    ∫ T0

    ∫D

    (y(x, t)− d(x, t))2dx dt+ 10−4

    2

    ∫ T0

    ∫U1∪U2

    u2(x, t)dx dt,

    subject to

    ∂ty(x, t)−∇(κ∇y(x, t)) + a(x) · ∇y(x, t)

    = u(x, t)χU1(x) + u(x, t)χU2(x) in Ω× (0, T ),

    with boundary conditions y(x, t) = 0 on ΓD × (0, T ), κ ∂∂ny(x, t) = 0 onΓN × (0, T ) and initial conditions y(x, 0) = 0 in Ω.1032 L. DEDE’ AND A. QUARTERONI

    0 0.2 0.4 0.6 0.8 1.0 1.2−0.4

    −0.2

    0

    0.2

    0.4

    U1

    U2

    D

    ΓD

    ΓN

    ΓN

    ΓN

    ΓN

    ΓN

    ΓN

    (a)

    0 0.2 0.4 0.6 0.8 1.0 1.2−0.4

    −0.2

    0

    0.2

    0.4

    Ω Γ

    Din

    ΓD

    ΓD

    ΓD

    ΓD

    ΓN

    ΓN

    (b)

    Figure 4. Test 1. Reference domain for the control problem. We report the boundary condi-tions for the advection–diffusion Equation (10) (a) and for the Stokes problem (64) (b).

    where ρwK , ρpK and ρ

    uK are defined in Equations (55) and (61) (for the sake of simplicity, we have dropped the

    apex (j) on the error indicators). Results are compared with those obtained on fine grids, that we consider anaccurate guess of the exact solution.

    4.1. Test 1: water pollution

    Let us consider a first test case that is inspired to a problem of a water pollution. The optimal control problemconsists in regulating the emission rates of pollutants (rising e.g. from refusals of industrial or agricultural plants)to keep the concentration of such substances below a desired threshold in a branch of a river.

    We refer to the domain reported in Figure 4a, that could represent a river that bifurcates into two branchespast a hole, which stands for, e.g., an island. Referring to Equation (10), we obtain the velocity field V as thesolution of the following Stokes problem:

    −µ∆V + ∇p = 0, in Ω,V = (1 − ( y0.2 )

    2, 0)T , on ΓinD ,V = 0, on ΓD,µ∇V · n− pn = 0, on ΓN ,

    (64)

    where p stands for the pressure, while ΓinD , ΓD and ΓN are indicated in Figure 4b. Adimensional quantitiesare used. Here the Stokes problem serves the only purpose to provide an appropriate velocity field for theadvection–diffusion problem; since the latter governs our control problem, the analysis provided in Section 1and Section 2 applies. Moreover, for the sake of simplicity, we adopt the method and the a posteriori errorestimate (54) proposed in Section 3. In fact, this approach is not fully coherent, being the velocity field Vcomputed numerically by means of the same grid adopted to solve the control problem, i.e. we consider Vhinstead of V.

    For the Stokes problem we assume µ = 0.1 , for which the Reynolds number reads Re ≈ 10; we solve theproblem by means of linear finite elements with stabilization (see [16]), computed with respect to the same gridof the control problem. In Figure 5 we report the velocity field and its intensity as obtained by solving theStokes problem.

    For our control problem we assume ν = 0.015, u = 50 in both the emission areas U1 and U2 and zd = 0.1 inthe observation area D. The initial value of the control function, u = 50, can be interpreted as the maximumrate of emission of pollutants (divided by the emission area), while the state variable w stands for the pollutant

    Ω with boundary conditions for theadvection diffusion equation

    0 0.2 0.4 0.6 0.8 1 1.2−0.4

    −0.3

    −0.2

    −0.1

    0

    0.1

    0.2

    0.3

    0.4Velocity

    ξ1

    ξ 2

    the velocity field a (obtained bysolving steady Stokes equation

    Matthias Heinkenschloss June 14, 2016 11

  • Finite element discretization in space

    min j(u) ≡ 12

    ∫ T0

    ‖Cy(t)− d(t)‖2 + 12u(t)TDu(t)dt,

    where y(t) = y(u; t) is the solution of

    My′(t) = Ay(t) + Bu(t), t ∈ (0, T ),y(0) = y0.

    Here y(t) ∈ Rn, M ∈ Rn×n invert., A ∈ Rn×n, B ∈ Rn×m, n large.Strongly convex problem.

    Matthias Heinkenschloss June 14, 2016 12

  • Shape Optimization with Local Parameter Dependence([Antil et al., 2011, Antil et al., 2012])

    Geometry motivated by biochip

    Problems where the shape param. θ only influences a (small) subdomain:

    Ω̄(θ) := Ω̄1 ∪ Ω̄2(θ), Ω1 ∩ Ω2(θ) = ∅, Γ = Ω̄1 ∩ Ω̄2(θ).Here Ω2(θ) is the top left yellow, square domain.

    Matthias Heinkenschloss June 14, 2016 13

  • minθmin≤θ≤θmax

    J(θ) =

    T∫0

    ∫Ωobs

    12 |∇×v(x, t; θ)|

    2dx+

    ∫Ω2(θ)

    12 |v(x, t; θ)−v

    d(x, t)|2dxdt

    where v(θ) and p(θ) solve the Stokes equations

    vt(x, t)− µ∆v(x, t) +∇p(x, t) = f(x, t), in Ω(θ)× (0, T ),∇ · v(x, t) = 0, in Ω(θ)× (0, T ),

    v(x, t) = vin(x, t) on Γin × (0, T ),v(x, t) = 0 on Γlat × (0, T ),

    −(µ∇v(x, t)− p(x, t)I)n = 0 on Γout × (0, T ),v(x, 0) = 0 in Ω(θ).

    Here Ω(θ) = Ω1 ∪ Ω2(θ) and Ω2(θ) is the top left yellow, square domain.Observation region Ωobs is part of the two reservoirs.

    Matthias Heinkenschloss June 14, 2016 14

  • The semi-discretized minimization problem is

    minθ∈Θad

    J(θ) :=

    ∫ T0

    1

    2

    ∫ T0

    ‖Cv(t, θ) + Fp(t, θ) + Du(t)− d‖2 dt

    where v(·, θ), p(·, θ) solves the semi-discretized Stokes equations

    M(θ)d

    dtv(t) + A(θ)v(t) + B(θ)p(t) = K(θ)u(t) t ∈ [0, T ] ,

    BT (θ)v(t) = L(θ)u(t) t ∈ [0, T ] ,M(θ)v(0) = M(θ)v0,

    θ ∈ Θad

    Matthias Heinkenschloss June 14, 2016 15

  • Oil Reservoir Waterflooding Optimization([Deng, 2017, Magruder, 2017, Wiegand, 2010])

    From : http://plant-engineering.tistory.com/267

    Matthias Heinkenschloss June 14, 2016 16

  • Reservoir Model: Two-phase immiscible incompressible flow with capillarypressure (see, e.g., [Peaceman, 1977], [Chen et al., 2006]).States: saturations sw, pressure p, velocity v.

    φ(x)∂

    ∂tsw(x, t)

    +∇(fw(sw(x, t))

    [v(x, t) + d(sw(x, t))

    ])= qw(x, t), x ∈ Ω, t > 0,

    v(x, t) +K(x)λ(sw(x, t))∇p(x, t) = 0, x ∈ Ω, t > 0,

    ∇ · v(x, t) = q(x, t), x ∈ Ω, t > 0,

    v(x, t) · n = 0, x ∈ ∂Ω, t > 0,vw(x, t) · n = 0, x ∈ ∂Ω, t > 0,sw (x, 0) = swinit(x), x ∈ Ω.

    I Porosity φ(x), diagonal permeability K(x) from SPE 10 dataset.

    I Phase mobility λα = krα/µα; total mobility λ = λo + λw;water fractional flow function fw = λw/λ.

    I Capillary pressure, Brooks-Corey formula pc = Pd(

    sw−swc1−sor−swc

    )− 12.

    Matthias Heinkenschloss June 14, 2016 17

  • Optimization Model Problem (Well Rate Optimization)

    I Maximize Net Present Value (NPV)

    ∫ T0

    (1 + rdis)−t[ water injection cost︷ ︸︸ ︷−rinj

    ∑i∈Iinj

    γq(xi, t)−

    water treatment cost︷ ︸︸ ︷roper

    ∑i∈Iprod

    γ|q(xi, t)|fw(sw(xi, t))

    +

    oil revenue︷ ︸︸ ︷roil

    ∑i∈Iprod

    γ|q(xi, t)|fo(sw(xi, t))]dt

    I subject toI two-phase immiscible incompressible flow,I well rates sum up to zero (reservoir is closed)∑

    i∈Iinj∪Iprod

    q(xi, t) = 0, t ∈ (0, T ),

    I well rate bounds on each well

    qi,low ≤ q(xi, t) ≤ qi,upp, i ∈ Iinj ∪ Iprod, t ∈ (0, T ).

    I Data: Daily discount rate rdis = 2× 10−4, oil price roil = 80,injection cost rinj = 5, production cost rpro = 5.

    Matthias Heinkenschloss June 14, 2016 18

  • Example Result

    I 500 days;

    I 1000 time steps;

    I 1200×600×10 ft.3;I 60× 60× 5 grid;

    I 10 + 10 = 20 wells;

    I 25-days const. rate.

    Matthias Heinkenschloss June 14, 2016 19

  • Outline

    Overview

    Example Optimization Problems

    Optimization Problem

    Projection Based Model Reduction

    Back to Optimization

    Error Estimates

    Linear-Quadratic Problems

    Shape Optimization with Local Parameter Dependence

    Semilinear Parabolic Problems

    Trust-Region Framework

    Matthias Heinkenschloss June 14, 2016 20

  • Overview

    I Formulate abstract optimization problem to introduce notation.

    I Introduce adjoint equation method to compute gradient and Hessianinformation.

    I Illustrate adjoint equation method on example problems.

    Matthias Heinkenschloss June 14, 2016 21

  • Abstract Optimization ProblemI Original problem

    min J(y, u)

    s.t. c(y, u) = 0, (governing PDE, state eqn.)

    u ∈ Uad, (control constraints)where y: states, u: controls,

    I (Y, ‖ · ‖Y), (C, ‖ · ‖C) Banach spaces, (U , ‖ · ‖U ) Hilbert spaceI Uad ⊂ U nonempty, closed convex set,I J : Y × U → R, c : Y × U → C are smooth mappings.I Can think (Y, ‖ · ‖Y) = (Rn, ‖ · ‖2), (C, ‖ · ‖C) = (Rn, ‖ · ‖2),

    (U , ‖ · ‖U ) = (Rm, ‖ · ‖2).I Reduced order problem

    min Ĵ(ŷ, u)

    s.t. ĉ(ŷ, u) = 0,

    u ∈ Uad,where ŷ: ROM states, u: controls,

    I (Ŷ, ‖ · ‖Ŷ), (Ĉ, ‖ · ‖Ĉ) Banach spaces,I Ĵ : Ŷ × U → R, c : Ŷ × U → Ĉ are smooth mappings.

    Matthias Heinkenschloss June 14, 2016 22

  • Problem Formulation

    Original problem

    min J(y, u)

    s.t. c(y, u) = 0,u ∈ Uad.

    ⇓y(u) unique sol. of c(y, u) = 0

    min j(u)

    s.t. u ∈ Uad

    where j(u)def= J(y(u), u).

    Reduced order problem

    min Ĵ(ŷ, u)

    s.t. ĉ(ŷ, u) = 0,u ∈ Uad.

    ⇓ŷ(u) unique sol. of ĉ(ŷ, u) = 0

    min ĵ(u)

    s.t. u ∈ Uad

    where ĵ(u)def= Ĵ(ŷ(u), u).

    Matthias Heinkenschloss June 14, 2016 23

  • Derivative Computation

    Compute gradient and Hessian information for

    j(u) = J(y(u), u), where y(u) solves c(y, u) = 0.

    Assumption

    I J and c are twice continuously differentiable,

    I cy(y, u) is continuously invertible.

    Consider problems with large number of controls u.Use adjoint equation approach for derivative computaton.See, e.g., chapter 1 in [Hinze et al., 2009] or [Heinkenschloss, 2008].

    Matthias Heinkenschloss June 14, 2016 24

  • Gradient Computation

    I Derivative

    〈Dj(u), v〉U∗,U = 〈DyJ(y(u), u), Dy(u)v〉Y∗,Y+〈DuJ(y(u), u), v〉U∗,U

    I Implicit function theorem applied to c(y(u), u) = 0 gives

    cy(y, u)(Dy(u)v)+cu(y, u)v = 0 =⇒ Dy(u)v = −cy(y, u)−1cu(y, u).

    I Derivative (y = y(u))

    〈Dj(u), v〉U∗,U= 〈DyJ(y, u), −cy(y, u)−1cu(y, u)v〉Y∗,Y + 〈DuJ(y, u), v〉U∗,U= 〈−cy(y, u)−∗DyJ(y, u)︸ ︷︷ ︸

    =p

    , cu(y, u)v〉C∗,C + 〈DuJ(y, u), v〉U∗,U

    = 〈cu(y, u)∗p+DuJ(y, u), v〉U∗,U .

    Matthias Heinkenschloss June 14, 2016 25

  • Gradient Computation

    I Derivative

    〈Dj(u), v〉U∗,U = 〈DyJ(y(u), u), Dy(u)v〉Y∗,Y+〈DuJ(y(u), u), v〉U∗,U

    I Implicit function theorem applied to c(y(u), u) = 0 gives

    cy(y, u)(Dy(u)v)+cu(y, u)v = 0 =⇒ Dy(u)v = −cy(y, u)−1cu(y, u).

    I Derivative (y = y(u))

    〈Dj(u), v〉U∗,U= 〈DyJ(y, u), −cy(y, u)−1cu(y, u)v〉Y∗,Y + 〈DuJ(y, u), v〉U∗,U= 〈−cy(y, u)−∗DyJ(y, u)︸ ︷︷ ︸

    =p

    , cu(y, u)v〉C∗,C + 〈DuJ(y, u), v〉U∗,U

    = 〈cu(y, u)∗p+DuJ(y, u), v〉U∗,U .

    Matthias Heinkenschloss June 14, 2016 25

  • Gradient Computation

    I Derivative

    〈Dj(u), v〉U∗,U = 〈DyJ(y(u), u), Dy(u)v〉Y∗,Y+〈DuJ(y(u), u), v〉U∗,U

    I Implicit function theorem applied to c(y(u), u) = 0 gives

    cy(y, u)(Dy(u)v)+cu(y, u)v = 0 =⇒ Dy(u)v = −cy(y, u)−1cu(y, u).

    I Derivative (y = y(u))

    〈Dj(u), v〉U∗,U= 〈DyJ(y, u), −cy(y, u)−1cu(y, u)v〉Y∗,Y + 〈DuJ(y, u), v〉U∗,U= 〈−cy(y, u)−∗DyJ(y, u)︸ ︷︷ ︸

    =p

    , cu(y, u)v〉C∗,C + 〈DuJ(y, u), v〉U∗,U

    = 〈cu(y, u)∗p+DuJ(y, u), v〉U∗,U .

    Matthias Heinkenschloss June 14, 2016 25

  • I Connection with Lagrangian L(y, u, p) = J(y, u) + 〈p, c(y, u)〉C∗,C :I The adjoint variable p solves cy(y, u)

    ∗p = −DyJ(y, u),which is equivalent to DyL(y, u, p) = 0.

    I Derivative

    〈Dj(u), v〉U∗,U = 〈cu(y, u)∗p+DuJ(y, u), v〉U∗,U= 〈DuL(y, u, p), v〉U∗,U .

    I The gradient ∇j(u) is the vector in U such that

    〈∇j(u), v〉U = 〈Dj(u), v〉U∗,U= 〈cu(y, u)∗p+DuJ(y, u), v〉U∗,U ∀v ∈ U

    (Riesz representation)

    Gradient Computation Using Adjoints

    1. Given u, solve c(y, u) = 0 for y (if not done already).

    2. Solve the adjoint equation cy(y(u), u)∗ p = −DyJ(y(u), u) for p.

    Denote the solution by p(u).

    3. Compute Dj(u) = DuJ(y(u), u) + cu(y(u), u)∗p(u).

    Two PDE solves (possibly nonlinear PDE in step 1, linear PDE in step 2)

    Matthias Heinkenschloss June 14, 2016 26

  • I Connection with Lagrangian L(y, u, p) = J(y, u) + 〈p, c(y, u)〉C∗,C :I The adjoint variable p solves cy(y, u)

    ∗p = −DyJ(y, u),which is equivalent to DyL(y, u, p) = 0.

    I Derivative

    〈Dj(u), v〉U∗,U = 〈cu(y, u)∗p+DuJ(y, u), v〉U∗,U= 〈DuL(y, u, p), v〉U∗,U .

    I The gradient ∇j(u) is the vector in U such that

    〈∇j(u), v〉U = 〈Dj(u), v〉U∗,U= 〈cu(y, u)∗p+DuJ(y, u), v〉U∗,U ∀v ∈ U

    (Riesz representation)

    Gradient Computation Using Adjoints

    1. Given u, solve c(y, u) = 0 for y (if not done already).

    2. Solve the adjoint equation cy(y(u), u)∗ p = −DyJ(y(u), u) for p.

    Denote the solution by p(u).

    3. Compute Dj(u) = DuJ(y(u), u) + cu(y(u), u)∗p(u).

    Two PDE solves (possibly nonlinear PDE in step 1, linear PDE in step 2)

    Matthias Heinkenschloss June 14, 2016 26

  • I Connection with Lagrangian L(y, u, p) = J(y, u) + 〈p, c(y, u)〉C∗,C :I The adjoint variable p solves cy(y, u)

    ∗p = −DyJ(y, u),which is equivalent to DyL(y, u, p) = 0.

    I Derivative

    〈Dj(u), v〉U∗,U = 〈cu(y, u)∗p+DuJ(y, u), v〉U∗,U= 〈DuL(y, u, p), v〉U∗,U .

    I The gradient ∇j(u) is the vector in U such that

    〈∇j(u), v〉U = 〈Dj(u), v〉U∗,U= 〈cu(y, u)∗p+DuJ(y, u), v〉U∗,U ∀v ∈ U

    (Riesz representation)

    Gradient Computation Using Adjoints

    1. Given u, solve c(y, u) = 0 for y (if not done already).

    2. Solve the adjoint equation cy(y(u), u)∗ p = −DyJ(y(u), u) for p.

    Denote the solution by p(u).

    3. Compute Dj(u) = DuJ(y(u), u) + cu(y(u), u)∗p(u).

    Two PDE solves (possibly nonlinear PDE in step 1, linear PDE in step 2)Matthias Heinkenschloss June 14, 2016 26

  • Hessian ComputationI Apply implicit differentiation to

    Dj(u) = DuJ(y(u), u) + cu(y(u), u)∗p(u)

    to compute Hessian information.

    I Hessian–Times–Vector Computation1. Given u, solve c(y, u) = 0 for y (if not done already).2. Solve adjoint eqn. cy(y, u)

    ∗ p = −DyJ(y, u) for p (if not donealready).

    3. Solve cy(y, u)w = −cu(y, u)v.4. Solve cy(y, u)

    ∗ q = −DyyL(y, u, p)w −DyuL(y, u, p)v.5. Compute

    D2j(u)v = cu(y, u)∗q +DuyL(y, u, p)w +DuuL(y, u, p)v.

    Two linear PDE solves in steps 3+4 per direction v.

    I Vector su solves Newton equation ∇2j(u) su = −∇j(u)if and only if (sy, su) solves the quadratic program

    min 〈[DyJ(.)DuJ(.)

    ],

    [sysu

    ]〉+ 12 〈

    [sysu

    ],

    [DyyL(..) DyuL(..)DuyL(..) DuuL(..)

    ] [sysu

    ]〉,

    s.t. cy(.)sy + cu(.)su = 0,

    where (.) = (y(u), u) and (..) = (y(u), u, p(u)).

    Matthias Heinkenschloss June 14, 2016 27

  • Hessian ComputationI Apply implicit differentiation to

    Dj(u) = DuJ(y(u), u) + cu(y(u), u)∗p(u)

    to compute Hessian information.

    I Hessian–Times–Vector Computation1. Given u, solve c(y, u) = 0 for y (if not done already).2. Solve adjoint eqn. cy(y, u)

    ∗ p = −DyJ(y, u) for p (if not donealready).

    3. Solve cy(y, u)w = −cu(y, u)v.4. Solve cy(y, u)

    ∗ q = −DyyL(y, u, p)w −DyuL(y, u, p)v.5. Compute

    D2j(u)v = cu(y, u)∗q +DuyL(y, u, p)w +DuuL(y, u, p)v.

    Two linear PDE solves in steps 3+4 per direction v.

    I Vector su solves Newton equation ∇2j(u) su = −∇j(u)if and only if (sy, su) solves the quadratic program

    min 〈[DyJ(.)DuJ(.)

    ],

    [sysu

    ]〉+ 12 〈

    [sysu

    ],

    [DyyL(..) DyuL(..)DuyL(..) DuuL(..)

    ] [sysu

    ]〉,

    s.t. cy(.)sy + cu(.)su = 0,

    where (.) = (y(u), u) and (..) = (y(u), u, p(u)).

    Matthias Heinkenschloss June 14, 2016 27

  • Hessian ComputationI Apply implicit differentiation to

    Dj(u) = DuJ(y(u), u) + cu(y(u), u)∗p(u)

    to compute Hessian information.

    I Hessian–Times–Vector Computation1. Given u, solve c(y, u) = 0 for y (if not done already).2. Solve adjoint eqn. cy(y, u)

    ∗ p = −DyJ(y, u) for p (if not donealready).

    3. Solve cy(y, u)w = −cu(y, u)v.4. Solve cy(y, u)

    ∗ q = −DyyL(y, u, p)w −DyuL(y, u, p)v.5. Compute

    D2j(u)v = cu(y, u)∗q +DuyL(y, u, p)w +DuuL(y, u, p)v.

    Two linear PDE solves in steps 3+4 per direction v.

    I Vector su solves Newton equation ∇2j(u) su = −∇j(u)if and only if (sy, su) solves the quadratic program

    min 〈[DyJ(.)DuJ(.)

    ],

    [sysu

    ]〉+ 12 〈

    [sysu

    ],

    [DyyL(..) DyuL(..)DuyL(..) DuuL(..)

    ] [sysu

    ]〉,

    s.t. cy(.)sy + cu(.)su = 0,

    where (.) = (y(u), u) and (..) = (y(u), u, p(u)).

    Matthias Heinkenschloss June 14, 2016 27

  • Hessian ComputationI Apply implicit differentiation to

    Dj(u) = DuJ(y(u), u) + cu(y(u), u)∗p(u)

    to compute Hessian information.

    I Hessian–Times–Vector Computation1. Given u, solve c(y, u) = 0 for y (if not done already).2. Solve adjoint eqn. cy(y, u)

    ∗ p = −DyJ(y, u) for p (if not donealready).

    3. Solve cy(y, u)w = −cu(y, u)v.4. Solve cy(y, u)

    ∗ q = −DyyL(y, u, p)w −DyuL(y, u, p)v.5. Compute

    D2j(u)v = cu(y, u)∗q +DuyL(y, u, p)w +DuuL(y, u, p)v.

    Two linear PDE solves in steps 3+4 per direction v.

    I Vector su solves Newton equation ∇2j(u) su = −∇j(u)if and only if (sy, su) solves the quadratic program

    min 〈[DyJ(.)DuJ(.)

    ],

    [sysu

    ]〉+ 12 〈

    [sysu

    ],

    [DyyL(..) DyuL(..)DuyL(..) DuuL(..)

    ] [sysu

    ]〉,

    s.t. cy(.)sy + cu(.)su = 0,

    where (.) = (y(u), u) and (..) = (y(u), u, p(u)).

    Matthias Heinkenschloss June 14, 2016 27

  • Application to Example Problems

    Matthias Heinkenschloss June 14, 2016 28

  • Example: Elliptic Optimal Control ProblemI Problem:

    Minimize j(u) =1

    2

    ∫D

    (y(x)− d(x))2dx+ α2

    ∫Γc

    |∇u(x)|2dσ

    where y solves

    −∇(κ∇y(x)) + a·∇y(x) = f(x) in Ω,y(x) = u(x) on Γc,

    y(x) = 0 on ΓD,

    κ∇y(x)n = 0, on ΓN .

    I Control space H10 (Γc). State spaceV =

    {v ∈ H1(Ω) : v = 0 on Γc ∪ ΓD

    }.

    I Handle Dirichlet boundary condition using the inverse trace theorem.(In finite element approximation: via interpolation):For every u ∈ H10 (Γc) there exists y(u; ·) ∈ H1(Ω) such thaty(u;x) = u(x) on Γc.Moreover H10 (Γc) 3 u 7→ y(u; ·) ∈ H1(Ω) is bounded and linear.

    I Write y = y0 + y(u; ·), where y0 ∈ V.

    Matthias Heinkenschloss June 14, 2016 29

  • Example: Elliptic Optimal Control ProblemI Problem:

    Minimize j(u) =1

    2

    ∫D

    (y(x)− d(x))2dx+ α2

    ∫Γc

    |∇u(x)|2dσ

    where y solves

    −∇(κ∇y(x)) + a·∇y(x) = f(x) in Ω,y(x) = u(x) on Γc,

    y(x) = 0 on ΓD,

    κ∇y(x)n = 0, on ΓN .

    I Control space H10 (Γc). State spaceV =

    {v ∈ H1(Ω) : v = 0 on Γc ∪ ΓD

    }.

    I Handle Dirichlet boundary condition using the inverse trace theorem.(In finite element approximation: via interpolation):For every u ∈ H10 (Γc) there exists y(u; ·) ∈ H1(Ω) such thaty(u;x) = u(x) on Γc.Moreover H10 (Γc) 3 u 7→ y(u; ·) ∈ H1(Ω) is bounded and linear.

    I Write y = y0 + y(u; ·), where y0 ∈ V.

    Matthias Heinkenschloss June 14, 2016 29

  • Example: Elliptic Optimal Control ProblemI Problem:

    Minimize j(u) =1

    2

    ∫D

    (y(x)− d(x))2dx+ α2

    ∫Γc

    |∇u(x)|2dσ

    where y solves

    −∇(κ∇y(x)) + a·∇y(x) = f(x) in Ω,y(x) = u(x) on Γc,

    y(x) = 0 on ΓD,

    κ∇y(x)n = 0, on ΓN .

    I Control space H10 (Γc). State spaceV =

    {v ∈ H1(Ω) : v = 0 on Γc ∪ ΓD

    }.

    I Handle Dirichlet boundary condition using the inverse trace theorem.(In finite element approximation: via interpolation):For every u ∈ H10 (Γc) there exists y(u; ·) ∈ H1(Ω) such thaty(u;x) = u(x) on Γc.Moreover H10 (Γc) 3 u 7→ y(u; ·) ∈ H1(Ω) is bounded and linear.

    I Write y = y0 + y(u; ·), where y0 ∈ V.Matthias Heinkenschloss June 14, 2016 29

  • I Lagrangian:

    L(y, u, p) =1

    2

    ∫D

    (y0(x) + y(u;x)− yd(x))2dx+α

    2

    ∫Γc

    |∇u(x)|2dσ

    +

    ∫Ω

    κ∇y0∇p+ a·∇y0p dx

    +

    ∫Ω

    κ∇y(u; ·)∇p+ a·y(u; ·)∇p− fp dx

    I Adjoint equation:

    −∇(κ∇p(x))− a·∇p(x) = −(y0(x) + y(u;x)− yd(x))|D, in Ω,p(x) = 0, on Γc ∪ ΓD,

    κ∇p(x) · n+ a · n p(x) = 0, on ΓN .I Derivative

    〈Dj(u), v〉H10 (Γc)∗,H1(Γc)

    =

    ∫∂Ω

    α∇u(x)∇v(x) dσ +∫

    κ∇y(v; ·)∇p+ a·y(v; ·)∇p dx

    +

    ∫D

    (y0(x) + y(u;x)− yd(x))y(v;x)dx

    What’s the gradient?

    Matthias Heinkenschloss June 14, 2016 30

  • I Lagrangian:

    L(y, u, p) =1

    2

    ∫D

    (y0(x) + y(u;x)− yd(x))2dx+α

    2

    ∫Γc

    |∇u(x)|2dσ

    +

    ∫Ω

    κ∇y0∇p+ a·∇y0p dx

    +

    ∫Ω

    κ∇y(u; ·)∇p+ a·y(u; ·)∇p− fp dx

    I Adjoint equation:

    −∇(κ∇p(x))− a·∇p(x) = −(y0(x) + y(u;x)− yd(x))|D, in Ω,p(x) = 0, on Γc ∪ ΓD,

    κ∇p(x) · n+ a · n p(x) = 0, on ΓN .

    I Derivative

    〈Dj(u), v〉H10 (Γc)∗,H1(Γc)

    =

    ∫∂Ω

    α∇u(x)∇v(x) dσ +∫

    κ∇y(v; ·)∇p+ a·y(v; ·)∇p dx

    +

    ∫D

    (y0(x) + y(u;x)− yd(x))y(v;x)dx

    What’s the gradient?

    Matthias Heinkenschloss June 14, 2016 30

  • I Lagrangian:

    L(y, u, p) =1

    2

    ∫D

    (y0(x) + y(u;x)− yd(x))2dx+α

    2

    ∫Γc

    |∇u(x)|2dσ

    +

    ∫Ω

    κ∇y0∇p+ a·∇y0p dx

    +

    ∫Ω

    κ∇y(u; ·)∇p+ a·y(u; ·)∇p− fp dx

    I Adjoint equation:

    −∇(κ∇p(x))− a·∇p(x) = −(y0(x) + y(u;x)− yd(x))|D, in Ω,p(x) = 0, on Γc ∪ ΓD,

    κ∇p(x) · n+ a · n p(x) = 0, on ΓN .I Derivative

    〈Dj(u), v〉H10 (Γc)∗,H1(Γc)

    =

    ∫∂Ω

    α∇u(x)∇v(x) dσ +∫

    κ∇y(v; ·)∇p+ a·y(v; ·)∇p dx

    +

    ∫D

    (y0(x) + y(u;x)− yd(x))y(v;x)dx

    What’s the gradient?Matthias Heinkenschloss June 14, 2016 30

  • I The gradient ∇j(u) = g ∈ H10 (Γc) is a function that satisfies

    〈Dj(u), v〉H10 (Γc)∗,H1(Γc)

    =

    ∫∂Ω

    α∇u(x)∇v(x) dσ +∫

    κ∇y(v; ·)∇p+ a·y(v; ·)∇p dx

    +

    ∫D

    (y0(x) + y(u;x)− yd(x))y(v;x)dx

    =

    ∫∂Ω

    g(x)v(x) +∇g(x)∇v(x)ds = 〈g, v〉H10 (Γc) ∀v ∈ H10 (Γc).

    I Solve Laplace equation on the boundary,∫Γc

    ∇g̃(x)∇v(x)dσ =∫

    κ∇y(v; ·)∇p+ a · y(v; ·)∇p dx

    +

    ∫D

    (y0(x) + y(u;x)− yd(x))y(v;x)dx ∀v ∈ H10 (Γc)

    and then∇j(u) = αu+ g̃.

    I Note: Other ways to incorporate Dirichlet boundary controls(Lagrange multipliers, very weak form of Laplace equation) may leadto different weak forms and to different control and state spaces.

    Matthias Heinkenschloss June 14, 2016 31

  • I The gradient ∇j(u) = g ∈ H10 (Γc) is a function that satisfies

    〈Dj(u), v〉H10 (Γc)∗,H1(Γc)

    =

    ∫∂Ω

    α∇u(x)∇v(x) dσ +∫

    κ∇y(v; ·)∇p+ a·y(v; ·)∇p dx

    +

    ∫D

    (y0(x) + y(u;x)− yd(x))y(v;x)dx

    =

    ∫∂Ω

    g(x)v(x) +∇g(x)∇v(x)ds = 〈g, v〉H10 (Γc) ∀v ∈ H10 (Γc).

    I Solve Laplace equation on the boundary,∫Γc

    ∇g̃(x)∇v(x)dσ =∫

    κ∇y(v; ·)∇p+ a · y(v; ·)∇p dx

    +

    ∫D

    (y0(x) + y(u;x)− yd(x))y(v;x)dx ∀v ∈ H10 (Γc)

    and then∇j(u) = αu+ g̃.

    I Note: Other ways to incorporate Dirichlet boundary controls(Lagrange multipliers, very weak form of Laplace equation) may leadto different weak forms and to different control and state spaces.

    Matthias Heinkenschloss June 14, 2016 31

  • I The gradient ∇j(u) = g ∈ H10 (Γc) is a function that satisfies

    〈Dj(u), v〉H10 (Γc)∗,H1(Γc)

    =

    ∫∂Ω

    α∇u(x)∇v(x) dσ +∫

    κ∇y(v; ·)∇p+ a·y(v; ·)∇p dx

    +

    ∫D

    (y0(x) + y(u;x)− yd(x))y(v;x)dx

    =

    ∫∂Ω

    g(x)v(x) +∇g(x)∇v(x)ds = 〈g, v〉H10 (Γc) ∀v ∈ H10 (Γc).

    I Solve Laplace equation on the boundary,∫Γc

    ∇g̃(x)∇v(x)dσ =∫

    κ∇y(v; ·)∇p+ a · y(v; ·)∇p dx

    +

    ∫D

    (y0(x) + y(u;x)− yd(x))y(v;x)dx ∀v ∈ H10 (Γc)

    and then∇j(u) = αu+ g̃.

    I Note: Other ways to incorporate Dirichlet boundary controls(Lagrange multipliers, very weak form of Laplace equation) may leadto different weak forms and to different control and state spaces.

    Matthias Heinkenschloss June 14, 2016 31

  • Example: Parabolic Optimal Control ProblemI Problem:

    Minimize1

    2

    ∫ T0

    ∫D

    (y(x, t)− d(x, t))2dx dt+ α2

    ∫ T0

    ∫U

    u2(x, t)dx dt,

    where y solves

    ∂ty(x, t)−∇(κ∇y(x, t)) + a·∇y(x, t) = u(x, t)χU (x) in Ω× (0, T ),

    y(x, t) = 0, on ΓD × (0, T ), κ∇y(x, t)n = 0, on ΓN × (0, T ),y(x, 0) = 0, in Ω.

    I Lagrangian (formally)

    L(y, u, p) =1

    2

    ∫ T0

    ∫D

    (y(x, t)− d(x, t))2dx dt+ α2

    ∫ T0

    ∫U

    u2(x, t)dx dt

    +

    ∫ T0

    ∫Ω

    ∂ty(x, t)p(x, t) + κ∇y(x, t)∇p(x, t) + a·∇y(x, t)p(x, t)dxdt

    −∫ T

    0

    ∫U

    u(x, t)p(x, t)dxdt

    (For details chapter 1 in [Hinze et al., 2009] or [Tröltzsch, 2010a].)

    Matthias Heinkenschloss June 14, 2016 32

  • I Adjoint equation

    − ∂∂tp(x, t)−∇(κ∇p(x, t))

    −a·∇p(x, t) = −(y(x, t)− d(x, t))χD(x) in Ω× (0, T ),p(x, t) = 0, on ΓD × (0, T ),

    (κ∇p(x, t) + a p(x, t))n = 0, on ΓN × (0, T ),p(x, T ) = 0, in Ω.

    I Gradient

    ∇j(u) =αu(x, t)− p(x, t) x ∈ U, t ∈ (0, T ).

    Matthias Heinkenschloss June 14, 2016 33

  • Outline

    Overview

    Example Optimization Problems

    Optimization Problem

    Projection Based Model Reduction

    Back to Optimization

    Error Estimates

    Linear-Quadratic Problems

    Shape Optimization with Local Parameter Dependence

    Semilinear Parabolic Problems

    Trust-Region Framework

    Matthias Heinkenschloss June 14, 2016 34

  • Reduced-Order Dynamical Systems

    ẏ(t) = Ay(t) + Bu(t) + f(t)

    s(t) = Cy(t)

    ẏ(t) = f(y(t),u(t), t)

    s(t) = g(y(t), t)

    Replace y(t) ∈ Rn by Vŷ(t) =∑ri=1 viŷi(t), ŷ ∈ Rr where r � n

    and multiply the state equation by WT . (Often W = V.)

    ˙̂y = WTAVŷ + WTBu + WT f(t)

    ŝ = CVŷ

    ˙̂y = WT f(Vŷ,u)

    ŝ = g(Vŷ)

    Two main questions:

    I Accuracy of the reduced order model? Approximation of theinput-to-output map u 7→ s.

    I Efficiency of the reduced order model?

    Matthias Heinkenschloss June 14, 2016 35

  • Projection Based Reduced Order Models (ROMs) -Overview

    I Reduced Basis Method.See books [Hesthaven et al., 2015], [Patera and Rozza, 2007],[Quarteroni et al., 2016].

    I Proper Orthogonal Decomposition (POD).See survey article [Hinze and Volkwein, 2005] and sections in books[Hesthaven et al., 2015], [Quarteroni et al., 2016].

    I Balanced Truncation Model Reduction (BTMR) for linear timeinvariant problems.See book [Antoulas, 2005] and for connections with POD[Rowley, 2005].

    I Interpolation Based Model Reduction.See survey articles [Antoulas et al., 2010], [Benner et al., 2015].

    Matthias Heinkenschloss June 14, 2016 36

  • Reduced Order Model (ROM) of Parametric Elliptic PDEI Given Hilbert space V, bounded coercive bilinear form a(·, ·;µ) onV × V, and bounded linear functional f(·;µ) on V.

    I Find y ∈ V that satisfies the variational formulation:

    a(y, v;µ) = f(v;µ) ∀v ∈ V.

    I Given bounded linear functional `(·) on V, we are often interested in

    s(µ) = `(y(µ)) (quantity of interest)

    I Example

    −∇2y(µ) +[µ0

    ]· ∇y(µ) = 100e−5

    √‖x‖2 , in Ω = [−1, 1]2, µ ∈ [−10, 10],

    y(µ) = 0, on ∂Ω,

    s(µ) =

    ∫Ω

    y(x;µ)dx.

    Here V = H10 (Ω) and f(v;µ) =∫

    Ω100e−5

    √‖x‖2v(x)dx,

    a(y, v;µ) =∫

    Ω∇y(x) · ∇v(x) +

    [µ0

    ]· ∇y(x)v(x)dx.

    Matthias Heinkenschloss June 14, 2016 37

  • Finite Element ApproximationI Vn = span{φ1, . . . , φn} ⊂ V.

    Find y = y(µ) ∈ Vn such that

    a(y, v;µ) = f(v;µ) ∀v ∈ Vn.I Linear system for y =

    ∑ni=1 yiφi:

    A(µ)y = f(µ), (n× n)

    where

    A(µ)ij = a(φj , φi;µ),

    f(µ)i = f(φi;µ).

    I Well posedness: If there exists α > 0 such that

    a(v, v;µ) ≥ α‖v‖2V for all v ∈ V and µ ∈ Γ,

    then yTA(µ)y ≥ α‖y‖2V (norm ‖y‖2V =∑ni,j=1 yiyj〈φi, φj〉V) and

    A(µ)y = f(µ)

    ⇒ α‖y‖2V ≤ yTA(µ)y = yT f(µ) ≤ ‖y‖V ‖f(µ)‖V−1⇒ ‖y‖V ≤ α−1‖f(µ)‖V−1 ⇒ ‖A(µ)−1‖V ≤ α−1.

    Matthias Heinkenschloss June 14, 2016 38

  • Reduced Order Model (ROM)I Subspace Vr = span{ζ1, . . . , ζr} ⊂ Vn, r � n.

    Find ŷ = ŷ(µ) ∈ Vr such that

    a(ŷ, v;µ) = f(v;µ) ∀v ∈ Vr.I Linear system for ŷ =

    ∑ri=1 ŷiζi:

    I Represent Vr basis: ζi =∑nk=1 vkiφk, V ≡ (vij) ∈ R

    n×r.I Insert into bilinear/linear forms

    a(ζj , ζi;µ) =

    n∑k=1

    n∑k̂=1

    vkivk̂ja(φk̂, φk;µ) = (VTA(µ)V)ij ,

    f(ζi;µ) =

    n∑k=1

    vkif(φk;µ) = (VT f(µ))i.

    I ROM

    a(ŷ, ζi;µ) =r∑j=1

    ŷja(ζj , ζi;µ) = f(ζi;µ), i = 1, . . . , r,

    is equivalent to

    VTA(µ)Vŷ = VT f(µ). (r × r)

    I Well posedness inherited: ŷTVTA(µ)Vŷ ≥ α‖Vŷ‖2V ≡ α‖ŷ‖2V.Matthias Heinkenschloss June 14, 2016 39

  • Basic ROM Algorithm1. Compute Snapshots: Given {µ1, . . . , µr} compute full solutions:

    A(µi)y(µi) = f(µi)

    2. Orthogonalize: Find V ∈ Rn×r where VTV = I and

    Ran(V) = span{y(µ1), . . . ,y(µr)}

    3. Construct Reduced Order System:

    Â(µ) = VTA(µ)V ∈ Rr×r, f̂(µ) = VT f(µ) ∈ Rr

    4. ROM Solution: Cheaply solve reduced order system forout-of-sample parameter choices µ:

    Â(µ)ŷ = f̂(µ).

    Approximation y(µ) ≈ Vŷ(µ).Above is basic algorithm

    I At what parameters µi do we sample?

    I ROM Â(µ), f̂(µ) is smaller, but evaluation of Â(µ), f̂(µ) not cheap

    Matthias Heinkenschloss June 14, 2016 40

  • ROM Error Estimates

    I Error in the solution

    A(µ)(y −Vŷ) = f(µ)−A(µ)Vŷ)=⇒ ‖y −Vŷ‖ ≤ ‖A(µ)−1‖ ‖f(µ)−A(µ)Vŷ)‖.

    I Error in output of interest s(µ) = lTy(µ):

    |s(µ)− ŝ(µ)| = |lTy − l̂T ŷ|≤ ‖l‖ ‖A(µ)−1‖ ‖f(µ)−A(µ)Vŷ)‖.

    But can do much better [Machiels et al., 2001].

    Matthias Heinkenschloss June 14, 2016 41

  • ROM Error Estimate - Quantity of Interest

    Primal:

    A(µ)y(µ) = f(µ)

    s(µ) = lTy(µ)

    Primal RB:

    Â(µ)ŷ(µ) = f̂(µ)

    ŝ(µ) = l̂T ŷ(µ)

    Primal Residual:

    ‖f(µ)−A(µ)Vŷ(µ)‖

    Primal Bound:

    ‖y −Vŷ‖ = ‖A−1(f −AVŷ)‖≤ ‖A−1‖‖f −AVŷ‖

    Dual:

    p(µ)TA(µ) = −l(µ)T

    Dual RB:

    p̂(µ)T Â(µ) = −̂l(µ)

    Dual Residual:

    ‖ − l(µ)−A(µ)TVp̂(µ)‖

    Dual Bound:

    ‖p−Vp̂‖ = ‖A−T (−l−ATVp̂)‖≤ ‖A−T ‖‖ − l−ATVp̂‖

    Matthias Heinkenschloss June 14, 2016 42

  • ROM Error Estimate - Quantity of Interest

    Galerkin orthogonality

    a(y − ŷ, p̂) = f(p̂)− f(p̂) = 0 =⇒ (Vp̂)TA(y −Vŷ) = 0a(ŷ, p− p̂) = −`(ŷ) + `(ŷ) = 0 =⇒ (p−Vp̂)TAVT ŷ = 0

    Error bound

    |s(µ)− ŝ(µ)| = |lTy − l̂T ŷ|= |pTAy − (Vp̂)TAVŷ|= |pTA(y −Vŷ) + (p−Vp̂)TAVT ŷ|= |pTA(y −Vŷ)− (Vp̂)TA(y −Vŷ)|= |(p−Vp̂)TA(y −Vŷ)|≤ ‖A−T ‖‖AT (p−Vp̂)‖‖A(y −Vŷ)‖

    |s(µ)− ŝ(µ)| ≤ ∆(µ) ≡ ‖A−1‖‖f −AVŷ‖‖ − l−ATVp̂‖

    Matthias Heinkenschloss June 14, 2016 43

  • Petrov-Galerkin Reduced Order Model (ROM)

    I Could also generate two subspaces

    Vr = span{ζ1, . . . , ζr} ⊂ Vn,Wr = span{ξ1, . . . , ξr} ⊂ Vn, r � n.

    For example

    Vr = span{y(µ1), . . . , y(µr)} ⊂ Vn,Wr = span{p(µ1), . . . , p(µr)} ⊂ Vn.

    Find ŷ = ŷ(µ) ∈ Vn such that

    a(ŷ, w;µ) = f(vn;µ) ∀w ∈ Wr.

    I Linear system for y =∑nj=1 ŷjφj :

    WTA(µ)Vŷ = WT f(µ). (r × r)

    I But well posedness no longer inherited.Not automatically guaranteed that WTA(µ)V is invertible orinverse is uniformly bounded in r.

    Matthias Heinkenschloss June 14, 2016 44

  • Reduced Basis Method - Greedy SelectionRecall |s(µ)− ŝ(µ)| ≤ ∆(µ) ≡ ‖A−1‖‖f −AVŷ‖‖ − l−ATVp̂‖.Choose Γtrain ⊂ Γ, tolerance � > 0 and maximum ROM size rmax.

    Given r > 0, V ∈ Rn×r.While r < rmax

    1. Next sample point via greedy and error estimate

    µr+1 = argmaxµ∈Γtrain∆(µ)

    2. If ∆(µr+1) < � stop. Computed ROM of desired accuracy.3. Compute new basis vector y(µr+1) by solving

    A(µr+1)y = f(µr+1)4. Update old basis: compute V ∈ Rn×(r+1) where VTV = I and

    Ran(V) = span{y(µ1), . . . ,y(µr+1)}

    5. Update Reduced Order System:

    Â(µ) = VTA(µ)V ∈ R(r+1)×(r+1), f̂(µ) = VT f(µ) ∈ Rr+1

    6. Set r ← r + 1.Constructed y(µ1), . . . ,y(µr+1) are linearly independent.Convergence of the greedy selection [Binev et al., 2011].

    Matthias Heinkenschloss June 14, 2016 45

  • Example

    −∇2y(µ) +[µ0

    ]· ∇y(µ) = 100e−5

    √‖x‖2 , in Ω = (−1, 1)2, µ ∈ [−10, 10],

    y(µ) = 0, on ∂Ω

    s(µ) =

    ∫Ω

    y(µ)

    (a) µ = −10 (b) µ = 0 (c) µ = 10

    Matthias Heinkenschloss June 14, 2016 46

  • Figure: Convection-Diffusion Equation: Output, s(µ) vs. Parameter, µ

    Matthias Heinkenschloss June 14, 2016 47

  • Figure: Convection-Diffusion Equation: Output, s(µ) vs. Parameter, µ

    Matthias Heinkenschloss June 14, 2016 48

  • Figure: Convection-Diffusion Equation: Output, s(µ) vs. Parameter, µ

    Matthias Heinkenschloss June 14, 2016 49

  • Figure: Convection-Diffusion Equation: Output, s(µ) vs. Parameter, µ

    Matthias Heinkenschloss June 14, 2016 50

  • Figure: Convection-Diffusion Equation: Output, s(µ) vs. Parameter, µ

    Matthias Heinkenschloss June 14, 2016 51

  • Figure: Convection-Diffusion Equation: Output, s(µ) vs. Parameter, µ

    Matthias Heinkenschloss June 14, 2016 52

  • Figure: Convection-Diffusion Equation: Output, s(µ) vs. Parameter, µ

    Matthias Heinkenschloss June 14, 2016 53

  • Proper Orthogonal Decomposition

    I Given snapshots y(µ1), . . . , y(µm) ∈ Vn, m > r.I Compute orthonormal basis v1, . . . , vr ∈ Vn as solution of

    min

    m∑k=1

    ‖yk −r∑i=1

    〈yk, vi〉V vi‖2V

    s.t. 〈vi, vj〉V = δij .

    I SolutionI Compute eigenvectors v1, v2, . . . ∈ Vn and eigenvaluesλ1 ≥ λ2 ≥ . . . ≥ λm ≥ 0 of the linear operator

    ψ 7→ Kψ =m∑k=1

    yk〈yk, ψ〉V .

    I Solution v1, v2, . . . , vr,

    m∑k=1

    ‖yk −r∑i=1

    〈yk, vi〉V vi‖2V =m∑

    i=r+1

    λi.

    Matthias Heinkenschloss June 14, 2016 54

  • I Finite dimensional representation of snapshotsy(µ1), . . . ,y(µm) ∈ Rn, m > r.

    I Inner product 〈v, w〉V = vTMw, M s.p.d. (not nec. mass matrix)I Solution

    I Define Y = [y(µ1), . . . ,y(µm)] ∈ Rn×m.I Compute M-orthonormal eigenvecs. v1,v2, . . . ∈ Rn and eigenvals.λ1 ≥ . . . ≥ λm ≥ 0 of generalized n× n eigenvalue prob.

    MYYTMvi = λiMvi.

    I Alternatively, if n > m compute eigenvectors w1,w2, . . . ∈ Rm andeigenvalues λ1 ≥ λ2 ≥ . . . ≥ λmin{m,n} ≥ 0 of

    YTMYwi = λiwi.

    vi = λ−1/2i Ywi, i = 1, . . . ,m.

    I Usually, fix tolerance � > 0. Compute eigenvectors v1,v2, . . . ∈ Rnand eigenvalues λ1 ≥ λ2 ≥ . . . ≥ λmin{m,n} ≥ 0.

    I Find smallest r such that∑mi=r+1 λi < �.

    I If only some of the largest eigenvals. and vecs. are computed:Find smallest r such that λr+1/λ1 < �.

    Reduced order model V = [v1, . . . ,vr] ∈ Rn×r.Matthias Heinkenschloss June 14, 2016 55

  • I POD often used to ‘compress’ solution of (linear or nonlinear)dynamical system

    My′(t) = Ay(t) + f(t), t ∈ (0, T ),y(0) = y0.

    I Solutions y(t0), . . . ,y(tm) ∈ Rn at time steps0 = t0 < . . . < tm = T used as snapshots.

    I Can combine Reduced Basis Method and POD for parameterizeddynamical systems

    M(µ)y′(t;µ) = A(µ)y(t;µ) + f(t;µ), t ∈ (0, T ),y(0;µ) = y0,

    s(µ) =

    ∫ T0

    c(t;µ)Ty(0;µ)dt. (output of interest)

    Corresponding dual

    −M(µ)Tp′(t;µ) = A(µ)Tp(t;µ)− c(t;µ), t ∈ (0, T ),p(T ;µ) = 0,

    Matthias Heinkenschloss June 14, 2016 56

  • Balanced Truncation Model Reduction (BTMR)I Consider

    d

    dty(t) = Ay(t) + Bu(t), t ∈ (0, T )

    z(t) = Cy(t) + Du(t), t ∈ (0, T )y(0) = 0.

    I Projection methods for model reduction produce n× r matrices V,Wwith r � n and with WTV = Ir.

    I One obtains a reduced form by setting y = Vŷ and projecting so that

    WT [Vd

    dtŷ(t)−AVŷ(t)−Bu(t)] = 0, t ∈ (0, T ).

    I This leads to a reduced order system of order n given by

    d

    dtŷ(t) = Âŷ(t) + B̂u(t), t ∈ (0, T )

    ẑ(t) = Ĉŷ(t) + Du(t), t ∈ (0, T )ŷ(0) = 0.

    with  = WTAV, B̂ = WTB, and Ĉ = CV.

    Matthias Heinkenschloss June 14, 2016 57

  • Controllability and Observability Gramians

    I Recall

    y′(t) = Ay(t) + Bu(t), t ∈ (0, T )z(t) = Cy(t) + Du(t), t ∈ (0, T ).

    Assume the system is stable (Re(λ(A)) < 0), controllable and observable.

    I Controllability Gramian.

    I P =∫ ∞

    0

    eAtB BT eAT tdt.

    I Eigenspaces corresponding to large eigenvalues are ‘easy’ to control(control has smaller energy).

    I Controllability Gramian solves the Lyapunov equation

    AP + PAT + BBT = 0.

    I Observability Gramian.

    I Q =∫ ∞

    0

    eAT tCTCeAtdt.

    I Eigenspaces corresponding to large eigenvalues are ‘easy’ to observe.I Observability Gramian solves the Lyapunov equation

    ATQ+QA + CTC = 0.

    Matthias Heinkenschloss June 14, 2016 58

  • Controllability and Observability Gramians

    I Recall

    y′(t) = Ay(t) + Bu(t), t ∈ (0, T )z(t) = Cy(t) + Du(t), t ∈ (0, T ).

    Assume the system is stable (Re(λ(A)) < 0), controllable and observable.

    I Controllability Gramian.

    I P =∫ ∞

    0

    eAtB BT eAT tdt.

    I Eigenspaces corresponding to large eigenvalues are ‘easy’ to control(control has smaller energy).

    I Controllability Gramian solves the Lyapunov equation

    AP + PAT + BBT = 0.

    I Observability Gramian.

    I Q =∫ ∞

    0

    eAT tCTCeAtdt.

    I Eigenspaces corresponding to large eigenvalues are ‘easy’ to observe.I Observability Gramian solves the Lyapunov equation

    ATQ+QA + CTC = 0.

    Matthias Heinkenschloss June 14, 2016 58

  • Controllability and Observability Gramians

    I Recall

    y′(t) = Ay(t) + Bu(t), t ∈ (0, T )z(t) = Cy(t) + Du(t), t ∈ (0, T ).

    Assume the system is stable (Re(λ(A)) < 0), controllable and observable.

    I Controllability Gramian.

    I P =∫ ∞

    0

    eAtB BT eAT tdt.

    I Eigenspaces corresponding to large eigenvalues are ‘easy’ to control(control has smaller energy).

    I Controllability Gramian solves the Lyapunov equation

    AP + PAT + BBT = 0.

    I Observability Gramian.

    I Q =∫ ∞

    0

    eAT tCTCeAtdt.

    I Eigenspaces corresponding to large eigenvalues are ‘easy’ to observe.I Observability Gramian solves the Lyapunov equation

    ATQ+QA + CTC = 0.

    Matthias Heinkenschloss June 14, 2016 58

  • I Compute controllability and observability gramians P,Q P = UUT andQ = LLT in factored form, i.e., solve

    AP + PAT + BBT = 0,

    ATQ+QA + CTC = 0.

    I Compute the SVD UTL = ZSYT ,where Sr = diag(σ1, σ2, . . . , σr) with S = Sn, and σ1 ≥ σ2 ≥ . . ..

    I Set V = UZrS−1/2n , W = LYrS

    −1/2n , where n is selected to be the

    smallest positive integer such that σr+1 < τσ1. Here τ > 0 is aprespecified constant. The matrices Zr,Yr consist of the correspondingleading r columns of Z,Y.

    I It is easily verified that PW = VSr and QV = WSr.I Hence

    0 = WT (AP + PAT + BBT )W = ÂSr + SrÂT + B̂B̂T ,

    0 = VT (ATQ+QA + CTC)V = ÂTSr + Sr + ĈT Ĉ.

    Matthias Heinkenschloss June 14, 2016 59

  • Two important properties of balanced trunction model reduction:

    I Â is stable

    I For any given input u we have

    ‖z− ẑ‖L2 ≤ 2‖u‖L2(σn+1 + . . .+ σN )

    where ẑ is the output (response) of the reduced model[Glover, 1984].

    Matthias Heinkenschloss June 14, 2016 60

  • Empirical Interpolation Method

    I VTA(µ)V ∈ Rr×r, but evaluation µ 7→ VTA(µ)V requiresevaluation µ 7→ A(µ) 7→ VTA(µ)V at cost dependent on n.

    I If A(µ) = A0 +∑kj=1 µjAj , then

    VTA(µ)V = VTA0V +

    k∑j=1

    µjVTAjV.

    Precompute VTAkV ∈ Rr×r, j = 0, . . . , k, afterwardsevaluate µ 7→ VTA(µ)V at cost of O(r2).

    I For example, finite element discretization of

    −∇2y(µ) +[µ0

    ]· ∇y(µ) = 100e−5

    √‖x‖2 , in Ω = (−1, 1)2,

    y(µ) = 0, on ∂Ω

    leads to A(µ) = Adiff + µAadv.

    Matthias Heinkenschloss June 14, 2016 61

  • Empirical Interpolation Method

    I Empirical Interpolation Method (EIM): [Barrault et al., 2004][Eftang et al., 2010].

    I Discrete Empirical Interpolation Method (DEIM):[Chaturantabut and Sorensen, 2010].

    I Application of DEIM for finite element approximations[Antil et al., 2014], [Tiso and Rixen, 2013].

    I Element based (compared to nodal/point based) version:[Farhat et al., 2015].

    Matthias Heinkenschloss June 14, 2016 62

  • Outline

    Overview

    Example Optimization Problems

    Optimization Problem

    Projection Based Model Reduction

    Back to Optimization

    Error Estimates

    Linear-Quadratic Problems

    Shape Optimization with Local Parameter Dependence

    Semilinear Parabolic Problems

    Trust-Region Framework

    Matthias Heinkenschloss June 14, 2016 63

  • Recall Optimization Problem

    Original problem

    min J(y, u)

    s.t. c(y, u) = 0,u ∈ Uad.

    ⇓y(u) unique sol. of c(y, u) = 0

    min j(u)

    s.t. u ∈ Uad

    where j(u)def= J(y(u), u).

    Reduced order problem

    min Ĵ(ŷ, u)

    s.t. ĉ(ŷ, u) = 0,u ∈ Uad.

    ⇓ŷ(u) unique sol. of ĉ(ŷ, u) = 0

    min ĵ(u)

    s.t. u ∈ Uad

    where ĵ(u)def= Ĵ(ŷ(u), u).

    Matthias Heinkenschloss June 14, 2016 64

  • Gradient Computation (Using Adjoints)

    1. Given u, solve state PDE c(y, u) = 0 for y = y(u).

    2. Solve the adjoint PDE cy(y(u), u)∗ p = −DyJ(y(u), u) for p = p(u).

    3. Compute Dj(u) = DuJ(y(u), u) + cu(y(u), u)∗p(u).

    〈∇j(u), v〉U = 〈Dj(u), v〉U∗,U= 〈cu(y, u)∗p+DuJ(y, u), v〉U∗,U ∀v ∈ U

    Matthias Heinkenschloss June 14, 2016 65

  • Outline

    Overview

    Example Optimization Problems

    Optimization Problem

    Projection Based Model Reduction

    Back to Optimization

    Error Estimates

    Linear-Quadratic Problems

    Shape Optimization with Local Parameter Dependence

    Semilinear Parabolic Problems

    Trust-Region Framework

    Matthias Heinkenschloss June 14, 2016 66

  • Overview

    I Optimization problem: More parameters - controls u ∈ Uad;objective function j(u) is quantify of interest.

    I Approximating just the objective function j(u) is not enough.Also need to approximate gradient information ∇j(u).

    Matthias Heinkenschloss June 14, 2016 67

  • Error Estimate (Strongly Convex Function) I

    I u∗ minimizer of original objective j, û∗ min. of reduced objective ĵ.Want to estimate error ‖û∗ − u∗‖U .

    I Optimality conditions

    〈∇j(u∗), u− u∗〉U ≥ 0 ∀u ∈ Uad,

    〈∇ĵ(û∗), u− û∗〉U ≥ 0 ∀u ∈ Uad.

    I Assume j is strongly convex function on convex set C ⊂ U : Thereexists κ > 0 such that

    〈u− w,∇j(u)−∇j(w)〉U ≥ κ‖u− w‖2U for all u,w ∈ C ⊂ U .

    I Let ξ ∈ U be such that

    〈∇j(û∗) + ξ, u− û∗〉U ≥ 0 ∀u ∈ Uad.

    ξ = ∇ĵ(û∗)−∇j(û∗) always works; sometimes can find better ξ.

    (û∗ solves perturbed optimization problem minu∈Uad j(u) + 〈ξ, u〉U .)Matthias Heinkenschloss June 14, 2016 68

  • Error Estimate (Strongly Convex Function) II

    I Combine optimality conds. & convexity: If u∗, û∗ ∈ C,

    κ‖u∗ − û∗‖2 ≤ 〈u∗ − û∗,∇j(u∗)−∇j(û∗)〉≤ 〈u∗ − û∗,∇j(u∗)−∇j(û∗)〉+ 〈u∗ − û∗,∇j(û∗) + ξ〉= 〈u∗ − û∗,∇j(u∗)〉+ 〈u∗ − û∗, ξ〉 ≤ 〈u∗ − û∗, ξ〉.

    I Hence ‖u∗ − û∗‖U ≤ κ−1‖ξ‖U(≤ κ−1‖∇ĵ(û∗)−∇j(û∗)‖U

    )I Estimate error in gradients to get estimate for error in solution.

    I Applies whenI j is strongly convex function on Uad admissible set, e.g.,

    convex-linear quadratic problems.I j satisfies strong second order optimality conditions at u∗ and û∗ is

    in neighborhood of u∗.

    Matthias Heinkenschloss June 14, 2016 69

  • Error Estimate for Unconstrained ProblemsI û∗ = argminuĵ(u) minimizer of unconstrained reduced problem.

    I Newton-Kantorovich Theorem: Let r > 0 and ∇2j ∈ LipL(Br(û∗)).∇2j(û∗) be nonsingular and constants ζ, η ≥ 0 such that

    ‖∇2j(û∗)−1‖ = ζ, ‖∇2j(û∗)−1∇j(û∗)‖ ≤ η.

    If Lζη ≤ 12 , there is unique local minimum u∗ of j in ball around û∗with radius min

    {r,(

    1−√

    1− 2Lζη)/(Lζ)

    }≤ min

    {r, 2η

    }.

    I Estimate η:

    ‖∇2j(û∗)−1(∇j(û∗)−∇ĵ(û∗)︸ ︷︷ ︸=0

    )‖ ≤ ζ‖∇j(û∗)−∇ĵ(û∗)‖ = η.

    I Hence ‖u∗ − û∗‖U ≤ 2ζ‖∇ĵ(û∗)−∇j(û∗)‖UI Estimate error in gradients to get estimate for error in solution.

    I Need Lζη ≤ 12 , i.e., ∇j(û∗) small enough.I Can estimate error using convergence properties of Newton’s

    method started with û∗ applied to original problem.

    Matthias Heinkenschloss June 14, 2016 70

  • Outline

    Overview

    Example Optimization Problems

    Optimization Problem

    Projection Based Model Reduction

    Back to Optimization

    Error Estimates

    Linear-Quadratic Problems

    Shape Optimization with Local Parameter Dependence

    Semilinear Parabolic Problems

    Trust-Region Framework

    Matthias Heinkenschloss June 14, 2016 71

  • Elliptic Linear-Quadratic Model Problem

    I Original problem

    min1

    2yTQy + cTy +

    α

    2uTRu

    s.t. Ay + Bu = b,

    u ∈ Uad

    where A ∈ Rn×n invertible, B ∈ Rn×m, b, c ∈ Rn,Q = QT ∈ Rn×n positive semidef., R = RT ∈ Rm×m positive def.n large.

    I Reduced order problem

    min1

    2ŷTVTQVŷ + cTVŷ +

    α

    2uTRu

    s.t. WTAVŷ + WTBu = WTb,

    u ∈ Uad

    where V,W ∈ Rn×r, r � n, Â def= WTAV invertible.

    Matthias Heinkenschloss June 14, 2016 72

  • I Gradient original problem

    solve Ay + Bu = b,

    solve ATp = −Qy − d,∇j(u) = αRu + BTp.

    I Gradient reduced order problem

    solve WTAV ŷ + WTBu = WTb,

    solve VTATWp̂ = −VTQVŷ −VTd,

    ∇ĵ(u) = αRu + BTWp̂.

    I Error

    ∇ĵ(u)−∇j(u) = BT(Wp̂− p

    )= BT (Wp̂− p̃ + p̃− p

    ),

    p̃ solves AT p̃ = −QVŷ − d (full adjoint with reduced Vŷ input).

    I Error bound

    ‖∇ĵ(u)−∇j(u)‖ ≤ ‖B‖(‖A−T ‖‖ATWp̂ + QVŷ + d‖

    + ‖A−T ‖ ‖A−1‖ ‖Q‖ ‖AVŷ + Bu− b‖)

    Matthias Heinkenschloss June 14, 2016 73

  • I Basis V represents state information, W represents adjointinformation. In principle can construct V 6= W, but must haveWTAV invertible.Often compute V = W from samples/snapshots of states andadjoints.Careful, states and adjoints represent different objects and havedifferent scales (more later).

    I Application to parameterized optimal control problems µ ∈ D

    min1

    2yTQ(µ)y + c(µ)Ty +

    α

    2uTR(µ)u

    s.t. A(µ)y + B(µ)u = b(µ).

    I Assume computable uniform bounds, e.g., ‖A(µ)−1‖ ≤ a, µ ∈ D.I Have error estimates, can now use ROM machinery developed for

    equations.I Use greedy procedure to sample µ ∈ Γ. Use error estimate.I Add state y(µ) and adjoint p(µ) at sample µ to basis V = W.

    Matthias Heinkenschloss June 14, 2016 74

  • Parabolic Linear-Quadratic Model ProblemI Consider optimal control probl. governed by advection diffusion PDE

    ∂ty(x, t)−∇(k(x)∇y(x, t)) + a(x) · ∇y(x, t)) = f(x, t)

    in Ω× (0, T ). Optimization variables are related to the right handside f or to boundary data.

    I After (finite element) discretization in space the optimal controlproblems are of the form

    min j(u) ≡ 12

    ∫ T0

    ‖Cy(t) + Du(t)− d(t)‖2dt,

    where y(t) = y(u; t) is the solution of

    My′(t) = Ay(t) + Bu(t), t ∈ (0, T ),y(0) = y0.

    Here y(t) ∈ Rn, M ∈ Rn×n invert., A ∈ Rn×n, B ∈ Rn×m, n large.D ∈ Rm×m invertible. Strongly convex problem.

    Matthias Heinkenschloss June 14, 2016 75

  • I Reduced optimal control problem

    min ĵ(u) ≡ 12

    ∫ T0

    ‖=Ĉ︷︸︸︷CV ŷ(t) + Du(t)− d(t)‖2dt

    where ŷ(t) = ŷ(u; t) solves

    WTMV︸ ︷︷ ︸=M̂

    ŷ′(t) = WTAV︸ ︷︷ ︸=Â

    ŷ(t) + WTB︸ ︷︷ ︸=B̂

    u(t), t ∈ (0, T ),

    ŷ(0) = ŷ0.

    Here ŷ(t) ∈ Rr, M̂, Â ∈ Rr×r, B̂ ∈ Rr×m, with r � n small.

    Matthias Heinkenschloss June 14, 2016 76

  • I Gradient computation original problem

    My′(t) = Ay(t) + Bu(t), t ∈ (0, T ), y(0) = y0,z(t) = Cy(t) + Du(t)− d(t), t ∈ (0, T )

    −MTp′(t) = ATp(t) + CT z(t), t ∈ (0, T ), p(T ) = 0,∇j(u) = q(t) = BTp(t) + DT z(t), t ∈ (0, T )

    I Gradient computation reduced problem

    M̂ŷ′(t) = Âŷ(t) + B̂u(t), t ∈ (0, T ) ŷ(0) = ŷ0,

    ẑ(t) = Ĉŷ(t) + Du(t)− d(t), t ∈ (0, T )

    −M̂T p̂′(t) = ÂT p̂(t) + ĈT ẑ(t), t ∈ (0, T ) p̂(T ) = 0,

    ∇ĵ(u) = q̂(t) = B̂T p̂(t) + DT ẑ(t), t ∈ (0, T )

    I ‘Duality’ in input-output maps u 7→ z, w 7→ q of state, adjoint sys.I Need to approximate input-to-output maps u 7→ z, w 7→ q.

    Matthias Heinkenschloss June 14, 2016 77

  • I Balanced Truncation Model Reduction (BTMR) error bound:If system is stable (Re(λ(A)) < 0), controllable and observable (truefor model problem), can use BTMR to compute W,V ∈ RN×n:

    ‖z− ẑ‖L2 ≤ 2(σr+1 + . . .+ σn) ‖u‖L2 ∀u,‖q− q̂‖L2 ≤ 2(σr+1 + . . .+ σn) ‖w‖L2 ∀w,

    where σ1 ≥ . . . ≥ σr ≥ σr+1 ≥ . . . σn ≥ 0 are Hankel singular vals.I Introduce auxiliary adjoint for error estimate

    I Original problem

    −Mp′(t) = ATp(t) + CT z(t), t ∈ (0, T ), p(T ) = 0,

    ∇j(u) = q(t) = BTp(t) + DT z(t), t ∈ (0, T )

    I Reduced order problem

    −p̂′(t) = ÂT p̂(t) + ĈT ẑ(t), t ∈ (0, T ) p̂(T ) = 0,

    ∇ĵ(u) = q̂(t) = B̂T p̂(t) + DT ẑ(t), t ∈ (0, T )

    I BTMR bound requires same input in full and reduced adjoint system.I Easy to fix: Introduce auxiliary adjoint p̃ as solution of the original

    adjoint, but with input ẑ instead of z.

    Matthias Heinkenschloss June 14, 2016 78

  • I Assume that there exists γ > 0 such that

    vTAv ≤ −γ vTMv, ∀v ∈ Rn.

    (Satisfied for model problem).

    I Gradient error:

    ‖∇j(u)−∇ĵ(u)‖L2 ≤ 2 (c‖u‖L2 + ‖ẑ(u)‖L2) (σr+1 + . . .+ σn)

    for all u ∈ L2! (ẑ(u) output of reduced order state with input u.)I Solution error:

    ‖u∗ − û∗‖L2 ≤2

    κ(c‖û∗‖L2 + ‖ẑ∗‖L2) (σr+1 + . . .+ σn).

    Matthias Heinkenschloss June 14, 2016 79

  • Example Problem (modeled after [Dedé and Quarteroni, 2005]

    Minimize1

    2

    ∫ T0

    ∫D

    (y(x, t)− d(x, t))2dx dt+ 10−4

    2

    ∫ T0

    ∫U1∪U2

    u2(x, t)dx dt,

    subject to

    ∂ty(x, t)−∇(κ∇y(x, t)) + a(x) · ∇y(x, t)

    = u(x, t)χU1(x) + u(x, t)χU2(x) in Ω× (0, T ),

    with boundary conditions y(x, t) = 0 on ΓD × (0, T ), ∂∂ny(x, t) = 0 onΓN × (0, T ) and initial conditions y(x, 0) = 0 in Ω.1032 L. DEDE’ AND A. QUARTERONI

    0 0.2 0.4 0.6 0.8 1.0 1.2−0.4

    −0.2

    0

    0.2

    0.4

    U1

    U2

    D

    ΓD

    ΓN

    ΓN

    ΓN

    ΓN

    ΓN

    ΓN

    (a)

    0 0.2 0.4 0.6 0.8 1.0 1.2−0.4

    −0.2

    0

    0.2

    0.4

    Ω Γ

    Din

    ΓD

    ΓD

    ΓD

    ΓD

    ΓN

    ΓN

    (b)

    Figure 4. Test 1. Reference domain for the control problem. We report the boundary condi-tions for the advection–diffusion Equation (10) (a) and for the Stokes problem (64) (b).

    where ρwK , ρpK and ρ

    uK are defined in Equations (55) and (61) (for the sake of simplicity, we have dropped the

    apex (j) on the error indicators). Results are compared with those obtained on fine grids, that we consider anaccurate guess of the exact solution.

    4.1. Test 1: water pollution

    Let us consider a first test case that is inspired to a problem of a water pollution. The optimal control problemconsists in regulating the emission rates of pollutants (rising e.g. from refusals of industrial or agricultural plants)to keep the concentration of such substances below a desired threshold in a branch of a river.

    We refer to the domain reported in Figure 4a, that could represent a river that bifurcates into two branchespast a hole, which stands for, e.g., an island. Referring to Equation (10), we obtain the velocity field V as thesolution of the following Stokes problem:

    −µ∆V + ∇p = 0, in Ω,V = (1 − ( y0.2 )

    2, 0)T , on ΓinD ,V = 0, on ΓD,µ∇V · n− pn = 0, on ΓN ,

    (64)

    where p stands for the pressure, while ΓinD , ΓD and ΓN are indicated in Figure 4b. Adimensional quantitiesare used. Here the Stokes problem serves the only purpose to provide an appropriate velocity field for theadvection–diffusion problem; since the latter governs our control problem, the analysis provided in Section 1and Section 2 applies. Moreover, for the sake of simplicity, we adopt the method and the a posteriori errorestimate (54) proposed in Section 3. In fact, this approach is not fully coherent, being the velocity field Vcomputed numerically by means of the same grid adopted to solve the control problem, i.e. we consider Vhinstead of V.

    For the Stokes problem we assume µ = 0.1 , for which the Reynolds number reads Re ≈ 10; we solve theproblem by means of linear finite elements with stabilization (see [16]), computed with respect to the same gridof the control problem. In Figure 5 we report the velocity field and its intensity as obtained by solving theStokes problem.

    For our control problem we assume ν = 0.015, u = 50 in both the emission areas U1 and U2 and zd = 0.1 inthe observation area D. The initial value of the control function, u = 50, can be interpreted as the maximumrate of emission of pollutants (divided by the emission area), while the state variable w stands for the pollutant

    Ω with boundary conditions for theadvection diffusion equation

    0 0.2 0.4 0.6 0.8 1 1.2−0.4

    −0.3

    −0.2

    −0.1

    0

    0.1

    0.2

    0.3

    0.4Velocity

    ξ1

    ξ 2

    the velocity field a

    Matthias Heinkenschloss June 14, 2016 80

  • grid k m n r1 168 9 1545 92 283 16 2673 93 618 29 6036 9

    Number k of observations,number m of controls,size n of full order system, andsize r of reduced order systemfor three discretizations.

    0 10 20 30 40 50 6010

    −20

    10−15

    10−10

    10−5

    100

    Hankel Singular Values

    Largest Hankel singular valuesand threshold 10−4σ1 (grid # 3)

    Matthias Heinkenschloss June 14, 2016 81

  • 0 0.5 1 1.5 2 2.5 3 3.5 4

    10−6

    10−4

    10−2

    100

    102

    Time

    Integrals∫U1u2∗(x, t)dx (solid

    blue line) and∫U1û2∗(x, t)dx

    (dashed red line) of the optimalcontrols computed using the fulland and the reduced order model.

    0 0.5 1 1.5 2 2.5 3 3.5 4

    10−6

    10−4

    10−2

    100

    102

    Time

    Integrals∫U2u2∗(x, t)dx (solid

    blue line) and∫U2û2∗(x, t)dx

    (dashed red line) of the optimalcontrols computed using the fulland and the reduced order model.

    Full and reduced order model sols. in excellent agreement:‖u∗ − û∗‖2L2 = 6 · 10−3.

    Matthias Heinkenschloss June 14, 2016 82

  • 0 2 4 6 8 10 12 1410

    −10

    10−9

    10−8

    10−7

    10−6

    10−5

    Iteration

    CG

    Res

    idua

    l

    Convergence histories of the Conjugate Gradient algorithm applied to full (+)and reduced (o) order optimal control problems.

    Recall error bound for the gradients:

    ‖∇j(u)−∇ĵ(u)‖L2 ≤ 2 (c‖u‖L2 + ‖ẑ(u)‖L2) (σr+1 + . . .+ σn) ∀u ∈ L2!

    Matthias Heinkenschloss June 14, 2016 83

  • Outline

    Overview

    Example Optimization Problems

    Optimization Problem

    Projection Based Model Reduction

    Back to Optimization

    Error Estimates

    Linear-Quadratic Problems

    Shape Optimization with Local Parameter Dependence

    Semilinear Parabolic Problems

    Trust-Region Framework

    Matthias Heinkenschloss June 14, 2016 84

  • Shape Optimization with Local Parameter Dependence([Antil et al., 2010, Antil et al., 2011, Antil et al., 2012])

    I Consider minimization problem

    minθ∈Θad

    j(θ) :=

    ∫ T0

    ∫Ω(θ)

    `(y(x, t; θ), t, θ)dx dt

    where y(x, t; θ) solves

    ∂ty(x, t)−∇(κ(x)∇y(x, t))

    +V (x) · ∇y(x, t)) = f(x, t) (x, t) ∈ Ω(θ)× (0, T ),κ(x)∇y(x, t) · n = g(x, t) (x, t) ∈ ΓN (θ)× (0, T ),

    y(x, t) = d(x, t) (x, t) ∈ ΓD(θ)× (0, T ),y(x, 0) = y0(x) x ∈ ΩD(θ)

    I Semidiscretization in space leads to

    minθ∈Θad

    j(θ) :=

    ∫ T0

    `(y(t; θ), t, θ) dt

    where y(t; θ) solves

    M(θ)d

    dty(t) + A(θ)y(t) = B(θ)u(t), t ∈ [0, T ],

    M(θ)y(0) = M(θ)y0.

    Matthias Heinkenschloss June 14, 2016 85

  • I We would like to replace the large scale problem

    minθ∈Θad

    j(θ) :=

    ∫ T0

    `(y(t; θ), t, θ) dt

    where y(t; θ) solves

    M(θ)d

    dty(t) + A(θ)y(t) = B(θ)u(t), t ∈ [0, T ],

    M(θ)y(0) = M(θ)y0

    I by a reduced order problem

    minθ∈Θad

    Ĵ(θ) :=

    ∫ T0

    `(ŷ(t; θ), t, θ) dt

    where ŷ(t; θ) solves

    M̂(θ)d

    dtŷ(t) + Â(θ)y(t) = B̂(θ)u(t), t ∈ [0, T ],

    M̂(θ)ŷ(0) = M̂(θ)ŷ0.

    I Problem is that we need a reduced order model that approximates the fullorder model for all θ ∈ Θad! Cannot be done using BTMR. I am notaware of any MR method that can do this with guaranteed error bounds.

    Matthias Heinkenschloss June 14, 2016 86

  • Localized parameters (nonlinearity)I Consider classes of problems where the shape parameter θ only

    influences a (small) subdomain:

    Ω̄(θ) := Ω̄1 ∪ Ω̄2(θ), Ω1 ∩ Ω2(θ) = ∅Γ = Ω̄1 ∩ Ω̄2(θ).

    JJĴ

    Γ

    Ω1 Ω1Ω2(θ)

    I The FE stiffness matrix times vector can be decomposed into

    Ay =

    AII1 AIΓ1 0AΓI1 AΓΓ(θ) AΓI2 (θ)0 AIΓ2 (θ) A

    II2 (θ)

    yI1yΓyI2

    where AΓΓ(θ) = AΓΓ1 + A

    ΓΓ2 (θ).

    The matrices M, B admit similar representations.I Consider objective functions of the type∫ T

    0

    `(y(t), t, θ)dt =1

    2

    ∫ T0

    ‖CI1yI1−dI1(t)‖22 + ˜̀(yΓ(t),yI2(t), t, θ)dt.Matthias Heinkenschloss June 14, 2016 87

  • Our Optimization problem

    minθ∈Θad

    j(θ) :=

    ∫ T0

    `(y(t; θ), t, θ) dt

    where y(t; θ) solves

    M(θ)d

    dty(t) + A(θ)y(t) = B(θ)u(t), t ∈ [0, T ],

    M(θ)y(0) = M(θ)y0

    can now be written as

    minθ∈Θad

    j(θ) :=1

    2

    ∫ T0

    ‖CI1yI1 − dI1(t)‖22 + ˜̀(yΓ(t),yI2(t), t, θ)dt.where y(t; θ) solves

    MII1d

    dtyI1(t) + M

    IΓ1d

    dtyΓ(t) + AII1 y

    I1(t) + A

    IΓ1 y

    Γ(t) = BI1uI1(t)

    MII2 (θ)d

    dtyI2(t) + M

    IΓ2 (θ)

    d

    dtyΓ(t) + AII2 (θ)y

    I2(t) + A

    IΓ2 (θ)y

    Γ(t) = BI2(θ)uI2(t)

    MΓI1d

    dtyI1(t) + M

    ΓΓ(θ)d

    dtyΓ(t) + MΓI2 (θ)

    d

    dtyI2(t)

    +AΓI1 yI1(t) + A

    ΓΓ(θ)d

    dtyΓ(t) + AΓI2 (θ)y

    I2(t) = B

    Γ(θ)uΓ(t)

    Dependence on θ ∈ Θad is now localized. The fixed subsystem 1 is large. Thevariable subsystem 2 is small. Idea: Reduce subsystem 1 only.

    Matthias Heinkenschloss June 14, 2016 88

  • First Order Optimality Conditions

    I Lagrangian

    L(y,p, θ) =

    ∫ T0

    `(y(t), t, θ) dt+

    ∫ T0

    p(t)T(M(θ)

    d

    dty(t)+A(θ)y(t)−B(θ)u(t)

    )dt

    I The first order necessary optimality conditions are

    M(θ)d

    dty(t) + A(θ)y(t) = B(θ)u(t)&t ∈ [0, T ],

    M(θ)y(0) = y0,

    −M(θ) ddt

    p(t) + AT (θ)p(t) = −∇y`(y(t), t, θ) t ∈ [0, T ],

    M(θ)p(T ) = 0.

    ∇θL(y,p, θ)(θ̃ − θ) ≥ 0, θ̃ ∈ Θad

    I Gradient of j is given by ∇j(θ) = ∇θ`(y(t),p(t), θ).

    Matthias Heinkenschloss June 14, 2016 89

  • First Order Optimality Conditions

    I Lagrangian

    L(y,p, θ) =

    ∫ T0

    `(y(t), t, θ) dt+

    ∫ T0

    p(t)T(M(θ)

    d

    dty(t)+A(θ)y(t)−B(θ)u(t)

    )dt

    I The first order necessary optimality conditions are

    M(θ)d

    dty(t) + A(θ)y(t) = B(θ)u(t)&t ∈ [0, T ],

    M(θ)y(0) = y0,

    −M(θ) ddt

    p(t) + AT (θ)p(t) = −∇y`(y(t), t, θ) t ∈ [0, T ],

    M(θ)p(T ) = 0.

    ∇θL(y,p, θ)(θ̃ − θ) ≥ 0, θ̃ ∈ Θad

    I Gradient of j is given by ∇j(θ) = ∇θ`(y(t),p(t), θ).

    Matthias Heinkenschloss June 14, 2016 89

  • First Order Optimality Conditions

    I Lagrangian

    L(y,p, θ) =

    ∫ T0

    `(y(t), t, θ) dt+

    ∫ T0

    p(t)T(M(θ)

    d

    dty(t)+A(θ)y(t)−B(θ)u(t)

    )dt

    I The first order necessary optimality conditions are

    M(θ)d

    dty(t) + A(θ)y(t) = B(θ)u(t)&t ∈ [0, T ],

    M(θ)y(0) = y0,

    −M(θ) ddt

    p(t) + AT (θ)p(t) = −∇y`(y(t), t, θ) t ∈ [0, T ],

    M(θ)p(T ) = 0.

    ∇θL(y,p, θ)(θ̃ − θ) ≥ 0, θ̃ ∈ Θad

    I Gradient of j is given by ∇j(θ) = ∇θ`(y(t),p(t), θ).

    Matthias Heinkenschloss June 14, 2016 89

  • Using DD structure, state and adjoint equations can be written as

    MII1d

    dtyI1(t) + M

    IΓ1d

    dtyΓ(t) + AII1 y

    I1(t) + A

    IΓ1 y

    Γ(t) = BI1uI1(t)

    MII2 (θ)d

    dtyI2(t) + M

    IΓ2 (θ)

    d

    dtyΓ(t) + AII2 (θ)y

    I2(t) + A

    IΓ2 (θ)y

    Γ(t) = BI2(θ)uI2(t)

    MΓI1d

    dtyI1(t) + M

    ΓΓ(θ)d

    dtyΓ(t) + MΓI2 (θ)

    d

    dtyI2(t)

    +AΓI1 yI1(t) + A

    ΓΓ(θ)d

    dtyΓ(t) + AΓI2 (θ)y

    I2(t) = B

    Γ(θ)uΓ(t),

    −MII1d

    dtpI1(t)−MIΓ1

    d

    dtpΓ(t) + AII1 p

    I1(t) + A

    IΓ1 p

    Γ(t) = −(CI1)T (CI1yI1(t)− dI1)

    −MII2 (θ)d

    dtpI2(t)−MIΓ2 (θ)

    d

    dtpΓ(t) + AII2 (θ)p

    I2(t) + A

    IΓ2 (θ)p

    Γ(t) = −∇yI2˜̀(.)

    −MΓI1d

    dtpI1(t)−MΓΓ(θ)

    d

    dtpΓ(t)−MΓI2 (θ)

    d

    dtpI2(t)

    +AΓI1 pI1(t) + A

    ΓΓ(θ)d

    dtpΓ(t) + AΓI2 (θ)p

    I2(t) = −∇yΓ ˜̀(.),

    To apply model reduction to the system corresponding to fixed subdomain Ω1,we have to identify how yI1 and p

    I1 interact with other components.

    Matthias Heinkenschloss June 14, 2016 90

  • Using DD structure, state and adjoint equations can be written as

    MII1d

    dtyI1(t) + M

    IΓ1d

    dtyΓ(t) + AII1 y

    I1(t) + A

    IΓ1 y

    Γ(t) = BI1uI1(t)

    MII2 (θ)d

    dtyI2(t) + M

    IΓ2 (θ)

    d

    dtyΓ(t) + AII2 (θ)y

    I2(t) + A

    IΓ2 (θ)y

    Γ(t) = BI2(θ)uI2(t)

    MΓI1d

    dtyI1(t) + M

    ΓΓ(θ)d

    dtyΓ(t) + MΓI2 (θ)

    d

    dtyI2(t)

    +AΓI1 yI1(t) + A

    ΓΓ(θ)d

    dtyΓ(t) + AΓI2 (θ)y

    I2(t) = B

    Γ(θ)uΓ(t),

    −MII1d

    dtpI1(t)−MIΓ1

    d

    dtpΓ(t) + AII1 p

    I1(t) + A

    IΓ1 p

    Γ(t) = −(CI1)T (CI1yI1(t)− dI1)

    −MII2 (θ)d

    dtpI2(t)−MIΓ2 (θ)

    d

    dtpΓ(t) + AII2 (θ)p

    I2(t) + A

    IΓ2 (θ)p

    Γ(t) = −∇yI2˜̀(.)

    −MΓI1d

    dtpI1(t)−MΓΓ(θ)

    d

    dtpΓ(t)−MΓI2 (θ)

    d

    dtpI2(t)

    +AΓI1 pI1(t) + A

    ΓΓ(θ)d

    dtpΓ(t) + AΓI2 (θ)p

    I2(t) = −∇yΓ ˜̀(.),

    To apply model reduction to the system corresponding to fixed subdomain Ω1,we have to identify how yI1 and p

    I1 interact with other components.

    Matthias Heinkenschloss June 14, 2016 90

  • Model Reduction of Fixed Subdomain Problem

    We need to reduce

    MII1d

    dtyI1(t) = −AII1 yI1(t)−MIΓ1

    d

    dtyΓ(t) + BI1u

    I1(t)−AIΓ1 yΓ(t)

    zI1 = CI1y

    I1(t)− dI1

    zΓ1 = −MΓI1d

    dtyI1 −AΓI1 yI1,

    −MII1d

    dtpI1(t) = −AII1 pI1(t) + MIΓ1

    d

    dtpΓ(t)− (CI1)T zI1 −AIΓ1 pΓ(t)

    qI1 = (BI1)TpI1

    qΓ1 = MΓI1

    d

    dtpI1 −AΓI1 pI1

    For simplicity we assume that

    MIΓ1 = 0 MΓI1 = 0,

    Matthias Heinkenschloss June 14, 2016 91

  • we get

    MII1d

    dtyI1(t) = −AII1 yI1(t) + (BI1 | −AIΓ1 )

    (uI1yΓ

    ),(

    zI1zΓ1

    )=

    (−CI1−AΓI1

    )yI1 +

    (I0

    )dI1,

    −MII1d

    dtpI1(t) = −AII1 pI1(t) + (−(CI1)T | −AIΓ1 )

    (zI1pΓ

    ),(

    qI1qΓ1

    )=

    ((BI1)

    T

    −AΓI1

    )pI1.

    System is exactly of form needed for balanced truncation model red.

    Matthias Heinkenschloss June 14, 2016 92

  • Reduced Optimization ProblemI We apply BTMR to the fixed subdomain problem with inputs and output

    determined by the original inputs to subdomain 1 as well as the interfaceconditions.

    I In optimality conditions replace fixed subdomain problem by its reducedorder model.

    I We can interpret the resulting reduced optimality system as the optimalitysystem of the following reduced optimization problem

    min

    ∫ T0

    1

    2‖ĈI1ŷI1 − dI1(t)‖22 + ˜̀(yΓ(t),yI2(t), t, θ)dt

    subject to

    M̂II1d

    dtŷI1(t) + M̂

    IΓ1d

    dtyΓ(t) + ÂII1 ŷ

    I1(t) + Â

    IΓ1 y

    Γ(t) = B̂I1uI1(t)

    MII2 (θ)d

    dtyI2(t) + M

    IΓ2 (θ)

    d

    dtyΓ(t) + AII2 (θ)y

    I2(t) + A

    IΓ2 (θ)y

    Γ(t) = BI2(θ)uI2(t)

    M̂ΓI1d

    dtyI1(t) + M

    ΓΓ(θ)d

    dtyΓ(t) + MΓI2 (θ)

    d

    dtyI2(t)

    +ÂΓI1 yI1(t) + A

    ΓΓ(θ)d

    dtyΓ(t) + AΓI2 (θ)y

    I2(t) = B

    Γ(θ)uΓ(t)

    ŷI1(0) = ŷI1,0 y

    I2(0) = y

    I2,0, y

    Γ(0) = yΓ0 ,

    θ ∈ ΘadMatthias Heinkenschloss June 14, 2016 93

  • Error EstimateIf I there exists α > 0 such that

    vTAv ≤ −αvTMv, ∀v ∈ RN ,

    I the gradients ∇y

    (2)I

    ˜̀(y(2)I ,yΓ, t, θ), ∇yΓ ˜̀(y(2)I ,yΓ, t, θ),∇θ ˜̀(y(2)I ,yΓ, t, θ), are Lipschitz continuous in y(2)I ,yΓ

    I for all ‖θ̃‖ ≤ 1 and all θ ∈ Θ the following bound holds

    max{‖DθM(2)(θ)θ̃‖, ‖DθA(2)(θ)θ̃‖, ‖DθB(2)(θ)θ̃‖

    }≤ γ,

    then there exists c > 0 dependent on u, ŷ, and p̂ such that

    ‖∇J(θ)−∇Ĵ(θ)‖L2 ≤c

    α(σr+1 + ...+ σn).

    If we assume the convexity condition

    (∇J(θ̂∗)−∇J(θ∗))T (θ̂∗ − θ∗) ≥ κ‖θ̂∗ − θ∗‖2,

    then we obtain the error bound

    ‖θ∗ − θ̂∗‖ ≤c

    ακ(σr+1 + ...+ σn).

    Matthias Heinkenschloss June 14, 2016 94

  • ExampleI Reference domain Ωref

    −10 −8 −6 −4 −2 0 2 4 6 8 10−1

    0

    1

    ΩA

    ΩB

    ΩH

    ΓI

    ΓI

    ΩC

    ΓR

    ΓL

    ΓT

    ΓB

    I Optimization problem

    min

    T∫0

    ∫ΓL∪ΓR

    |y − yd|2dsdt+T∫

    0

    ∫Ω2(θ)

    |y − yd|2dxdt

    subject to the differential equation

    yt(x, t)−∆y(x, t) + y(x, t) =100 in Ω(θ)× (0, T ),n · ∇y(x, t) = 0 on ∂Ω(θ)× (0, T ),

    y(x, 0) = 0 in Ω(θ)

    and design parameter constraints θmin ≤ θ ≤ θmax.I We use kT = 3, kB = 3 Bézier control points to specify the top and the

    bottom boundary of the variable subdomain Ω2(θ).The desired temperature yd is computed by specifying the optimalparameter θ∗ and solving the state equation on Ω(θ∗).

    Matthias Heinkenschloss June 14, 2016 95

  • I We use automatic differentiation to compute the derivatives with respectto the design variables θ.

    I The semi-discretized optimization problems are solved using a projectedBFGS method with Armijo line search. The optimization algorithm isterminated when the norm of projected gradient is less than � = 10−4.

    I The optimal domain

    −10 −8 −6 −4 −2 0 2 4 6 8 10

    −1

    0

    1

    Matthias Heinkenschloss June 14, 2016 96

  • N(1)dof Ndof

    Reduced 147 581Full 4280 4714

    Sizes of the full and thereduced order problems

    0 50 100 150 200 250 30010

    −10

    10−8

    10−6

    10−4

    10−2

    100

    The largest Hankel singular valuesand the threshold 10−4σ1

    Error in solution between full and reduced order problem:‖θ∗ − θ̂∗‖2 = 2.325 · 10−4

    Optimal shape parameters θ∗ and θ̂∗ (rounded to 5 digits)computed by minimizing the full and the reduced order model.θ∗ (1.00, 2.0000, 2.0000, -2.0000, -2.0000, -1.00)

    θ̂∗ (1.00, 1.9999, 2.0001, -2.0001, -1.9998, -1.00)

    Matthias Heinkenschloss June 14, 2016 97

  • The convergence histories of the projected BFGS algorithm applied to thefull and the reduced order problems.

    0 5 10 15 20 25 3010

    −6

    10−4

    10−2

    100

    102

    k

    J(x k)

    convergence history of the objec-tive functionals for the full (+)and reduced (o) order model.

    0 5 10 15 20 25 3010

    −6

    10−4

    10−2

    100

    102

    k

    || P

    ∇ J

    (xk)

    ||

    convergence history of the pro-jected gradients for the full (+)and reduced (o) order model.

    Matthias Heinkenschloss June 14, 2016 98

  • Example - Stokes

    Geometry motivated by biochip

    Problems where the shape param. θ only influences a (small) subdomain:

    Ω̄(θ) := Ω̄1 ∪ Ω̄2(θ), Ω1 ∩ Ω2(θ) = ∅, Γ = Ω̄1 ∩ Ω̄2(θ).

    Here Ω2(θ) is the top left yellow, square domain.

    Matthias Heinkenschloss June 14, 2016 99

  • minθmin≤θ≤θmax

    J(θ) =

    T∫0

    ∫Ωobs

    12 |∇×v(x, t; θ)|

    2dx+

    ∫Ω2(θ)

    12 |v(x, t; θ)−v

    d(x, t)|2dxdt

    where v(θ) and p(θ) solve the Stokes equations

    vt(x, t)− µ∆v(x, t) +∇p(x, t) = f(x, t), in Ω(θ)× (0, T ),∇ · v(x, t) = 0, in Ω(θ)× (0, T ),

    v(x, t) = vin(x, t) on Γin × (0, T ),v(x, t) = 0 on Γlat × (0, T ),

    −(µ∇v(x, t)− p(x, t)I)n = 0 on Γout × (0, T ),v(x, 0) = 0 in Ω(θ).

    Here Ω(θ) = Ω1 ∪ Ω2(θ) and Ω2(θ) is the top left yellow, square domain.Observation region Ωobs is part of the two reservoirs.

    Stokes equation requires additional care:I Domain decomposition ([Pavarino and Widlund, 2002]).I Balanced truncation ([Stykel, 2006], [Heinkenschloss et al., 2008])I See [Antil et al., 2011]

    Matthias Heinkenschloss June 14, 2016 100

  • We have 12 shape parameters, θ ∈ R12.

    Reference domain Ωref Optimal domain

    Matthias Heinkenschloss June 14, 2016 101

  • grid m N(1)v,dof N

    (1)v̂,dof Nv,dof Nv̂,dof

    1 149 4752 23 4862 1332 313 7410 25 7568 1833 361 11474 26 11700 2524 537 16472 29 16806 363

    The number m of observations in Ωobs, thenumber of velocities N

    (1)v,dof , N

    (1)v̂,dof in the

    fixed subdomain Ω1 for the full and re-duced order model, the number of velocitiesNv,dof , Nv̂,dof in the entire domain Ω forthe full and reduced order model for fivediscretizations.

    0 50 100 150 200 250 30010

    −10

    10−8

    10−6

    10−4

    10−2

    100

    102

    The largest Hankel singu-lar values and the threshold10−3σ1

    Matthias Heinkenschloss June 14, 2016 102

  • I Error in optimal parameter computed sing the full and the reducedorder model (rounded to 5 digits)

    θ∗ (9.8987, 9.7510, 9.7496, 9.8994, 9.0991, 9.2499, 9.2504, 9.0989)

    θ̂∗ (9.9026, 9.7498, 9.7484, 9.9021, 9.0940, 9.2514, 9.2511, 9.0956)

    I The convergence histories of the projected BFGS algorithm appliedto the full and the reduced order problems.

    0 5 10 1510

    −8

    10−6

    10−4

    10−2

    k

    J(x k)

    convergence history of the objec-tive functionals for the full (+)and reduced (o) order model.

    0 5 10 15

    10−4

    10−3

    10−2

    10−1

    k

    || P

    ∇ J

    (xk)

    ||

    convergence history of the pro-jected gradients for the full (+)and reduced (o) order model.

    Matthias Heinkenschloss June 14, 2016 103

  • Recap of Part I

    I Reviewed projection based model reduction for simulation.

    I Reviewed adjoint eqn. approach for gradient and Hessiancomputation.

    I Gradient computation requires solution of adjoint PDE.I Hessian time vector computation requires solution of linearized state

    PDE and 2nd order adjoint PDE.

    I Reduced order models for optimization must approximate theobjective function j(u) and its gradient ∇j(u).

    I Considered two classes of optimization problemsI Parameterized linear quadratic problems.

    Sample optimization problems to generate reduced order model thatallows fast on-line solution of linear quadratic problem at out ofsample parameter.

    I Linear quadratic problems, or problems with localized nonlinearity forwhich reduced order models can be computed that are goodapproximations for all controls u.

    Matthias Heinkenschloss June 14, 2016 104

  • Review Part I

    I Overview

    I Example Optimization Problems

    I Optimization Problem

    I Projection Based Model Reduction

    I Back to Optimization

    I Error Estimates

    I Linear-Quadratic Problems

    I Shape Optimization with Local Parameter Dependence

    Matthias Heinkenschloss June 14, 2016 105

  • I Original problem

    min j(u)

    s.t. u ∈ Uad,

    where j(u) = J(y(u),u), y(u) ∈ Rn solves c(y,u) = 0. n large.

    I Reduced order problemConstruct V ∈ Rn×r, r � n, rank(V) = r.Reduced order problem:

    min ĵ(u)

    s.t. u ∈ Uad,

    where ĵ(u) = J(Vŷ(u),u), ŷ(u) ∈ Rr solves reduced stateequation VT c(Vŷ,u) = 0 ∈ Rr.

    Matthias Heinkenschloss June 14, 2016 106

  • Review Proper Orthogonal Decomposition I

    I Finite dimensional representation of snapshotsy(t1), . . . ,y(tm) ∈ Rn, m > r.

    I Inner product vTMw and norm ‖ · ‖M. M s.p.d. but not nec. massmatrix.

    I Compute orthonormal basis v1, . . . ,vr as solution of

    min

    m∑k=1

    ‖y(tk)−r∑i=1

    y(tk)TMvi vi‖2M

    s.t. vTi Mvj = δij .

    I SolutionI Define Y = [y(t1), . . . ,y(tm)] ∈ Rn×m.I Compute M-orthonormal eigenvecs. v1,v2, . . . ∈ Rn and eigenvals.λ1 ≥ . . . ≥ λm ≥ 0 of generalized n× n eigenvalue prob.

    MYYTMvi = λiMvi.

    Matthias Heinkenschloss June 14, 2016 107

  • Review Proper Orthogonal Decomposition II

    I Alternatively, if n > m compute eigenvectors w1,w2, . . . ∈ Rm andeigenvalues λ1 ≥ λ2 ≥ . . . ≥ λmin{m,n} ≥ 0 of

    YTMYwi = λiwi.

    vi = λ−1/2i Ywi, i = 1, . . . ,m.

    I Usually, fix tolerance � > 0. Compute eigenvectors v1,v2, . . . ∈ Rnand eigenvalues λ1 ≥ λ2 ≥ . . . ≥ λmin{m,n} ≥ 0.

    I Find smallest r such that∑mi=r+1 λi < �.

    I If only some of the largest eigenvals. and vecs. are computed:Find smallest r such that λr+1/λ1 < �.

    Reduced order model V = [v1, . . . ,vr] ∈ Rn×r.Error

    m∑k=1

    ‖y(tk)−r∑i=1

    y(tk)TMvi vi‖2M =

    min{m,n}∑i=r+1

    λi

    Matthias Heinkenschloss June 14, 2016 108

  • Review Error Estimate for Unconstrained ProblemsI û∗ = argminuĵ(u) minimizer of unconstrained reduced problem.

    I Newton-Kantorovich Theorem: Let r > 0 and ∇2j ∈ LipL(Br(û∗)).∇2j(û∗) be nonsingular and constants ζ, η ≥ 0 such that

    ‖∇2j(û∗)−1‖ = ζ, ‖∇2j(û∗)−1∇j(û∗)‖ ≤ η.

    If Lζη ≤ 12 , there is unique local minimum u∗ of j in ball around û∗with radius min

    {r,(

    1−√

    1− 2Lζη)/(Lζ)

    }≤ min

    {r, 2η

    }.

    I Estimate η:

    ‖∇2j(û∗)−1(∇j(û∗)−∇ĵ(û∗)︸ ︷︷ ︸=0

    )‖ ≤ ζ‖∇j(û∗)−∇ĵ(û∗)‖ = η.

    I Hence ‖u∗ − û∗‖U ≤ 2ζ‖∇ĵ(û∗)−∇j(û∗)‖UI Estimate error in gradients to get estimate for error in solution.

    I Need Lζη ≤ 12 , i.e., ∇j(û∗) small enough.I Can estimate error using convergence properties of Newton’s

    method started with û∗ applied to original problem.

    Matthias Heinkenschloss June 14, 2016 109

  • Outline

    Overview

    Example Optimization Problems

    Optimization Problem

    Projection Based Model Reduction

    Back to Optimization

    Error Estimates

    Linear-Quadratic Problems

    Shape Optimization with Local Parameter Dependence

    Semilinear Parabolic Problems

    Trust-Region Framework

    Matthias Heinkenschloss June 14, 2016 110

  • Semilinear Parabolic Model ProblemsI Distributed control

    min1

    2

    ∫ T0

    ∫Ω

    |y(x, t)− z(x, t)|2 dx dt+ α2

    ∫ T0

    ∫Ω

    |u(x, t)|2 dx dt,

    where y = y(u) solves

    yt(x, t)− ν∆y(x, t) + y(x, t)3 + u(x, t) = f(x, t), (x, t) ∈ Ω× (0, T ),y(x, t) = 0, (x, t) ∈ Γ× (0, T ),y(x, 0) = y0(x), x ∈ Ω.

    I Robin boundary control of Burgers equation

    min1

    2

    ∫ T0

    ∫ 10

    |y(x, t)−z(x, t)|2 dx dt+ α2

    ∫ T0

    (|u0(t)|2 +|u1(t)|2) dt,

    where y = y(u) solves

    yt(x, t)− νyxx(x, t) + y(x, t)yx(x, t) = f(x, t), (x, t) ∈ (0, 1)× (0, T ),νyx(0, t) + σ0y(0, t) = u0(t), t ∈ (0, T ),νyx(1, t) + σ1y(1, t) = u1(t), t ∈ (0, T ),

    y(x, 0) = y0(x), x ∈ (0, 1).

    Matt