application h-matrices for solving pdes with multi-scale coefficients, jumping and strongly...

Post on 08-Feb-2017

48 Views

Category:

Education

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Application of H-matrices for solving

multiscale problems

Litvinenko Alexander,

Dissertation work

Max-Planck-Institut fur Mathematik in den Naturwissenschaften,

Leipzig, 10 August, 2006.

www.hlib.org www.mis.mpg.de

2

H-matrices

IntegralEquations,BEM3D

Parallel Impl. of H-matrices

HelmholzEquation

Convection-DiffusionProblems

Multigrid+ H-matrices

H-Matrix Approximation ofsign(A), exp(A), etc

Aposteriory Err. Est.+efficient H-matrix update

Lyapunov, RiccatiEquations

DD methods

Schur ComplementMethods

HierarchicalDomain Decompositionfor Multiscale Problems

*

3D Skin problem*

MultidimensionalProblems

Fig. 1 – Main directions of applications H-matrices. The sym-

bol ? refers to the projects in which I took part.

3

Contents

1. Examples of multiscale problems

2. Multiscale methods

3. HDD method

4. Hierarchical matrices

5. Application of H-matrices to HDD

6. Complexity and storage of HDD

7. Modifications of HDD

– Two scales

– Truncation of the small scales

8. Numerical results

4

Example of multiscale problems

(a)macroscopic scale (b)microscopic scale

Different scales in a porous medium.[Bastian 99].

10 s-6

10 s-3

10 s0

10 s3

10 m-12

10 m-9

10 m-6

10 m-3

Atom Protein Cell Tissue

molecular events(ion channel gating)

diffusion cell signalling

mitosis

Example of time and length scales for modeling tumor growth.[Alarcon,

Byrne, Maini 05]

5

0,6

0,2

-0,6

0,4

0

x

621

-0,4

-0,2

3 4 50

Fig. 2 – Fine properties of the solution are out of interest.

6

Multiscale methods

The equation is :

−∇(a(x)∇u) = f in Ω,

u = 0 on ∂Ω.(1)

Homegenisation [Babuska 75], [Bensoussan, Lions, Papanicolau 78],

[Jikov, Kozlov, Oleinik 94]

Solution is

uε(x) = u0(x) + εu1(x,x

ε) + O(ε2).

u0 is the solution of the homogenized equation

∇a∗∇u0 = f in Ω, u0 = 0 on ∂Ω, (2)

Resonance effect in MsFEM [T.Hou, X. Wu 97]

‖u − uh‖0,Ω = O(h2 + ε/h). (3)

Heterogeneous multiscale method [Weinan E, B.Engquist 03]

7

Problem setup

The Poisson problem : find u ∈ H1(Ω) s.t. :

1≤i,j≤2

∂xi

ai,j(x)∂

∂xj

u = f in Ω

u = g on Γ

(4)

where ai,j ∈ L∞(Ω) such A(x) = (ai,j)i,j=1,...,d satisfies

0 < λ ≤ λmin(A(x)) ≤ λmax(A(x)) ≤ λ , ∀x ∈ Ω.

⇒ Oscillatory or jumping coefficients are allowed.

8

The idea of HDD

Find operators : Bh, Ch s.t.

uh = Bhfh + Chgh, (5)

where fh is the rhs and gh the Dirichlet-boundary values.

Composed matrix (Bh, Ch) is the ’inverse’ of the stiffness

matrix Ah.

Complete inverse (Bh, Ch) is too much of information. We

might be interested only in few functionals of the solution.

Example : we want to know uh(fh, gh) only for fh in a smaller

space VH ⊂ Vh.

9

Domain decomposition tree TTh

FE discretisation : triangulation Th, Ω = ∪t∈Tht.

1

2

3

4

5

6

7

910

11

12

13

14

15

8

5

6

7

11

12

13

14

15

8

1

2

3

4

5

6

7

910

3

4

19

10

......

5

611

12

13

14

15

6

7

11

15

8

......

26

2

6

• Ω is the root of the tree,

• TThis a binary tree,

• if ω ∈ TThhas two sons

ω1, ω2 ∈ TTh: ω = ω1 ∪ ω2

and γω = ∂ω1 ∩ ∂ω2,

• ω ∈ TThis a leaf, if and only

if ω ∈ Th.

10

Notations

Let ω ∈ TTh, ω = ω1 ∪ ω2.

Γω,1 := ∂ω ∩ ω1, Γω,2 := ∂ω ∩ ω2 and γω := ∂ω1\∂ω = ∂ω2\∂ω

ω 1 ω 2

ωPSfrag replacements

∂γω

Γω,1 Γω,2

Γω

I = I(Ω) = set of all vertices of Ω.

I(ω) = i ∈ I ; xi ∈ ω.

11

Discretisation

Let ω ∈ TTh. Denote dω :=

(

(fi)i∈I(ω) , (gi)i∈I(∂ω)

)

. Define the

following discrete problem in the variational form :

aω(uh, bj) = (fω, bj)L2(ω) ∀ j ∈ I(ω),

uh(xj) = gj ∀ j ∈ I(∂ω).(6)

a(bi, bj) =

Ω

α(x)(∇bi,∇bj)dx, (f, bj) =

suppbj

fbjdx.

12

1. Mapping Ψω

Ψω(d) = (Ψω(dω))i∈I(∂ω) with (Ψω(dω))i = aω(uh, bi) − (fω, bi)L2(ω) ,

Ψωdω = Ψfωfω + Ψg

ωgω.

2. Mapping Φω

(Φω(dω))i := uh(xi) , ∀i ∈ I(γω).

Hence, Φω(dω) is the trace of uh on γω.

Goal of HDD is to build the set

of mappings : Φ0, Φ1, Φ2, ..., Φn which

than produce sequentially the solution on

γω0, γω1

, γω2..., γωn.

ω

ω

ω

1

2

xjγ ω

xj

13

Construction of the mappings Ψω and Φω

Let ω1 and ω2 be two sons of ω ∈ TTh. Let dω1

and dω2the

data associated to ω1 and ω2 s.t. :

• (consistency conditions for the Dirichlet data)

g1,i = g2,i , ∀i ∈ I(ω1) ∩ I(ω2), (7)

• (consistency conditions for the right-hand side)

f1,i = f2,i , ∀i ∈ I(ω1) ∩ I(ω2). (8)

Let uω1and uω2

be the local FE solutions of the problem (6)

for the data dω1, dω2

.

14

ω

ω

ω

1

2

xjγ ω

xj

If uω1, uω2

satisfy to the Neu-

mann condition

γΨω1(dω1

) + γΨω2(dω2

) = 0,

Then, uω defined by

uω(xi) :=

uω1(xi) for i ∈ I(ω1)

uω2(xi) for i ∈ I(ω2)

(9)

is solution of (6) for the data dω := (fω, gω) given by

fω :=

f1,i for i ∈ I(ω1)

f2,i for i ∈ I(ω2)gω :=

g1,i for i ∈ I(∂ω1)

g2,i for i ∈ I(∂ω2)

15

(

γΨγω1

+ γΨγω2

)

gγ = −Ψfω1

f1 − ΨΓω1

g1,Γ − Ψfω2

f2 − ΨΓω2

g2,Γ.

We set

M := −( γΨγω1

+ γΨγω2

),

compute M−1 and solve for gγ :

gγ = M−1(Ψfω1

f1 + ΨΓω1

g1,Γ + Ψfω2

f2 + ΨΓω2

g2,Γ).

For given mappings Ψω1, Ψω2

, defined on the sons ω1, ω2, we

can compute Φω and Ψω for the father ω. This recursion

process ends as soon as ω = Ω.

16

Hierarchical Process

1. Leaves to Root

1. Compute Ψω on all leaves (3 × 3 matrices).

2. Recursion from the leaves to the root :

(a) Compute and store Φω and Ψω from Ψω1, Ψω2

.

(b) Delete Ψω1, Ψω2

.

2. Root to Leaves

1. Given dω = (fω, gω), compute the solution uh on the

interior boundary γω by Φω (dω).

2. Build the data dω1= (fω1

, gω1), dω2

= (fω2, gω2

) from

dω = (fω, gω) and gγ := Φω (dω).

17

Rank-k matrices

1. R ∈ Rn×m, R = ABT , where

A ∈ Rn×k, B ∈ R

m×k, k min(n, m).

The storage A and B is k(n + m)

instead of n · m.

=

A

BT

*

R

k

k

n

m

n

m

H-matrices (Hackbusch ’98)

2. Grid → cluster tree (TI) → blockclus-

ter tree (TI×J) + admissibility condition

→ admissible partitioning → H-matrix →H-matrix arithmetics .

4 2

2 2 3

3 3

4 2

2 2

4 2

2 2

4

18

3. Let I := I(Ω), t, s ∈ TI , (t × s) ∈ TI×I.

Admissibility : maxdiam(t), diam(s) ≤ η · dist(t, s).

if(adm=true) then M |t×s is a rank-k matrix block

if(adm=false) then divide M |t×s further or define as a dense

matrix block.

QQt

S

dist H=

t

s

...

I

I

I I

I

I

I I I I

I

1

1

2

2

11 12 21 22

I11

I12

I21

I22

19

Definition 0.1 H(TI×J , k) := M ∈ RI×J | rank(M |t×s) ≤ k for

all admissible leaves t × s of TI×J.

n := max(|I|, |J |, |K|).

Operation Sequential Compl. Parallel Complexity

(R.Kriemann 2005)

building(M) N = O(n log n) Np

+ O(|V (T )\L(T )|)storage(M) N = O(kn log n) N

Mx N = O(kn log n) Np

αM ′ ⊕ βM ′′ N = O(k2n log n) Np

αM ′ M ′′ ⊕ βM N = O(k2n log2 n) Np

+ O(Csp(T )|V (T )|)M−1 N = O(k2n log2 n) N

p+ O(nn2

min)

LU N = O(k2n log2 n) N

H-LU N = O(k2n log2 n) Np

+ O(k2n log2 n

n1/d )

20

Application of H-matrices to HDD

Let ω = ω1 ∪ ω2, γω = ∂ω1\∂ω.

Suppose Ψgω1

, Ψgω2

→ Ψgω =: A and Ψf

ω1, Ψf

ω2→ Ψf

ω =: F .

A11 A12

A21 A22

x1

x2

=

F1

F2

b.

Eliminate internal nodal points :

A11 − A12A−122 A21 0

A21 A22

x1

x2

=

F1 − A12A−122 F2

F2

b.

Ψgωx1 := (A11 − A12A

−122 A21)x1 = (F1 − A12A

−122 F2)b = Ψf

ωb

x2 = A−122 F2b − A−1

22 A21x1 =: Φfωb + Φg

ωx1,

21

13 4

4 45

5 8 5

5 82

28 5

5 16 5

5 85

5

8 5

5

16 5

5 81

18 5

5

8 5

5 15

5

516 5

5 15

Matrix Ψg with the weak admissibility condition

9 3

8 3

3 3

8 3

8 3

3 3

8 3

8 3

33 9

3 8

3 3

3 8

3 8

3 3

3 8

3 8

3 33 3

8 3 3

33 3

3 3

3 8 3

38 3

3 3

33 33 3

3 3

33

3

33 8

3 3

33 3

3 3

8 3

8 3

3 3

8 3

8 3

3 33 3

3 33 3

3 3

33 33 3

3 3

8 3

12 83 3

12 84 4

3 8

3 33 3

3 34 4

3 8

3 3

3 33 3

3 38 3

3 8

3 3

3 8

3 8 8

3 3

3

3

3 8

3 3

3

3 8

3 8

3 3

3 8

3 33 3

33 3

3 3

3 8

33 3

3 3

3 8

3 8

3 3

3 8

3 3

3 3

3 3

3 8

3 3

8 8

3 33 3

3 8

4 4

3 8

8 83 3

8 8

4 4

8 8

8 3

8 3

8 3

8 3 83

8 3

3 3

8 3

8 3

38 3

7 33

3

9 8

9 8 8

8

1

8 8

8 8

8 3

8 8

8 3

8 8

8 8

3 8

8 8

3 8

3 33 3

8 3

3 3

8 3

3 8

3 3

3 8

3 3

3 3

8 3

8 8

8 3

8 8

8 8

3 7

8 8

3 7

3 8

3 3

8 3

3 3

3 3

Matrix Ψf with the standard admissibility condition

22

Building (Ψgω)H ∈ R

512×512 from (Ψgω1

)H and (Ψgω2

)H ∈ R384×384.

25 5

5 86

6 16 6

6 166

6 32 7

7 321

132 6

6

16 6

6 32 6

6 166

6 3211

11

32 6

6

16 6

6

32 5

5 166

6 321

132 6

6

32 5

5

16 6

6

16 5

5 31

255 8

6 16

6 16

6 32

7 32

3216

3216

32

6

32

6

16

6

325 16

6 32

32

3216

16

31

1932

5 32

632

5 31

258

16

16

32

32

132

6

16

6 326 16

6 32

5

3216

3216

32

132

6

32

5

16

6

16

5 31

2032

5 32

632

5 31

25 7

7 89

9 16 10

10 1611

11 32 18

18 3215

1532 17

17

16 10

10

32 8

8 1611

11 3219

1932 11

11 32 14

1432 12

12 31

17 8

8 16 11

11

16 8

8 32 10

10 1617

17 3214

1432 16

16

17 6

6 16 9

9

16 10

10

16 8

8 31

20

2032 12

12 32 13

1332 11

11 31

25 5

5 86

6 16 6

6 166

6 32 7

7 321

132 6

6

16 6

6 32 6

6 166

6 3211

11

32 6

6

16 6

6

32 5

5 166

6 321

132 6

6

32 5

5

16 6

6

16 5

5 31

32

3232 10

10 32 12

1232 10

10 31

PSfrag replacements

(Ψgω1

)H (Ψgω2

)H

(Ψgω1

)H|I×I (Ψgω2

)H|I×I

(Ψgω)H ∈ H(TI×I , k)

H

23

Complexity and storage

storage complexity

Ψg - O(k3√nhnH log√

nhnH)

Ψf - O(k3√nhnH log2 √nhnH)

Φg O(k√

nhnH) O(k2√nhnH)

Φf O(k2√nhnH log2 √nhnH) O(k3√nhnH log√

nhnH)

24

Prolongation of the right-hand side on the fine grid

h H, fH ∈ VH ⊂ Vh is given ⇒ to build fh.

Mappings Ψf , Φf can be compressed.

H h

.=

PSfrag replacementsΦfh

ωΦfHω Ph←H

ω

25

Truncation of the small scales :

S(Φω) = S(Φgω) + S(Φf

ω) = O(k2√nhnH log√

nhnH).

Ω

h

HPSfrag replacements

T≥HTh

TTh

T<HTh

Fig. 3 – Domain decomposition tree TThand its parts.

26

Numeric results

27

(left) Skin problem, (right) model of a cell.

a b

c

Lipid layer

αβ

0 10.25 0.75

0.5

1

4h

[Khoromskij, Wittum 02]

28

α‖ucg−u‖2‖ucg‖2

‖ucg − u‖∞ ‖ucg − u‖A

1.0 6.6 ∗ 10−9 7.1 ∗ 10−10 2.3 ∗ 10−7

10−1 2.0 ∗ 10−8 1.4 ∗ 10−8 2.0 ∗ 10−6

10−2 6.6 ∗ 10−8 2.6 ∗ 10−7 1.7 ∗ 10−5

10−3 7.4 ∗ 10−7 1.8 ∗ 10−5 4.2 ∗ 10−4

10−4 4.2 ∗ 10−6 1.8 ∗ 10−3 1.4 ∗ 10−2

10−5 7.0 ∗ 10−5 2.3 ∗ 10−1 9.0 ∗ 10−1

Tab. 1 – Dependence of absolute and relative errors on α.

1292 dofs, εk = 10−8, β = 1.0, residium ‖Au − f‖ = 10−10,

‖A‖ = 1.22 ∗ 105.

29

ε‖ucg−u‖2‖ucg‖2

‖ucg − u‖∞ ‖ucg − u‖A

10−6 4.4 ∗ 10−1 6.67 ∗ 102 1.1 ∗ 103

10−8 7.27 ∗ 10−5 2.3 ∗ 10−1 9.0 ∗ 10−1

10−10 5.1 ∗ 10−7 1.0 ∗ 10−3 3.0 ∗ 10−3

10−12 3.9 ∗ 10−9 1.2 ∗ 10−5 2.9 ∗ 10−5

10−14 1.2 ∗ 10−11 6.6 ∗ 10−7 1.2 ∗ 10−7

10−16 1.6 ∗ 10−12 1.1 ∗ 10−8 1.7 ∗ 10−8

Tab. 2 – Dependence of absolute and relative errors on εk. 1292

dofs, α = 10−5, β = 1.0, residium ‖Au − f‖ = 10−10,

‖A‖2 = 1.22 ∗ 105.

ε is responsible for the H-matrix approximation accuracy.

σk ≤ εσ1.

30

dofs Φg,Φf ,h,Kb Φg,Φf ,H=0.5,Kb Φg,Φf ,H=0.125,Kb

332 2.45 ∗ 102, 4 ∗ 102 9.1 ∗ 10, 1.7 ∗ 102 2 ∗ 102, 2.8 ∗ 102

652 1.1 ∗ 103, 2.4 ∗ 103 2.9 ∗ 102, 1.2 ∗ 103 7.9 ∗ 102, 1.8 ∗ 103

1292 5 ∗ 103, 1.4 ∗ 104 6.8 ∗ 102, 8 ∗ 103 2.6 ∗ 103, 1.2 ∗ 104

2562 2.1 ∗ 104, 7.86 ∗ 104 1.4 ∗ 103, 4.1 ∗ 104 7.4 ∗ 103, 6.9 ∗ 104

Tab. 3 – Dependence of memory requirements for Φg and Φf

on numbers of dofs and size of the interface, f = 4, nmin = 12,

u = x2 + y2 and k = 7.

31

Storage

ε LLT (Mb) HDD(Mb) (A−1)H(Mb)

10−3 13.3 19.7 51.0

10−4 14.7 20.1 64.0

10−5 16.0 20.4 75.2

10−6 17.2 20.6 87.4

Tab. 4 – Dependence of memory requirements on ε, 1292 dofs.

32

dofs HDD pre,LLT ,cg Inv(A) pre,LLT

332 0.19 0.1=0.03+0.02+0.04 0.24 0.11=0.03+0.08

652 0.96 0.6=0.2+0.1+0.26 3.54 0.5=0.2+0.3

1292 5.6 5=2.6+0.6+1.8 65.8 4.7=2.7+2.0

2572 36.1 53=38.0+3.4+11.4 - 50=38.2+11.7

5122 218 - - -

Tab. 5 – Comparison of times for the skin problem with

α(x, y) = 10−5, ε = εcg = 10−8.

33

Oscillatory coefficients

global k ‖u40 − uk‖2/ ‖u40‖2

‖u40 − uk‖∞

2 7 7 ∗ 10−2

4 2 ∗ 10−2 1.8 ∗ 10−3

6 5.4 ∗ 10−4 4.5 ∗ 10−5

8 6.6 ∗ 10−5 6.3 ∗ 10−6

10 7.6 ∗ 10−6 9 ∗ 10−7

Tab. 6 – α(x, y) = 1 + 0.5sin(50x)sin(50y)

ω ‖u40 − uk‖2/ ‖u40‖2

‖u40 − uk‖∞

10 1.65 ∗ 10−4 1.76 ∗ 10−5

50 1.8 ∗ 10−4 1.9 ∗ 10−5

450 7.7 ∗ 10−4 10−4

Tab. 7 – 2572 Dofs, f = 1, α(x, y) = 1 + 0.5sin(ωx)sin(ωy).

34

Ω

α

β

0.1 0.2 0.8 0.9

0.1

0.2

0.80.9

Fig. 4 – Domain Ω = (0, 1)2 with jumping coefficients α and β.

35

ε ‖Au − f‖2 cg ; HDD(sec)‖ucg−u‖2‖u‖2

‖ucg − u‖∞10−4 2 ∗ 10−4 5.3 ; 8.9 6.7 ∗ 10−1 1.4

10−6 4.8 ∗ 10−7 5.0 ; 10.1 1.8 ∗ 10−4 9.5 ∗ 10−4

10−8 1.4 ∗ 10−8 5.7 ; 11.5 1.1 ∗ 10−6 1.48 ∗ 10−5

10−10 1.45 ∗ 10−8 6.7 ; 12.3 5.3 ∗ 10−7 10−5

10−12 1.2 ∗ 10−8 7.4 ; 13.5 5.2 ∗ 10−7 10−5

Tab. 8 – Dependence of the relative and absolute errors on ε,

u is HDD solution from ε, α = 10, β = 0.01, 1292 dofs.

36

Properties of HDD :

1. HDD computes uh := Bhfh + Chgh or uh := BHfH + Chgh.

2. Bh, BH and Ch have H-matrix format.

3. The complexities are O(k2nh log3 nh) and

O(k2√nhnH log3 √nhnH).

4. The storages are O(knh log2 nh) and

O(k√

nhnH log2 √nhnH).

5. HDD computes functionals of the solution :

(a) Neumann data ∂uh

∂nat the boundary,

(b) mean values∫

ωuhdx, ω ⊂ Ω, the solution at a point,

the solution in a small subdomain ω,

(c) flux∫

C∇u−→n dx, where C is a curve in Ω.

37

6. HDD for multiple right-hand sides and multiple Dirichlet

data.

7. HDD can easily be parallelized.

8. Problems with repeated patterns.

38

Thanks for your attention !

top related