computational solid mechanics { part ikochmann.caltech.edu/ae214/ae214lecturenotes.pdf ·...

Ae/AM/CE/ME 214a – Computational Solid Mechanics March 7, 2017Fall 2016 Prof. D. M. Kochmann, Caltech

Computational Solid Mechanics – Part I

(Ae/AM/CE/ME 214a)

Dennis M. Kochmann

Division of Engineering and Applied ScienceCalifornia Institute of Technology

18 19

30 32

1 2

43

global nodes 18,19,32,30

element We

local nodes 1,2,3,4

quadraturepoints

material model

u19

Fint18

assembler:Fint,e

1

ue4

x

h

xy

xk

P=P (Ñu)

Fint,e Ue

Ñu ( )xkP, C

C= C (Ñu)

element

quadraturerule

W , )x k k(

SpatialDimension: 2D

DegreesOfFreedom: 2 (u , u )x y

solver:

nodes = 1, (0,0), 2, (0.5,1.2), ...

connectivity = 1,2,13,12, ..., 18,19,32,30, ...

mesh:

F (inth

U ) - F = 0ext

F , TinthUi

hU

ess. BCs12

u = 0x

Copyright © 2016 by Dennis M. Kochmann

1


These lecture notes are a concise collection of equations and comments. They are by no means acomplete textbook or set of class notes that could replace lectures.

You are strongly encouraged to take your own notes during lectures and to use this set of notes asa look-up reference.

2


1 Introduction, Vector Spaces

We define a set Ω as a collection of points X ∈ Ω.

We further say Ω is a (proper) subset of all space if Ω ⊆ Rd in d dimensions (proper if Ω ⊂ Rd).

We usually take Ω to be an open set, i.e., Ω ∩ ∂Ω = ∅ with boundary ∂Ω.

Deformation and motion are described by a mapping

ϕ ∶ X ∈ Ω→ ϕ(X) ∈ Rd or ϕ ∶ Ω→ Rd, (1.1)

where Ω is the domain and Rd the range of ϕ. The mapped (current) configuration of Ω is ϕ(Ω).

Every function f(x) ∶ R→ R is a mapping from R to R.

We call a mapping injective (or one-to-one) if for each x ∈ ϕ(Ω) there is one unique X ∈ Ω suchthat x = ϕ(X). In other words, no two points X ∈ Ω are mapped onto the same position x. Amapping is surjective (or onto) if the entire set Ω is mapped onto the entire set ϕ(Ω); i.e., forevery X ∈ Ω there exists at least one x ∈ ϕ(Ω) such that x = ϕ(X). If a mapping is both injectiveand surjective (or one-to-one and onto) we say it is bijective. A bijective mapping is also calledan isomorphism.

For example, ϕ ∶ Ω→ Rd is injective, whereas ϕ ∶ Ω→ ϕ(Ω) would be bijective.

For time-dependent problems, we have ϕ ∶ Ω ×R → Rd and x = ϕ(X, t). This describes a familyof configurations ϕ(Ω, t), from which we arbitrarily define a reference configuration Ω for whichϕ = id (the identity mapping).

A linear/vector space Ω,+;R, ⋅ is defined by the following identities. For any u,v,w ∈ Ω andα,β ∈ R it holds that

(i) closure: α ⋅u + β ⋅ v ∈ Ω

(ii) associativity w.r.t. +: (u + v) +w = u + (v +w)(iii) null element : ∃ 0 ∈ Ω such that u + 0 = u(iv) negative element : for all u ∈ Ω ∃ −u ∈ Ω such that u + (−u) = 0

(v) commutativity : u + v = v +u(vi) associativity w.r.t. ⋅: (αβ) ⋅u = α(β ⋅u)(vii) distributivity w.r.t. R: (α + β) ⋅u = α ⋅u + β ⋅u(viii) distributivity w.r.t. Ω: α ⋅ (u + v) = α ⋅u + α ⋅ v(ix) identity : 1 ⋅u = u

examples:

Rd is a vector space. By contrast, Ω ⊂ Rd is not a vector space, since – in general – it violates,e.g., (i) closure and (iv) null element.

P2 = ax2 + bx + c ∶ a, b, c ∈ R is the space of all second-order polynomial functions,or an ordered triad (a, b, c) ∈ R3. More generally, Pk(Ω) is the space of all kth-orderpolynomial functions defined on Ω. Pk(Ω) is a linear space.

We call P2 a linear subspace of Pk with k ≥ 2, and we write P2 ⊆ Pk.

3


2 Continuum Mechanics

The governing equations are commonly partial differential equations. Key examples for us are:

linear momentum balance:

DivP +RB = RA divσ + ρb = ρawith deformation mapping: with displacement field:

ϕ(X, t) ∶ Ω ×R→ Rd u(x, t) ∶ Ω ×R→ Rd

and x = ϕ(X, t) and x =X +u if ∥∇u∥ ≪ 1

and acceleration A(X, t) = d2

dt2ϕ(X, t) and acceleration a(x, t) = u(x, t)

deformation gradient (kinematics): infinitesimal strain tensor:

F = Gradϕ ε = sym(gradu)Constitutive laws determine, e.g., thefirst Piola-Kirchhoff stress tensor: Cauchy stress tensor:

P = P (F ) σ = σ(ε)Important constitutive relations:

PiJ =∂W

∂FiJ, CiJkL = ∂PiJ

∂FkL, σij =

∂W

∂εij, cijkl =

∂σij

∂εkl

Goal is to find the deformation mapping ϕ(X, t) or displacement field u(x, t).Note: quasistatics implies no inertial effects: RA ≈ 0 (common simplification at low rates).

heat equation (conservation of energy):

RCV T = −DivQ +RSh (and analogously in small strains) (2.1)

with a constitutive relation Q =Q(GradT ), e.g., Fourier’s law: Q = −λ GradT , so we obtainthe heat equation: RCV T = Div(λ GradT ) +RSh and in case of constant, isotropic thermalconductivity λ = λI, we arrive at the heat equation

RCV T = λ∇2T +RSh. (2.2)

Goal is to find the temperature field T (X, t) ∶ Ω ×R→ R.

diffusion equation (conservation of mass):

c = Div(DGrad c) + Sc (2.3)

Goal is to find the chemical concentration field c(X, t) ∶ Ω ×R→ R.

generalized conservation law:

A = −DivJ + S with J = J(GradA) (2.4)

where A is a conserved field, J its flux, and S an internal source term.In simple terms, a change in A is effected by internal sources and by flux.

4


An initial boundary value problem (IBVP) furnishes the above equations with appropriateboundary conditions (BCs) and initial conditions (ICs).

To this end, we subdivide the boundary ∂Ω of a body Ω into

∂ΩD ≡ Dirichlet boundary, prescribing the primary field (ϕ, T , etc.):

ϕ(X, t) = x(X, t) on ∂ΩD or c(X, t) = c(X, t) on ∂ΩD. (2.5)

∂ΩN ≡ Neumann boundary, prescribing derivatives of the primary field (Gradϕ, etc.):

T (X, t) = PN(X, t) = T on ∂ΩN or Q(X, t) = Q(X, t) on ∂ΩN . (2.6)

and we must have

∂ΩD ∪ ∂ΩN = ∂Ω and ∂ΩD ∩ ∂ΩN = ∅. (2.7)

Initial conditions, e.g., T (X,0) = T0(X)or ϕ(X,0) = x0(X) and V (X,0) = V0(X) for all X ∈ Ω.

Number of required BCs/ICs depends on the order of a PDE, e.g.,

c = Div(DGrad c) + Sc (2.8)

is first-order in time, requires one IC: c(X,0) = c0(X); and is second-order in space, requires BCsalong all ∂Ω (e.g., two conditions per x and y coordinates).

Classification of ODEs/PDEs:

all of the above ODEs/PDEs are linear in the primary fields and of (at most) second order:

au,xx + b,xy + cu,yy + du,x + eu,y + f = 0. (2.9)

The equation is

elliptic if b2 − 4ac < 0, e.g., λ∇2T +RSh = 0 (static heat equation)but also divσ + ρb = 0 (quasistatic linear momentum balance).

Elliptic ODEs/PDEs generally produce smooth solutions.

hyperbolic if b2 − 4ac > 0, e.g., ku,xx = u,tt (wave equation).

Hyperbolic ODEs/PDEs generally preserve existing discontinuities.

parabolic if b2 − 4ac = 0, e.g., λT,xx = T,t (heat/diffusion equation).

Parabolic ODEs/PDEs generally smooth out initial discontinuities.

5


3 Numerical Methods

To numerically solve such ODEs/PDEs, we generally have so-called direct and indirect methods.

Direct methods aim to solve the governing equations directly; for example, using finite differ-ences (FD):

1D diffusion equation: c =Dc,xx + sc

Introduce a regular (∆x,∆t)-grid with cαi = c(xi, tα) and use Taylor expansions, e.g., in space:

c(xi+1, tα) = cαi+1 = cαi +∆x∂c

∂x∣xi,tα

+ (∆x)2

2

∂2c

∂x2∣xi,tα

+ (∆x)3

3!

∂3c

∂x3∣xi,tα

+O(∆x4) (3.1)

c(xi−1, tα) = cαi−1 = cαi −∆x∂c

∂x∣xi,tα

+ (∆x)2

2

∂2c

∂x2∣xi,tα

− (∆x)3

3!

∂3c

∂x3∣xi,tα

+O(∆x4) (3.2)

Addition of the two equations gives:

cαi+1+cαi−1 = 2cαi +(∆x)2 ∂2c

∂x2∣xi,tα

+O(∆x4) ⇒ ∂2c

∂x2(xi, tα) =

cαi+1 − 2cαi + cαi−1

(∆x)2+O(∆x2) (3.3)

This is the second-order central difference approximation.

Analogously, Taylor expansion in time and subtraction of the two equations gives:

cα+1i − cα−1

i = 2∆t∂c

∂t∣xi,tα

+O(∆t3) ⇒ ∂c

∂t(xi, tα) =

cα+1i − cα−1

i

2∆t+O(∆t2) (3.4)

which is the first-order central difference scheme. A simpler stencil is obtained from the firstTaylor equation (3.1) alone:

cα+1i − cαi = ∆t

∂c

∂t∣xi,tα

+O(∆t2) ⇒ ∂c

∂t(xi, tα) =

cα+1i − cαi

∆t+O(∆t) (3.5)

Altogether, the discretized governing equation becomes, e.g., the explicit FD scheme

cα+1i − cαi

∆t=Dc

αi+1 − 2cαi + cαi−1

(∆x)2+ sh(xi, tα) +O(∆t,∆x2) (3.6)

which in the limit ∆t,∆x → 0 is expected to converge towards the same solution as the governingequation (consistency of the discretized equation).

Numerical solution can be interpreted via stencils, which also reveal the required BCs/ICs.

6


problems associated with direct methods (e.g.):

a regular grid is required (okay for most fluids problems but oftentimes problematic for com-plex solid geometries).

stability/efficiency issues (probably known from fundamental computational mechanics classes:CFL-condition, von Neumann stability analysis, etc.).

variables are only defined at grid points, hence the error is minimized only at grid points (noinformation about what happens between grid points; fields and errors undefiend).

alternative:

By contrast, indirect methods do not solve the ODEs/PDEs directly but search for optimalapproximations, e.g., ch(x) ≈ c(x) for all x ∈ Ω.

questions to be addressed:

How do we choose ch(x)?

What is “optimal”?

What trial functions to use?

We need a few more concepts of function spaces to address those questions. Note that in thefollowing, we will formulate most concepts in 1D with analogous generalizations possible for higherdimensions unless specifically mentioned.

7


4 Function Spaces

Consider a function u(x) ∶ Ω→ R and Ω ⊂ R.

u is continuous at a point x if, given any scalar ε > 0, there is a r(ε) ∈ R such that

∣u(y) − u(x)∣ < ε provided that ∣y − x∣ < r. (4.1)

A function u is continuous over Ω if it is continuous at all points x ∈ Ω.

u is of class Ck(Ω) with an integer k ≥ 0 if it is k times continuously differentiable over Ω (i.e., upossesses derivatives up to the kth order and these derivatives are continuous functions).

Examples:

Functions u(x) ∈ Pk with k ≥ 0 are generally C∞(R).

Consider a continuous, piecewise-linear function u ∶ Ω = (0,2) → R. Function u is C0(Ω) butnot C1(Ω).

The Heavyside function H(x) is said to be C−1(R) since its “zeroth derivative” (i.e., thefunction itself) is not continuous.

If there are no discontinuities such as cracks, shocks, etc. (or discontinuities in the BCs/ICs) weusually assume the solution fields are C∞(Ω), so we may take derivatives; otherwise, derivativesexist almost everywhere (a.e.)

To evaluate the global errors of functions, we need norms.

Consider a linear space U ,+;R, ⋅. A mapping ⟨⋅, ⋅⟩ ∶ U ×U → R is called inner product on U ×Uif for all u, v,w ∈ U and α ∈ R:

(i) ⟨u + v,w⟩ = ⟨u,w⟩ + ⟨v,w⟩(ii) ⟨u, v⟩ = ⟨v, u⟩

(iii) ⟨α ⋅ u, v⟩ = α ⟨u, v⟩(iv) ⟨u,u⟩ ≥ 0 and ⟨u,u⟩ = 0⇔ u = 0

A linear space U endowed with an inner product is called an inner product space.

Examples:

⟨u,v⟩ = uivi = u ⋅ v defines an inner product for u,v ∈ Rd.

The L2-inner product for functions u, v ∈ U with domain Ω:

⟨u, v⟩L2(Ω) = ∫Ωu(x) v(x)dx and often just ⟨u, v⟩ = ⟨u, v⟩L2(Ω) . (4.2)

Note that if ⟨u, v⟩ = 0 we say u and v are orthogonal.

Examples:

Legendre polynomials:

pn(x) =1

2nn!

dn

dxn(x2 − 1)n so that p0 = 1, p1 = x, p2 =

1

2(3x2 − 1), . . . (4.3)

8


orthogonality on Ω = (−1,1):

∫1

−1pn(x)pm(x)dx = 2

2n + 1δmn (4.4)

trigonometric functions:

pn(x) = cos(πnxL

) (4.5)

orthogonality on Ω = (−L,L)

∫L

−Lpn(x)pm(x)dx =

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

2L, if m = n = 0

L, if m = n ≠ 0

0, else

(4.6)

Now we are in place to define the distance between x1 and x2:

d(x1, x2) =√

⟨x1 − x2, x1 − x2⟩ (4.7)

We need this concept not only for points in space but also to define the closeness or proximity offunctions.

Consider a linear space U ,+;R, ⋅. A mapping ∥⋅∥ ∶ U → R+ is called a norm on U if for all u, v ∈ Uand α ∈ R:

(i) ∥u + v∥ ≤ ∥u∥ + ∥v∥ (triangle inequality)

(ii) ∥α ⋅ u∥ = ∣α∣ ∥u∥(iii) ∥u∥ ≥ 0 and ∥u∥ = 0 ⇔ u = 0.

A linear space Ω endowed with a norm is called a normed linear space (NLS).

Examples of norms:

Consider the d-dimensional Euclidean space, so x = x1, . . . , xdT. Then we define

– the 1-norm: ∥x∥1 = ∑di=1 ∣xi∣

– the 2-norm: ∥x∥2 = (∑di=1 ∣xi∣2)1/2

(Euclidean norm)

– the n-norm: ∥x∥n = (∑di=1 ∣xi∣n)1/n

– the ∞-norm: ∥x∥∞ = max1≤i≤n ∣xi∣

Now turning to functions, the Lp-norm of a function u ∶ Ω→ R:

∥u∥Lp(Ω) = (∫Ωup dx)

1/p(4.8)

The most common norm is the L2-norm:

∥u∥L2(Ω) = ⟨u,u⟩1/2L2(Ω) = (∫

Ωu2(x)dx)

1/2. (4.9)

9


Furthermore, notice that

∥u∥L∞(Ω) = ess supx∈Ω

∣u(x)∣ , (4.10)

where we introduced the essential supremum

ess supx∈Ω

∣u(x)∣ =M with the smallest M that satisfies ∣u(x)∣ ≤M for a.e. x ∈ Ω. (4.11)

Now, that we have norms, we can generalize our definition of the distance. If un, u ∈ U equippedwith a norm ∥⋅∥ ∶ U → R, then we define the distance as

d(un, u) = ∥un − u∥ . (4.12)

Now, we are in place to define the convergence of a sequence of functions un to u in U : we sayun → u ∈ U if for all ε > 0 there exists N(ε) such that d(un, u) < ε for all n > N .

Examples:

Consider un ∈ U = P2(Ω) with L2-norm and Ω ⊂ R

un(x) = (1 + 1

n)x2 → u(x) = x2 since d(un − u) =

1

n∫ x2 dx (4.13)

with u ∈ U = P2(Ω). For example, for d(un − u) < ε we need n > N = ∫Ω x2 dx/ε.

Fourier series:

u(x) =∞∑i=0

ci xi ⇒ un(x) =

n

∑i=0

ci xi such that un → u as n→∞. (4.14)

Given a point u in a normed linear space U , a neighborhood Nr(u) of radius r > 0 is defined asthe set of points v ∈ U for which d(u, v) < r. Now, we can define sets properly:

A subset V ⊂ U is called open if, for each point u ∈ V, there exists a neighborhood Nr(u) whichis fully contained in V. The complement V of an open set V is, by definition a closed set. Theclosure V of an open set V is the smallest closed set that contains V. In simple terms, a closed setis defined as a set which contains all its limit points. Therefore, note that

supx∈Ω

∣u(x)∣ = maxx∈Ω

∣u(x)∣ . (4.15)

For example, (0,1) is an open set in R. [0,1] is a closed set, and [0,1] is the closure of (0,1).

A linear space U is a complete space if every sequence un in U converges to u ∈ U . In simpleterms, the space must contain all limit points.

A complete normed linear space is called a Banach space; i.e., U ,+;R, ⋅ with a norm ∥⋅∥ andun → u ∈ U . A complete inner product space is called a Hilbert space.

Note that ∥⋅∥ = ⟨⋅, ⋅⟩1/2 defines a norm. Hence, every Hilbert space is also a Banach space (but notthe other way around).

10


As an example, consider U = Pn (the space of all polynomial functions of order n ∈ N). This is alinear space which we equip with a norm, e.g., the L2-norm. It is complete since (an, bn, cn, . . .) →(a, b, c, . . .) for a, b, c, . . . ∈ R. And an inner product is defined via ⟨u, v⟩ = ∫Ω uv dx. With all thesedefinitions, U is a Hilbert space.

We can use these norms to define function spaces, e.g., the L2-space of functions:

L2(Ω) = u ∶ Ω→ R ∶ ∫Ωu2 dx <∞ (4.16)

We say L2(Ω) contains all functions that are square-integrable on Ω.

Examples:

u ∶ Ω→ R with u ∈ Pk(Ω) and ess supx∈Ω ∣u(x)∣ <∞. Then, u ∈ L2(Ω).

f ∶ R→ R with f(x) = x−2 is not in L2(Ω) if 0 ∈ Ω.

Piecewise constant functions u (with ess supx∈Ω ∣u(x)∣ <∞) are square-integrable and thus inL2.

Note that we can write alternatively

L2(Ω) = u ∶ Ω→ R ∶ ∥u∥L2(Ω) <∞. (4.17)

11


5 Approximation Theory

Motivation: in computational mechanics, we seek approximate solutions uh(x) = ∑Na=1 uaNa(x),e.g., a linear combination of basis functions Na(x) with amplitudes ua ∈ R.

Questions: How does uh(x) converge to u(x), if at all? Can we find an error estimate ∥uh − u∥?What is the rate of convergence (how fast does it converge, cf. the truncation error arguments forgrid-based direct methods)?

Fundamental tools for estimating errors are the Poincare inequalities:

(i) Dirichlet-Poincare inequality:

∫h

0∣v(x)∣2 dx ≤ ch∫

h

0[v′(x)]2

dx if v(0) = v(h) = 0. (5.1)

with a constant ch > 0 that depends on the interval size h.

(ii) Neumann-Poincare (or Poincare-Wirtinger) inequality:

∫h

0∣v(x) − v∣2 dx ≤ ch∫

h

0[v′(x)]2

dx with v = 1

h∫

h

0u(x) dx (5.2)

In 1D an optimal constant can be found: ch = h2/π2.

(iii) extension:

∫h

0∣v(x)∣2 dx ≤ h

2

π2[∫

h

0[v′(x)]2

dx + ∣v(x0)∣2] with x0 ∈ [0, h]. (5.3)

Now, let us use those inequalities to find error bounds. Suppose a general function u(x) is approx-imated by a piecewise linear approximation uh(x). Let’s first find a local error estimate.

Consider v(x) = u′h(x) − u′(x) and note that by Rolle’s theorem

u′h(x0) − u′(x0) = 0 for some x0 ∈ (0, h). (5.4)

Next, use inequality (iii):

∫h

0∣u′h(x) − u′(x)∣

2dx ≤ h

2

π2 ∫h

0∣u′′h(x) − u′′(x)∣

2dx, (5.5)

but since uh(x) is piecewise linear, we have u′′h(x) = 0, so that we arrive at the local error estimate

∫h

0∣u′h(x) − u′(x)∣

2dx ≤ h

2

π2 ∫h

0∣u′′(x)∣2 dx. (5.6)

Now, let’s seek a global error estimate by using

∫b

a(⋅) dx =

n

∑i=0∫

xi+1

xi(⋅) dx with x0 = a, xn+1 = b, xi+1 = xi + h (5.7)

12


so that

∫b

a∣u′h(x) − u′(x)∣

2dx ≤ h

2

π2 ∫b

a∣u′′(x)∣2 dx (5.8)

Taking square roots, we see that for Ω = (a, b)

∥u′h − u′∥L2(Ω) ≤h

π∥u′′∥

L2(Ω) (5.9)

and hence that ∥u′h − u′∥L2(Ω) → 0 as h→ 0 linearly in h.

We want to write this a bit more concise. Let us define the Sobolev semi-norm:

∣u∣Hk(Ω) = [∫Ω∣Dku∣2 dx]

1/2or for short ∣u∣k = ∣u∣Hk(Ω) (5.10)

where in 1D Dku = u(k). A semi-norm in general must satisfy the following conditions:

(i) ∥u + v∥ ≤ ∥u∥ + ∥v∥ (like for a norm)

(ii) ∥α ⋅ u∥ = ∣α∣ ∥u∥ (like for a norm)

(iii) ∥u∥ ≥ 0 (a norm also requires ∥u∥ = 0 iff u = 0, not so for a semi-norm).

Examples in 1D:

from before:

∣u∣H1(a,b) = [∫b

a∣u′(x)∣2 dx]

1/2(5.11)

analogously:

∣u∣H2(a,b) = [∫b

a∣u′′(x)∣2 dx]

1/2(5.12)

so that we can write (5.8) as

∣uh − u∣2H1(a,b) ≤h2

π2∣u∣2H2(a,b) ⇒ ∣uh − u∣H1(a,b) ≤

h

π∣u∣H2(a,b) (5.13)

Hence, the topology of convergence is bounded by the regularity of u. Convergence withh-refinement is linear.

note the special case

∣u∣2H0 = ∫Ωu(x)2 dx = ∥u∥2

L2 (L2-norm) (5.14)

13


We can extend this to higher-order interpolation. For example, use a piecewise quadraticinterpolation uh. From Poincare:

∫h

0∣u′h − u′∣

2dx ≤ h

2

π2 ∫h

0∣u′′h − u′′∣

2dx ≤ h

4

π4 ∫h

0∣u′′′h − u′′′∣2 dx = h

4

π4 ∫h

0∣u′′′∣2 dx (5.15)

Extension into a global error estimate with quadratic h-convergence:

∣uh − u∣H1(a,b) ≤h2

π2∣u∣H3(a,b) . (5.16)

For a general interpolation of order k:

∣uh − u∣H1(a,b) ≤hk

πk∣u∣Hk+1(a,b) (5.17)

Why is the Sobolev semi-norm not a norm? Simply consider the example u(x) = c > 0. All higherderivatives vanish on R, so that ∣u∣Hk(Ω) = 0 for Ω ⊂ R and k ≥ 1. However, that does not implythat u = 0 (in fact, it is not).

Let us introduce the Sobolev norm (notice the double norm bars)

∥u∥Hk(Ω) = (k

∑m=0

∣u∣2Hm(Ω))1/2

or for short ∥u∥k = ∥u∥Hk(Ω) (5.18)

For example, in one dimension

∥u∥2H1(Ω) = ∣u∣20 + ∣u∣21 = ∫

Ωu(x)2 dx + ∫

Ω[u′(x)]2

dx = ∥u∥2L2(Ω) + ∣u∣2H1(Ω) (5.19)

Note that this also shows that, more generally,

∥u∥2Hk(Ω) ≥ ∣u∣2Hk(Ω) . (5.20)

Let us derive a final global error estimate, one that involves proper norms – here for the exampleof a piecewise linear uh. Start with Poincare inequality (i):

∥uh − u∥2L2(a,b) = ∫

b

a∣uh − u∣2 dx ≤ ch∫

b

a∣u′h − u′∣

2dx = ch ∣uh − u∣2H1(a,b) (5.21)

and from (5.19):

∥uh − u∥2H1(Ω) = ∥uh − u∥2

L2(Ω) + ∣uh − u∣2H1(Ω)

≤ (1 + ch) ∣uh − u∣2H1(a,b) ≤ c∗h ∣u∣2H2(a,b)(5.22)

which along with (5.13) gives

∥uh − u∥H1(a,b) ≤ ch ∣u∣H2(a,b) ≤ ch ∥u∥H2(a,b) (5.23)

14


Summary and Extension of Norms:

Lp-norm: ∥u∥Lp(Ω) = (∫Ωup dx)

1/p

Sobolev semi-norm ∶ ∣u∣Hk(Ω) = [∫Ω∣Dku∣2 dx]

1/2= ∣u∣k

Sobolev norm ∶ ∥u∥Hk(Ω) = (k

∑m=0

∣u∣2Hm(Ω))1/2

= ∥u∥k

generalization ∶ ∣u∣Wk,p(Ω) = [∫Ω∣Dku∣p dx]

1/p= ∣u∣k,p

∥u∥Wk,p(Ω) = (k

∑m=0

∣u∣pWm,p(Ω))

1/p

= ∥u∥k,p

5.1 Sobolev spaces

The Sobolev norm is used to define a Sobolev space:

Hk(Ω) = u ∶ Ω→ R such that ∥u∥k <∞, (5.24)

which includes all functions whose kth-order derivatives are square-integrable.

Examples:

Consider a piecewise linear function u ∈ C0 defined on Ω = (0,2). Then u ∈ H1(Ω) since thefirst derivative is piecewise constant and therefore square-integrable.

Consider the Heavyside step function H(x) ∈ C−1 defined on R. Then, e.g., h ∈ H0(Ω) withΩ = (−1,1) since the first derivative (the Dirac delta function) is not square-integrable over(−1,1).

Overall, note that the above examples imply that

Hm(Ω) ⊂ Ck(Ω) with m > k. (5.25)

For example, if a function has a kth continuous derivative, then the (k + 1)th derivative is definedpiecewise and therefore square-integrable.

5.2 Higher dimensions

To extend the above concepts to higher dimensions, we need multi-indices. A multi-index is anarray of non-negative integers:

α = (α1, . . . , αn) ∈ (Z+0)n (5.26)

The degree of a multi-index is defined as

∣α∣ = α1 + . . . + αn. (5.27)

15


This can be used to define a monomial for x ∈ Rn:

xα = xα11 ⋅ . . . ⋅ xαnn (5.28)

For example, we can now extend our definition of polynomials to higher dimensions:

p(x) ∈ Pk(R2) ⇒ p(x) =k

∑β=0

∑∣α∣=β

aαxα (5.29)

Specifically, the monomials above for x ∈ R2 are

for ∣α∣ = 0 ∶ x0y0 = 1for ∣α∣ = 1 ∶ x1y0, x1y0 = x, yfor ∣α∣ = 2 ∶ x2y0, x1, y1, x0y2 = x2, xy, y2 . . .

(5.30)

so that

p(x) = a(0,0) + a(1,0)x1 + a(0,1)x2 + a(2,0)x21 + a(1,1)x1x2 + a(0,2)x2

2 + . . . (5.31)

Note that this defines a complete polynomial of degree k.

Now we can use multi-indices to define partial derivatives via

Dαu = ∂ ∣α∣u

∂xα11 ⋅ . . . ⋅ ∂xαnn

and D0u = u (5.32)

A common notation is

∑∣α∣=β

Dαu = ∑α1,...,αns.t. ∣α∣=β

∂ ∣α∣u

∂xα11 ⋅ . . . ⋅ ∂xαnn

(5.33)

With the above derivatives, we may redefine the inner product

⟨u, v⟩Hm(Ω) = ∫Ω

m

∑β=0

∑∣α∣=β

DβuDβu dx (5.34)

and the Sobolev norm

∥u∥Hm(Ω) = ⟨u,u⟩1/2Hm(Ω) =

⎡⎢⎢⎢⎢⎣

m

∑β=0

∑∣α∣=β

∫Ω(Dαu)2 dx

⎤⎥⎥⎥⎥⎦

1/2

=⎡⎢⎢⎢⎢⎣

m

∑β=0

∑∣α∣=β

∥Dαu∥2L2(Ω)

⎤⎥⎥⎥⎥⎦

1/2

(5.35)

Let’s look at some examples; e.g., consider Ω = R2 and m = 1. Then we have

D0u = u and D1u = ∂u∂x1

,∂u

∂x2 (5.36)

so that

⟨u, v⟩H1(R2) = ∫R2(uv + ∂u

∂x1

∂v

∂x1+ ∂u

∂x2

∂v

∂x2) dx1 dx2 (5.37)

16


and

∥u∥2H1(R2) = ∫R2

[u2 + ( ∂u∂x1

)2

+ ( ∂u∂x2

)2

] dx1 dx2. (5.38)

Altogether we can now properly define a Sobolev space in arbitrary dimensions:

Hm(Ω) = u ∶ Ω→ R ∶ Dαu ∈ L2(Ω) ∀ α ≤m (5.39)

This is the set of all functions whose derivatives up to mth order all exist and are square-integrable.

As an example, u ∈ H1(Ω) implies that u and all its first partial derivatives must be square-integrable over Ω because (5.38) must be finite.

Let us look at the example u(x) = ∣x∣ and take Ω = (−1,1). Then, we have u′(x) = H(x) (theHeaviside jump function) and u′′(x) = δ(x) (the Dirac delta function). Therefore,

∫b

au2(x) dx <∞ ⇒ u ∈ L2(Ω) =H0(Ω)

∫b

a(∂u∂x

)2

dx = ∫b

aH2(x)dx <∞ ⇒ ∂u

∂x∈ L2(Ω) and u ∈H1(Ω)

∫b

a(∂

2u

∂x2)

2

dx = ∫b

aδ2(x)dx =∞ ⇒ ∂2u

∂x2∉ L2(Ω) and u ∉H2(Ω)

(5.40)

Note that one usually indicates the highest order k that applies (since this is what matters forpractical purposes), so here we thus conclude that u ∈H1(Ω).

From the above, we also see that

H∞ ⊂ . . . ⊂H2 ⊂H1 ⊂H0 = L2. (5.41)

Notice that even though polynomials u ∈ Pk(Ω) are generally in H∞(Ω) for any bounded Ω ⊂ Rd,they are not square-integrable over Ω = Rd. Luckily, in practical problems we usually consider onlyfinite bodies Ω. To more properly address this issue, let us introduce the support of a continuousfunction u defined on the open set Ω ∈ Rd as the closure in Ω of the set of all points where u(x) ≠ 0,i.e.,

suppu = x ∈ Ω ∶ u(x) ≠ 0 (5.42)

This means that u(x) = 0 for x ∈ Ω ∖ suppu. We may state, e.g., that functions u ∶ Ω → R with afinite support Ω ⊂ Rd and ess supΩ <∞ are square-integrable over Ω.

Finally, let us define by Ck0 (Ω) the set of all functions contained in Ck(Ω) whose support is abounded subset of Ω. Also, notice that

Ck0 (Ω) ⊂Hk0 (Ω) (5.43)

and

C∞0 (Ω) = ⋂

k≥0

Ck0 (Ω). (5.44)

17


6 Functionals

A functional is special type of mapping which maps from a function space U to R:

I ∶ u ∈ U → I[u] ∈ R. (6.1)

An example is the energy of a mechanical system which depends on the displacement field u(x)and defines an energy I ∈ R. Note that unlike a function which is a mapping from Rd → R, afunctional ’s domain is generally a function space U .

Energies are often defined via operators. Generally, we call A an operator if

A ∶ u ∈ U → A(u) ∈ V, (6.2)

where both U and V are function spaces.

Examples:

f(x) =√x2

1 + x22 = ∣x∣ is a function with f ∶ R2 → R+

0 .

I[u] = ∫ 10 u

2(x) dx = ∥u∥2L2([0,1]) is a functional requiring u ∈ U ⊂H0(0,1).

A(u) = cdu

dxis a (linear differential) operator requiring u ∈ C1.

An operator A ∶ U → V is linear if for all u1, u2 ∈ U and α,β ∈ R

A(α ⋅ u1 + β ⋅ u2) = α ⋅A(u1) + β ⋅A(u2). (6.3)

For example, L is a linear operator in

au,xx + bu,x = c ⇔ L(u) = c with L(⋅) = a(⋅),xx + b(⋅),x. (6.4)

Operators (such as the inner product operator) can also act on more than one function. Consider,e.g., an operator B ∶ U × V → R where U ,V are Hilbert spaces. B is called a bilinear operator iffor all u,u1, u2 ∈ U and v, v1, v2 ∈ V and α,β ∈ R

(i) B(α ⋅ u1 + β ⋅ u2, v) = α ⋅ B(u1, v) + β ⋅ B(u2, v)(ii) B(u,α ⋅ v1 + β ⋅ v2) = α ⋅ B(u, v1) + β ⋅ B(u, v2)

An example of a bilinear operator is the inner product ⟨⋅, ⋅⟩ ∶ U × U → R for some Hilbert space U .

An operator A ∶ U → V is called symmetric if

⟨A(u), v⟩ = ⟨u,A(v)⟩ for all u, v, ∈ U . (6.5)

Furthermore, the operator is positive if

⟨A(u), u⟩ ≥ 0 for all u ∈ U . (6.6)

An example of a symmetric operator is A(u) =Mu with u ∈ Rd and M ∈ Rd×d, which is positiveif M is positive-semidefinite.

18


7 Variational Calculus

7.1 Variations

Consider a functional I ∶ U → R such as the potential energy. Analogous to the stationaritycondition of classical optimization problems, a necessary condition for an extremum of I is thatthe first variation of I vanishes, i.e., δI[u] = 0 (this is the stationarity condition).

A variation δu is an arbitrary function that represents admissible changes of u. If Ω ⊂ Rd is thedomain of u ∈ U with boundary ∂Ω, we seek solutions

u ∈ U = u ∈Hk(Ω) ∶ u = u on ∂ΩD (7.1)

then

δu ∈ U0 = u ∈Hk(Ω) ∶ u = 0 on ∂ΩD . (7.2)

k can be determined from the specific form of I[u], as will be discussed later.

With this, we define the first variation of I as

δI[u] = limε→0

I[u + ε δu] − I[u]ε

= d

dεI[u + ε δu]∣

ε→0

(7.3)

and analogously higher-order variations via

δkI[u] = δ (δk−1I) for k ≥ 2 (7.4)

Note that a Taylor expansion of a functional I can now be written as

I[u + δu] = I[u] + δI[u] + 1

2!δ2I[u] + 1

3!δ3I[u] + . . . (7.5)

The following are helpful relations for u, v ∈ U , further Ii ∶ U → V ⊂ R, and constants αi ∈ R:

δ (α1I1 + α2I2) = α1 δI1 + α2 δI2

δ(I1I2) = (δI1)I2 + I1(δI2)

δdu

dx=

d

dxδu (assuming u ∈ C1)

δ ∫Ω u dx = ∫Ω δu dx (assuming Ω is independent of u)

δI[u, v, . . .] =d

dεI[u + ε δu, v + ε δv, . . .]ε→0

Example:

Let us consider

I[u] = ∥u∥2L2(0,1) = ∫

1

0u2 dx so that we seek u ∈ U =H0(0,1). (7.6)

19


The variations follow as

δI = limε→0

d

dε∫

1

0(u + ε δu)2 dx = lim

ε→0∫

1

02(u + ε δu)δu dx = 2∫

1

0uδu dx

δ2I = limε→0

δI[u + ε δu] = d

dε∫

1

02(u + ε δu) δu dx = 2∫

1

0(δu)2 dx

δkI = 0 for all k > 2.

(7.7)

Notice that

I[u + δu] = ∫1

0(u + δu)2 dx = ∫

1

0u2 dx + ∫

1

02uδu dx + ∫

1

0(δu)2 dx

= I[u] + δI[u] + 1

2δ2I[u].

(7.8)

Here comes the reason we look into variations:

Some classes of partial differential equations possess a so-called variational structure; i.e., theirsolutions u ∈ U can be interpreted as extremal points over U of a functional I[u].

7.2 Example: Static Heat Conduction

As an introductory example, let us review the static heat conduction problem in d dimensions,defined by (N ∈ Rd denoting the outward unit normal vector)

⎡⎢⎢⎢⎢⎢⎢⎣

λ∆T +RSh = 0 in Ω,

T = T on ∂ΩD,

Q = −λ GradT ⋅N = Q on ∂ΩN ,

(7.9)

and we seek solutions T ∈ C2(Ω) ∩ C0(Ω) that satisfy all of the above equations. Such solutionsare called classical solution.

As an alternative to solving the above equations, consider the total potential energy defined by thefunctional I ∶ U → R with

I[T ] = ∫Ω(λ

2∥GradT ∥2 −RShT) dV + ∫

∂ΩNQ T dS. (7.10)

The specific form shows that we need to seek solutions in the space

U = T ∈H1(Ω) ∶ T = T on ∂ΩD and U0 = T ∈H1(Ω) ∶ T = 0 on ∂ΩD . (7.11)

Let us find extremal points T ∈ U that render I[T ] stationary.

First variation:

δI[T ] = ∫Ω(λ

22T,i δT,i −RSh δT) dV + ∫

∂ΩNQ δT dS = 0 for all δT ∈ U0. (7.12)

Application of the divergence theorem to the first term yields

∫∂ΩλT,iNi δT dS−∫

ΩλT,ii δT dV −∫

ΩRSh δT dV +∫

∂ΩNQ δT dS = 0 for all δT ∈ U0. (7.13)

20


Rearranging terms and using the fact that δT = 0 on ∂ΩD leads to

−∫Ω(λT,ii +RSh) δT dV + ∫

∂ΩN(λT,iNi + Q) δT dS = 0 for all δT ∈ U0. (7.14)

This must hold for all admissible variations δT ∈ U0. Therefore, (7.14) is equivalent to stating

λ∆T +RSh = 0 in Ω, −λ(GradT )N = Q on ∂ΩN and T = T on ∂ΩD. (7.15)

Ergo, extremal points u ∈ U of (7.10) are guaranteed to satisfy the governing equations (7.9) andare thus classical solutions.

To see if it is a maximizer or minimizer, let us compute the second variation

δ2I[T ] = ∫ΩλδT,i δT,idV = ∫

Ωλ ∥δGradT ∥2 dV ≥ 0. (7.16)

Hence, the extremum is a minimizer, assuming that λ > 0. Otherwise, note that λ < 0 leads tosolutions being (unstable) energy maxima, which implies that λ > 0 is a (necessary and sufficient)stability condition.

Notice that (assuming that λ = const. and using the L2-inner product) we can rewrite the energyfunctional as

I[T ] = 1

2⟨λGradT,GradT ⟩Ω − ⟨RSh, T ⟩Ω + ⟨Q, T ⟩∂ΩN

= 1

2B(T,T ) −L(T ),

(7.17)

i.e., the functional is defined by a bilinear form B and a linear form L:

B(⋅, ⋅) = ⟨λGrad ⋅,Grad ⋅⟩Ω and L(⋅) = ⟨RSh, ⋅⟩Ω − ⟨Q, ⋅⟩∂ΩN (7.18)

This is in fact a recipe for a more general class of variational problems: let us consider an energyfunctional of the general form

I[u] = 1

2B(u,u) −LΩ(u) −L∂Ω(u)

= 1

2⟨λ Gradu,Gradu⟩Ω − ⟨S,u⟩Ω − ⟨Q, u⟩∂ΩN

(7.19)

with u ∈ U being some (scalar- or vector-valued) mapping and S and Q denoting, respectively, adistributed body sources and surface fluxes/tractions. Now we have

δI[u] = B(u, δu) −L(δu)

= −∫Ω[Div(λGradu) + S] δu dV − ∫

∂ΩN[Q − λ(Gradu)N] δu dS.

(7.20)

Thus, the energy density (7.19) is generally suitable for quasistatic problems of the type

⎡⎢⎢⎢⎢⎢⎢⎣

Div(λ Gradu) + S = 0 in Ω

u = u on ∂ΩD

λ(Gradu)N = Q on ∂ΩN

(7.21)

21


Note that (7.21) describes not only heat conduction (verify that it agrees with (7.9) for the simplifiedcase λ = const.) but the general form also applies to electromagnetism, elasticity (to be discussedlater), and various other fields. Notice that, while (7.19) required u ∈ H1(Ω) (highest derivativesare of first order), evaluating (7.21) in general requires that u ∈ C2(Ω)∩C0(Ω) (second derivativesare required). We will get back to this point later.

For notational purposes, let us adopt the following notation found in various textbooks on finiteelements: the first variation is usually abbreviated as an operator acting on both the unknown fieldu and its variation δu; i.e., we write G ∶ U × U0 → V ⊂ R with

G(u, δu) =DδuI[u] = limε→0

d

dεI[u + δu] (7.22)

7.3 Uniqueness

One of the beauties of the above variational problem (7.21) is that a unique minimizer exists by theLax-Milgram theorem. This is grounded in (assuming ∣Ω∣ < ∞ and u, v ∈ U with some Hilbertspace U):

boundedness of the bilinear form:

∣B(u, v)∣ ≤ C ∥u∥ ∥v∥ for some C > 0. (7.23)

For a bilinear form B(u, v) = ⟨Gradu,Grad v⟩, this is satisfied by the Cauchy-Schwarz in-equality (using L2-norms):

∣B(u, v)∣ ≤ C ∥Gradu∥L2(Ω) ∥Grad v∥L2(Ω) ≤ C ∥Gradu∥H1(Ω) ∥Grad v∥H1(Ω) (7.24)

coercivity of the bilinear form (ellipticity):

B(u,u) ≥ c ∥u∥2 for some c > 0. (7.25)

Again, for a bilinear form B(u, v) = ⟨Gradu,Grad v⟩ this is satisfied by Poincare’s inequality :

B(u,u) = ∥Gradu∥2L2(Ω) ≥ c ∥u∥

2L2(Ω) (7.26)

These two requirements imply the well-posedness of the variational problem and thus imply theexistence of a unique solution (or, that the potential has a unique global minimizer). In simpleterms, the two conditions that the functional has sufficient growth properties (i.e., the bilinear formhas upper and lower bounds).

22


8 The weak form

8.1 Classical and weak solutions

Consider a physical problem that is – as before – governed by the so-called strong form

⎡⎢⎢⎢⎢⎢⎢⎣

(λu,i),i + S = 0 in Ω

ui = ui on ∂ΩD

λu,iNi = Q on ∂ΩN

(8.1)

where u ∈ C2(Ω) ∩C0(Ω). As we showed previously, the solution u can alternatively be found byusing a variational approach, viz.

u = arg minI[u] ∶ u = u on ∂ΩD (8.2)

whose stationarity condition is

δI[u] = G(u, δu) = B(u, δu) −L(δu) = 0 for all δu ∈ U0(Ω). (8.3)

Therefore, we can reformulate that problem (without, in principle, knowing anything about varia-tional calculus) as

G(u, v) = B(u, v) −L(v) = 0 for all v ∈ U0(Ω) (8.4)

This is called the weak form of the problem. Now we seek solutions u ∈ U(Ω) with

U = u ∈H1(Ω) ∶ u = u on ∂Ωd , (8.5)

that satisfy (8.4) for all v ∈ U0(Ω), and such a solution is called weak solution. There is oneessential difference between the weak and strong form: solutions of the weak form are requiredto be in H1(Ω), whereas the strong form required solutions to be in C2(Ω). Thus, we haveweakened/relaxed the conditions on the family of solutions, which is why the above is called theweak form.

Notice that, if v is interpreted as a virtual displacement field, then (8.4) is also referred to as theprinciple of virtual work.

Computationally, solving the weak form is usually preferable over the strong form. First, u ∈H1(Ω)is simpler to satisfy than u ∈ C2(Ω) (e.g., piecewise linear interpolation is sufficient in the weakform but not in the strong form). Second, the weak form will be shown to boil down to solving alinear system of equations.

Let us show that we can also arrive at the weak form in an alternative fashion without the useof variational calculus. Let us take the first equation in (8.1), multiply it by some random trialfunction v ∈ U0(Ω) that vanishes on ∂ΩD, and integrate over the entire domain. The result, whichmust still vanish due to (8.1), is

0 = −∫Ω[(λu,i),i + S] v dV, (8.6)

which must hold for all admissible v ∈ U0(Ω).

23


Using the divergence theorem and the fact that v = 0 on ∂ΩD reduces the above to

0 = ∫Ωλu,iv,idV − ∫

ΩSvdV − ∫

∂ΩNλu,iNiv dS for all v ∈ U0(Ω)

= ∫Ωλu,iv,idV − ∫

ΩSvdV − ∫

∂ΩNQv dS for all v ∈ U0(Ω),

(8.7)

where we used the Neumann bounday condition to transform the last term. The last equationin (8.7) is exactly identical to (8.4). In other words, we can find the weak form without the useof variational calculus, moreover, even without the existence of an energy functional by startingdirectly from the strong form. This is an important observation (even those problems that do nothave a variational structure can thus be written in terms of a weak form).

8.2 Equivalence of strong and weak forms

We have now two equivalent variational principles:

Given a Hilbert space

U = u ∈Hk(Ω) ∶ u = u on ∂Ωd , (8.8)

a functional I ∶ U → R and associated bilinear, continuous form B(⋅, ⋅) defined on U × U and acontinuous linear form L(⋅) defined on U , we seek to

(A) find u ∈ U s.t. u = arg min I[u] (8.9)

(B) find u ∈ U s.t. B(u, v) = L(v) for all v ∈ U0 (8.10)

And we know that the two have a unique connection since δI = B(u, δu) − L(δu). Thus, we knowthat

(A)⇔ (B) (8.11)

with a unique solution by the Lax-Milgram theorem.

8.3 Approximate solutions

The idea of numerical approaches is to find an approximate solution: we replace the space U by afinite-dimensional subspace

Uh ⊂ U , (8.12)

in which we seek a solution uh, where h stands for the discretization size.

An n-dimensional space Uh is defined by a set of n basis functions N1, . . . ,Nn and the approx-imation

uh(x) =n

∑a=1

uaNa(x) and vh(x) =n

∑a=1

vaNa(x). (8.13)

24


Assume that the approximation space is chosen wisely, so the exact solution can be attained withinfinite refinement; i.e., we assume that

for all u ∈ U there exists uh(v) ∈ Uh such that limh→0

∥uh(v) − u∥ = 0. (8.14)

Then we can formulate the discrete problem

(C) Find uh ∈ Uh s.t. B(uh, vh) = L(vh) for all vh ∈ Uh0 (8.15)

For the linear problem, Cea’s lemma states that if uh ∈ Uh is a solution, then (without proof here)

∥u − uh∥ ≤ Ch ∥u − v∥ for all v ∈ Uh0 . (8.16)

That is, the solution is optimal in the chosen approximate function space (also, one can show thatCh → 0 as h→ 0 so approximate solutions can be expected to converge to the classical one).

Next, let us insert the approximations (8.13) into (8.15) to obtain:

B (n

∑a=1

uaNa,n

∑b=1

vbN b) = L(n

∑b=1

vbN b) for all vb (8.17)

or, exploiting that B is bilinear and L is linear,

n

∑b=1

vb [n

∑a=1

uaB (Na,N b) −L (N b)] = 0 for all vb. (8.18)

Since this must hold for all (admissible) vb, we conclude that

n

∑a=1

uaB (Na,N b) = L (N b) for b = 1, . . . , n. (8.19)

This is a linear system to be solved for ua (a = 1, . . . , n).

Let us define a vector of all unknown coefficients:

Uh = u1, . . . , unT. (8.20)

Further, we define a (symmetric) matrix K ∈ Rn×n and vector F ∈ Rn with components

Kab = B (Na,N b) , Fb = L (N b) . (8.21)

Then, the linear system reads

KUh = F ⇔ KbaUha = Fb. (8.22)

Since we are using the same approximation space for uh and vh, this is the so-called the Bubnov-Galerkin approximation. Alternative, one can choose different function spaces for the approx-imations uh and vh, which leads to the so-called Petrov-Galerkin method. The latter gainsimportance when solving over/underconstrained problems since it allows to control the number ofequations by the choice of the dimension of the space of vh.

25


9 Vainberg’s theorem

The question arises whether or not a general form like (7.19) always exists for any set of PDEs/ODEsas governing equations. Vainberg’s theorem helps us answer this question. Consider a weak form

G[u, v] = 0 ∀ v ∈ U0(Ω). (9.1)

Let us see if G derives from a potential I via its first variation. That would imply that

G(u, δu) =DδuI[u] = limε→0

d

dεI[u + δu]. (9.2)

Now recall from calculus that for any continuously differentiable function f(x, y) we must have bySchwartz’ theorem

∂

∂y

∂f

∂x= ∂

∂x

∂f

∂y. (9.3)

We can use the same strategy to formulate whether or not a weak form derives from a potential.Specifically, we can take one more variation and state that

Dδu2G(u, δu1) =Dδu1G(u, δ2) if and only if I[u] exists (9.4)

This is known as Vainberg’s theorem.

We can easily verify this for the general form given in (7.19):

G(u, δu1) =Dδu1I[u] = ∫Ω[λGradu Grad δu1 − S δu1] dx − ∫

∂ΩNQ δu1 dx

⇒ Dδu1G(u, δ2) = ∫ΩλGrad δu2 Grad δu1 dx =Dδu2G(u, δu1)

(9.5)

In simple terms (and not most generally), Vainberg’s theorem holds if the potential obeys sym-metry. This in turn implies that the governing PDE contains derivatives of even order (such aslinear momentum balance which is second-order in both spatial and temporal derivatives, or theequilibrium equations of beams which are fourth-order in space and second-order in time). If thePDEs are of odd order (such as, e.g., the time-dependent diffusion or heat equation), then no directpotential I exists – there are work-arounds using so-called effective potentials that will be discussedlater in the context of internal variables.

Of course, knowing that a variational structure exists is beneficial but it does not reveal anythingabout the actual solution u which will be obtained by solving the above system of equations.

26


10 Energy norm

For many purposes it will be convenient to introduce the so-called energy norm

∣⋅∣E =√B(⋅, ⋅). (10.1)

For example, if in subsequent sections we find an approximate solution T h for the temperaturefield, then the error can be quantified by

∣T − T h∣E=√B(T − T h, T − T h) =

√∫

Ωλ ∥Grad(T − T h)∥2

dV . (10.2)

Notice that in this case ∥T − T h∥ does not necessarily imply T −T h = 0 so that, strictly speaking, theabove energy norm is only a semi-norm. To turn it into a proper norm, we need to exclude solutionscorresponding to rigid-body motion (i.e., solutions that imply uniform translations or rotations ofthe T -field but without producing gradients in T ). If we assume that essential boundary conditionsare chosen appropriately to suppress rigid body motion by seeking solutions

T ∈ U = T ∈H1 ∶ T = T on ∂ΩD, (10.3)

then, for this space, ∥⋅∥E is indeed a proper norm.

27


11 The mechanical variational problem

11.1 Linearized kinematics

After all those precursors, let us analyze the mechanical variational problem and start with thesimplest problem: quasistatics in linearized kinematics. Here, the strong form is

⎡⎢⎢⎢⎢⎢⎢⎣

σij,j + ρ bi = 0 in Ω,

ui = ui on ∂ΩD,

σijnj = t on ∂ΩN .

(11.1)

The associated total potential energy is

I[u] = ∫ΩW (ε) dV − ∫

Ωρb ⋅u dV − ∫

∂ΩNt ⋅u dS (11.2)

and we seek displacement field solutions

u = arg minI[u] ∶ u = u on ∂ΩD. (11.3)

Compute the first variation:

δI[u] = ∫Ω

∂W

∂εijsym(δui,j) dV − ∫

ΩρbiδuidV − ∫

∂ΩNtiδuidS

= ∫Ωσijδui,j dV − ∫

ΩρbiδuidV − ∫

∂ΩNtiδuidS = 0 ∀ δu ∈ U0,

(11.4)

where we used σij = ∂W /∂εij and σij = σji (by angular momentum balance for simple bodies).Note that application of the divergence theorem shows the equivalence of the two forms since

δI[u] = 0 = ∫∂ΩN

(ti − σijnj)δuidS − ∫Ω(σij,j + ρbi)δuidV ∀ δu ∈ U0. (11.5)

We can use the first variation to define the weak form as

G(u,v) = A(u,v) −L(v) = 0 for all adm. v (11.6)

with

A(u,v) = ∫Ωσij( sym(∇u))vi,j dV and L(v) = ∫

ΩρbividV + ∫

∂ΩNtividS. (11.7)

Notice that A(⋅, ⋅) is generally not a bilinear operator, while L(⋅) is a linear operator.

Next, we introduce the discrete weak form A(uh,vh) − L(vh) = 0 with the Bubnov-Galerkin ap-proximation

uh(x) =n

∑a=1

uaNa(x) and vh(x) =n

∑a=1

vaNa(x), (11.8)

28


so that we arrive at (in component form)

n

∑a=1

vai [∫Ωσij( sym(∇uh))Na

,j dV − ∫ΩρbiN

adV − ∫∂ΩN

tiNadS] = 0 for all adm. va (11.9)

or

Fint(Uh) −Fext = 0 with Uh = u1, . . . ,unT (11.10)

and

F aint,i = ∫Ωσij(∇uh)Na

,j dV and F aext,i = ∫ΩρbiN

adV + ∫∂ΩN

tiNadS (11.11)

For the special case of linear elasticity we have σij = Cijkluk,l so that the weak form reads

G(u,v) = B(u,v) −L(v) = 0 for all adm. v (11.12)

with

B(u,v) = ∫Ωvi,jCijkluk,l dV and L(v) = ∫

ΩρbividV + ∫

∂ΩNtividS, (11.13)

so B(⋅, ⋅) is indeed a bilinear form. Inserting the approximate fields, (11.11) becomes

F aint,i = ∫ΩCijkluhk,lN

a,j dV =

n

∑b=1∫

ΩCijklubkN

b,lN

a,j dV =

n

∑b=1

ubk ∫ΩCijklNa

,jNb,l dV

=n

∑b=1

Kabik u

bk with Kab

ik = ∫ΩCijklNa

,jNb,l dV

⇒ Fint =KUh ⇒ Uh =K−1Fext if detK ≠ 0.

(11.14)

That is, we arrive at a linear problem to be solved for the unknown coefficients Uh = u1, . . . ,un.

For computational purposes, notice that vectors Uh and (internal or extrenal) F , e.g., in 3D are,respectively

Uh =⎛⎜⎝

u1

. . .un

⎞⎟⎠=

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

u11

u12

u13

. . .un1un2un3

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

, F =⎛⎜⎝

F 1

. . .F n

⎞⎟⎠=

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

F 11

F 12

F 13

. . .Fn1Fn2Fn3

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠

. (11.15)

If we use 0-index notation like in C++ (i.e., we sum over a = 0, . . . , n − 1 instead of a = 1, . . . , n),then

uai is the (d ⋅ a + i)th component of vector Uh in d dimensions. (11.16)

Similarly, we apply the same rule to the rows and columns of matrix K, so that

Kabik is the component at (d ⋅ a + i, d ⋅ b + k) of matrix K in d dimensions. (11.17)

29


There is a shortcut to computing the internal force vector via the Rayleigh-Ritz method. Recallthat this implies we insert uh into the total potential energy and minimize with respect to theunknown coefficients, i.e., we must solve

∂I[uh]∂ua

= 0 ∀ a = 1, . . . , n. (11.18)

Note that

∂I[uh]∂uai

= 0 = ∂

∂uai[∫

ΩW (εh) dV − ∫

Ωρb ⋅uh dV − ∫

∂ΩNt ⋅uh dS]

= ∫Ω

∂W

∂εkl(εh)∂ε

hkl

∂uaidV − ∫

Ωρbk

∂

∂uai

n

∑b=1

ubkNb dV − ∫

∂ΩNtk

∂

∂uai

n

∑b=1

ubkNb dS.

(11.19)

where

εhkl =1

2(uhk,l + uhl,k) =

n

∑b=1

1

2(ubkN b

,l + ublN b,k) ⇒ ∂εhkl

∂uai= 1

2(δikNa

,l +Na,kδli). (11.20)

This is equivalent to

0 = ∫Ωσkl(εh)

1

2(Na

,lδik +Na,kδli) dV − ∫

ΩρbiN

adV − ∫∂ΩN

tiNadS

= ∫Ωσil(εh)Na

,l dV − ∫ΩρbiN

adV − ∫∂ΩN

tiNadS.

(11.21)

By comparison, we see immediately that this yields F aint,i and F aext,i directly. Thus, rather thanresorting to variations, we can obtain the internal and external force vectors alternatively (andoftentimes much more simply) by the Rayleigh-Ritz approach, which inserts the approximationinto the potential energy and then differentiates with respect to the unknown coefficients.

Also, notice that Fext is independent of the constitutive law and only depends on the applied bodyforces and surface tractions. That is, Fext is sufficiently general for arbitrary materials (in linearizedkinematics), while the computation of Fint depends on the particular material model.

11.2 Finite kinematics

The variational problem in finite-deformation quasistatics is quite similar:

I[ϕ] = ∫ΩW (F ) dV − ∫

ΩRB ⋅ϕ dV − ∫

∂ΩNT ⋅ϕ dS (11.22)

and we seek solutions

ϕ ∈ U = ϕ ∈H1(Ω) ∶ ϕ = ϕ on ∂ΩD such that ϕ = arg min I[ϕ]. (11.23)

In all our problems, we will assume that the undeformed and deformed coordinate systems coincideso that we write for convenience ϕ = x =X +u and we thus formulate the above problem in termsof the displacement field u ∈ U , like in the linear elastic case (note that this is a “notational crime”that we adopt here for convenience).

30


We may then write F = I +Gradu and compute the first variation as

δI[u] = ∫Ω

∂W

∂FiJδui,J dV − ∫

ΩRBiδuidV − ∫

∂ΩNTiδuidS

= ∫ΩPiJδui,J dV − ∫

ΩRBiδuidV − ∫

∂ΩNTiδuidS = 0,

(11.24)

where we used the first Piola-Kirchhoff stress tensor PiJ = ∂W /∂FiJ (which is not symmetric).

Even though the form looks similar to (11.4), recall that P (F ) involves in a generally nonlin-ear relation between P and the displacement gradient Gradu. Therefore, the finite-deformationvariational problem does not involve a bilinear form (even if the material is elastic).

As before, we produce a discrete approximation, e.g., with the Bubnov-Galerkin approximation

uh(X) =n

∑a=1

uaNa(X) and vh(X) =n

∑a=1

vaNa(X), (11.25)

so that we again arrive at

Fint(Uh) −Fext = 0 (11.26)

where now, by comparison,

F aint,i = ∫ΩPiJ(∇u)Na

,J dV and F aext,i = ∫ΩRBiN

adV + ∫∂ΩN

TiNadS (11.27)

In a nutshell, the linearized and finite elastic variational problems result in the same system ofequations (11.26). For the special case of linear elasticity, that system is linear. Otherwise theproblem is nonlinear and requires an iterative solution method.

11.3 Thermal problem revisited

For completeness, let us revisit the quasistatic thermal problem in an analogous fashion: we seektemperature values T h = T1, . . . , Tn such that

Qint(T h) −Qext = 0 (11.28)

with

Qaint = ∫ΩQI(∇T )Na

,I dV and Qaext = ∫ΩRShN

adV + ∫∂ΩN

QNadS (11.29)

where QI = ∂W /∂T,I .

For linear heat conduction, Q = λ GradT , we obtain a linear system of equations since then

Qaint =n

∑b=1

T b∫ΩλNa

,INb,I dV =

n

∑b=1

KabT b with Kab = ∫ΩλNa

,INb,I dV. (11.30)

(Notice that for convenience we defined the flux vector Q without the negative sign, which resultsin the analogous forms as in linear elasticity.)

31


12 Interpolation spaces

So far, we have assumed approximations of the type

uh(x) =n

∑a=1

uaNa(x), (12.1)

but we have not chosen particular spaces Uh for the interpolation or shape functions Na(x).

In general, there are two possible choices:

global shape functions that are defined such that ∣suppNa∣ ∼ ∣Ω∣,e.g., polynomials Na(x) = xa−1 or trigonometric polynomials Na(x) = cos (π(a − 1)x).

local shape functions that are defined locally: ∣suppNa∣ ≪ ∣Ω∣,e.g., picewise linear shape functions.

For any set of shape functions, the following shape function properties must be satisfied:

(1) for any x ∈ Ω there is at least one a with 1 ≤ a ≤ n such that Na(x) ≠ 0 (i.e., the whole domainmust be covered ; otherwise, there is no approximation at all at certain points)

(2) all Na should allow to satisfy the Dirichlet boundary conditions if required.

(3) linear independence of the shape functions:

n

∑a=1

uaNa = 0 ⇔ ua = 0 for all a = 1, . . . , n. (12.2)

In other words, given any function uh ∈ Uh, there exists a unique set of parameters u1, . . . , unsuch that

uh =n

∑a=1

uaNa. (12.3)

Then, functions N1, . . . ,Nn are a basis of Uh. Linear independence is important since itavoids ill-posed problems.

For example, take Uh = P2 and N1,N2,N3 = 1, x, x2 so that uh = u1N1 + u2N2 + u3N3.Hence, if, e.g., uh = a + bx + x2 then we immediately conclude that u1 = a, u2 = b, u3 = cuniquely.

Otherwise, i.e., if there existed a set α1, . . . , αn ≠ 0 such that ∑na=1 αaNa = 0, then this set

of parameters could be added on top of any solution u1, . . . , un such that

uh =n

∑a=1

uaNa =n

∑a=1

(ua + αa)Na, (12.4)

which means both ua and ua + αa are solutions (hence the problem is not well-posed).

(4) The shape functions Na must satisfy the differentiability/integrability requirements of theweak form (this depends on the problem to be solved and will be discussed later).

(5) The shape functions must possess “sufficient approximation power”. In other words, consideruh ∈ Uh ⊂ U : we should ensure that uh = ∑na=1 u

aNa → u as n→∞.

32


This is a crucial theorem. It tells us that for an approximation to converge, we must pick anapproximate function space that gives the solution “a chance to converge”. For example, assumeyou aim to approximate a high-order polynomial u ∈ Pn (with n≫ 1) by an approximation uh usingshape functions 1, x, x2, x3, . . . , xn. This is expected to converge as n→∞, because the coefficientsof u will approach the coefficients of uh. But choosing shape functions 1, x, x3, . . . , xn (notice thex2-term is omitted) will never converge as n →∞. Polynomials do satisfy this requirement by thefollowing theorem:

Weierstrass approximation theorem: Given a continuous function f ∶ [a, b] ⊂ R → R and anyscalar ε > 0, then there exists a polynomial

pn(x) ∈ P∞ such that ∣f(x) − pn(x)∣ < ε for all x ∈ [a, b]. (12.5)

This means every continuous function u can be approximated by a polynomial function to withinany level of accuracy.

Therefore, N i = 1, x, x2, x3, . . ., i.e., the polynomials in R, is a suitable choice for the shapefunctions that satisfy the completeness property (and we have shown their linear independence).

Note that, as discussed above, one cannot omit any intermediate-order terms from the set

1, x, x2, x3, . . .. (12.6)

If one omits a term, e.g., take 1, x2, x3, . . ., then if uh ∈ Uh = Pn then there is no set u1, . . . , unsuch that uh = ∑ni=1 u

iN i.

As an extension, the Weierstrass approximation theorem also applies to trigonometric polynomials(cf. Fourier series).

completeness in higher dimensions:

A polynomial approximation in Rd is complete up to order q, if it contains independently allmonomials xα with ∣α∣ = α1 + . . . + αd ≤ q, i.e., using multi-indices we write

uh =q

∑β=0

∑∣α∣=β

cαxα. (12.7)

What does this mean in practice?

1D: 1, x, x2, x3, . . . , xq so that a polynomial of order q contains q + 1 monomials

2D: q = 0: 1q = 1: 1, x1, x2q = 2: 1, x1, x2, x

21, x1x2, x

22

q = 3: 1, x1, x2, x21, x1x2, x

22, x

31, x

21x2, x1x

22, x

32

...

The number of independent monomials in 2D is hence (q + 1)(q + 2)/2.

33


13 The Finite Element Method

motivation: define shape functions that are local and admit a simple way to enforce DirichletBCs.

idea: introduce a discretization Th that splits Ω into subdomains Ωe, the so-called elements, suchthat

Ωe ⊂ Ω, Ω =⋃e

Ωe, ∂Ω ⊆⋃e∂Ωe. (13.1)

Th is defined by the collection of nodes and elements and is called a mesh.

Mathematically (and computationally), a finite element is an object that has

(i) a FE subdomian Ωe.

(ii) a (linear) space of shape functions N i (restricted to Ωe, i.e. suppN i = Ωe).

(iii) a set of degrees of freedom (dofs), viz. the ua associated with those N i.

The Finite Element Method (FEM) defines continuous, piecewise-polynomial shape functionssuch that

N i(xj) = δij for all i, j ∈ 1, . . . , n (13.2)

This is the defining relation that determines the shape functions. Notice that if we evaluate theapproximation uh(x) at one of the nodes xj , then

uh(xj) =n

∑a=1

uaNa(xj) =n

∑a=1

uaδaj = uj . (13.3)

That is, the coefficient uj can now be identified as the value of approximate function uh at node j.This makes for a very beneficial physical interpretation of the (yet to be determined) shape functioncoefficients.

Let us check the requirements for shape functions:

(1) is automatically satisfied: if x ∈ Ω then x ∈ Ωe for some e, then there are N i(x) ≠ 0

(2) can be satisfied by fixing degrees of freedom of the boundary nodes (errors possible)

(3) Assume, by contradiction, that uh(x) = 0 for all x ∈ Ω while some ua ≠ 0. Now, evaluate at anode xj :

0 = uh(xj) =n

∑a=1

uaNa(xj) = uj ⇒ uj = 0, (13.4)

which contradicts the assumption that some uha ≠ 0. Thus, we have linear independence.

(4) Integrability/differentiability requirements depend on the variational problem to be solvedand must be ensured. For example, for mechanics we have Uh,Vh ∈ H1, i.e., first derivativesmust be square-integrable. Note that this guarantees that displacements (0th derivatives) arecontinuous and thus compatible (no jumps in displacements).

(5) Completeness requires uh → u (and thus Uh → U) to within desirable accuracy. In the FEmethod, one enriches Uh by, e.g.,

34


h-refinement: refining the discretization Th while keeping the polynomial order fixed.

p-refinement: increasing the polynomial interpolation order within a fixed discretiza-tion Th.

hp-refinement: combination of the two above.

r-refinement: repositioning of nodes while keeping discretization/interpolation fixed.

A note on ensuring sufficient approximation power : consider the exact solution u(x) at a pointx ∈ Ω so

u(x + h) = u(x) + hu′(x) + 1

2h2u′′(x) + . . . + 1

q!hqu(q)(x) +O(hq+1) (13.5)

Assume that Uh contains all polynomials complete up to degree q (i.e., uh ∈ Pq), then there exists

uh ∈ Uh such that u(x) = uh(x) +O(hq+1). (13.6)

Let p be the highest derivative in the weak form, then

dpu

dxp= dpuh

dxp+O(hq+1−p). (13.7)

For the solution to converge as h → 0 we need q + 1 − p ≥ 1 so that we have at least order O(h).Thus we must ensure that

q ≥ p (13.8)

35


14 Finite element spaces: polynomial shape functions

Simplest of all choices: continuous, piecewise-polynomial interpolation functions.

Note that we need q ≥ 1 since p = 1 for the mechanical/thermal/electromagnetic variational problems(i.e., we need at least linear interpolation within elements).

14.1 One dimension

Simplest example: 2-node bar element

Interpolation with element dofs u1e, u

2e so that

uhe(x) = N1e (x)u1

e +N2e (x)u2

e (14.1)

and we must have uhe(0) = u1e and uhe(∆x) = u2

e.

This gives the element shape functions:

N1e (x) = 1 − x

∆x, N2

e (x) =x

∆x. (14.2)

Note that the interpolation space uses 1, x which is complete up to q = 1.

Extension: Lagrangian interpolation of higher order

Interpolation up to degree q, i.e., 1, x, x2, . . . , xq so that

uhe(x) =q+1

∑a=1

Nae (x)uae = a0 + a1x + a2x

2 + . . . + aqxq. (14.3)

In general, shape functions can be determined by solving the q + 1 equations

uhe(xi) = ui for all i = 1, . . . , q + 1 (14.4)

for the q + 1 coefficients ai (i = 0, . . . , q). Then, rearranging the resulting polynomial allows toextract the shape functions Na

e (x) by comparison of the coefficients of uae .

Alternatively, we can solve

Nae (xi) = δai for all nodes i = 1, . . . , q + 1. (14.5)

The solution is quite intuitive:

Nae (x) =

(x − x1) ⋅ . . . ⋅ (x − xa−1) ⋅ (x − xa+1) ⋅ . . . ⋅ (x − xq+1)(xa − x1) ⋅ . . . ⋅ (xa − xa−1) ⋅ (xa − xa+1) ⋅ . . . ⋅ (xa − xq+1)

(14.6)

One can readily verify that Nae (xi) = δai. These are called Lagrange polynomials.

36


Alternative: hierarchical interpolation

We can also construct higher-order interpolations based on lower-order shape functions. For exam-ple, start with a 2-node bar:

N1e (x) = 1 − x

∆x, N2

e (x) =x

∆x. (14.7)

Let us enrich the interpolation to reach q = 2:

uh(x) = N1e (x)u1

e +N2e (x)u2

e + N3e (x)αe (14.8)

with

N3e (x) = a0 + a1x + a2x

2. (14.9)

We need to find the coefficients ai. Note that we must have

N3e (0) = N3

e (∆x) = 0 ⇒ N3e (x) = c

x

∆x(1 − x

∆x) (14.10)

with some constant c ≠ 0.

Note that αe does not have to be continuous across elements and can hence be determined lo-cally (i.e., given u1

e and u2e, αe can be determined internally for each element, which allows for

condensation of the αe-dof).

Example of higher order interpolation: 2-node beam element

Linear elastic Euler-Bernoulli beams are a most common structural element. From the variationalform (or the strong form, EIw(4)(x) = q(x) for statics, which is of 4th order in the deflection w(x))we know that p = 2. Therefore, we must have q ≥ 2, and w ∈H2(Ω).

The simplest admissible interpolation is based on 1, x, x2, x3, so

whe (x) = c0 + c1x + c2x2 + c3x

3. (14.11)

We need four dofs, so we pick two nodes and assign to each node a deflection w and angle θ = w′:

whe (x) =2

∑i=1

[N i1e (x)wie +N i2

e (x)θie] (14.12)

and we must have

whe (0) = w1e , whe (∆x) = w1

e , (whe )′(0) = θ1e , (whe )′(∆x) = θ1

e . (14.13)

The resulting shape functions are known as Hermitian polynomials (and can be found in text-books and the notes).

Note that this is only one possible choice; we could also define alternative nodes and nodal values.However, the above choice ensures that both deflection and angle are continuous across elements.

37



Problem: in higher dimensions it will be cumbersome to define polynomial shape functions onthe actual shape of the element, unless one uses regular structured meshes (e.g, grids). In general,all elements have different shapes and it is beneficial to define shape functions independent of thespecific element shape.

To this end, we introduce a (bijective) isoparametric mapping φ from a reference domain (withreference coordinates ξ = ξ, η, ζ) onto the physical domain (with coordinates x = x, y, z) of anelement e:

x = φ(ξ), i.e. in 3D: x = x(ξ, η, ζ), y = y(ξ, η, ζ), z = z(ξ, η, ζ). (14.14)

For simplicity, we reuse the interpolation concepts from before:

so far: uhe =n

∑i=1

N ie(x)uie

now: x =m

∑i=1

N ie(x)xie,

(14.15)

where we have three options for the mapping:

(i) isoparametric: n =m and N ie = N i

e (same interpolation)

(ii) subparametric: n >m (lower-order interpolation of positions)

(iii) superparametric: n <m (higher-order interpolation of positions)

The strategy is now to define N ie(ξ) in the reference configuration.

Example: 4-node bilinear quadrilateral (Q4)

By definition, Ωe = [−1,1]2 and node numbering starts in the bottom left corner and is counter-clockwise.

Shape functions must satisfy N ie(ξj , ηj) = δij and can be obtained from the 2-node bar which has:

2-node bar: N1e (ξ) =

1

2(1 − ξ), N2

e (ξ) =1

2(1 + ξ). (14.16)

It can easily be verified that N1e (−1) = 1, N1

e (1) = 0, and N2e (−1) = 0, N2

e (1) = 1.

The 2D element shape functions thus follow from combining the above in the ξ and η directions:

Q4 element: N1e (ξ, η) = N1

e (ξ)N1e (η) =

1

4(1 − ξ)(1 − η),

N2e (ξ, η) = N2

e (ξ)N1e (η) =

1

4(1 + ξ)(1 − η),

N3e (ξ, η) = N2

e (ξ)N2e (η) =

1

4(1 + ξ)(1 + η),

N4e (ξ, η) = N1

e (ξ)N2e (η) =

1

4(1 − ξ)(1 + η).

(14.17)

38


One can easily verify that N ie(ξj , ηj) = δij and ∑4

i=1Nie(ξ, η) = 1 for all (ξ, η).

The isoparametric mapping in 2D now means

x =4

∑i=1

N ie(ξ, η)xie, y =

4

∑i=1

N ie(ξ, η) yie. (14.18)

Notice that this implies straight edges (in the reference configuration) remain straight (in theactual mesh). For example, take the bottom edge (η = −1), the interpolation along this edge isx = ∑4

i=1Nie(ξ,−1)xie and y = ∑4

i=1Nie(ξ,−1)yie, which are both linear in ξ. Therefore, this element

has straight edges in physical space.

Remarks:

completeness up to only q = 1 is given by 1, ξ, η, ξη. This means we must be able to representsolutions

uh(x, y) = c0 + c1x + c2y (14.19)

exactly. Check:

uh(x, y) =4

∑i=1

N ieuie =

4

∑i=1

N ieuh(xi, yi) =

4

∑i=1

N ie(c0 + c1xi + c2yi)

= (4

∑i=1

N ie) c0 + c1

4

∑i=1

N iexi + c2

4

∑i=1

N ieyi = c0 + c1x + c2y.

(14.20)

integrability : Note that this interpolation scheme ensures that uh is continuous across el-ements. To see this, notice that on any element edge only those two shape functions arenon-zero whose nodes are on that edge while the others are zero.

There is one particular difficulty with isoparametric elements:

To compute force vectors, etc., we need shape function derivatives N i,x and N i

,y but the shape

functions were defined as N i(ξ, η), so only N i,ξ and N i

,η are known.

Let us use x = x(ξ, η) and y = y(ξ, η) so the chain rule gives

( u,ξu,η

) = ( u,xx,ξ + u,yy,ξu,xx,η + u,yy,η

) = (x,ξ y,ξx,η y,η

)( u,xu,y

) = J ( u,xu,y

) (14.21)

with the Jacobian matrix

J = (x,ξ y,ξx,η y,η

) (14.22)

so that

J = detJ = ∂x∂ξ

∂y

∂η− ∂x∂η

∂y

∂ξ. (14.23)

Note that, like for the deformation mapping, for the isoparametric mapping to be invertible weneed to have J > 0. This implies that elements cannot be distorted (inverted or non-convex).

39


Using our isoparametric mapping, we obtain

J =⎛⎜⎜⎜⎜⎝

4

∑i=1

N ie,ξx

ie

4

∑i=1

N ie,ξy

ie

4

∑i=1

N ie,ηx

ie

4

∑i=1

N ie,ηy

ie

⎞⎟⎟⎟⎟⎠. (14.24)

This solves the problem. As discussed, we need to have J > 0 so that we can invert J to arrive at(by the inverse function theorem)

( u,xu,y

) = J−1 ( u,ξu,η

) = J−1

⎛⎜⎜⎜⎜⎝

4

∑i=1

N ie,ξu

ie

4

∑i=1

N ie,ηu

ie

⎞⎟⎟⎟⎟⎠

but also ( u,xu,y

) =⎛⎜⎜⎜⎜⎝

4

∑i=1

N ie,xu

ie

4

∑i=1

N ie,yu

ie

⎞⎟⎟⎟⎟⎠. (14.25)

By equating these two and comparing the coefficients of uie we thus obtain

(Nie,x

N ie,y

) = J−1 (Nie,ξ

N ie,η

) and more generally: ∇xN ie = J−1∇ξN i

e (14.26)

This is generally applicable for any isoparametric mapping.

As a simple example, recall the 2-node bar element whose shape functions we computed as

N1e (x) = 1 − x

∆x, N2

e (x) =x

∆x. (14.27)

For a reference bar element with nodes at ξ = ±1, the analogous shape functions in 1D read

N1e (ξ) =

1 − ξ2

, N2e (ξ) =

1 + ξ2

. (14.28)

Applying the above relations to the 1D problem gives

J = ∂x∂ξ

= ∂N1e

∂ξx1e +

∂N2e

∂ξx2e =

x2e − x1

e

2= ∆x

2. (14.29)

This confirms that indeed

N ie,x = J−1N i

e,ξ =2

∆xN ie,ξ. (14.30)

A useful relation to evaluate area integrals in the following is (e1,e2 being reference unit vectors)

dA = dx × dy = (∂x∂ξ

dξ e1 +∂x

∂ηdη e2) × (∂y

∂ξdξ e1 +

∂y

∂ηdη e2) = J dξ dη e1 × e2 (14.31)

so that

dA = ∣dA∣ = J dξ dη (14.32)

40


Extension to higher-order elements:

The 9-node quadratic quadrilateral (Q9) derives its shape functions from applying the 1Dshape functions of the 3-node bar (using Lagrangian interpolation) to the 2D case. For example,

N1(ξ, η=ξ(1 − ξ)η(1 − η)

4, (14.33)

so that overall the interpolation includes monomials

1, ξ, η, ξη, ξ2, η2, ξ2η, ξη2, ξ2η2, (14.34)

which is complete up to order q = 2 (quadratic).

Since the above elements includes way more polynomial terms than required for quadratic inter-polation, one can construct elements with less nodes, e.g., the 8-node quadratic quadrilateral(Q8) also known as serendipity element.

Shape functions are constructed as follows:

N5(ξ, η) =(1 − η)(1 − ξ2)

4,

N6(ξ, η) =(1 − η2)(1 + ξ)

4,

N1(ξ, η) =(1 − η)(1 − ξ)

4− 1

2(N5 +N8), etc.

(14.35)

Extension to three dimensions:

The same procedure can be applied to 3D, resulting in the 8-node brick element. The referencecoordinates (ξ, η, ζ) are linked to the physical coordinates (x, y, z) via the shape functions

N1e (ξ, η, ζ) =

1

8(1 − ξ)(1 − η)(1 − ζ), . . . (14.36)

again with

∇xN ie = J−1∇ξN i

e and dV = J dξ dη dζ (14.37)

41


15 Numerical quadrature

The finite element method frequently requires computing integrals, e.g., for the internal/externalforce vectors. Since these cannot be integrated analytically in general, we need numerical inte-gration schemes.

Consider the integral

I[u] = ∫b

au(x) dx. (15.1)

For convenience, let us introduce the shift

ξ = 2x − ab − a − 1 ⇒ ξ(x = a) = −1, ξ(x = b) = 1, dξ = 2 dx

b − a (15.2)

and x = ξ+12 (b − a) + a so that

I[u] = ∫1

−1f(ξ) dξ where f(ξ) = b − a

2u(x(ξ)). (15.3)

The goal is now to approximate the integral in (15.3) numerically.

15.1 Example: Riemann sums

Consider a partition with n + 1 nodes:

P = [ξ0, ξ1], [ξ1, ξ2], . . . , [ξn−1, ξn] such that − 1 = ξ0 < ξ1 < ξ2 < . . . < ξn = 1 (15.4)

The Riemann sum makes use of the approximation

I ≈ S =n

∑i=1

f(ξ∗i )(ξi − ξi−1) with ξi−1 ≤ ξ∗i ≤ ξi. (15.5)

Different choices of ξ∗i :

(i) left Riemann sum: ξ∗i = ξi−1:

(ii) right Riemann sum: ξ∗i = ξi(iii) middle Riemann sum: ξ∗i = 1

2(ξi−1 + ξi)(iv) trapezoidal sum: average of left and right

(v) upper Riemann sum: ξ∗i s.t. g(ξ∗i ) = supξ∈[ξi−1,ξi] g(ξ)(vi) lower Riemann sum: ξ∗i s.t. g(ξ∗i ) = infξ∈[ξi−1,ξi] g(ξ)

More refined formulations can be found in the so-called Newton-Cotes formulas (the trapezoidalrule is the Newton-Cotes formula of degree 1).

42


15.2 Gauss quadrature

A much more efficient alternative are quadrature rules (also called cubature rules in 3D) whichare best suited for polynomial functions:

I[u] = ∫1

−1f(ξ) dξ ≈

nQP−1

∑i=0

Wi f(ξi). (15.6)

Now, we need to choose nQP , W0, . . . ,WnQP−1 and ξ0, . . . , ξnQP−1. The choice should dependon the function to be integrated (more specifically, on its smoothness). Note that most functionsof interest will be of polynomial type.

We say a quadrature rule is exact of order q if it integrates exactly all polynomial functionsg ∈ Pq([−1,1]). Gauss quadrature generally chooses nQP quadrature points and associatedweights such that the quadrature rule is exact of order q = 2nQP − 1.

15.2.1 Gauss-Legendre quadrature

Gauss-Legendre quadrature selects the nodes and weights such that the first 2nQP momentsare computed exactly:

µk = ∫1

−1ξkdξ

!=nQP−1

∑i=0

Wi ξki , k = 0,1, . . . ,2nQP − 1. (15.7)

These are 2nQP equations for the 2nQP free parameters (Wi, ξi) for i = 0, . . . , nQP −1. The equationsare generally nonlinear and thus hard to solve analytically.

Let us compute the Gauss-Legendre weights and points for the lowest few orders in 1D:

a single quadrature point, i.e., nQP = 1:

Two equations for two unknowns:

∫1

−1ξ0 dξ = 2 =W0 ξ

00 =W0 and ∫

1

−1ξ1 dξ = 0 =W0 ξ

10 =W0ξ0 (15.8)

So that the first-order quadrature rule is given by

W0 = 2, ξ0 = 0 (15.9)

Since linear functions are integrated exactly, this quadrature rule is exact to order q = 1.

two quadrature points, i.e., nQP = 2:

In close analogy, we now have

∫1

−1ξ0 dξ = 2 =W0 +W1, ∫

1

−1ξ1 dξ = 0 =W0 ξ0 +W1 ξ1,

∫1

−1ξ2 dξ = 1

3=W0 ξ

20 +W1 ξ

21 , ∫

1

−1ξ3 dξ = 0 =W0 ξ

30 +W1 ξ

31 .

(15.10)

43


A simple solution can be found for symmetric quadrature points ξ0 = −ξ1:

W0 =W1 = 1, ξ0 = −1√3, ξ1 =

1√3. (15.11)

This quadrature rule is exact to order q = 3 (cubic polynomials are integrated exactly).

higher-order quadrature rules:

Quadrature weights and points for arbitrary order can be obtained in analogous fashionand, most importantly, can be found in numerous look-up tables (see notes and textbooks).However, there is a better, systematic way to compute Gauss-Legendre quadrature weightsand points.

Note that monomials 1, ξ, ξ2, ξ3, . . ., although complete, are not orthogonal basis functions.We can turn them into orthogonal polynomials Pn(ξ) by, e.g., the Gram-Schmidt orthog-onalization procedure. To this end, let us start with

P0(ξ) = 1 (15.12)

and obtain the next basis function by starting with the linear momonial ξ and computing

P1(ξ) = ξ −⟨1, ξ⟩⟨1,1⟩1 = ξ, (15.13)

where we used the inner product

⟨u, v⟩ = ∫1

−1u(ξ)v(ξ)dξ. (15.14)

Similarly, the next higher basis function is obtained by starting from ξ2, so that

P2(ξ) = ξ2 − ⟨ξ, ξ2⟩⟨ξ, ξ⟩ ξ −

⟨1, ξ2⟩⟨1,1⟩ 1 = ξ2 − 1

3. (15.15)

Analogously, one finds

P3(ξ) = ξ3 − 3

5ξ. (15.16)

By continuing analogously, we create a countably infinite set of orthogonal basis functionsPn(ξ) such that

∫1

−1Pn(ξ)Pm(ξ)dξ = 0 if n ≠m. (15.17)

These polynomials are known as Legendre polynomials. Note that they are defined onlyup to a constant, so one can renormalize them, which is commonly done by enforcing thatPn(1) = 1 for all n. The result is the well known Legendre polynomials which can alternativelybe defined via

Pn(ξ) =1

2nn!

dn

dξn[(ξ2 − 1)n] (15.18)

44


These polynomials have another interesting feature, viz. by orthogonality with P0(ξ) = 1 weknow that

∫1

−1Pn(ξ)dξ = ⟨1, Pn⟩ =

⎧⎪⎪⎨⎪⎪⎩

2, if n =m = 0,

0, else.(15.19)

Pn(ξ) has exactly n roots in the interval [−1,1]. Also, for n ≠ 0 we know that

Pn(ξ) = −Pn(−ξ) for odd n,

Pn(ξ) = Pn(−ξ) for even n.(15.20)

Moreover, Pn(0) = 0 for odd n.

With this new set of basis functions, we can define the Gauss-Legendre quadrature rule toenforce

∫1

−1Pk(ξ)dξ

!=nQP−1

∑i=0

Wi Pk(ξi), k = 0,1, . . . ,2nQP − 1. (15.21)

If nQP = 1, then the solution is simple because the above equations simplify to

W0 = 2 and 0 =W0P1(ξ0). (15.22)

Therefore, the weight is, as before, W0 = 2 and the quadrature point is the root of P1(ξ), viz.ξ0 = 0.

If nQP = 2, then the four equations to be solved are

W0 +W1 = 2 and 0 =W0P1(ξ0) +W1P1(ξ1),0 =W0P2(ξ0) +W1P2(ξ1) and 0 =W0P3(ξ0) +W1P3(ξ1).

(15.23)

By analogy, we choose the quadrature points to be the roots of P2(ξ), so that

P2(ξ0) = P2(ξ1) = 0 ⇒ ξ0 =1√3, ξ1 = −

1√3. (15.24)

Using Pn(ξ) = −Pn(−ξ), the above equations reduce to

W0 =W1 = 1. (15.25)

Further, note that Pn(0) = 0 for odd n. Therefore, the same procedure can be continued asfollows. For an arbitrary number of quadrature points, nQP , the Gauss-Legendre quadra-ture points and associated weights are defined by

PnQP (ξi) = 0, wi =2

(1 − ξ2i )[P ′

nQP(ξi)]2

i = 0, . . . , nQP − 1. (15.26)

As a check, take, e.g., nQP = 1 so that P1(ξ) = ξ with root ξ0 = 0. The weight is computed as

w0 =2

(1 − ξ20)[P ′

nQP(ξ)]2

= 2, (15.27)

as determined above. Similarly, for nQP = 2 we have P2(ξ) = 12(3ξ

2 − 1) with the above roots

of ±1/√

3. The associated weights are computed as

w0 =2

(1 − ξ20)[P ′

2(ξ0)]2= 2

(1 − ξ20)[3ξ0]2

= 223

32

3

= 1 = w1, (15.28)

which agrees with our prior solution.

45


15.2.2 Other Gauss quadrature rules

Note that if, for general functions f , one can sometimes find a decomposition f(ξ) = w(ξ)g(ξ)where w(⋅) is a known weighting function and g(ξ) is (approxiately) polynomial, so that a moresuitable quadrature rule may be found via

I[u] = ∫1

−1f(ξ)dξ = ∫

1

−1w(ξ) g(ξ)dξ ≈

nQP−1

∑i=0

w(ξi) g(ξi). (15.29)

Examples of such Gaussian quadrature rules include those of Gauss-Chebyshew type, which areobtained form a weighting function w(ξ) = (1 − ξ2)−1/2, and the quadrature points are the roots ofChebyshew polynomials. Gauss-Hermite quadrature uses a weighting function w(ξ) = exp(−ξ2)(and the integral is taken over the entire real axis). Gauss-Legendre quadrature is included as thespecial case w(ξ) = 1.

Another popular alternative (less for FE though) is Gauss-Lobatto quadrature which includesthe interval end points as quadrature points and is accurate for polynomials up to degree 2nQP −3,viz.

I[u] = ∫1

−1f(ξ) dξ ≈ 2

nQP (nQP − 1)[f(−1) + f(1)] +

nQP−1

∑i=2

Wi f(ξi). (15.30)


Like the polynomial shape functions, the above quadrature rules can easily be extended to 2D and3D, e.g.,

∫1

−1∫

1

−1f(ξ, η)dξ dη = ∫

1

−1[N−1

∑i=0

Wi f(ξi, η)] dη =N−1

∑j=0

Wj [N−1

∑i=0

Wi f(ξi, ηj)]

=nQP−1

∑k=0

W ∗k f(ξk, ηk)

(15.31)

with the combined weights W ∗k =WiWj and points (ξk, ηk) = (ξi, ηj) obtained from the individual

quadrature rules in each direction. By symmetry we choose N = √nQP so that N2 = nQP .

For example, consider the Q4 element. By reusing the 1D Gauss-Legendre weights and points,we now have:

first-order quadrature (q = 1), as in 1D, has only a single quadrature point (nQP = 1):

W0 = 1 and (ξ0, η0) = (0,0) (15.32)

Bilinear functions (at most linear in ξ and η) are integrated exactly with this rule.

third-order quadrature (q = 3), now has four quadrature points (nQP = 22 = 4):

W0 =W1 =W2 =W3 = 1 and (ξi, ηi) = (± 1√3, ± 1√

3) (15.33)

Bicubic polynomial functions (at most cubic in ξ and η) are integrated exactly.

46


Similarly, the brick element in 3D uses Gauss-Legendre quadrature as follows:

first-order quadrature (q = 1) still has a single quadrature point (nQP = 1):

W0 = 1 and (ξ0, η0) = (0,0) (15.34)

third-order quadrature (q = 3), now has four quadrature points (nQP = 23 = 8):

Wi = 1 and (ξi, ηi, ζi) = (± 1√3, ± 1√

3, ± 1√

3) . (15.35)

15.4 Finite element implementation

The key integrals to be evaluated numerically are (e.g., in 2D)

F aint,i = ∫ΩeσijN

a,j dV = ∫

1

−1∫

1

−1σij(ξ)Na

,j(ξ)J(ξ)dξdη ≈nQP

∑k=1

Wk σij(ξk)Na,j(ξk)J(ξk) (15.36)

and

T abik = ∫Ωe

CijklNa,jN

b,ldV = ∫

1

−1∫

1

−1Cijkl(ξ)Na

,j(ξ)N b,l(ξ)J(ξ)dξ dη

≈nQP

∑k=1

WkCijkl(ξk)Na,j(ξk)N b

,l(ξk)J(ξk)(15.37)

Notice for implementation purposes that σijNa,j is a simple matrix-vector multiplication so that,

using the isoparametric mapping of Section 14.2,

F aint,i ≈nQP

∑k=1

WkJ(ξk)σ(ξk)∇xNa(ξk) with ∇xNa = J−1 = ∇ξNa. (15.38)

15.5 Quadrature error estimates

Using numerical quadrature to approximate an exact integral introduces a quadrature error,which is bounded as follows. For a function f(ξ) ∈ Ck+1(Ω) we have that (without proof here)

RRRRRRRRRRR∫

1

−1f(ξ)dξ −

nQP−1

∑q=0

Wi f(ξi)RRRRRRRRRRR≤ Ch2nQP max

ξ∈[−1,1]∥f (2nQP )(ξ)∥ (15.39)

with a constant C > 0. This shows that, as can be expected, the quadrature error decreases withdecreasing mesh size h and with increasing smoothness of function f . The rate of convergenceunder h-refinement also depends on the smoothness of the function.

The exact error depends on the chosen quadrature rule. For example, for Gauss-Legendre quadra-ture an error estimate is given by

e = 22nQP+1(nQP !)4

(2nQP + 1)[(2nQP )!]3maxξ∈[−1,1]

∥f (2nQP )(ξ)∥ . (15.40)

47


15.6 Which quadrature rule to use?

Stresses, shape function derivatives, and Jacobians are not necessarily smooth polynomials. Thus,rather than finding the exact integration order for each element and constitutive model, we intro-duce a minimum required integration order for a particular element type.

Our minimum requirement is that an undistorted elastic element is integrated exactly. Thus, wedefine full integration as the order needed to integrate an undistorted, homogeneous, linear elasticelement exactly. An element is undistorted if element angles are preserved or, in other words, ifJ = const.

For example, the 4-node quadrilateral (Q4) is undistorted if the physical element has the shapeof a rectangle (side lengths a and b), so that

J = ab4= const. (15.41)

Then, for a linear elastic Q4 element we have

F aint,i = ∫1

−1∫

1

−1Cijkl εkl(ξ)Na

,j(ξ)ab

4dξ dη = ab

4Cijkl ∫

1

−1∫

1

−1εkl(ξ)Na

,j(ξ)dξ dη, (15.42)

where ε = sym(graduh) is at most linear (since the interpolation of uh is bilinear), ∇xNa is alsoat most linear for the same reason. Overall, the integrand is at most a quadratic polynomial, sothat we need integration order q ≥ 2.

Recall that in 1D we showed that q = 2nQP − 1, so that full integration of the Q4 element in 2Drequires nQP = 2 × 2 = 4 quadrature points.

Analogously, full integration of the quadratic 2D elements Q8/Q9 requires nQP = 32 = 9 quadraturepoints. Full integration of the 8-node brick element requires nQP = 8 quadrature points.

Note that only does full integration guarantee that the internal force vector of an undistorted,elastic element is integrated exactly. By reviewing the element energy and the element tangentmatrix, we make the same observation (i.e., those are integrated exactly as well):

Ie = ∫ΩeW dV = ab

4

1

2Cijkl ∫

1

−1∫

1

−1εij(ξ)εkl(ξ)dξ dη,

T abij = ab4Cijkl ∫

1

−1∫

1

−1Na,j(ξ)N b

,l(ξ)dξ dη.

(15.43)

Using an integration rule less than full integration is called under-integration; the opposite iscalled over-integration. Which integration order to use depends very much on the element,material model, etc. Sometimes under-integration can be beneficial (e.g., to avoid locking). Wewill not discuss these techniques here further.

48


16 Simplicial elements

A simplex of order k is a k-dimensional polytope which is the convex hull of its k + 1 vertices.

In plain English, a simplex of order k is a convex body made up of k + 1 nodes:

in 1D: a 2-node bar, interpolation basis is 1, x

in 2D: a 3-node triangle (T3), interpolation basis is 1, x, y

in 3D: a 4-node tetrahedron (T4), interpolation basis is 1, x, y, z

Overall, this shows that interpolation is of degree q = 1 (linear) for all simplicial elements.

One usually uses special shape functions for simplices, based on barycentric coordinates.

16.0.1 Linear Triangle (T3)

In 2D, the 3-node triangular element uses the barycentric coordinates

lie(x) =Aie(x)Ae

, (16.1)

where Ae = ∣Ωe∣ is the total triangle area, and Aie is the sub-area opposite from node i, so that

∑3i=1A

ie = A.

It is an easy check to see that 0 ≤ lie ≤ 1 and ∑3i=1 l

ie(x) = 1 for all x ∈ Ωe, so the lie qualify as shape

functions for the T3 element.

For convenience we use r = l1e and s = l2e as reference coordinates, so that the shape functions become

N1e (r, s) = r, N2

e (r, s) = s, N3e (r, s) = 1 − r − s. (16.2)

The rest of the story is analogous to the isoparametric elements discussed before with (r, s) insteadof (ξ, η):

J = (x,r y,rx,s y,s

) =⎛⎜⎜⎜⎜⎝

3

∑i=1

N ie,rx

ie

3

∑i=1

N ie,ry

ie

3

∑i=1

N ie,sx

ie

3

∑i=1

N ie,sy

ie

⎞⎟⎟⎟⎟⎠= (x

1e − x3

e y1e − y3

e

x2e − xe3 y2

e − y3e) (16.3)

and

J = detJ = (x1e − x3

e)(y2e − y3

e) − (x2e − xe3)(y1

e − y3e) = 2Ae. (16.4)

Notice that J and hence J are constant and do not depend on (r, s).

Further, we have

(Nie,x

N ie,y

) = J−1 (Nie,r

N ie,s

) . (16.5)

49


From (16.11) we see that all shape function derivatives are constant (−1, +1, or 0), so that we mayconclude that

N ie,x = const., N i

e,y = const. and also dA = J dr ds = 2Aedr ds. (16.6)

The constant shape function derivatives indicate that all strain components are constant within theelement since ε = sym(∇u). This is why the 3-node triangle element is also called Constant StrainTriangle or CST. This also has the important consequence that integration of force vectors orstiffness matrices can easily be performed exactly by a single quadrature point since the integrandsare constant across the element.

In case of higher-order triangular elements, the following relation is helpful for evaluating integrals:

∫Ωerαsβ dA = α!β!

(α + β + 2)!2A. (16.7)

16.1 Extension to three dimensions:

The extension to three dimensions is straight-forward and results in the 4-node tetrahedron(constant strain tetrahedron) with reference coordinates (r, s, t) and shape functions

N1e (r, s, t) = r, N2

e (r, s, t) = s, N3e (r, s, t) = t, N4

e (r, s, t) = 1 − r − s − t. (16.8)

Like in 2D, strains are constant in the linear tetrahedron, and dV = 6V dr ds dt.

Note the following important relation that is analogous to (16.7),

∫Ωerαsβtγ dA = α!β!γ!

(α + β + γ + 2)!2A. (16.9)

16.2 Quadrature rules for simplicial elements:

As discussed above, simplicial elements (bar, triangle, tetrahedron) produce constant strains andthus constant stresses within elements. Hence, a single quadrature point at an arbitrarylocation inside the element is sufficient. Usually one chooses the point to be located at theelement center, which gives

W0 = 1, r0 = s0 =1

3(in 2D) and r0 = s0 = t0 =

1

4(in 3D). (16.10)

If higher-order quadrature rules are required (e.g., for triangular and tetrahedral elements of higherinterpolation order as discussed next), the same concepts as for Gauss-Legendre quadrature canbe applied here, resulting in simplicial quadrature rules whose weights and quadrature pointlocations can be found in look-up tables.

50


16.3 Generalization

For a general simplicial element in d dimensions having n = d + 1 nodes, we have shape functions

N1e (r1, . . . , rd) = r1, N2

e (r1, . . . , rd) = r2, . . . Nne (r1, . . . , rd) = 1 −

d

∑i

ri, (16.11)

where ξ = r1, . . . , rd denote the d barycentric coordinates. Then, the Jacobian J is given by

Jij =∂Xj

∂ri=

n

∑a=1

Xaj

∂Nae

∂rior J =

n

∑a=1

∇ξNae ⊗Xa, J = detJ , (16.12)

so that shape function derivatives in physical coordinates follow as

⎛⎜⎝

Nae,X1

. . .Nae,Xd

⎞⎟⎠= J−1

⎛⎜⎝

Nae,r1. . .Nae,rd

⎞⎟⎠

or ∇XNae = J−1∇ξNa

e . (16.13)

Using numerical quadrature with weights Wk and points ξk (in reference coordinates), the elementenergy is given by

Ie ≈nQP

∑k=1

WkW (F (ξk))J(ξk) te. (16.14)

Nodal forces are approximated by

F aint,i ≈nQP

∑k=1

Wk PiJ (F (ξk))Na,J(ξk)J(ξk) te. (16.15)

Finally, the element stiffness matrix components become

T abil ≈nQP

∑k=1

WkCiJlL (F (ξk))Na,J(ξk)N b

,L(ξk)J(ξk) te. (16.16)

Here, te is an element constant (e.g., the cross-sectional area A in 1D, the thickness t in 2D, andsimply 1 in 3D).

The deformation gradient, F = I+∇Xu, and strain tensor, ε = sym(∇Xu), can be obtained directlyfrom ∇Xu. Also, recall that

∇Xu(ξ) =n

∑a=1

uae ⊗∇XNa(ξ). (16.17)

Note that, in linearized kinematics, the above equations hold analogously with PiJ replaced by theCauchy stress tensor, upper-case coordinates X by lower-case x, etc.

16.4 Higher-order triangles and tetrahedra:

Of course, one can also define higher-order triangular and tetrahedral elements.

51


For example, the quadratic triangle (T6) element has six nodes so that the interpolation functionspace is 1, x, y, x2, xy, y2 and therefore complete up to second order. Per convention, nodes 1-3are the corners while 4-6 are at edge midpoints, and counting is counter-clockwise as before.

With the same reference coordinates (r, s) as for the T3 element, the shape functions can be foundas

N1e (r, s) = r(2r − 1), N2

e (r, s) = s(2s − 1), N3e (r, s) = t(2t − 1),

N4e (r, s) = 4rs, N5

e (r, s) = 4st, N6e (r, s) = 4rt,

(16.18)

where t = 1 − r − s.

Since shape function derivatives are linear, strains (and thus stresses) also vary linearly within theelement which is therefore known as the linear strain triangle (LST). Full integration requiresorder q = 2, which corresponds to three quadrature points. The quadratic interpolation implies thatelement edges of isoparametric elements can be curved.

Analogously, the quadratic tetrahedron (T10) has four corner nodes and six nodes at edgemidpoints.

52


17 Assembly

So far, we have defined local element vectors and matrices. The solution of any problem requiresthe assembly of global vectors and matrices.

This is accomplished by the assembly operator, viz.

Fint =neAe=1Fint,e, Tint =

neAe=1Tint,e, (17.1)

which loops over all ne elements e and adds their respective contributions to the global quantities.

This requires careful book-keeping to keep track of the correspondence between local and globalnode numbering.

Similarly, an inverse assignment operator extracts element quantities from a global vector:

Uhe = A

e

−1(Uh). (17.2)

53


18 Iterative solvers

In the linear elastic or thermal problem, the solution is simple to obtain from a linear systemof equations, as discussed before. In case of nonlinear problems, an iterative solution method isrequired. Here, we discuss a few common examples.

The problem to be solved has the general form

f(Uh) = Fint(Uh) −Fext = 0. (18.1)

All iterative solvers start with an initial guess Uh0 , which is then corrected in a multitude of ways

to find

Uhn+1 = Uh

n +∆Uhn . (18.2)

The iterative scheme converges if ∆Uhn → 0 as n→∞, or equivalently f(Uh

n )→ 0 as n→∞.

18.1 Netwon-Raphson (NR) method

The Newton-Raphson method (introduced by Newton in 1669 and generalized by Raphson in1690) starts with a Taylor expansion:

0 = f(Uhn+1) = f(Uh

n +∆Un)

= f(Uhn ) +

∂f

∂U(Uh

n )∆Uhn +O(∆Uh

n ).(18.3)

If we neglect higher-order terms, then the above can be solved for the increment

∆Uhn = − [T (Uh

n )]−1f(Uh

n ) with T (Uhn ) =

∂f

∂U(Uh

n ) (18.4)

being the tangent matrix.

For the mechanical problem, this gives (e.g., in finite deformations)

T abik (Uhn ) =

∂F ai∂U bk

(Uhn ) =

∂

∂U bk(∫

ΩPiJN

a,J dV − F aext,i)

= ∫Ω

∂PiJ∂FkL

N b,LN

a,J dV −

∂F aext,i

∂U bk

= ∫ΩCiJkLN b

,LNa,J dV −

∂F aext,i

∂U bk,

(18.5)

where CiJkL is the incremental stiffness tensor (in linearized kinematics, the expression is the samewith Cijkl the linearized stiffness tensor).

Note that the solver requires that detT ≠ 0, which is guaranteed in linear elasticity if no rigidbody mode exists (i.e., the linearized system has no zero-energy mode so that Uh ⋅T Uh ≠ 0 for alladmissible Uh). The Newton-Raphson solver displays quadratic convergence.

54


Note that, if the problem is linear as in linear elasticity, the solver converges in one step since

Uhn+1 = Uh

n +∆Uhn = Uh

n −K−1 [Fint −Fext]= Uh

n −K−1 [KUhn −Fext]

= Uhn −Uh

n +K−1Fext

=K−1Fext.

(18.6)

18.2 Damped Newton-Raphson (dNR) method

A slight modification of the Newton-Raphson method, the damped Newton-Raphson methodis beneficial, e.g., when the NR method tends to overshoot (e.g., in case of oscillatory energylandscapes or multiple minima such as in finite-deformation elasticity).

The iterative scheme is identical to the classical NR method except that

Uhn+1 = Uh

n + α∆Uhn with α ∈ (0,1). (18.7)

The damping parameter α can be chosen constant or adjusted based on convergence.

18.3 Quasi-Newton (QN) method

The Quasi-Newton method is the same as the classical NR method with the exception that onedoes not use the actual tangent matrix T for computational simplicity or efficiency.

Motivation is thus to avoid the computation of T (Uh) and its inversion at each iteration step.Instead one uses a matrix Bn and updates its inverse directly.

The general algorithm is as follows:

(1) start with an initial guess Uh0 and B0 = T (Uh

0 )(2) compute ∆Uh

n = −B−1n f(Uh

n ) and Uhn+1 = Uh

n +∆Uhn and

B−1n+1 =B−1

n −(B−1

n zn −∆Uhn)⊗∆Uh

nB−1n

∆Uhn ⋅B−1

n znwith zn = f(Uh

n+1) − f(Uhn ). (18.8)

We omit the full derivation of the update for Bn+1 here for brevity. The idea is that B−1n+1 and B−1

n

are approximately rank-one-connected using the Sherman-Morrison formula. The added benefit isthat not only does T not have to be recomputed exactly but also can the inversion or linear solverbe skipped since the updated inverse is computed explicitly.

18.4 Line search method

The line search method can be used as an improvement for other nonlinear iterative solvers.Similar to the Quasi-Newton schemes, updates are made according to

Uhn+1 = Uh

n + β∆Uhn , (18.9)

55


where now β is not a constant but chosen such that f(Uhn+1) = 0. For example, we can find β from

solving

∆Uhn ⋅ f(Uh

n + β∆Uhn ) = 0. (18.10)

This is generally a nonlinear but scalar problem that can be solved by bisection, regula falsi, secant,and other methods.

Notice that (18.10) is in fact the stationarity condition of the minimization problem

β = arg inf ∥f(Uhn + β∆Uh

n )∥2, (18.11)

which is the motivation for the nonlinear least-squares method described below.

18.5 Gradient flow method

Although not with a proper physical meaning, the gradient flow method (also known as gradientdescent) has become popular as an iterative solver for quasistatic problems.

The idea is to replace the equation

0 = f(Uhn+1) (18.12)

by a dynamic evolution equation:

CUhn+1/2 = −f(U

hn ) and Uh

n+1 = Uhn +∆t Uh

n+1/2 (18.13)

with, e.g., C = cI with c > 0. It is obvious that as f → 0 we have Uhn+1/2 → 0 and thus the method

converges. Although there is no guarantee to reach an extremum, the method is popular becauseit does not require a tangent matrix and is quite robust.

For example, using a simple backward-Euler discretization for the time derivative and C = cI, weobtain

Uhn+1 = Uh

n −1

cf(Uh

n ). (18.14)

18.6 Nonlinear Least Squares

The family of methods based on nonlinear least squares aim to minimize

r(Uh) = ∥f(Uh)∥2 = f(Uh) ⋅ f(Uh) so we solve∂r

∂Uh= 0. (18.15)

This approach is helpful, e.g., in case of over-constrained systems. Application of Newton-Raphsonto this nonlinear system of equations leads to

∆Uh = − [ ∂

∂Uh

∂r

∂Uh]−1

Uhn

∂r

∂Uh(Uh

n )

= − [TT(Uhn)∂f

∂Uh(Uhn)f(Uhn) +

∂TT

∂Uh(Uhn)f(Uhn)]

−1

TT(Uhn)f(Uhn).(18.16)

56


If updates are small, we can neglect the second term in brackets (which requires higher derivativesthan what is commonly computed in FEM), which gives

∆Uh = − [TT(Uhn)T (Uhn)]−1TT(Uhn)f(Uhn). (18.17)

This is known as the Gauss-Newton method. Note that this reduces to Newton-Raphson forstandard problems (i.e., those with as many equations as unknowns). However, Gauss-Newton canalso be applied to overdetermined systems (i.e., more equations than unknowns).

18.7 Conjugate Gradient (CG) method

The conjugate gradient method follows the idea of iterating into the direction of steepest descentin order to minimize the total potential energy (as a variation, it can also be applied to the nonlinearleast squares problem).

Here, the update is

Uhn+1 = Uh

n + αnSn, (18.18)

where both the direction Sn and increment αn are determined in an optimal way as follows.

The conjugate direction is updated according to

Sn = −f(Uhn ) + βnSn−1 (18.19)

with β computed from the current solution Uhn and the previous solution Uh

n according to one ofseveral options (Polak-Ribiere, Fletcher-Reeves, etc.)

Then, the scalar increment αn is obtained from a line search to find

αn = arg min r (Uhn + αSn) . (18.20)

A benefit of the conjugate gradient technique is that, as for gradient flow, no tangent matrix isrequired.

A variation of this scheme, originally developed for atomistics but also applicable to the FE method(and oftentimes converging faster than CG) is the so-called Fast Inertial Relaxation Engine(FIRE).

57


19 Boundary conditions

19.1 Neumann boundary conditions

Recall that the nonlinear system of equations to solve,

Fint(Uh) −Fext = 0, (19.1)

requires computation of the external force vector. The latter includes both body forces and surfacetractions.

Now that we have introduced isoparametric mappings and numerical quadrature rules, we canapply those to the external force terms.

Body forces ρb produce the external force on node a in direction i, e.g., in 2D

F aext,i = ∫ΩeρbiN

adV = ∫1

−1∫

1

−1ρbi(ξ)Na(ξ)J(ξ)dξ dη ≈

nQP

∑k=1

Wk ρbi(ξk)Na(ξk)J(ξk). (19.2)

Surface tractions t result in an external force on node a in direction i, again in 2D,

F aext,i = ∫∂ΩN,e

tiNadV = ∫

1

−1ti(ξ)Na(ξ)J(ξ)dξ dη ≈

nQP

∑k=1

Wk ti(ξk)Na(ξk)J(ξk). (19.3)

Note that the surface traction term integrates over the boundary of the element, so in d dimensionswe can use a quadrature rule for d− 1 dimensions (e.g., for a 2D element we use 1D quadrature onthe element edges).

Implementation:

The variational formulation allows us to implement external force elements just like regularfinite elements: we have defined their energy,

Ie = −∫Ωeρb ⋅u dV − ∫

Ωet ⋅u dS. (19.4)

The corresponding force vectors Fext are given above, and the tangent matrices are obtained from

T = ∂Fext

∂Uhe

. (19.5)

Notice that if t and ρb are independent of displacements (as in most cases), we have T = 0. Anexception is, e.g., the application of pressure to the boundary in finite deformations (since theresulting force depends on the surface area which, in turn, depends on the sought deformation).

58


19.2 Examples of external forces

Constant force:

Consider a constant force P applied to a particular node i at deformed position xi. The elementenergy of this external force vector is

Ie = −P ⋅ui, (19.6)

so that the resulting force vector is

F a = ∂Ie∂ua

= −P δia, (19.7)

i.e., an external force is applied only to node i. The stiffness matrix vanishes since

T ab = ∂Fa

∂ub= 0. (19.8)

Linear spring:

Next, consider a linear elastic spring (stiffness k) attached to a node i (deformed position xi,undeformed position Xi) and anchored at a constant position x0 =X0. The element energy in thiscase is simply the spring energy

Ie =k

2(∥xi −x0∥ − ∥Xi −X0∥)

2

, (19.9)

and force vector and stiffness matrix follow by differentiation.

Indentor:

When simulating indentation tests, it is oftentimes convenient to apply the indenter forces via anexternal potential rather than by modeling contact. Consider a spherical (in 3D) or circular (in2D) indenter of radius R whose center is located at the constant, known point x0. Here, one mayuse potentials of the type

Ie = C(∥x0 −xi∥ −R)n

(19.10)

with a force constant C > 0 and integer exponent n ≥ 2 (a common choice is n = 3). Again, forcesand stiffness matrix follow by differentiation.

Pressure in linearized kinematics:

Applying a constant pressure p to a surface can be accomplished rather easily in linearized kine-matics via (19.3). Specifically, we have

Ie = −∫∂Ωe

t ⋅uhdS = −∫∂Ωe

(−p)n ⋅n

∑a=1

uaeNadS =

n

∑a=1∫∂Ωe

pn ⋅uaeNadS. (19.11)

59


Integration can be carried out using numerical quadrature on the element boundary (the elementboundary normal n can be computed from the nodal locations). Again, forces and stiffness matrixfollow by differentiation (forces are constant, and the stiffness matrix vanishes).

Pressure in finite kinematics:

Applying a constant pressure p to a surface in finite kinematics is more complicated as the elementboundary undergoes finite changes during deformation, resulting in nodal forces that depend ondeformation. Here, we start with the work done by the pressure (which we assume constant), viz.

Ie = −pv = −∫ϕ(Ωe)

pdv = −∫ϕ(Ωe)

p1

dxi,idv

= −pd∫∂ϕ(Ωe)

ϕinids = −p

d∫∂Ωe

ϕiJF−1Ji NJ dS,

(19.12)

where we used that xi,i = δii = d in d dimensions as well as the Piola transform nids = JF −1Ji NJ dS.

Thus the numerical integration uses

Ie = −p

d∫∂Ωe

uh ⋅F −TNJ dS, (19.13)

which can be evaluated by numerical quadrature. As before, forces and stiffness matrix follow bydifferentiation but are non-zero and rather complex.

19.3 Dirichlet boundary conditions

Essential boundary conditions require us to replace individual equations in the nonlinear systemby Ua = Ua. This is computationally accomplished usually in one of two ways.

Substitution:

First, brute-force substitution is a simple method: one replaces the respective equation for ∆uai inthe linearized system T ∆Uh = F by ∆uai = ∆uai ; e.g.,

⎛⎜⎜⎜⎜⎜⎜⎝

T11 T12 . . . ⋅ ⋅⋅ ⋅ ⋅ ⋅ ⋅0 0 1 0 0⋅ ⋅ ⋅ ⋅ ⋅⋅ ⋅ ⋅ ⋅ T55

⎞⎟⎟⎟⎟⎟⎟⎠

⎛⎜⎜⎜⎜⎜⎜⎝

∆u1

∆u2

∆u3

∆u4

∆u5

⎞⎟⎟⎟⎟⎟⎟⎠

=

⎛⎜⎜⎜⎜⎜⎜⎝

F 1

F 2

∆u3

F 4

F 5

⎞⎟⎟⎟⎟⎟⎟⎠

(19.14)

The same method can also impose other types of boundary conditions, e.g., periodic boundaryconditions of the type

u+ = u− (19.15)

for some opposite nodes (+,−). Also, constraints of the general type

f(ui,uj , . . .) = 0 (19.16)

can be implemented in a similar fashion (e.g., rigid links between nodes).

60


Condensation:

The above substitution method is simple to implement. However, it is quite expensive since thenumber of equations remains the same when imposing essential boundary conditions. The conden-sation method removes from the linear system to be solved those equations imposing essentialboundary conditions.

Let us rewrite the linearized system by moving the third column to the right-hand side:

⎛⎜⎜⎜⎜⎜⎜⎝

T11 T12 0 T14 ⋅⋅ ⋅ 0 ⋅ ⋅⋅ ⋅ 0 ⋅ ⋅⋅ ⋅ 0 ⋅ ⋅⋅ ⋅ 0 ⋅ T55

⎞⎟⎟⎟⎟⎟⎟⎠

⎛⎜⎜⎜⎜⎜⎜⎝

∆u1

∆u2

∆u3

∆u4

∆u5

⎞⎟⎟⎟⎟⎟⎟⎠

=

⎛⎜⎜⎜⎜⎜⎜⎝

F 1

F 2

F 3

F 4

F 5

⎞⎟⎟⎟⎟⎟⎟⎠

−

⎛⎜⎜⎜⎜⎜⎜⎝

∆T13

∆T23

∆T33

∆T43

∆T53

⎞⎟⎟⎟⎟⎟⎟⎠

∆u3 (19.17)

so that we can eliminate the third row and column from the system:

⎛⎜⎜⎜⎝

T11 T12 T14 ⋅⋅ ⋅ ⋅ ⋅⋅ ⋅ ⋅ ⋅⋅ ⋅ ⋅ T55

⎞⎟⎟⎟⎠

⎛⎜⎜⎜⎝

∆u1

∆u2

∆u4

∆u5

⎞⎟⎟⎟⎠=⎛⎜⎜⎜⎝

F 1

F 2

F 4

F 5

⎞⎟⎟⎟⎠−⎛⎜⎜⎜⎝

∆T13

∆T23

∆T43

∆T53

⎞⎟⎟⎟⎠

∆u3, (19.18)

and solve for the remaining unknowns along with ∆u3 = ∆u3.

The clear advantage of this method is the reduction in size of the system to be solved. Thedisadvantage is that it is computationally more involved (can be even more expensive for a smallnumber of essential boundary conditions applied to large systems).

19.4 Rigid body motion

Consider the composite deformation mapping x = ϕ∗(ϕ(X)) with x = ϕ(X) being an admissible

deformation mapping and ϕ∗(x) =Rx + c denoting rigid body motion, i.e., R ∈ SO(d) and c ∈ Rdis a constant vector. Note that the combined deformation gradient is given by F ∗ = RF whereF = Gradϕ. Recall that the weak form read

G(u, v) = ∫ΩPiJ(F )vi,J dV − ∫

ΩRBividV − ∫

∂ΩNTividS = 0. (19.19)

Insertion of the composite mapping yields

G(u∗, v) = ∫ΩPiJ(RF )vi,J dV − ∫

ΩRBividV − ∫

∂ΩNTividS = 0. (19.20)

However, note that material frame indifference requires that P = P (C) and C∗ = (F ∗)TF ∗ =FTF = C, so that the rotation has no affect on the weak form (neither does the translationc). Therefore, rigid body motion can be superimposed onto any admissible solution and must besuppressed by appropriate essential boundary conditions to ensure uniqueness of solutions.

For the linear elastic case, the tangent matrix T therefore has as many zero eigenvalues as it hasrigid-body modes, or zero-energy modes U∗ such that

U∗ ⋅ TU∗ = 0. (19.21)

61


This implies that T has zero eigenvalues and is thus not invertible.

The remedy is to suppress rigid body modes via appropriate essential boundary conditions; specif-ically in d dimensions we need d(d − 1)/2 such essential boundary conditions.

62


20 Internal variables and inelasticity

20.1 Inelastic material models

Inelastic material models describe a variety of phenomena, e.g.,

viscoelasticity, i.e., time- and rate-dependent reversible behavior (the stress–strain relationdepends on the loading rate; stresses and strains evolve over time); internal variables, e.g., forviscoelasticity with n Maxwell elements are the inelastic strain contributions z = e1

p, . . . ,enp.

plasticity, i.e., history-dependent irreversible behavior (the stress–strain relation dependson the loading history); internal variables are usually the plastic strains, accumulated plasticstrains, and possibly further history variables: z = ep, εp, . . .

viscoplasticity, i.e., history- and time-dependent irreversible behavior; internal variables aresimilar to the case of plasticity above.

damage, i.e., irreversible degradation of the elastic stiffness with loading; internal variable isa damage parameter, e.g., a scalar measure z = d.

ferroelectricity, i.e., irreversible electro-mechanical-coupling; internal variable can be, e.g.,the polarization field z = p.

All these phenomena can be described by the same underlying principles.

The general description of an inelastic (variational) material model starts with a strain energydensity

W =W (ε,z), (20.1)

where z denotes a collection of internal variables. While the stress tensor and linear momentumbalance remain untouched, i.e., (in linearized kinematics)

σ = ∂W∂ε

and divσ + ρb = 0, (20.2)

the internal variables evolve according to a kinetic law

∂W

∂z+ ∂φ

∗

∂z∋ 0, (20.3)

where φ∗ denotes the dual (dissipation) potential and we assume that such a potential exists.The differential inclusion is replaced by an equality in case of rate-dependent models. (20.3) canalternatively be cast into an effective variational problem:

z = arg inf W + φ∗. (20.4)

Let us introduce a discretization in time: tα = α∆t, where we assume constant time steps ∆t = tα+1−tα and, for conciseness, we write ∆(⋅) = (⋅)α+1−(⋅)α, where (⋅)α denotes a quantity at time tα. Usingsimple backward-Euler rules then gives W = (Wα+1 −Wα)/∆t and z = (zα+1 − zα)/∆t = ∆z/∆t.

63


Thus,

zα+1 − zα∆t

= arg inf Wα+1 −Wα

∆t+ φ∗ (z

α+1 − zα∆t

) . (20.5)

Multiplication by ∆t and omitting Wα (since it does not depend on zα+1) leads to

zα+1 = arg inf Wα+1(εα+1,zα+1) +∆t φ∗ (zα+1 − zα

∆t) , (20.6)

where the right-hand side defines an effective incremental potential:

Fzα(εα+1,zα+1) =Wα+1(εα+1,zα+1) +∆t φ∗ (zα+1 − zα

∆t) (20.7)

Notice that σα+1 = ∂W /∂εα+1 so that the effective potential can be used to replace the classicalstrain energy density in the total potential energy:

Izα[uα+1,zα+1] = ∫ΩFzα(εα+1,zα+1)dV − ∫

Ωρb ⋅uα+1 dV − ∫

∂ΩNt ⋅uα+1 dS. (20.8)

By the subscripts zα we denote that those potentials do depend on zα (i.e., the internal variables atthe previous time step) but that those fields are known when evaluating the respective quantities.

The solution can now be found from

uα+1,zα+1 = arg inf Izα[uα+1,zα+1]. (20.9)

We can exploit that only the internal energy term depends on the internal variables and furtherassume that the energy density is local in the internal variables (this does not apply, e.g., for phasefield models whose energy involves gradients of the internal variables). Then,

infuα+1

infzα+1

∫ΩFzα(εα+1,zα+1)dV − . . . = inf

uα+1∫

Ωinfzα+1Fzα(εα+1,zα+1)dV − . . .

= infuα+1

∫ΩW ∗zα(εα+1)dV − . . . ,

(20.10)

where

W ∗zα(εα+1) = inf

zα+1Fzα(εα+1,zα+1) = Fzα(εα+1,zα+1

∗ ), zα+1∗ = arg inf Fzα(εα+1, ⋅) (20.11)

is often referred to as the condensed energy density (the internal variables have been “condensedout”). Notice that the omitted terms (. . .) do not depend on the internal variables.

This finally leads to the incremental variational problem

Izα[uα+1,zα+1] = ∫ΩW ∗zα(εα+1)dV − ∫

Ωρb ⋅uα+1 dV − ∫

∂ΩNt ⋅uα+1 dS, (20.12)

which has the same structure as before. This concept of introducing internal variables into thevariational framework is also known as variational constitutive updates and goes back to Ortizand Stainier (2000).

64


Note that – for numerical implementation purposes – evaluation of W ∗zα(εα+1) always requires

us to compute the updated internal variables zα+1∗ based on εα+1, before the energy, stresses, or

incremental stiffness matrix can be evaluated.

The stresses are now

σα+1 = d

dεα+1W ∗zα(εα+1)

= ∂F∂εα+1

(εα+1,zα+1∗ ) + ∂F

∂zα+1(εα+1,zα+1

∗ ) ⋅ ∂zα+1∗

∂εα+1

= ∂W ∗

∂εα+1(εα+1,zα+1

∗ ),

(20.13)

where the second term vanished because zα+1∗ renders F stationary by definition.

The same trick unfortunately does not apply when computing the incremental stiffness matrix,

Cα+1ijkl =

dσα+1ij

dεα+1kl

=∂σα+1

ij

∂εα+1ij

+∂σα+1

ij

∂zα+1⋅ ∂z

α+1∗

∂εα+1ij

, (20.14)

where the second term does not vanish in general. It requires calculating the sensitivity of theinternal variables with respect to the strain tensor components.

20.2 Example: viscoelasticity, (visco)plasticity

Viscoelasticity and (visco)plasticity all start with the same fundamental structure (here presentedin linearized kinematics). The total strain tensor decomposes additively into elastic and inelastic(or plastic) contributions:

ε = εe + εp, (20.15)

where εp belongs to the set of internal variables. (Note that in finite deformations, the decompo-sition is multiplicative: F = FeFp.)

In case of history dependence, one introduces additional internal history variables. For example forvon Mises plasticity, the accumulated plastic strain εp captures the history of plastic strains εp

through the coupling

εp = ∥εp∥vM =√

2

3εp ⋅ εp. (20.16)

With internal variables z = εp, εp, the Helmholtz energy density decomposes into elastic andplastic energy:

W (ε,εp, εp) =Wel(ε − εp) +Wpl(εp). (20.17)

In case of reversible behavior (viscoelasticity), we have no plastic energy storage, i.e., Wpl = 0.

The dual dissipation potential can be defined, e.g., by the general power-law structure

φ∗(εp) = σ0 ∣εp∣ +τ0ε0m + 1

( εpε0

)m+1

(20.18)

65


with positive constants σ0 (initial yield stress), τ0 (hardening rate), ε0 (reference rate), and m (ratesensitivity). In case of viscoelasticity we choose σ0 = 0. By contrast, for rate-independent plasticityone chooses τ0 = 0.

The differential inclusion in (20.3) is required because of the first term, σ0 ∣εp∣, whose derivative isnot defined at the origin (for εp = 0). Here, a subdifferential is required:

∂

∂εpσ0 ∣εp∣ =

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

σ0, if εp > 0,

−σ0, if εp < 0,

(−σ0, σ0) if εp = 0.

(20.19)

In the first two cases (i.e. for plastic flow), the kinetic law becomes an equality; in the third case(i.e., for elastic loading such that ∣σ∣ < σ0), the differential inclusion is required.

The specific forms of Wel, Wpl, and φ∗ depend on the particular material model.

20.3 Example: viscoplasticity

Since plastic/viscous deformation is observed to be isochoric, one commonly assumes

trεp = 0 so that εp = ep. (20.20)

Similarly, only the deviatoric stress tensor s should cause plastic deformation. Here and in thefollowing, we denote the deviatoric tensors by

dev(⋅) = (⋅) − 1

3tr(⋅)I, so that e = dev ε, s = devσ. (20.21)

Using the above definitions of energy density and dual potential, we obtain

Feαp ,εαp(εα+1,eα+1p , εα+1

p ) =Wα+1(εα+1,eα+1p , εα+1

p ) +∆t φ∗ (εα+1p − εαp

∆t) (20.22)

Minimization with respect to eα+1p gives

∂F∂eα+1

p

= ∂Wα+1el

∂eα+1p

+∂Wα+1

pl

∂εα+1p

∂εα+1p

∂eα+1p

+∆t∂φ∗

∂εα+1p

1

∆t

∂εα+1p

∂eα+1p

= −devσα+1 + [τ(εα+1p ) + τ∗(εα+1

p )]∂εα+1

p

∂eα+1p

∋ 0

(20.23)

with back-stresses

τ(εα+1p ) =

∂Wα+1pl

∂εα+1p

, τ∗(εα+1p ) = ∂φ∗

∂εα+1p

. (20.24)

Further, note that

∂εα+1p

∂eα+1p

=∂∆εα+1

p

∂∆eα+1p

= ∂

∂∆eα+1p

√2

3∆eα+1

p ⋅∆eα+1p =

2∆eα+1p

3∆εα+1p

. (20.25)

66


Altogether, this results in

−sα+1 + [τ(εα+1p ) + τ∗(εα+1

p )]2∆eα+1

p

3∆εα+1p

∋ 0 (20.26)

or

sα+1 ∈ [τ(εα+1p ) + τ∗(εα+1

p )]2∆eα+1

p

3∆εα+1p

. (20.27)

Let us consider the elastically isotropic case (i.e., σ = κ(trε)I + 2µee). Assume that ∆εα+1p > 0, so

we have

2µ (eα+1 − eα+1p ) =

⎡⎢⎢⎢⎣τ(εα+1

p ) + σ0 + τ0 (∆εα+1

p

ε0 ∆t)m⎤⎥⎥⎥⎦

2∆eα+1p

3∆εα+1p

. (20.28)

and we introduce an elastic predictor (i.e., strain if the plastic strain remained unaltered)

eα+1pre = eα+1 − eαp such that eα+1 − eα+1

p = eα+1pre −∆eα+1

p . (20.29)

That gives

2µeα+1pre = 2µ∆eα+1

p + 2

3∆εα+1p

⎡⎢⎢⎢⎣3µ∆εα+1

p + τ(εα+1p ) + σ0 + τ0 (

∆εα+1p

ε0 ∆t)m⎤⎥⎥⎥⎦

∆eα+1p . (20.30)

or

2µeα+1pre

2µ + 23∆εα+1p

(τ(εα+1p ) + σ0 + τ0 (

∆εα+1p

ε0 ∆t )m

)= ∆eα+1

p . (20.31)

Now, let us use that

∆εα+1p =

√2

3∆eα+1

p ⋅∆eα+1p =

2µ ∥eα+1pre ∥

vM

2µ + 23∆εα+1p

(τ(εα+1p ) + σ0 + τ0 (

∆εα+1p

ε0 ∆t )m

). (20.32)

This is a scalar equation to be solved for the increment ∆εα+1p , which is then inserted into (20.31)

in order to determine ∆eα+1p . This completes the calculation of the updated internal variables.

Notice that the above equation is equivalent to saying (introducing the von Mises stress σvM)

σvM =√

3

2sα+1 ⋅ sα+1 = τ(εα+1

p ) + τ∗(εα+1p ). (20.33)

Analogous relations can be obtained for ∆εα+1p < 0.

Note that rate-independent plasticity (including the special case of von Mises plasticity) assumesthat τ0 = 0; i.e., the dissipation potential only provides the initial yield threshold σ0.

67


20.4 Example: linear viscoelasticity

In the simplest viscoelastic case, we take Wpl = 0 and thus τ(εα+1p ) = 0, i.e., the material has no

“memory”. Also, the yield threshold is removed by choosing σ0 = 0, and the accumulated plasticstress εp is of no interest nor needed. The dissipation is reformulated as

φ∗(ep) =η

2∥ep∥2 , (20.34)

which agrees with the plastic case above with velocity-proportional damping (m = 1), viscosityη = τ0/ε0, and the von Mises norm replaced by the classical vector norm.

In this case, (20.28) reduces to

2µ (eα+1pre −∆ep) = τ0

∆ep

ε0 ∆t= η∆ep

∆t, (20.35)

which can be rearranged to

2µeα+1pre = η∆ep

∆t+ 2µ∆ep (20.36)

giving

2eα+1pre = (η/µ

∆t+ 2)∆ep. (20.37)

If we define the relaxation time τ = η/µ, then

∆ep =2

2 + τ/∆teα+1pre (20.38)

or

eα+1p = eαp +

2

2 + τ/∆t(eα+1 − eap) . (20.39)

This can be extended to the generalized Maxwell model. Let us assume isotropic elasticity withshear and bulk modulus µ∞ and κ∞, respectively, while the n viscoelastic branches are characterizedby shear stiffnesses µi and viscosities ηi (for i = 1, . . . , n). The effective incremental energy densitynow becomes

Fei,αp (εα+1,e1,α+1

p , . . . ,en,α+1p ) = κ∞

2(trεα+1)2 + µ∞ eα+1 ⋅ eα+1 +

n

∑i=1

µi ∥eα+1 − ei,α+1p ∥2

+n

∑i=1

ηi2∆t

∥ei,α+1p − ei,αp ∥2

,

(20.40)

whose minimization with respect to the new internal variables yields

ei,α+1p = ei,αp + 2

2 + τi/∆t(eα+1 − ei,αp ) (20.41)

with relaxation times τi = η(i)/µ(i). Note that this agrees with (20.39) for a single Maxwell element.

68


Insertion and differentiation leads to the stress tensor

σα+1(εα+1,e1,αp , . . . ,en,αp ) = κ∞(trεα+1)I + 2µ∞e

α+1 +n

∑i=1

2µiτi/∆t

2 + τi/∆t(eα+1 − ei,αp ) . (20.42)

Similarly, the consistent incremental tangent matrix can be computed by differentiating σα+1:

Cα+1ijkl =

∂σα+1

∂εα+1= [κ∞ − 2

3(µ∞ +

n

∑i=1

µiτi/∆t

2 + τi/∆t)] δijδkl

+ (µ∞ +n

∑i=1

µiτi/∆t

2 + τi/∆t)(δikδjl + δilδjk).

(20.43)

Finally, the condensed incremental energy density W ∗ can be computed analytically be insert-ing (20.41) into (20.40).

69


21 Dynamics

21.1 Variational setting

The mechanical problems so far have been limited to quasistatic conditions. Let us consider the ex-tension to dynamic problems where inertial effects matter and the strong form reads (for simplicitystated in linearized kinematics; the finite-deformation setting is analogous)

⎡⎢⎢⎢⎢⎢⎢⎣

σij,j + ρ bi = ρai in Ω

ui(x, t) = ui(x, t) on ∂ΩD

σijnj(x, t) = t(x, t) on ∂ΩN

(21.1)

Now, we have u ∶ Ω ×R→ Rd with sufficient differentiability in both space and time.

The variational approach here makes use of the so-called action principle which uses the action

A[u] = ∫t2

t1L[u]dt with L[u] = T [u] − I[u], (21.2)

where I is the potential energy functional from before and T the kinetic energy functional:

T [u] = ∫Ω

ρ

2∣u∣2 dV. (21.3)

The action principle states that the solution u(x, t) renders A stationary with u(x, t1) = u1(x)and u(x, t2) = u2(x).

For variational material models (with W replaced by W ∗ for inelasticity), we have

A[u] = ∫t2

t1[∫

Ω(ρ

2∣u∣2 −W (ε)) dV + ∫

Ωρb ⋅u dV + ∫

∂ΩNt ⋅u dS] dt. (21.4)

Taking the first variation (with the divergence theorem and the same assumptions from before):

δA[u] = 0 = ∫t2

t1[∫

Ω(ρ ui δui − σijδui,j) dV + ∫

Ωρbiδui dV + ∫

∂ΩNtiδuidS] dt. (21.5)

The weak form is thus obtained as

G(u, v) = ∫t2

t1[∫

Ω(ρ ui vi − σijvi,j) dV + ∫

Ωρbivi dV + ∫

∂ΩNtividS] dt = 0 (21.6)

with

v ∈ v ∈H1(Ω) ∶ v = 0 on ∂ΩD and at t = t1 or t = t2 . (21.7)

To avoid the time derivative in the variation, let us integrate by parts in time (the “boundary term”vanishes since v = 0 at t = t1 and t = t2):

G(u, v) = −∫t2

t1[∫

Ω(ρ ui vi + σijvi,j) dV − ∫

Ωρbivi dV − ∫

∂ΩNtividS] dt = 0. (21.8)

70


Note that, without the first term, we recover the elastostatic formulation.

Since in the dynamic problem the displacement field depends on time, we introduce a semi-discretization, i.e., we discretize the solution in space but not in time:

uh(x, t) =n

∑a=1

ua(t)Na(x) and vh(x, t) =n

∑a=1

va(t)Na(x), (21.9)

so that

uh(x, t) =n

∑a=1

ua(t)Na(x) and uh(x, t) =n

∑a=1

ua(t)Na(x). (21.10)

Insertion into the weak form results in Galerkin’s discrete weak form:

G(uh,vh) = −∫t2

t1

n

∑a=1

n

∑b=1

[uai vbi ∫ΩρNaN bdV + vbi ∫

ΩσijN

b,j dV

−vbi ∫ΩρbiN

b dV − vbi ∫∂ΩN

tiNbdS] dt = 0

(21.11)

for all vb(t) histories that vanish at t1 and t2.

Analogously to before, we now write

Uh(t) = u1(t), . . . ,un(t), (21.12)

so that solving (21.11) for all vb(t) is equivalent to solving

M Uh +Fint(Uh) −Fext(t) = 0 (21.13)

with

Mabij = δij ∫

ΩρNaN bdV, F bint,i = ∫

ΩσijN

b,j dV, F bext,i = ∫

ΩρbiN

b dV +∫∂ΩN

tiNbdS. (21.14)

Matrix M is called the consistent mass matrix.

Examples:

The consistent mass matrix for a two-node bar element is computed from shape functions

N1(ξ) =1 − ξ

2, N2(ξ) =

1 + ξ2

. (21.15)

Specifically, we have (with m = ρALe)

Mab = ∫ΩρNaN bdV = ∫

1

−1ρNaN bA

Le2

dxi ⇒ M1D = m6

(2 11 2

) . (21.16)

Note that this is the consistent mass matrix for 1D motion. If each node moves has two degrees offreedom (u1, u2) in the plane, then each pair of dof is linked by the above mass matrix, so that thetotal consistent mass matrix becomes

M2D = m6

⎛⎜⎜⎜⎝

2 0 1 00 2 0 11 0 2 00 1 0 2

⎞⎟⎟⎟⎠. (21.17)

71


Similarly, the consistent mass matrix of the CST is computed by integration of the shape functions,resulting for 1D motion in

MCST = m

12

⎛⎜⎝

2 1 11 2 11 1 2

⎞⎟⎠, (21.18)

and, as before, the corresponding mass matrix for 2D motion is obtained by applying the abovematrix for each dof independently.

In summary, the dynamic problem is quite analogous to the quasistatic one. The key differenceis the first term in (21.13), which requires a strategy to obtain numerical solutions that are time-dependent. Note that the above formulation in linearized kinematics can easily be adopted forfinite deformations (the final matrix equations are the same with internal/external force vectorsreplaced by the finite-deformation counterparts).

We note that various references call this dynamic variational principle the “principle of least/minimumaction”, which is in fact not correct since the solution must not necessarily be a minimizer of A (itis merely guaranteed to be an extremizer).

Finally, note that in structural dynamics, one often includes velocity-proportional damping througha damping matrix C such that (21.13) turns into

M Uh +C Uh +Fint(Uh) −Fext(t) = 0 (21.19)

with, oftentimes, mass- and stiffness-proportional damping via

C = αM + βK, α, β ∈ R+. (21.20)

The choice of α > 0 controls low-frequency vibration attenuation, while β > 0 suppresses high-frequency vibrations.

21.2 Time-dependent solutions

The time-dependent solution for the nodal variables Uh(t) is obtained either in a discrete fashion(e.g., by using finite-difference approximations in time) or in a continuous manner (e.g., by modaldecomposition).

Explicit time integration:

We discretize the solution in time, e.g., we assume constant time increments ∆t > 0 and write

ua(tα) = ua,α, ∆t = tα+1 − tα. (21.21)

By using central-difference approximations, we obtain

Uh(tα) =Uh,α+1 −Uh,α−1

2∆t, Uh(tα) =

Uh,α+1 − 2Uh,α +Uh,α−1

(∆t)2. (21.22)

72


Insertion into (21.19) leads to

MUh,α+1 − 2Uh,α +Uh,α−1

(∆t)2+CU

h,α+1 −Uh,α−1

2∆t+Fint(Uh,α) −Fext(tα) = 0, (21.23)

which can be reorganized into

[ M

(∆t)2+ C

2∆t]Uα+1 = 2M

(∆t)2Uα + [ C

2∆t− M

(∆t)2]Uα−1 −Fint (Uh,α) +Fext(tα). (21.24)

This is an update rule for Uα+1, using explicit time integration.

Note that stability limits the time step of the explicit scheme. Specifically, we must ensure that

∆t ≤ ∆tcr =2

ωmax, (21.25)

with ωmax being the highest eigenfrequency.

Implicit time integration:

Next, implicit time integration uses the same discretization in time but requires solving a(non)linear system of equations for Uα+1.

The most popular scheme for mechanical problems, is the so-called Newmark-β method which is acombination of the linear acceleration and average acceleration schemes. Specifically, one assumes

Uα+1 = Uα +∆t Uα + (∆t)2

2[2β Uα+1 + (1 − 2β)Uα]

Uα+1 = Uα +∆t [γUα+1 + (1 − γ)Uα](21.26)

with parameters β and γ, often chosen as

β = 14 , γ = 1

2 in the average acceleration scheme,

β = 16 , γ = 1

2 in the linear acceleration scheme,

β = γ = 0 returns to the explicit scheme discussed above.

Most popular is the average accelerations scheme which is unconditionally stable for arbitrary∆t > 0.

Modal decomposition:

The starting point for modal decomposition of an elastic system is the Fourier representation

Uh(t) =n

∑i=1

zi(t) Uhi , (21.27)

where Uh1 , . . . , U

hn are the n eigenmodes of the system. To identify the eigenmodes and eigenfre-

quencies, we assume harmonic wave motion, i.e.,

Uh(t) = U exp(iωt) (21.28)

73


and the absence of external forces, i.e., Fext(t) = 0. Next, the equations of motion are linearizedfor small displacements around the equilibrium ground state, which results in

(−ω2iM +K)Uh

i = 0 with ω1 ≤ . . . ≤ ωn. (21.29)

This is an eigenvalue problem for eigenfrequencies ωi and associated eigenmodes Uhi . Note

that the number of eigenfrequencies and eigenmodes equals the total number of degrees of freedomin the system. Every zero-energy mode contributes a zero eigenfrequency.

We normalize the eigenvectors such that (no summation over i)

Uh(i) ⋅M Uh

(i) = 1 for all i = 1, . . . , n, (21.30)

so that pre-multiplying (21.29) by Uhi results in the Rayleigh quotient

ω2i =

Uh(i) ⋅ T U

h(i)

U(i) ⋅MUh(i)

= Uh(i) ⋅ T U

h(i). (21.31)

Also, since the eigenvectors are orthogonal, we now have

Uhi ⋅M Uh

j = δij , Uhi ⋅ T Uh

j = 0 if i ≠ j. (21.32)

If we now substitute (21.27) into the linearized equations of motion (with external forces), MUh +KUh = Fext, and pre-multiply the system of equations by Uh

i , then the equations decouple into

z(i) + ω2(i)z(i) = U

hi ⋅Fext(t), (21.33)

where the right-hand side is known for known body force and traction histories.

For many practical problems, only a limited number of modes are important (i.e., taking the firstm < n modes only). Therefore, numerical efficiency can be gained by truncating the Fourier sum,which is known as order reduction:

Uh(t) =m

∑i=1

zi(t) Uhi , m < n. (21.34)

74


22 Updated Lagrangian framework and particle methods

So far, we have always defined shape functions in a reference configuration and computed all ener-gies, forces, stiffness and mass matrices against the reference mesh. Mesh adaptivity (not discussedhere) can also be performed in the same framework, viz. by adaptively refining or coarsening thereference mesh. While this is suitable for most applications, it becomes problematic, e.g., when verylarge deformation is involved, so that local neighborhoods change significantly. Another scenariowhere the reference mesh causes problems is fluid-structure interactions.

To this end, one can change from the Lagrangian description used so far to an updated-Lagrangian description. The central idea is to update the reference configuration repeatedly(e.g., after each converged solution), so that elements are defined in the last-converged configurationrather than in the original reference configuration. The clear benefit is that the original referencemesh loses its meaning so that large deformations and significant local neighborhood changes canbe accounted for. Drawbacks are that (i) a reformulation of isoparametric elements may be requiredand (ii) element connectivities may have to be updated frequently.

As an alternative, particle methods have become popular which replace classical elements by aparticle-like formulation, starting again with

ϕh(X) =∑a

ϕaNa(X). (22.1)

Shape functionsNa are defined no longer in an element-like fashion, i.e., suppNa is no longer limitedto adjacent elements (there simply are no elements). Integrals over the body Ω are approximatedby quadrature, e.g.,

F aint,i = ∫ΩPiJN

a,J dV ≈

nQP

∑q=1

WqPiJ(Xq)Na,J(Xq). (22.2)

Now, degrees of freedom are defined at nodes located at Xa (a = 1, . . . , n), while summations arecarried out over material point locations Xq (q = 1, . . . , nQP ). The challenge is now to definethese two sets of “particles”.

Also, since shape functions have local support, the above sum for each node a must be carried outonly over those material points q whose positions Xq fall into the support of Na, i.e., Na(Xq) ≠ 0.This requires frequent neighborhood updates.

Example: local maximum-entropy shape functions

The width Ua of a shape function a can be defined by an integral over the entire (convex) domainΩ as

Ua[Na] = ∫ΩNa(X) ∥X −Xa∥2 dV, (22.3)

so that the a measure for the total width of shape functions within a domain Ω can be defined by

U[N] =∑a

Ua[Na] = ∫Ω∑a

Na(X) ∥X −Xa∥2 dv. (22.4)

75


Maximum locality (i.e. minimum shape function support) now requires to minimize functional Uwith respect to all shape functions Na(x), summarized as N = (N1,N2, . . . ,Nn) for a total of nnodes.

Shape functions Na(X), where a denotes the corresponding node a, are known to satisfy 0 ≤Na(X) ≤ 1 and ∑aNa(X) = 1. Therefore, they can also be interpreted shape functions as proba-bility distributions whose entropy density H in an information-theoretical sense at point X is givenby

H[N](X) = −∑a

Na(X) lnNa(X), (22.5)

so that the total entropy of domain Ω is

H[N] = −∫Ω∑a

Na(X) lnNa(X) dV. (22.6)

Achieving maximum entropy requires to maximize functional H with respect to all shape functionsNa(X). Maximum entropy implies minimum bias of the chosen shape functions with respect tothe chosen positions of nodes in the interpolation scheme.

Maximizing entropy while minimizing shape function support can be achieved as follows. Intro-ducing a balance parameter β > 0, one writes

F[N] = β U[N] −H[N] = ∫Ω∑a

βNa(X) ∥X −Xa∥2 +Na(X) lnNa(X) dV (22.7)

and the optimal set of local maximum-entropy shape functions is found as

N = arg minF (22.8)

with a number of constraints to be discussed below. Observe that minimization in (22.7) can beperformed pointwise so we can write

N(X) = arg min [β u[N](X) − h[N](X) ∣ constr. ] (22.9)

with densities

u[N](X) =∑a

Na(X) ∥X −Xa∥2 , h[N](X) = −∑a

Na(X) lnNa(X). (22.10)

The constraints are given by

Na(X) ≥ 0 ∀ a = 1, . . . , nh ∀ X ∈ Ω, (22.11a)

∑a

Na(X) = 1 ∀ X ∈ Ω, (22.11b)

∑a

Na(X)Xa =X ∀ X ∈ Ω. (22.11c)

The first constraint ensures positive shape functions (negative values of shape functions define adifferent class of higher-order maximum-entropy interpolation schemes). The second constraintenforces zeroth-order consistency and ensures the correct interpolation of constant functions. Thethird constraint hence enforces first-order consistency and guarantees the correct interpolation ofaffine functions (these together ensure consistency of the interpolation scheme, i.e. convergence ofsolutions with increasing mesh h-refinement).

76


The solution to the above minimization problem was found as

Na(X) = 1

Z(x,λ∗(X)) exp [−β ∥X −Xa∥2 +λ∗(X) ⋅ (X −Xa)] , (22.12)

where we have defined the partition function Z ∶ Rd ×Rd → R as

Z(X,λ) =∑a

exp [−β∣X − Xa∣2 +λ ⋅ (X −Xa)]

and

λ∗(X) = arg minλ ∈Rd

lnZ(X,λ). (22.13)

Using the above local maximum-entropy shape functions in an explicit dynamic setting can becarried out using the so-called Optimal Transportation Meshfree (OTM) scheme.

77


23 Error estimates and adaptivity

Solving systems of PDEs by the finite element method introduces numerous sources of errors thatone should be aware of:

(i) The discretization error (also known as the first fundamental error) arises from dis-cretizing the domain into elements of finite size h. As a result, the body Ω is not representedcorrectly and the model (e.g., the outer boundary) may not match the true boundary ∂Ω(e.g., think of approximating a circular domain Ω by CST or Q4 elements). This error canbe reduced by mesh refinement (and we discussed r-refinement, h-refinement, p-refinement,and hp-refinement).

(ii) The numerical integration error results from the application of numerical quadrature:

∫Ωef(ξ)dξ ≅

nQP

∑k=1

Wk f(ξk) (23.1)

We discussed that for f ∈ Ck+1(Ω) (the extension to higher dimensions is analogous)

RRRRRRRRRRR∫

1

−1f(ξ)dξ −

nQP

∑q=1

Wi f(ξi)RRRRRRRRRRR≤ C ∥Ω∥hk+1 max

ξ∈[−1,1]∣α∣=k+1

∥Dαf(ξ)∥ . (23.2)

Hence, the numerical integration error depends on the smoothness of the integrand and callsfor a proper choice of the integration order.

(iii) The solution error stems from numerically solving linear systems TUh = F . In general, theaccuracy of the solution depends on the condition number of the matrix,

κ = ∥T ∥ ⋅ ∥T −1∥ = ∣λmax

λmin∣ (23.3)

with λmax (λmin) being the largest (smallest) eigenvalue of T . The higher the conditionnumber, the larger the numerical error.A practical consequence is the guideline to choose wisely the units of model parameters (suchas material constants, domain size features, etc.). For example, when performing a linearelastic simulation, it is advisable to normalize elastic constants by 1 GPa instead of assigning,e.g., E = 210 ⋅ 109 (instead, use E = 2.1 and know that your results will be in 100 GPa’s).

(iv) As discussed before, an approximation error is introduced by approximating the functionalspace U (in which to find the solution u(x)) by a finite-dimensional subspace Uh ⊂ U .We showed that for an interpolation of order k and u ∈Hk+1(Ω):

∣uh − u∣H1(Ω) ≤hk

πk∣u∣Hk+1(Ω) and ∥uh − u∥H1(Ω) ≤ chk ∣u∣Hk+1(Ω) (23.4)

Thus the error is again bounded by the smoothness of the function to be interpolated; andit is expected to decrease with decreasing element size (as h → 0) – the faster the higher theinterpolation order.Note that special caution is required if stress concentrations of any kind are to be represented(e.g., imagine a linear elastic fracture problem and the issues arising from using linear elementsto capture the 1/rn-type stress concentration near the crack tip).

78


(v) A truncation error is made by every computer when storing and operating numeric valueswith only a finite number of digits (e.g., floats, doubles, etc.). This is unavoidable andone should be aware of what this error is (especially when choosing, e.g., solver tolerances).Choosing a solver tolerance in itself produces truncation error because we contend with asolution Uh that satisfies Fint(Uh) −Fext = tol. (instead of being zero).

(vi) Finally, no simulation is free of modeling error, which refers to the large collection of errorsmade by the selection of the analytical model to be solved (before starting any numericalapproximation). For example, we make choices for an appropriate material model, choosematerial parameters, boundary conditions, and geometric simplifications including reductionsto lower dimensions (e.g., plane strain or plane stress instead of a 3D simulation).

The sum of all of the above error sources makes up the numerical error inherent in every simulation.

79


A Numerical Implementation

The theoretical derivations are implemented within our in-house c++ finite element code, whosegeneral structure is explained in the following. Details should be extracted from the source files.The code structure is schematically shown in Fig. 1.

material model:

The material model computes

⋅ W =W (∇u)⋅ PiJ = PiJ(∇u) or σij = σij(∇u)⋅ CiJkL = CiJkL(∇u) or Cijkl(∇u)

Depending on the (finite/linearized) model,the strain is

F = I +∇u or

ε = 1

2(∇u +∇uT)

In addition, internal variables must be updated:

zα+1 = arg inf Fzα(∇u)

element:

The element computes

⋅ Ie ≅nQP

∑k=1

WkW (∇u(ξk))J(ξk)

⋅ (Fint,e)ai ≅nQP

∑k=1

Wk PiJ(ξk)Na,J(ξk)J(ξk)

⋅ (Tint,e)abik ≅nQP

∑k=1

WkCiJkL(ξk)Na,J(ξk)N b

,L(ξk)× J(ξk)

In addition, internal variables must be updated:

zα+1 = arg inf Fzα(∇u, t)

The MaterialModel class implements:

⋅ computeEnergy(∇u,zα, t) → W ∗

⋅ computeStresses(∇u,zα, t) → PiJ or σij

⋅ computeTangentMatrix(∇u,zα, t) → CiJkL

The respective strain tensor is provided by

⋅ computeStrain(∇u) → FiJ or εij

which can, of course, also be called within theelement to conveniently compute stresses, etc.

Internal variables are updated by

⋅ updateInternalVariables(∇u,zα, t)→ zα+1 (if no update, simply return zα)

The Element class implements:

⋅ computeEnergy(Uhe , t) → Ie

where Uhe = u1

e, . . . ,une

⋅ computeForces(Uhe , t) → F 1

int,e, . . . ,Fnint,e

⋅ computeStiffnessMatrix(Uhe , t) → (Tint,e)abik

Note that the element has a MaterialModel, whichis used to compute W , PiJ , and CiJkL from ∇u,zα, and time t.

The element stores and updates z:

⋅ updateInternalVariables(Uhe , t):

update zα ← zα+1

by calling the MaterialModel.

80


assembler:

The assembly procedure calculates

⋅ I(Uh) =neAe=1Ie(Uh

e )

⋅ Fint(Uh) =neAe=1Fint,e(Uh

e )

⋅ Tint(Uh) =neAe=1Tint,e(Uh

e )

These are the global quantities derived from localelement quantities.

The Assembler class implements:

⋅ assembleEnergy(Uh, t) → I

where Uh = u1, . . . ,un

⋅ assembleForces(Uh, t) → Fint −Fext

⋅ assembleStiffnessMatrix(Uh, t) → Tint

The assembler calls all elements to request theircontributions, then assembles those into the globalmatrices.

The assembler also implements

⋅ updateInternalVariables(Uh, t)

which asks each element to update its internalvariables (usually called at the end of a con-verged solver step).

Finally, note that material models and elements use Voigt notation for all stiffness matrices.

18 19

30 32

1 2

43


element We

local nodes 1,2,3,4

quadraturepoints

material model

u19

Fint18

assembler:Fint,e

1

ue4

x

h

xy

xk

P=P (Ñu)

Fint,e Ue

Ñu ( )xkP, C

C= C (Ñu)

element

quadraturerule

W , )x k k(



solver:

nodes = 1, (0,0), 2, (0.5,1.2), ...

connectivity = 1,2,13,12, ..., 18,19,32,30, ...

mesh:

F (inth

U ) - F = 0ext

F , TinthUi

hU

ess. BCs12

u = 0x

Figure 1: Illustration of the overall code structure.

81

Ae/AM/CE/ME 214b – Computational Solid Mechanics March 7, 2017Winter 2017 Prof. D. M. Kochmann, Caltech

Computational Solid Mechanics – Part II

(Ae/AM/CE/ME 214b)

Dennis M. Kochmann

Division of Engineering and Applied ScienceCalifornia Institute of Technology

18 19

30 32

1 2

43


element We

local nodes 1,2,3,4

quadraturepoints

material model

u19

Fint18

assembler:Fint,e

1

ue4

x

h

xy

xk

P=P (Ñu)

Fint,e Ue

Ñu ( )xkP, C

C= C (Ñu)

element

quadraturerule

W , )x k k(



solver:

nodes = 1, (0,0), 2, (0.5,1.2), ...

connectivity = 1,2,13,12, ..., 18,19,32,30, ...

mesh:

F (inth

U ) - F = 0ext

F , TinthUi

hU

ess. BCs12

u = 0x

Copyright © 2016 by Dennis M. Kochmann

82


23 Microstructure & Unit Cells

We will consider problems in which the macroscale response of a body depends on features onthe microscale of the material the body is made of. Here and in the following, macro and microdo not refer to particular length scales but solely to the bigger and smaller scale of concern.

A key assumption in most models is that material is statistically homogeneous, i.e., the sta-tistical properties of the microstructure do not change from point to point of the material at themacroscale. Statistical properties include, e.g., averages such as volume averages but also higher-order statistical data such as two-point correlation or n-point correlation data. If a material is notstatistically homogeneous, it is called statistically inhomogeneous (e.g., consider a compositewith a concentration gradient).

When discussing microstructures, we often consider some notion of a unit cell (UC) and we willneed average quantities computed by averaging over the volume V = ∣Ω∣ of a UC Ω. Thus, we willdefine

⟨⋅⟩Ω = 1

V∫

Ω(⋅)dV. (23.1)

Notice that we define the average with respect to the undeformed configuration (and that averagemay not agree with its deformed counterpart since volumes change with deformation). To this end,observe that

⟨⋅⟩ϕ(Ω) =1

v∫ϕ(Ω)

(⋅)dv =∫ϕ(Ω)

(⋅)dv

∫ϕ(Ω)

dv=∫

Ω(⋅) ϕJ dV

∫ΩJ dV

= ⟨⋅⟩Ω

⟨J⟩Ω≠ ⟨⋅⟩Ω (23.2)

in general (unless the volume does not change). In the following, we will always average withrespect to the undeformed configuration and thus omit the subscript Ω (in linearized kinematicsthere is no difference anyways).

23.1 Ensembles and Representative Volume Elements

Microstructures often display either randomness or periodicty, or combinations thereof.

A periodic microstructure admits the identification of a unit cell, i.e., the simplest (smallest)repeating substructure so that the complete microstructure results from periodic repetition of theunit cell. The size of the unit cell matches the length scale of periodicity. For periodic systems, thechoice of the unit cell is never unique; i.e., the size is but exact location within the microstructureis not since the system displays translational invariance.

If we generate random microstructures (e.g., by randomly arranging spherical particles of a givensize and volume fraction into a matrix material), then each random microstructure creation resultsin a different configuration or realization. A set of multiple realizations is called an ensembleof realizations. Assume that each configuration produces a response which generally differs betweenrealizations (e.g., some averaged quantity).

83


The ensemble average of a response over a set of N realizations is defined by

⟪⋅⟫N = 1

N

N

∑i=1

(⋅)i. (23.3)

For an increasing number of realizations, the central limit theorem from probability theory tellsthat this sequence converges for a truly random generation process, i.e.,

limN→∞

⟪⋅⟫N = ⟪⋅⟫∞. (23.4)

Also, the individual realizations follow a Gaussian distribution.

An alternative strategy to reaching a statistical limit is by sample enlargement, i.e., by increasingthe size of the unit cell while keeping microstructural feature sizes constant (e.g., for a constantparticle size and volume fraction increase the unit cell size). With enlargement the response ofthe unit cell, typically also exhibits convergence. Let us denote by L the unit cell size and by l acharacteristic microstructural size (e.g., particle diameter). Then we can define another limit by

⟨⋅⟩∞ = limL/l→∞

⟨⋅⟩Ω(L). (23.5)

Note that there is no reason for the unit cell to be cuboidal; it can have in principle any shape,and the shape loses significance as the sample size increases, i.e., in the limit L/l →∞.

Note that the convergence by sample enlargement vs. ensemble averaging at fixed size is not equallyuniform (ensemble average typically leads to smoother convergence). However, we should have that

⟨⋅⟩∞ = ⟪ ⟨⋅⟩∞ ⟫∞. (23.6)

Of course, we would ideally want an infinitely large sample, for which averaged quantities donot differ anymore between realizations. Such a sample is called statistically representative,and a single analysis would be sufficient with such a sample to reveal the effective behavior ofthe microstructure. This is generally called a representative volume element (RVE). For allpractical purposes, we cannot work with infinite RVE sizes but will choose a finite RVE size whoseresponse is as close to the infinite limit as needed. For example, one can find the required numberof realizations N or the size L of an RVE by checking for a specific quantity if

⟪⋅⟫N+1 − ⟪⋅⟫N⟪⋅⟫N

≤ ε, ⟨⋅⟩i+1 − ⟨⋅⟩i⟨⋅⟩i

≤ ε (23.7)

where ⟨⋅⟩i = ⟨⋅⟩Ω(Li) denotes a sequence of increasing sample sizes with limi→∞Li =∞, and ε > 0 isa tolerance. That is, one finds a sample size that is sufficiently large for accuracy, so that a singlecalculation based on the identified RVE is sufficient to extract effective material behavior.

If the microstructure is periodic, then a single unit cell is sufficient in most cases to serve as RVE(unless in case of, e.g., bifurcation or instabilities of any kind).

In the following, we will assume that a suitable RVE for computational purposes has been identifiedand is used for subsequent calculations and analysis. For mechanical problems, the key fields to beaveraged are stresses and strains, for which we can derive specific averaging theorems.

84


24 Averaging Theorems

For mechanical problems, we will have to compute average mechanical quantities from RVEs.Therefore, the following averaging theorems will be most helpful.

24.1 Averaging Theorems in Linearized Kinematics

Consider an RVE Ω with volume V = ∣Ω∣. The average infinitesimal strain tensor is computed as

⟨εij⟩ =1

V∫

Ωεij dV = 1

2V∫

Ω(ui,j + uj,i) dV = 1

2V∫∂Ω

(uinj + ujni) dS (24.1)

or, symbolically,

⟨ε⟩ = 1

V∫∂Ω

sym (u⊗n) dS or shorter ⟨ε⟩ = 1

V∫∂Ωu⊙ndS, (24.2)

where we adopted the mathematical notation

a⊙ b = sym(a⊗ b) = 1

2(a⊗ b + b⊗ a) . (24.3)

Note that in case of discontinuous displacement fields, the boundary integral must account fordisplacement jumps across interfaces. If no such discontinuous exist, we say the composite phasesare perfectly bonded (this also implies that tractions are continuous across interfaces). Otherwise,we have

⟨ε⟩ = 1

V∫∂Ω

sym (u⊗n) dS + 1

V∫C

sym ([[u]]⊗n) dS, (24.4)

where C is an interior discontinuous interface, [[⋅]] = (⋅)+ − (⋅)− denotes the jump across C, and nis chosen to point from the + to the − side of the interface.

Analogously, the average stress tensor is derived by exploiting the relation

σij = σikδjk = σikxj,k, (24.5)

so that (by using linear momentum balance and tractions ti = σiknk)

⟨σij⟩ =1

V∫

Ωσikxj,kdV = 1

V[∫

∂ΩσikxjnkdS − ∫

Ωσik,kxj dV ]

= 1

V[∫

∂Ωtixj dS − ∫

Ωρ(ai − bi)xj dV ] .

(24.6)

This is equivalent to saying

⟨σ⟩ = 1

V[∫

∂Ωt⊗xdS − ∫

Ωρ(a − b)⊗xdV ] . (24.7)

In case of quasistatic processes and in the absence of body forces we thus have

⟨σ⟩ = 1

V∫∂Ωt⊗xdS = 1

V∫∂Ωx⊗ tdS (24.8)

where the latter form results from the fact that σ is by definition symmetric (note that [[t]] = 0in equilibrium in case of internal interfaces).

85


24.2 Averaging Theorems in Finite Kinematics

Analogously, in finite deformations the average deformation gradient tensor in an RVE Ω is

⟨FiJ⟩ =1

V∫

Ωϕi,J dV = 1

V∫∂ΩϕiNJ dS (24.9)

or

⟨F ⟩ = 1

V∫∂Ωϕ⊗N dS (24.10)

Analogous to the linearized setting, the average first Piola Kirchhoff stress tensor is derived byexploiting the relation

PiJ = PiKδJK = PiKXJ,K , (24.11)

so that (by using linear momentum balance and tractions Ti = PiKNK)

⟨PiJ⟩ =1

V∫

ΩPiKXJ,K dV = 1

V[∫

∂ΩPiKXJNK dS − ∫

ΩPiK,KXJ dV ]

= 1

V[∫

∂ΩTiXJ dS − ∫

ΩR(Ai −Bi)XJ dV ] .

(24.12)

Using symbolic notation, the above is equivalent to

⟨P ⟩ = 1

V[∫

∂ΩT ⊗X dS − ∫

ΩR(A −B)⊗X dV ] . (24.13)

Under quasistatic conditions and in the absence of body forces, we thus have

⟨P ⟩ = 1

V∫∂ΩT ⊗X dS (24.14)

86


25 Numerical Evaluation of RVE Averages

25.1 Average Deformation Gradient

When the above averaging theorems are used in the FE context, it is convenient to find theirapproximate analogues within the discrete FE representation. For example, let us begin with theaverage deformation gradient in a RUC, for which we showed that

⟨F ⟩ = 1

V∫∂Ωϕ⊗N dS. (25.1)

The approximate numerical counterpart reads

⟨F h⟩ = 1

V∑e∫∂Ω′

e

n

∑a=1

ϕaeNa(X)⊗N dS

= 1

V∑e

n

∑a=1

ϕae ∫∂Ω′

e

Na(X)⊗N dS,

(25.2)

where ∂Ω′e ⊂ ∂Ω denotes the external part of the boundary associated with element e. Consider,

e.g., elements that interpolate linearly on the boundary (e.g., simplicial elements or (bi/tri)linearelements) and an RVE geometry with planar faces. In this case N = Ne = const. across elementson the same face, so that

⟨F h⟩ = 1

V∑e

n

∑a=1

ϕae ⊗Ne∫∂Ω′

e

Na(X)dS = 1

V

n

∑a=1

ϕa ⊗ 1

d∑e

NaeAe, (25.3)

where Nae denotes the outward unit normal at node a as seen from element e (for corner or edge

nodes, two or three such normals exist per node, respectively, and all must be accounted for withinall adjacent elements). The factor 1/d in d dimensions stems from the integration of the shapefunction on the boundary. Ae denotes the element external boundary area (in 3D) or external edgelength (in 2D). If we introduce an effective nodal normal Na, then

⟨F h⟩ = 1

V

n

∑a=1

ϕa ⊗ Na with Na = 1

d∑e

NaeAe. (25.4)

Thus, (25.1) has been transformed into a discrete sum over the deformed nodal positions of allsurface nodes, which can easily be evaluated for a given RVE and state of deformation.

25.2 Average Stress Tensor

Analogous to the previous section, we can derive a discrete version of the averaging theorem forthe stress tensor, starting with

⟨P ⟩ = 1

V∫∂ΩT ⊗X dS. (25.5)

Recall that surface tractions were related to external forces applied to nodes on the surface via

F a = ∫∂ΩTNadS. (25.6)

87


Also, for isoparametric elements we may write for each element (including those on the surface)

X = ∑a=1

XaeN

ae (X). (25.7)

Applying both to (25.5) yields

⟨P h⟩ = 1

V∑e∫∂Ω′

e

T ⊗∑a=1

XaeN

ae (X)dS = 1

V∑a=1∑e∫∂Ω′

e

TNae dS ⊗Xa (25.8)

If we now introduce the total force F a applied to each boudnary node, then

⟨P h⟩ = 1

V∑a

F a ⊗Xa with F a =∑e∑a=1∫∂Ω′

e

TNae dS. (25.9)

Note that nodal force Nae contains contributions from all adjacent boundary edges/surfaces (there-

fore the sum over all elements). Again, we have transformed the continuous averaging theorem,(25.5), into a discretized version to be evaluated in FE simulations.

25.3 Linearized Kinematics

The analogous relations can be derived for linearized kinematics. Without derivation, we mentionhere that the averaging theorem for the infinitesimal strain tensor,

⟨ε⟩ = 1

V∫∂Ω

sym (u⊗n) dS, (25.10)

can be treated analogously to (25.4), which leads to

⟨εh⟩ = 1

V

n

∑a=1

sym (ua ⊗ na) with na = 1

d∑e

naeAe. (25.11)

Similarly, starting with

⟨σ⟩ = 1

V∫∂Ωt⊗xdS, (25.12)

by analogy to (25.9) we arrive at the discrete version of the average stress theorem

⟨σh⟩ = 1

V∑a

F a ⊗xa with F a =∑e∑a=1∫∂Ω′

e

tNae dS. (25.13)

88


26 Homogenization Problem

Assuming a statistically homogeneous microstructure, the mechanical homogenization problem canbe summarized as follows. We formulate the problem in finite kinematics and can easily deducethe linearized version later if needed.

Let us first consider a heterogeneous body with microstructure so the constitutive relation changesfrom point to point, i.e., P = P (X,F ), as does the mass density R(X). Therefore, the hetero-geneous problem reads

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

find ϕ(X, t) s.t.

divP (X,F (X)) +R(X)B(X) = R(X)A(X) in Ω,

ϕ(X) = ϕ(X) on ∂ΩD,

T (X) = P (X)N(X) = T (X) on ∂ΩN ,

P = P (X,F ).

(26.1)

Because of the small-scale variations in material properties, stress and strain fields are highlyoscillatory, so the solution is challenging to find.

Assume that the fields of interest (e.g., stresses and strains) on the microscale vary over a character-istic length scale l and the macroscale features (e.g., dimensions of body Ω, variation of boundaryconditions T and ϕ and body forces B) are of characteristic length scale L. If

ε = l

L≪ 1, (26.2)

then we may assume what is known as a separation of scales.

If a separation of scales applies, we may take the limit ε → 0. For a statistically homogeneous mi-crostructure, this results in an effective material with constant, homogeneous material propertiessuch as constant mass density R(X) = R∗ and a constant constitutive response P (X,F ) = P ∗(F ).We generally label all effective quantities by an asterisk (⋅)∗. This defines the homogenizedproblem:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

find ϕ∗(X, t) s.t.

divP ∗(F (X)) +R∗B(X) = R∗A∗(X) in Ω,

ϕ∗(X) = ϕ(X) on ∂ΩD,

T ∗(X) = P ∗(X)N(X) = T (X) on ∂ΩN ,

P ∗ = P ∗(F ).

(26.3)

The solution ϕ∗ is smooth compared to the solution ϕ of the heterogeneous problem. Note thatwe assume a local form of the constitutive relations which may not be possible in general due to,e.g., discontinuous stress or strain fields at material interfaces at the micro-level. However, whenusing the FE method, we solve the problem with relaxed continuity requirements anyways and weassume that the possibly discontinuous micro-fields translate into smooth macro-fields.

Note that one can extend the above homogenized problem to statistically inhomogneous microstruc-tures, in which case we have an inhomogeneous effective material with P ∗ = P ∗(X,F ∗) but thevariation in X is of O(L) and not O(l).

89


Solving the above homogenized problem can be accomplished by the traditional FE techniquesestablished previously in this course. However, in order to solve the homogenized problem, we firstneed to find the effective density R∗ as well as the effective constitutive relation P ∗ = P ∗(F ∗).We generally identify the effective response from an RVE that was deemed sufficient to find (anapproximation of) the limit ⟨⋅⟩∞ by a single test, independent of the particular microstructuralrealization. Precisely that is the objective of computational homogenization to be discussed inthe following.

90


27 Micro-to-Macro Transition

27.1 Effective Material Properties

By conservation of mass, we can find the effective mass density via

R∗V = ∫ΩR(X)dV ⇒ R∗ = ⟨R⟩ (27.1)

Analogously, we define the effective stress and strain tensors as the RVE-averaged quantities, viz.

P ∗ = ⟨P ⟩ and F ∗ = ⟨F ⟩ (27.2)

In order to find an effective constitutive relation we must relate these two averages. That is, wemust apply some boundary conditions to the RVE to induce stress and strain fields. For given BCs,one can solve the equilibrium equations within the RVE and use the above averaging theorems toidentify the average quantities. Then, one defines the effective constitutive relation as the bestfunctional form that relates P ∗ and F ∗.

It is important to note this choice of relating P ∗ to F ∗ is not unique. One could alternatively useother stress or strain measures, which results in different effective relations because generally, e.g.,

⟨P ⟩ = ⟨FS⟩ ≠ ⟨F ⟩ ⟨S⟩. (27.3)

In the following, we stick to our choice of P ∗ and F ∗ as stress and strain measures in order toextract the effective response. In linearized kinematics this if, of course, not an issue and one canuniquely use ε∗ and σ∗.

In order to avoid the influence of body forces (which is legitimate if body forces vary on themacroscale), we will solve the RVE problem in the absence of body forces. Also, we will assumethat the effective constitutive behavior is an intrinsic material property that is not affected bythe boundary value problem to be solved. This implies that, if dynamic behavior is present, thewavelength λ of any dynamic behavior is on the order of the macroscale, λ ∼ L (and not λ ∼ l).This allows us to include inertial effects on the macroscale, so that the RVE problem to be solveddoes not include inertial effects. Overall, this implies that the homogenization problem to be solvedis the quasistatic one,

divP (X,F ) = 0 in Ω or divσ(x,ε) = 0 in Ω (27.4)

In order to identify suitable BCs to be applied to the RVE, we need the following concepts.

27.2 The Hill-Mandel Condition

Instead of solving the strong form, computational techniques usually prefer to solve the weak formwhich derives from the first variation of the associated variational problem, as discussed in Part I.Since we seek an effective macroscale formulation that is equivalent to the full-resolution microscaleproblem, we postulate that the averaged first variation on the microscale,

δI

V= 1

V∫

ΩδW dV = 1

V∫

Ω

∂W

∂F⋅ δF (X)dV = ⟨P (X) ⋅ δF (X)⟩, (27.5)

91


equals the effective pointwise variation on the macroscale, which is

δW ∗ = V P ∗ ⋅ δF ∗. (27.6)

Equating the two yields the so-called Hill-Mandel condition for micro-macro equivalence:

P ∗ ⋅ δF ∗ = ⟨P (X) ⋅ δF (X)⟩ (27.7)

The above can also be interpreted as the equivalence of micro- and macro-power by writing alter-natively

P ∗ ⋅ F ∗ = ⟨P (X) ⋅ F (X)⟩ (27.8)

For simplicity of notation, we will omit the variation in the following. Thus, we write in finite andinfinitesimal kinematics, respectively,

P ∗ ⋅F ∗ = ⟨P (X) ⋅F (X)⟩ and σ∗ ⋅ ε∗ = ⟨σ(x) ⋅ ε(x)⟩. (27.9)

Note that in small strains, the relation admits a clear energetic interpretation as it relates themicroscale and macroscale strain energy densities (for finite deformations, the equivalence must beinterpreted with regards to the weak form or the first variation or virtual work).

We now must find BCs that ensure the above energetic equivalence. There are three possible typesof BCs that do exactly that:

(i) affine displacement BCs which enforce Dirichlet boundary conditions

x = F0X on ∂Ω. (27.10)

(ii) uniform traction BCs which enforce Neumann boundary conditions

T = P0N on ∂Ω. (27.11)

(iii) periodic BCs which enforce

x+ = x− +F0(X+ −X−) and T + = T − on ∂Ω. (27.12)

In order to verify that those boundary conditions satisfy the Hill-Mandel condition, let us writethe equivalence criterion by expanding stresses and strains into averages plus fluctuation fields:

P = ⟨P ⟩ + P , F = ⟨F ⟩ + F . (27.13)

Taking averages of these two forms indicates that

⟨P ⟩ = 0, ⟨F ⟩ = 0, (27.14)

i.e., the fluctuation fields must have zero averages (this is what defines a fluctuation field).

This leads to

⟨P ⋅F ⟩ = ⟨(⟨P ⟩ + P ) ⋅ (⟨F ⟩ + F )⟩= ⟨⟨P ⟩ ⋅ ⟨F ⟩⟩ + ⟨P ⋅ ⟨F ⟩⟩ + ⟨⟨P ⟩ ⋅ F ⟩ + ⟨P ⋅ F ⟩= ⟨P ⟩ ⋅ ⟨F ⟩ + ⟨P ⟩ ⋅ ⟨F ⟩ + ⟨P ⟩ ⋅ ⟨F ⟩ + ⟨P ⋅ F ⟩= ⟨P ⟩ ⋅ ⟨F ⟩ + ⟨P ⋅ F ⟩.

(27.15)

92


Hill-Mandel requires that ⟨P ⋅ F ⟩ = ⟨P ⟩ ⋅ ⟨F ⟩, which implies that the boundary conditions mustguarantee that the fluctuations satisfy

⟨P ⋅ F ⟩ = 0. (27.16)

The following relation will be helpful in analyzing the different boundary condition cases:

⟨P ⋅F ⟩ = 1

V∫

ΩPiJϕi,J dV = 1

V[∫

ΩPiJNJϕidV − ∫

ΩPiJ,JϕidV ]

= 1

V[∫

ΩT ⋅ϕdV − ∫

ΩDivP ⋅ϕdV ]

= 1

V[∫

ΩT ⋅ϕdV − ∫

ΩR(A −B) ⋅ϕdV ] .

(27.17)

Again, we are assuming that phases are perfectly bonded; otherwise, discontinuities must be ac-counted for by additional interface integrals. Also, as discussed above, the RVE problem is solvedwithout body forces and inertial effects, so that

⟨P ⋅F ⟩ = 1

V∫

ΩT ⋅ϕdV (27.18)

93


28 RVE Boundary Conditions

Next, we consider the three types of BCs listed above and verify that they satisfy the Hill-Mandelequivalence condition.

28.1 Uniform Traction BCs

Let us first consider uniform tractions applied to the entire boundary of the RVE, i.e.,

T = P0N on ∂Ω, (28.1)

where P0 is a yet unknown constant stress tensor. Using the average stress theorem for quasistaticproblems and in the absence of body forces and discontinuities gives

⟨P ⟩ = 1

V∫∂ΩT ⊗X dS = 1

V∫∂ΩP0N⊗X dS = P0

V∫∂ΩN⊗X dS = P0

V∫

ΩGradX dV = P0, (28.2)

where we used that

∫ΩXI,J dV = δIJ ∫

ΩdV so that ∫

ΩGradX dV = V I. (28.3)

This indicates that P0 is nothing but the average stress inside the RVE, which offers a convenientway to impose average RVE stresses via uniform traction BCs.

Now, let us turn to the Hill-Mandel condition and observe that, using (27.17),

⟨P ⋅F ⟩ = 1

V∫∂ΩT ⋅ϕdS = 1

V∫∂ΩP0N ⋅ϕdS = 1

V∫∂ΩP 0iJNJϕidS

= P0iJ

V∫∂ΩNJϕidS = P0 ⋅

1

V∫∂Ωϕ⊗N dS = P0 ⋅ ⟨F ⟩,

(28.4)

where we used identity (24.10) to arrive at the last expression. This implies that uniform tractionBCs with P0 = ⟨P ⟩ guarantee that

⟨P ⋅F ⟩ = P0 ⋅ ⟨F ⟩ = ⟨P ⟩ ⋅ ⟨F ⟩, (28.5)

which confirms that the Hill-Mandel condition is satisfied.

28.2 Affine Displacement BCs

Next, consider the case of affine displacements imposed across the entire RVE boundary, i.e.,

ϕ = F0X on ∂Ω (28.6)

with some yet unknown constant tensor F0. Using the average strain theorem (24.10) shows that

⟨F ⟩ = 1

V∫∂Ωϕ⊗N dS = 1

V∫∂ΩF0X ⊗N dS = F0

V∫∂ΩX ⊗N dS = F0. (28.7)

94


In analogy to the above uniform traction case, this result indicates that F0 is the average deforma-tion gradient inside the RVE, which offers a convenient way to impose average RVE deformationgradients via affine displacement BCs.

Hill-Mandel can be verified by

⟨P ⋅F ⟩ = 1

V∫∂ΩT ⋅ϕdS = 1

V∫∂ΩT ⋅F0X dS = 1

V∫∂ΩTiF

0iJXJ dS

= F 0iJ

1

V∫∂ΩTiXJ dS = F0 ⋅

1

V∫∂ΩT ⊗X dS = F0 ⋅ ⟨P ⟩,

(28.8)

where the last step resulted from the average stress theorem (24.13) (for quasistatic problems andin the absence of body forces and discontinuities).

Therefore, affine displacement BCs with F0 = ⟨F ⟩ guarantee that

⟨P ⋅F ⟩ = F0 ⋅ ⟨P ⟩ = ⟨F ⟩ ⋅ ⟨P ⟩, (28.9)

which confirms that the Hill-Mandel condition is satisfied.

28.3 Periodic BCs

So far we have seen that uniform traction BCs offer a convenient way to impose average RVEstresses, whereas affine displacement BCs can conveniently be used to enforce average RVE de-formation gradients. Both types of BCs have shortcomings when considering the physics of theproblem. Uniform tractions result in a response that is generally too compliant (they produce alower bound), while affine displacements result in a response that is too stiff (they produce an upperbound, to be discussed later). Also, uniform traction BCs do not impose any deformation on theRVE so that the displacements on the RVE boundary will not be matching nor linearly distributed;by contrast, affine displacement boundary conditions result in a “tile-able” deformation but thetractions on opposite RVE boundaries are not matching and tractions are not uniform.

As an alternative, periodic BCs offer a compromise. We decompose the boundary into two parts∂Ω+ and ∂Ω− so that ∂Ω = ∂Ω+ ∪ ∂Ω−. Each point X+ on ∂Ω+ is linked to a unique point X− on∂Ω−, and the outward normal vectors at those points satisfy N− = −N+. Periodic BCs are thendefined by

ϕ+ −ϕ− = F0(X+ −X−) and T + = −T − on ∂Ω. (28.10)

That is, displacements are forced to be periodic and tractions are forced to be anti-periodic. Bycomparison, affine displacement BCs only enforce periodic displacements, whereas uniform tractionBCs only enforce anti-periodic tractions. Notice that for pratical purposes we may also writeu+ −u− = (F0 − I)(X+ −X−).

Periodic BCs ensure that, if the RVE is in equilibrium with BCs (28.10), the “tiled” solution of aperiodic microstructure is also in equilibrium (since tractions are anti-periodic) and geometricallymatching (since displacements are periodic); this also implies that stress and strain fields arecontinuous and periodic. This reveals that periodic BCs are ideally suited to simulate bodies withperiodic microstructure whose unit cell coincides with the RVE. In practice, one often uses periodicBCs also for problems whose microstructure is not periodic.

95


In order to identify F0, let us use the average strain theorem (24.10) and make use of the boundarydecomposition and N− = −N+:

V ⟨F ⟩ = ∫∂Ωϕ⊗N dS = ∫

∂Ω+

ϕ+ ⊗N+dS + ∫∂Ω−

ϕ− ⊗N−dS

= ∫∂Ω+

[ϕ− +F0(X+ −X−)]⊗N+dS + ∫∂Ω−

ϕ− ⊗N−dS

= ∫∂Ω+

[ϕ− +F0(X+ −X−)]⊗N+dS − ∫∂Ω+

ϕ− ⊗N+dS

= ∫∂Ω+

F0(X+ −X−)⊗N+dS

= ∫∂Ω+

F0X+ ⊗N+dS − ∫

∂Ω+

F0X− ⊗N+dS

= ∫∂Ω+

F0X+ ⊗N+dS + ∫

∂Ω−

F0X− ⊗N−dS

= ∫∂ΩF0X ⊗N dS = F0∫

∂ΩX ⊗N dS = V F0.

(28.11)

This proves that ⟨F ⟩ = F0 so that we have identified another way to impose average deformationgradients.

Next, we will verify that the periodi BCs (28.10) satisfy the Hill-Mandel condition:

V ⟨P ⋅F ⟩ = ∫∂ΩT ⋅ϕdS = ∫

∂Ω+

T + ⋅ϕ+dS + ∫∂Ω−

T − ⋅ϕ−dS

= ∫∂Ω+

T + ⋅ [ϕ− +F0(X+ −X−)] dS − ∫∂Ω+

T + ⋅ϕ−dS

= ∫∂Ω+

T + ⋅F0(X+ −X−)dS

= ∫∂Ω+

T + ⋅F0X+dS − ∫

∂Ω+

T + ⋅F0X−dS

= ∫∂Ω+

T + ⋅F0X+dS + ∫

∂Ω−

T − ⋅F0X−dS

= ∫∂ΩT ⋅F0X dS = ∫

∂ΩTiF

0iJXJ dS = F 0

iJ ∫∂ΩTiXJ dS

= F0 ⋅ ∫∂ΩT ⊗X dS = F0 ⋅ V ⟨P ⟩,

(28.12)

where we used the average stress theorm (24.13) to arrive at the final form (for quasistatic problemsand in the absence of body forces and discontinuities). Therefore, since F0 = ⟨F ⟩ (as shown above),we conclude that

⟨P ⋅F ⟩ = F0 ⋅ ⟨P ⟩ = ⟨F ⟩ ⋅ ⟨P ⟩, (28.13)

which proves the Hill-Mandel equivalence condition for periodic boundary conditions.

28.4 Enforcing RVE Averages

So far we have only discussed different types of boundary conditions but not under what cir-cumstances they should be used and, especially, how to numerically enforce them. We generallydifferentiate three types of RVE formulations:

96


(i) average strain-driven BCs enforce ⟨F ⟩ = F ∗. As discussed above, it is natural to useeither affine displacement BCs or periodic BCs for this case.

(ii) average stress-driven BCs enforce ⟨P ⟩ = P ∗. As discussed above, it is natural to useuniform traction BCs for this case.

(iii) mixed stress/strain-driven BCs enforce some components of ⟨F ⟩ = F ∗ and some compo-nents of ⟨P ⟩ = P ∗ (i.e., each component enforces either a stress or a strain).

Note that the above exmaples are not exclusive; e.g., one can also use periodic BCs to enforceaverage stresses – the formulation is just not as simple as for the cases mentioned above.

Numerical Aspects:

The three most popular choices in computational homogenization are the following:

From an implementation viewpoint, strain-driven affine displacement BCs are simplest toenforce since all they require is to enforce (28.6) with known average F0 = ⟨F ⟩ everywhere onthe RVE boundary ∂Ω.

Stress-driven uniform traction BCs are, in principle, simple to enforce since they require theapplication of tractions (28.1) with known P0 = ⟨P ⟩ everywhere on the RVE boundary ∂Ω.The devil is in the detail though since any numerical implementation requires the suppressionof rigid body motion and fixing individual nodes on ∂Ω leads to uncontrollable reaction forcesthat may break the applied tractions. Therefore, special caution is required.

Strain-driven periodic BCs can be implemented in a number of ways. The next sectionwill discuss a particular numerical formulation to implement those in an elegant practicalfashion. Alternatives are, e.g., penalty methods which add to the total potential energy tobe minimized a constraint potential of the form

IC[u] = 1

2ε∫∂Ω+

∥(F0 − I)(X+ −X−) − (u+ −u−)∥2dS, (28.14)

where 0 < ε≪ 1. Taking variations yields

δIC[u] = −1

ε∫∂Ω+

[(F0 − I)(X+ −X−) − (u+ −u−)] ⋅ δu+dS

+ 1

ε∫∂Ω−

[(F0 − I)(X+ −X−) − (u+ −u−)] ⋅ δu−dS = 0.(28.15)

Therefore, if one considers the total potential energy

I[u] = ∫ΩW (F )dV + IC[u], (28.16)

then the first variation becomes

δI[u] = 0 = ∫ΩP ⋅ δF dV − 1

ε∫∂Ω+

[(F0 − I)(X+ −X−) − (u+ −u−)] ⋅ δu+dS

+ 1

ε∫∂Ω−

[(F0 − I)(X+ −X−) − (u+ −u−)] ⋅ δu−dS.(28.17)

97


Note that we can identify those two terms arising from the constraint potential as tractionsapplied on the boundary, viz.

T + = 1

ε[(F0 − I)(X+ −X−) − (u+ −u−)] on ∂Ω+,

T − = −1

ε[(F0 − I)(X+ −X−) − (u+ −u−)] on ∂Ω−.

(28.18)

This shows that tractions are indeed anti-periodic, and parameter ε controls the enforcementof periodicity. The drawback of this method is that for finite ε > 0, the solution is onlyapproximately periodic (and for ε ≪ 1 conditioning issues can affect the numerical solutionprocess).

Similar constraint potentials can be introduced to enforce, e.g., average strain-driven uniformtraction BCs but will not be discussed here.

98


29 Numerical Aspects of Periodic Boundary Conditions

There are many different ways of implementing periodic BCs numerically. This section will outlineone possible, practical solution that avoids penalty constraints and instead enforces the periodicitydirectly by modifying the linear systems to be solved.

To begin, let us review linear and more general constraints within the FE context. We showed thatthe solution to quasistatic BVPs is obtained by solving a (non)linear system

f(Uh) = Fint(Uh) −Fext = 0, (29.1)

which includes one equation for each node. Any constraint can be formulated as

fc(Uh) = 0. (29.2)

In our finite element code, we implement the constraint by replacing the equation for one dof byfc = 0. The equation to be replaced must involve the node (or one of the nodes) to be constrained.

The simplest case is that of essential BCs. Consider node a, dof i is to be set to δ; this is equivalentto

fc(Uh) = uai − δ = 0. (29.3)

More generally, we write any linear constraint as

fc(Uh) =n

∑a=1

d

∑i=1

αai uai − δ = 0. (29.4)

Now, consider periodic boundary conditions applied to a pair of (±)-nodes, which must satisfy thefollowing d constraints:

fc(Uh) = u+ −u− − (F ∗ − I)(X+ −X−). (29.5)

Let us consider a general four-sided RVE in 2D; the generalization to 3D is straight-forward butmore cumbersome to formulate. The corners be denoted as nodes 1 through 4 (starting in thebottom left corner), and the two lattice vectors are defined as

L21 =X2 −X1 =X3 −X4, L41 =X4 −X1 =X3 −X2. (29.6)

To prevent rigid body motion, we fix the bottom left node, i.e.,

ϕ1 =X1 or u1 = 0. (29.7)

As a consequence, the displacements at the remaining corner nodes must satisfy

u2 = u1 + (F ∗ − I)(X2 −X1) = (F ∗ − I)L21,

u4 = u1 + (F ∗ − I)(X4 −X1) = (F ∗ − I)L41,

u3 = u1 + (F ∗ − I)(X3 −X1) = (F ∗ − I)(L21 +L41) = u2 +u4.

(29.8)

99


29.1 Enforcing Average Deformation Gradients:

Let us consider strain-driven periodic BCs. Consider a pair of (±)-points located on the left andright edges. We have by the periodicity of the unit cell

u+ = u− + (F ∗ − I)(X+ −X−) = u− + (F ∗ − I)L21 = u− +u2. (29.9)

Analogously, a node pair on the top and bottom edges is constrained by

u+ = u− + (F ∗ − I)(X+ −X−) = u− + (F ∗ − I)L41 = u− +u4. (29.10)

This implies that the displacements of all boundary nodes can be traced back to u2 and u4. Insummary, the constraints to be imposed are

f1(Uh) = u1,

f3(Uh) = u3 −u2 −u4,

f+(Uh) = u+ −u− −u2 for each left/right pair,

f+(Uh) = u+ −u− −u4 for each top/bottom pair

(29.11)

along with

u2 = (F ∗ − I)L21,

u4 = (F ∗ − I)L41.(29.12)

If the average F ∗ to be enforced is known, all we need to do is implement constraints (29.11)and (29.12). Notice that we formulated (29.11) free of the average deformation gradient, which willbecome very helpful in a moment.

29.2 Enforcing average stress components:

If average stresses P ∗ are to be imposed in conjunction with periodic BCs, we must impose tractionson the boundary ∂Ω of the RVE such that T = P ∗N , as shown in Section 28.1. Thus, we have

I[ϕ] = ∫ΩW (F )dV − ∫

∂ΩT ⋅ϕ dS

= ∫ΩW (F )dV − ∫

∂ΩP ∗N ⋅ϕ dS

= ∫ΩW (F )dV −P ∗ ⋅ ∫

∂Ωϕ⊗N dS,

(29.13)

which by the averaging theorem (24.10) is equivalent to

I[ϕ] = ∫ΩW (F )dV − V P ∗ ⋅ ⟨F ⟩. (29.14)

The boundary integral can be decomposed into bottom (b), top (t), left (l) and right (r) edges in

100


2D:

∫∂Ωϕ⊗N dS = V I + ∫

∂Ωbub ⊗Nb dS + ∫

∂Ωtut ⊗Nt dS + ∫

∂Ωlul ⊗Nl dS + ∫

∂Ωrur ⊗Nr dS

= V I − ∫∂Ωtub ⊗Nt dS + ∫

∂Ωt(ub +u4)⊗Nt dS

− ∫∂Ωr

ul ⊗Nr dS + ∫∂Ωr

(ul +u2)⊗Nr dS

= V I +u4 ⊗ ∫∂ΩtNt dS +u2 ⊗ ∫

∂ΩrNr dS,

(29.15)

where we used that

∫∂ΩX ⊗N dS = ∫

ΩGradX dV = V I. (29.16)

Therefore,

P ∗ ⋅ ∫∂Ωϕ⊗N dS = P ∗ ⋅ [V I +u4 ⊗ ∫

∂ΩtNt dS +u2 ⊗ ∫

∂ΩrNr dS]

= V P ∗ ⋅ I +u4 ⋅P ∗∫∂ΩtNt dS +u2 ⋅P ∗∫

∂ΩrNr dS.

(29.17)

In summary, the total potential energy becomes (dropping the term independent of displacements)

I[ϕ] = ∫ΩW (F )dV −u4 ⋅P ∗∫

∂ΩtNt dS −u2 ⋅P ∗∫

∂ΩrNr dS. (29.18)

If the average stress tensor P ∗ is known and to be imposed, than the above potential energy canbe reinterpreted as a BVP that applies constant external forces

F4 = P ∗∫∂ΩtNt dS, F2 = P ∗∫

∂ΩrNr dS (29.19)

to nodes 4 and 2, respectively. Thus, we can enforce an average stress tensor P ∗ by imposingconstraints (29.11) along with applying external forces (29.19). That is, nodes 2 and 4 are free todisplace while external forces are being applied to those two nodes.

29.3 Enforcing mixed averages:

Next, assume that some of the components of F ∗ and some of P ∗ are known. For example, considera uniaxial extension test enforcing a stretch λ in the X1-direction, while leaving the transverse X2-direction stress-free; this means

F ∗ = ( λ F12

F21 F22) , P ∗ = (P11 0

0 0) , (29.20)

where the four unknowns are F12, F21, F22, P11.

Enorcing such mixed averages is challenging in general. However, if the RVE is rectangular, thennote that (29.19) reduces to

F2 = P ∗N1L1 = L1 = (P∗11

P ∗12) , F4 = P ∗N2L2 = L2 (

P ∗21

P ∗22) , (29.21)

101


while (29.12) becomes

u2 = (F ∗ − I)N1L1 = (F∗11 − 1F ∗

12)L1, u4 = (F ∗ − I)N2L2 = ( F ∗

21

F ∗22 − 1

)L2. (29.22)

Notice that the equations decouple, so that we may enforce individual components of P ∗ and F ∗

simultaneously by enforcing selected components from those two sets of equations. That is, weneed to impose periodic BCs (29.11) complemented by whichever components of (29.22) apply.

For example, for the above uniaxial extension test we would enforce (29.11) along with

u21 = (λ − 1)L1, F2 = F4 = (0

0) , (29.23)

and u22 and u4 are to be determined.

29.4 Three Dimensions

Periodic BCs can be enforced analogously in 3D where instead of u2 and u4 we must formulateBCs by working with three corner nodes to enforce average stresses and strains. All relations forenforcing average deformation gradients and stress tensors apply analogously.

29.5 Linearized Kinematics

The above concepts apply equally in linearized kinematics. As a key difference, periodicity involvesthe infinitesimal strain tensor ε instead of deformation gradient F . That is, the correspondinglinearized expressions are obtained by replacing (F −I) by ε (as well as their averages and effectivevalues) and using normal vectors ni, positions xi, etc. in the linearized framework.

102


30 The FE2 Method

Since the above computational homogenization procedure provided a methodology to computeeffective stress-strain relations P ∗ = P ∗(F ∗), or analogously σ∗ = σ∗(ε∗), it has become convenientto use such effective constitutive laws within macroscale FE calculations. That is, the constitutiveresponse at each quadrature point is computed on-the-fly by solving the BVP at the RVE-level. Asthis approach couples to FE BVPs (at the macro- and microscales) it is known as the FE2 method.

One additional complication arises here since solving the macro-problem iteratively may requirethe tangent matrix of the material model to be used at the quadrature point-level, i.e., we need

C∗ = ∂σ∗

∂ε∗or C∗ = ∂P

∗

∂F ∗ . (30.1)

One possibility is to compute C∗ as the numerical tangent, which is prone to causing numericalerrors and possibly slow but generally easy to implement and (almost) always applicable.

As an alternative, a general strategy is to decompose all nodes into interior and boundary nodes,so that displacements and corresponding nodal forces and tangent matrix are written as

Uh = (Ui

Ub) , Fext = (Fi

Fb) , T = ∂Fint

∂Uh= (Kii Kib

Kbi Kbb) . (30.2)

Since no interior forces are being applied, we have Fi = 0. Also, by definition we must havesymmetry so that Kib =Kbi.

30.1 Affine BCs

Consider affine BCs applied to the RVE, so that the boundary displacements can be formulated as

Ub =DF ∗, (30.3)

where D is a matrix that depends on the boundary node locations and implements ϕ = F ∗X. Thegoverning equations are thus summarized as

⎡⎢⎢⎢⎢⎢⎢⎣

Fi(Uh) = 0,

Fb(Uh) −Ξ = 0,

Ub −DF ∗ = 0

(30.4)

with unknown boundary forces Ξ. Linearization of the system with respect to Uh,Ξ about anequilibrium solution Uh

0 ,Ξ0 leads to

⎡⎢⎢⎢⎢⎢⎢⎣

Fi(Uh0 ) +Kii(Uh

0 )∆Ui +Kib(Uh0 )∆Ub = 0,

Fb(Uh0 ) −Ξ +Kbi(Uh

0 )∆Ui +Kbb(Uh0 )∆Ub −∆Ξ = 0,

Ub,0 −DF ∗ +∆Ub −D∆F ∗ = 0.

(30.5)

Since we seek only interior dofs while Ub and F ∗ are fixed for a given solution step such thatUb =DF ∗. Therefore, perturbations from equilibrium must satisfy

∆Ub −D∆F ∗ = 0. (30.6)

103


Moreover, in equilibrium we know that Fi(Uh0 ) = 0 and Fb(Uh

0 ) = Ξ, so that

∆Ui = −K−1ii (Uh

0 )Kib(Uh0 )∆Ub (30.7)

and

Kbi(Uh0 )∆Ui +Kbb(Uh

0 )∆Ub −∆Ξ = 0 (30.8)

or, altogether,

∆Ξ = [Kbb(Uh0 ) −Kbi(Uh

0 )K−1ii (Uh

0 )Kib(Uh0 )]∆Ub

= [Kbb(Uh0 ) −Kbi(Uh

0 )K−1ii (Uh

0 )Kib(Uh0 )]D∆F ∗.

(30.9)

Finally, recall that the average stresses were obtained from

P ∗ = 1

V∑a

F a ⊗Xa = 1

VDTFi, (30.10)

where we reused the transformation matrix D introduced above (which depends on reference nodallocations on the boundary). Linearization results in

P ∗ +C∗∆F ∗ = 1

V[DTFi(Uh) +DT∆Ξ] , (30.11)

so that in equilibrium the perturbations are linked via

C∗∆F ∗ = 1

VDT∆Ξ (30.12)

Insertion of (30.9) finally admits the conclusion

C∗ = 1

VDT [Kbb(Uh

0 ) −Kbi(Uh0 )K−1

ii (Uh0 )Kib(Uh

0 )]D (30.13)

That is, the consistent tangent matrix can be obtained from the element tangent matrices once oneload step has been equilibrated. Note that because Kii is constructed from only the inner nodes,it is invertible by definition.

Similar procedures can be applied to uniform traction BCs and periodic BCs. For uniform tractionBCs, the procedure is analogous but may require iterations. For periodic BCs, the procedure ismore involved but in principle analogous. Details will be skipped here and can be found, e.g., in(Miehe and Koch, 2002).

104


31 Spectral Techniques

31.1 Reciprocal Bravais Lattices

Consider an RVE Ω discretized by a regular grid of n nodes located at lattice sitesX = X1, . . . ,Xndefined by a Bravais basis B = A1, . . . ,Ad, such that every point in the RVE can be decomposedinto a linear combination of the Bravais vectors:

Xi =d

∑i=1

ciAi, ci ∈ Z s.t. Xi ∈ Ω. (31.1)

The infinite point set spanned by B is called a Bravais lattice.

By assuming periodicity, we can tessellate Rd by periodically repeating the RVE. Thus, we have asecond set of Bravais lattice vectors L = L1, . . . ,Ld that define the corners of all RVEs in Rd andwe choose the vectors such that

Li = LiAi and (L1 − 1) ⋅ . . . ⋅ (Ld − 1) = n, (31.2)

where Li defines the number of grid points within the RVE in the Ai-direction.

Periodicity implies that all field quantities u must also periodic across RVEs, i.e., we must ensure

u(X) = u(X +d

∑i=1

ciLi) ∀ ci ∈ Z. (31.3)

To this end, note that we express the field in its discrete Fourier representation, where we interpolatethe field u by using the inverse Fourier transform, viz. we write

u(X) = ∑K∈T

u(K) exp (−ihK ⋅X) , h = 2π

n. (31.4)

We introduced the variable h for convenience and for simple redefinitions of the transform. Thechoice of vectors K must ensure (31.3), which implies that

exp (−ihK ⋅X) = exp [−ihK ⋅ (X +d

∑i=1

ciLi)] ∀ ci ∈ Z

= exp (−ihK ⋅X) ⋅ exp(−ihK ⋅d

∑i=1

ciLi) ∀ ci ∈ Z.(31.5)

This is turn requires that

hK ⋅d

∑i=1

ciLi =2π

n

d

∑i=1

ciK ⋅Li ∈ 2πZ ⇔ 1

n

d

∑i=1

ciK ⋅Li ∈ Z ∀ ci ∈ Z. (31.6)

Therefore (choose, e.g., all ci = 0 except one cj = 1), we conclude that each K-vector must satisfy

1

nK ⋅Li ∈ Z ∀ i = 1, . . . , d, (31.7)

which defines a reciprocal lattice of grid points K, which is also a Bravais lattice. To realizethis, consider, e.g., the 2D case and draw rays perpendicular to each Li at distances nZ/Li; this

105


results in a new periodic lattice with some basis K = Z1, . . . ,Zd so we may write K = ∑di=1 kiZiwith ki ∈ Z. We notice that the new basis must be related to the original basis via

Zi ⋅Lj = nδij (31.8)

Note that the conventional definition of reciprocal lattices uses n = 1.

As an important feature, the reciprocal lattice of the reciprocal lattice is the original lattice itself,which is known as the Pontryagin duality of the associated vector spaces.

Examples:

Consider a 2D lattice with Bravais basis L = L1,L2. The reciprocal lattice has Bravais basisvectors (with R ∈ SO(2) denoting a rotation matrix by 90)

Z1 = nRL2

L1 ⋅RL2, Z2 = n

RL1

L1 ⋅RL2. (31.9)

We can easily verify the reciprocity relation (31.8) by noticing that

Z1 ⋅L1 = nRL2

L1 ⋅RL2⋅L1 = n, Z2 ⋅L1 = n

RL1

L1 ⋅RL2⋅L1 = 0,

Z1 ⋅L2 = nRL2

L1 ⋅RL2⋅L2 = 0, Z2 ⋅L2 = n

RL1

L1 ⋅RL2⋅L2 = n,

(31.10)

so that indeed Zi ⋅Lj = nδij .

Similarly, a 3D Bravais basis L = L1,L2,L3 has reciprocal Bravais basis vectors

Z1 = nL2 ×L3

L1 ⋅ (L2 ×L3), Z2 = n

L3 ×L1

L1 ⋅ (L2 ×L3), Z3 = n

L1 ×L2

L1 ⋅ (L2 ×L3), (31.11)

which satisfy the reciprocity relation (31.8) since Zi ⋅Lj = nδij .

31.2 Fourier Spectral Solution Techniques

For a field variable u, we use the discrete Fourier transform F whose inversion is defined as

u(X) = F−1 (u) = ∑K∈T

u(K) exp (−ihK ⋅X) , h = 2π

n. (31.12)

The complete discretized information of u is thus contained either in its values at the grid pointsin real space, u = u(x1), . . . , u(xn), or in Fourier space, u = u(k1), . . . , u(kn). By usingthe above transform (and its inverse), both sets of data contain equivalent information.

All FE equations (mechanical, thermal, electromagnetic, etc.) involve computing derivatives of theprimary fields, e.g., to obtain strains from displacements, heat flux from temperature, etc.

The fundamental idea of Fourier spectral techniques is to not introduce shape functions but instead

use Fourier transforms to compute derivatives and

106


solve the governing equations in Fourier space.

This not only ensures higher accuracy in computing derivatives but also provides a simple way toenforce of average field quantities, as we will show in the following.

First, notice that out of all Fourier coefficients u(k), one is special as the integral over the RVEreveals:

∫Ωu(X)dV = ∑

K∈Tu(K)∫

Ωexp(−ihK ⋅X)dV

= V u(0).(31.13)

Here, we used that the integrals vanish unless K = 0 because

∫Ω

exp(−ihK ⋅X)dV = ∫L1

0∫

L2

0∫

L3

0exp(−i2π

n

d

∑i=1

kiZi ⋅X) dX1 dX2 dX3

= ∫1

0∫

1

0∫

1

0exp(−i2π

n

d

∑i=1

kiZi ⋅d

∑a=1

ξaLa)L1 dξ1L2 dξ2L3 dξ3

= V ∫1

0∫

1

0∫

1

0exp(−i2π

d

∑i=1

kiξi) dξ1 dξ2 dξ3

= Vd

∏i=1∫

1

0exp (−i2πkiξi) dξi where ki ∈ Z,

= V⎧⎪⎪⎨⎪⎪⎩

1, if k1 = . . . = kd = 0 i.e., K = 0,

0, else.

(31.14)

The latter identity can easily be verified by expanding the exponential into cosines and sines usingEuler’s identity. Overall, we have thus shown the useful identity

⟨u⟩ = u(0) (31.15)

Next, notice that

∂

∂Xu(X) = − ∑

K∈TihKu(K) exp (−ihK ⋅X) , (31.16)

so that

F (u,J) = −ihKJF(u) (31.17)

The strategy is now to transform the governing ODEs/PDEs into Fourier space and solve theresulting equations in Fourier space, where possible.

31.3 Example: Heat Conduction

Consider a thermal problem that involves equilibrating an RVE by solving the static heat equationwith a known heat source distribution S(X) = R(X)Sh(X) and with periodic boundary conditions,while enforcing an average temperature ⟨T ⟩.

107


Recall the static heat equation with a constant conductivity λ,

λT,II + S = 0 ∀ X ∈ Ω, (31.18)

which in Fourier space becomes

λ(−ihKI)(−ihKI)F(T ) +F(S) = 0 (31.19)

or

λ ∑K∈T

(−ihKI)(−ihKI)T (K) exp (−ihK ⋅X) + ∑K∈T

S(K) exp (−ihK ⋅X) = 0 (31.20)

For this to hold at all X ∈ Ω, we conclude that

−λh2KIKI T (K) + S(K) = 0 ∀ K ∈ T , (31.21)

which can be solved, unless K = 0, to yield

T (K) = S(K)λh2 ∥K∥2

∀ K ∈ T . (31.22)

Therefore, we can solve the problem by utilizing Fourier transforms as follows:

(i) transform the known source term S(X) into Fourier space to obtain S(K) for all K ∈ T ,

(ii) solve for T (K) in Fourier space:

T (K) =

⎧⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎩

⟨T ⟩ if K = 0,

S(K)h2λ ∥K∥2

else.(31.23)

(iii) apply the inverse transform to obtain T (X) from T (K) for all grid points Xi.

The advantages of this method are obvious:

No approximating shape functions must be introduced (and approximated derivatives are ofsignificantly much higher order).

No iterative solver nor linear-system solver is required; the only numerical steps required arethe (standard and inverse) Fourier transforms, which are numerically inexpensive.

No global stiffness matrix is required for the solution, which is a big memory advantage.

RVE-averages of the primary fields (and their derivatives) can be trivially enforced.

Next, consider the same problem with non-constant conductivity λ(X), so that

(λT,I),I + S = 0 ∀ X ∈ Ω (31.24)

with heat flux

QI(X) = −λ(X)T,I(X) ⇒ −QI,I + S = 0. (31.25)

108


Notice that Fourier transforming the above equation would turn the first term into a convolutionthat cannot be treated in the same simple fashion as above. To this end, we introduce a constantreference conductivity λ0 and define the perturbation P such that

QI = −(λ0T,I − PI) or PI = λ0T,I +QI . (31.26)

Insertion into (31.25) yields

(λ0T,I − PI),I + S = 0 ⇔ λ0T,II − PI,I + S = 0, (31.27)

which now can be Fourier-transformed without convolution:

−λ0h2KIKI T (K) + ihKI PI(K) + S(K) = 0 ∀ K ∈ T . (31.28)

Unless K = 0, we now obtain the solution

T (K) = S(K) + ihK ⋅ P (K)λ0h2 ∥K∥2

∀ K ∈ T . (31.29)

This form looks analogous to (31.22). Notice, however, that P = P (GradT ), so that the termK ⋅ P (K) depends on temperature T . Therefore, one cannot expect the above equation to yieldthe correct solution. Yet, it provides the basis for an iterative solution scheme:

1. Choose a conductivity λ0 > 0.

2. Pick an initial guess for T (X) at all grid points Xi (in physical space).

3. Compute Q(GradT ) and P at all grid points (in physical space).

4. Transform Q(X) to Fourier space to obtain P (K) at all K-points.

5. Solve for all T (K) via (31.29) for a given average ⟨T ⟩ or ⟨GradT ⟩ (same treatment as before).

6. Transform P (K) back into physical space to obtain P (X) at all grid points.

7. Return to step 3. until the solution has converged for all grid points.

31.4 Mechanical Problem using Fourier Spectral Techniques

The quasistatic mechanical problem is, in principle, similar to the heat conduction scenario dis-cussed above. The complication is, though, that the stress–strain relation is in general nonlinear,so that the ODE to be solved is not necessarily linear. Let us first investigate the finite-kinematicssetting and then turn to linearized kinematics.

31.4.1 Finite Kinematics

Consider a quasistatic purely mechanical problem with linear momentum balance

DivP = 0. (31.30)

109


Since P = P (F ) is a generally nonlinear stress–strain relation, transforming the above ODE intoFourier space does not present a solution. Therefore, let us introduce a perturbation stresstensor τ (x) such that

P (X) = C0F (X) − τ (X) (31.31)

with some constant fourth-order modulus tensor C0 (e.g., the RVE average ⟨C⟩).

Inserting the strain–displacement relation

F (X) = Gradϕ(X) (31.32)

into (31.30) gives

[C0iJkLϕk,L(X) − τiJ(X)]

,J= 0 (31.33)

or

C0iJkLϕk,LJ(X) = τiJ,J(X). (31.34)

This is a linear equation in the deformation mapping whose (inverse) Fourier transform here gives

ϕ(X) = ∑K∈T

ϕ(K) exp (−ihK ⋅X) , (31.35)

and analogously we can transform τ (X). This turns (31.34) into

−h2C0iJkL ϕk(K)KLKJ = −ihτiJ(K)KJ . (31.36)

Let us introduce the acoustic tensor

Aik(k) = C0iJkLKJKL (31.37)

to transform (31.36) into

−hAik(k)ϕk(k) = −iτiJ(K)KJ , (31.38)

which can be solved for the deformation mapping in Fourier space:

ϕk(K) = i

hA−1ik (K)τiJ(K)KJ (31.39)

Note that A(K) is invertible only if K ≠ 0 (and C0 is strongly elliptic).

Next, let us transform (31.32) into Fourier space, which gives

FkL(K) = −ihϕk(K)KL (31.40)

and insertion of (31.39) finally yields

FkL(K) =⎧⎪⎪⎪⎨⎪⎪⎪⎩

⟨F ⟩ if K = 0,

A−1ik (K)τiJ(K)KJKL else,

(31.41)

110


where we exploited the fact that ⟨F ⟩ = F (0).

It is important to recall that τ depends on F , so that solving (31.41) for F can only be part of aniterative solution scheme. For example, (31.41) can be solved by fixed-point iteration.

We note that even though (31.41) is formulated in terms of the deformation gradient (and thedeformation mapping is not even required), the solution is guaranteed to be compatible as follows.Compatibility requires

CurlF = 0 ⇔ εKLIFkL,K = 0. (31.42)

or, in Fourier space,

−ihεKLI FkLKK = 0. (31.43)

Insertion of (31.41) shows that

−ihεKLI FkLKK = −ihεKLIA−1ik (K)τiJ(K)KJKKKL

= −ih1

2(εKLI + εLKI)A−1

ik (K)τiJ(K)KJKKKL = 0.(31.44)

Thus, the deformation gradient field resulting from (31.41) is per definition compatible.

The solution algorithm is now as follows:

1. Choose a stiffness tensor C0.

2. Pick an initial guess for F (X) at all grid points Xi (in physical space).

3. Compute P (X) and τ (X) at all grid points (in physical space).

4. Transform τ (X) Fourier space to obtain τ (K) at all K-points.

5. Solve for all F (K) via (31.41) for a given average ⟨F ⟩.

6. Transform F (K) back into physical space to obtain F (X) at all grid points.

7. Return to step 3. until the solution has converged for all grid points.

8. After convergence is reached, (31.39) can be used for postprocessing the deformation mapping.

Note that we can conveniently use the Fourier transform to also compute other RVE-averages suchthat, e.g., the average stress tensor needed for computational homogenization since

⟨P (X)⟩ = P (0) = 1

V∫

ΩP (X) dV = 1

n

n

∑i=1

P (Xi). (31.45)

111


Note that the effective incremental stiffness tensor is

C∗ = ∂P∗

∂F ∗ = 1

n

∂

∂F ∗

n

∑i=1

P (F (Xi)) =1

n

n

∑i=1

∂P

∂F(Xi) ⋅

∂F (Xi)∂F ∗

= 1

n

n

∑i=1

C(Xi) ⋅∂

∂F ∗ [F (0) + ∑K≠0

F (K) exp (−ihK ⋅Xi)]

= 1

n

n

∑i=1

C(Xi) ⋅ [I + ∑K≠0

exp (−ihK ⋅Xi)∂F (K)∂F ∗ ]

= 1

n

n

∑i=1

C(Xi) ⋅ [I + ∑K≠0

exp (−ihK ⋅Xi)∂

∂F ∗A−T(K)τ (K)K ⊗K]

= 1

n

n

∑i=1

C(Xi) ⋅ [I + ∑K≠0

exp (−ihK ⋅Xi)A−T(K)∂τ (K)∂F ∗ K ⊗K] ,

(31.46)

which unfortunately cannot be evaluated in closed form.

31.4.2 Linearized Kinematics

In linearized kinematics, the methodology is completely analogous and starts with introducing aperturbation stress tensor:

τ (x) = C0ε(x) −σ(x). (31.47)

Applying the same steps now aims at solving for the displacement field. In Fourier space we obtainthe analogous relation to (31.39), viz.

uk(k) =i

hA−1ik (k)τij(k)kj . (31.48)

Here, the strain–displacement relation is ε = sym(gradu) which becomes in Fourier space

εkl(k) = −ih

2[ukkl + ulkk] . (31.49)

Insertion of (31.48) yields the analogous form of (31.41), viz.

εkl(k) =⎧⎪⎪⎪⎨⎪⎪⎪⎩

⟨ε⟩ if k = 0,

12[A−1

ik (k)kl +A−1il (k)kk]τij(k)kj else.

(31.50)

The displacement field in Fourier space finally follows from (31.48).

The numerical algorithm is analogous to the finite-deformation formulation.

It is important to realize that, even though the stress strain relation in linear elasticity is pointwiselinear (σij = Cijklεkl), the Fourier transform can only result in a linear system – like in the heatconduction problem above – if C is constant (which is a rather uninteresting case). Otherwise, wehave

σij(x) = Cijkl(x)εkl(x), (31.51)

so that the Fourier transform involves a convolution, so that the equation in Fourier space is notlinear (which is why the perturbation stress tensor was introduced).

112


31.5 Fourier-Related Stability Issues

All of the above Fourier spectral formulations used a discrete grid with a finite set T of grid pointsto approximate a function f(x) by

fh(x) = ∑k∈T

f(k) exp(−ihk ⋅x), (31.52)

where T denotes the finite set of grid points in spectral space, while T∞ is the countably infinite setof the corresponding exact Fourier representation, and we write T ∗ = T∞ ∖ T . Fourier coefficientshave pointwise convergence; however, the error of a truncated Fourier series, due to the high-frequency terms, depends on the smoothness of the function.

To see this, note that we can bound the truncation error at any point x ∈ Ω by

∣fh − f ∣ = ∣ ∑k∈T ∗

f(k) exp(−ihk ⋅x)∣ = ∣ ∑k∈T ∗

a(k) sin(hk ⋅x) + b(k) cos(hk ⋅x)∣

≤ ∑k∈T ∗

∣a(k) + b(k)∣ ≤ ∑k∈T ∗

∣a(k)∣ + ∣b(k)∣,(31.53)

where we used the triangle inequality for the last step, and we defined a = −if , b = f .

By applying Poincare’s inequality to the transform and using Parseval’s identity, we obtain

∑k∈T ∗

∣a(k)∣ + ∣b(k)∣ ≤ ∑k∈T ∗

1

∣k∣(∣a′(k)∣ + ∣b′(k)∣)

≤√

2 [ ∑k∈T ∗

∣a′(k)∣2 + ∣b′(k)∣2]1/2

≤√

2

πN∥f ′∥1/2

L2(Ω) .

(31.54)

Thus, we have derived the bound

supx∈Ω

∣fh − f ∣ ≤√

2

πN∥f ′∥1/2

L2(Ω) (31.55)

which depends on the smoothness of the function f . Most importantly, the decay rate of the Fouriercoefficients results in a non-uniform convergence. The consequences is a problem known as ringingartifacts in case of sharp contrasts in local properties (such as jumps in elastic constants, electricpermittivity, or crystal orientation) due to high-frequency oscillations in the solution near suchdiscontinuities.

One can reduce such artifacts by ensuring that the derivative in a domain is bounded, which canbe accomplished, e.g., by approximating the differential operator by a discrete, finite-differenceoperator before the Fourier transform. While this ensures asymptotic consistency, the spectralaccuracy of the method is reduced to that of the finite-difference operator.

This works as follows. Instead of applying the Fourier transform to a spatial derivative of f directly,viz.

F−1 ( ∂f∂xi

) = −ihkiF−1 (f) , (31.56)

113


we first apply a central-difference approximation to the partial derivatives such that for a gridspacing ∣∆xi∣ ≪ 1 in the coordinate direction xi with Cartesian unit vector ei (no summations overi implied)

∂f

∂xi(x) = f(x +∆xiei) − f(x −∆xiei)

2∆xi+O(∆x2

i ). (31.57)

Neglecting all higher-order terms and applying the discrete inverse Fourier transform yields

∂f

∂xi= ∂

∂xi∑k∈T

f(k) exp (−ihk ⋅x) ≈ ∑k∈T

f(k)exp [−ihk ⋅ (x +∆xiei)] − exp [−ihk ⋅ (x −∆xiei)]2∆xi

.

(31.58)

Note that

exp(−ihk ⋅ (x ±∆x)) = exp(−ihk ⋅x) exp(±ihk ⋅∆x), (31.59)

as well as Euler’s identity

exp(ihk ⋅∆x)) − exp(−ihk ⋅∆x)2i

= sin(hk ⋅∆x), (31.60)

which transforms the above into (no summations over i implied)

∂f

∂xi(x) ≈ − ∑

k∈Tf(k) exp (−ihk ⋅x) i sin(hki∆xi)

∆xi. (31.61)

For a regular grid with equal spacings ∆xi = ∆x, we thus obtain

∂f

∂xi≈ − ∑

k∈T

i sin(hki∆x)∆x

f(k) exp (−ihk ⋅x) . (31.62)

Finally, taking the Fourier transform leads to an approximate form of (31.56), viz.

F ( ∂f∂xi

) ≈ −isin(hki∆x)∆x

F(f) (31.63)

where the fractional term is related to the so-called Lanczos-σ factor.

In the limit of infinitely fine grids, notice that the above approximation converges to the exactrelation since

lim∆x→0

−isin(hki∆x)∆x

= −ihki. (31.64)

The truncation error of the central difference scheme creates numerical errors. These can be reducedby starting with higher-order finite difference stencils in (31.57). For example, consider the fourth-order central difference approximation

∂f

∂xi(x) = −f(x + 2∆xei) + 8f(x +∆xei) − 8f(x −∆xei) + f(x − 2∆xei)

12∆x+O(∆x4). (31.65)

114


Proceeding with the same approach outlined above for the second-order accurate scheme, we arriveat

F ( ∂f∂xi

) ≈ −i [8 sin(hki∆x)6∆x

− sin(2hki∆x)6∆x

]F(f). (31.66)

Again, we recover the exact solution as the grid spacing tends to zero because

lim∆x→0

[8 sin(hki∆x)6∆x

− sin(2hki∆x)6∆x

] = hki. (31.67)

The same procedure can be applied to finite-difference approximations of arbitrary order to resultin increased accuracy.

Examples: benchmark tests

The effectiveness of the discrete derivative approximation is demonstrated in Fig. 2, which illustratesthe numerically-computed derivatives of two 1D examples, viz. a Heaviside step function and a non-periodic half-sine wave. In both cases, we compare the exact (discontinuous) solution, the solutionbased on the standard Fourier transform, and the modified Fourier transform described above.Although they do not disappear entirely, high-frequency artifacts are considerably reduced by thefinite-difference approximation as compared to the classical Fourier-transform. Besides numericalerrors in the solution scheme, oscillations in shown results obtained from classical and modifiedFourier transforms also stem from the Fourier interpolation after solving for function values at gridpoints (cf. the discrete solution at grid points also shown in Figs. 2 and 3).

To illustrate the effectiveness of higher-order corrections, we present in Fig. 4 the convergence orderof the Euclidean error norm for corrections of varying order applied to the smooth function

f(x) =n=9

∑n=1

sin(2nπx). (31.68)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-1500

-1000

-500

0

500

1000

1500

-600

-400

-200

0

200

400

600

800

1000

1200 modified Fourier transformanalytical derivative1st order correction

normalized length x /L

function

f

4th order correction

0.225 0.23 0.235 0.24 0.245 0.25 0.255 0.26 0.265 0.27 0.275normalized length x /L

(a) (b)

Figure 2: Illustration of the effectiveness of the first- and fourth-order finite-difference correction onthe spectral derivative (using FFT) of the double step function f(x) = δ(x− 1

4 +2−10)+δ(x− 34 −2−10)

periodically continued with period x ∈ [0,1). Shown are solutions obtained from the classical Fourierspectral method, the modified Fourier transform (first-order and fourth-order correction), and theexact analytical solution.

115


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-5

-4

-3

-2

-1

0

1

2

3

4

5

function

g

normalized length x /L(b)

0 0.005 0.01 0.015

modified Fourier transformanalytical derivative1st order correction4th order correction

normalized length x /L

5

4

3

2

1

4.5

3.5

2.5

1.5

(a)

Figure 3: Illustration of the effectiveness of the first- and fourth-order finite-difference correction onthe spectral derivative (using FFT) of the half-sine function f(x) = π cos(πx) periodically continuedwith period x ∈ [0,1). Shown are solutions obtained from the classical Fourier spectral method,the modified Fourier transform (first-order and fourth-order correction), and the exact analyticalsolution.

-30

-25

-20

-15

-10

-5

0

5

-10

0

10

20

30

40

50standard Fourier transform

1st order correction

2ndorder correction

3rdorder correction

4thorder correction

conve

rgen

ceor

der

log (n)2log (n)2(a) (b)

log

(E )

2

1 2 3 4 5 6 7 1 2 3 4 5 6

Figure 4: (a) Mesh convergence indicating the loss of accuracy of the spectral method; shown isthe error E = ∣∣uh − u∣∣L2 vs. number of grid points n. (b) Comparison of the convergence order ofthe Fourier spectral scheme for different orders of the finite-difference correction.

As can be expected for smooth functions, the spectral accuracy is degraded when using the modifiedFourier transform with finite-difference approximation (see Fig. 4a). Using a higher-order correctionis shown to be advantageous, as the error decreases significantly while resulting in lower convergenceorder (see Fig. 4b). Thus, the correction order can be chosen to control the competition betweenaccuracy and convergence rate.

Next, we demonstrate the asymptotic consistency of the modified spectral scheme for a simpleelasticity problem. Consider a composite bar of length L having piecewise-constant cross-sectionsand Young’s moduli, as shown in Fig. 5, and subjected to uniaxial tension such that the averagestrain is 0.1. The outer two sections have Young’s modulus 2E and lengths 0.25L, while the innersection has Young’s modulus E and length 0.5L. The resulting axial strain distribution in thebar as obtained from the classical and modified Fourier spectral scheme with first- and fourth-order corrections is plotted in Fig. 5, demonstrating the effectiveness of the modified scheme in

116


0.05

0.06

0.07

0.08

0.09

0.1

0.11

0.12

0.13

0.14

0.15

analytical solution

standard spectral method

4th order correction

1st order correction

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

normalized position x/L

axia

l st

raine x

Figure 5: Distribution of the axial strain in a composite bar obtained from solving linear momentumbalance by the iterative spectral method with first-order, fourth-order and without correction,compared to the exact piecewise-constant solution. Dots denote the converged grid point values,whereas lines follow from Fourier interpolation.

suppressing Gibbs effects.

As a representative 2D example of a high-contrast RVE, we compute the response of a linearelastic composite made of a circular inclusion (radius 0.25L; normalized Lame moduli λ = 1, µ = 1)embedded in a matrix (normalized Lame moduli λ = 0.6, µ = 0.6, outer side length L). Plotted inFig. 6 is the normal stress distribution within the RVE for a biaxial tension test (for simplicity,shown is the affine interpolation of the 256 × 256 grid point values, i.e., the shown solution is notaffected by errors in the Fourier interpolation). Obviously, the classical Fourier spectral methodresults in strong artifacts near the interface between inclusion and matrix (Fig. 6a), which is reducedsignificantly by using the first-and fourth-order corrections (see Fig. 6).

(b)(a) (c)

position x/L position y/L

stre

sss

/m11

mat


stre

sss

/m11

mat


stre

sss

/m11

mat

s /m11 mat

stress

Figure 6: Uniaxial stress distribution in an elastic square-RVE with a circular inclusion using(a) the standard iterative spectral method and the modified spectral method with (b) fourth-orderand (c) first-order correction. The first-order correction over-smooths, the fourth-order correctionstill produces small oscillations, but note that no Fourier interpolation has been performed here(shown are the grid point values.)

117


It is important to point out that several previous approaches circumvented ringing artifacts nearhigh-contrast interfaces by aborting the iterative solution scheme before oscillations appeared inthe solution. Unlike the corrections presented above, that does not guarantee convergence norbounded errors and, especially, is not suitable for a staggered, time-stepping solution procedureas that used for the dissipative processes in ferroelectrics. The numerical error of each truncatediteration can propagate and escalate over time, which is why the method presented here is superiorin the following scenarios.

118


32 Bloch Wave Analysis

So far, we have dealt with quasistatic homogenization problems. Bloch wave analysis deals withthe propagation of linear waves in periodic media and originated from harmonic atomic crystals.For a given RVE, we study linearized waves of the form

u(x, t) = u(x) exp(−iωt) (32.1)

with propagation frequency ω ∈ R. For a periodic microstructure with RVE Ω, a Bloch wave isdefined by

u(x) = u0(x0) exp [−ik ⋅ (x −x0)] , (32.2)

where k denotes the wave vector. x0 is located within the reference RVE is linked to any point xby periodicity, i.e.,

x = x0 +d

∑a=1

caLa with x0 ∈ Ω. (32.3)

We now aim to solve a linearized wave propagation problem, for which the governing equations arelinearized about the current solution:

MU +KU = F . (32.4)

Application of

U(x, t) = U(x) exp(−iωt) (32.5)

yields

(K − ω2M)U = F . (32.6)

By exploiting the Bloch wave form, the RVE problem now becomes

(K − ω2M)U0 = F0 (32.7)

with

U+0 = U−

0 exp [−ik ⋅ (x+ −x−)] and F +0 = −F −

0 exp [ik ⋅ (x+ −x−)] , (32.8)

which can be solved in multiple ways. Note that because of inertial contributions of the movingwave, forces are no longer the same on opposite surfaces.

Note that we do not have to check all wave vectors. To notice this, recall that

exp [i(k ⋅x + 2πn)] = exp [ik ⋅x] ∀ n ∈ Z. (32.9)

This implies that we can restrict the choice of k to vectors lying in the First Brillouin Zone, i.e.,the primitive cell in reciprocal space. By exploiting symmetry, one can often reduce the space of kfurther to the so-called Irreducible Brillouin Zone.

119


33 Analytical Techniques: Cauchy-Born Kinematics

The above multiscale techniques involved solving a BVP on the microscale to extract the effectivematerial behavior on the macroscale. Under certain conditions, one may want to formulate asimpler microscale problem that does not necessarily require solving a microscale BVP but admitanalytical solutions for average microscale quantities.

As before, we assume a clear spatial separation of scales between micro- and macroscale features.Assume the macroscale deformation mapping ϕ ∶ Ω → Rd is known, then we may expand thedeformation within an infinitesimal neighborhood of a point X0 as

ϕi(X) = ϕi(X0) +∂ϕi∂XJ

(X0)dXJ +1

2

∂2ϕi∂XJ ∂XK

(X0)dXJ dXk + h.o.t.

= ϕi(X0) + FiJ(X0)dXJ +1

2FiJ,K(X0)dXJ dXk + h.o.t.

(33.1)

with deformation gradient F , second gradient GradF , and higher-order terms continued analo-gously. Hence, if the deformation at a point – characterized by ϕ(X0), F (X0), GradF (X0), etc.– is known, the deformation in its infinitesimal neighborhood is obtained from the above Taylorexpansion. Dropping higher-order terms results in an approximation.

Now, consider a microstructure that admits a separation of scales from the macroscale. In this case,one may regard (33.1) as an approximation of the deformation across the RVE, so that infinitesimaldistances dX are replaced by finite but small microscale distances ∆X =X −X0:

ϕi(X) ≈ ϕi(X0) + FiJ(X0)∆XJ +1

2FiJ,K(X0)dXJ ∆Xk + h.o.t. (33.2)

The simplest approximation is obtained by dropping all but the linear term, which results in theso-called (historically not quite correct) local Cauchy-Born rule

ϕi(X) = ϕi(X0) + FiJ(X0)∆XJ , ∆X =X −X0, (33.3)

where X0 is some reference point on the RVE-level (e.g., the center of mass of the RVE). Obviously,(33.3) prescribes an affine deformation with constant deformation gradient F ∗ = F (X0) across theentire RVE (not only on the boundary as in the affine-displacement case discussed before). Notethat for quasistatic problems the constant term ϕi(X0) constitutes rigid body motion of the RVE,so that omitting this term (or, equivalently, arbitrarily choosingX0) does not affect the deformationof the RVE.

Assume now that the material in the RVE has an energy density W (F ,X), then the above approx-imation results in the average energy density of the RVE being (for simplicity choosing X0 = 0)

W ∗(F ∗) = 1

V∫

ΩW (F ∗,X)dV = ⟨W (F ∗,X)⟩. (33.4)

The average first Piola-Kirchhoff stress tensor can be obtained in closed form as

P ∗ = ∂W∗

∂F ∗ (F ∗) = ⟨∂W∂F

(F ∗,X)⟩ = ⟨P (F ∗,X)⟩. (33.5)

120


Analogously, the incremental stiffness tensor C∗ follows from averaging.

The so-called extended Cauchy-Born rule (also known as nonlocal Cauchy-Born rule) re-tains one more term in the above expansion, resulting in

ϕi(X) = ϕi(X0) + FiJ(X0)∆XJ +1

2FiJ,K(X0)∆XJ∆XK , ∆X =X −X0. (33.6)

Here, the RVE deformation is no longer affine but follows the linearly-varying deformation gradient

FiJ = FiJ(X0) + FiJ,K(X0)∆XK , (33.7)

where we exploited that FiJ,K = ϕi,JK = ϕi,KJ = FiK,J .

Again, if the RVE’s material has an energy density W (F ,X), then the average is

W ∗(F ∗,GradF ∗) = 1

V∫

ΩW (F ∗ +GradF ∗X,X)dV = ⟨W (F ∗ +GradF ∗X,X)⟩. (33.8)

The result is an effective nonlocal model whose energy density depends on both F ∗ and GradF ∗

on the macroscale. The resulting first Piola-Kirchhoff stress tensor is by definition

P ∗iJ =

∂W ∗

∂F ∗iJ

(F ∗) = ⟨PiJ(F ∗ +GradF ∗X,X)⟩ , (33.9)

and there is a third-order couple stress tensor

Y ∗iJK = ∂W ∗

∂F ∗iJ,K

(F ∗) = 1

V∫

Ω

∂

∂FkLW (F ∗ +GradF ∗X,X)

∂F ∗kL + F ∗

kL,MXM

∂F ∗iJ,K

dV

= ⟨PiJ(F ∗ +GradF ∗X,X)XK⟩.(33.10)

Note that now the macroscale variational problem must be revised as the energy density is nolonger local. This results in added complexity and requires reformulation in particular of boundaryconditions and variations. For example, consider a total potential energy (dropping the asterisksfor convenience) in its most general form

I[ϕ] = ∫ΩW (ϕ,F ,GradF ,Grad GradF , . . .) dV (33.11)

whose first variation is

δI[ϕ] = ∫Ω(∂W∂ϕi

δϕi +∂W

∂FiJδϕi,J +

∂W

∂FiJ,Kδϕi,JK + . . .) dV. (33.12)

Using our usual discretization Bubnov-Galerkin scheme, the internal force vector becomes

F aint,i = ∫Ω(∂W∂ϕi

Na + PiJNa,J + YiJKNa

,JK + . . .) dV. (33.13)

This requires that we need to use higher-order interpolation. E.g., using the above formulation withW = W (F ,GradF ), the need at least quadratic interpolation. We will not discuss this problemhere in detail (see, e.g., the rich literature on gradient elasticity theories).

The above formulation can be extended as follows.

121


First, it can be applied to discrete RVE descriptions (e.g., in case of atomic crystals or structuraltrusses). In this case, (33.2) is applied to the position of each discrete point in the RVE.

Second, consider a discrete RVE in which only some points are constraint by the Cauchy-Bornrule while others are free to displace (e.g., in complex atomic crystals in which the RVE cornersdisplace in an affine manner while interior atoms show relative motion (so-called shuffles) withrespect to the corners). Here, the procedure is analogous to that of FE2: the constrained nodes aredeformed according to the macroscale deformation while the displacements of the remaining nodesis determined from solving the microscale equilibrium equations.

122


34 Atomistics: Molecular Statics

34.1 A Cartoon Introduction to Quantum Mechanics

At the level of elementary particles, matter is commonly described as waves rather than particles.

Consider a vibrating rod or string whose 1D equation of motion is

∂2

∂t2u(x, t) = c2 ∂

2

∂x2u(x, t) (34.1)

with wave speed c =√E/m. This is the approach of classical mechanics. At the quantum level,

the governing equations can be considered as follows.

Note that using the form u(x, t) = u exp[2πi(νt − kx)] we may conclude that

∂2u

∂x2= −4π2k2u, (34.2)

and

∂u

∂t= −i2πν u. (34.3)

Next, consider the impulse and energy of a free particle, which behaves like a wave – unlike inclassical mechanics.

De Broeglie postulated that wave length and impulse are related by

λ = hp

and λ = 1/k ⇒ p = kh. (34.4)

Furthermore, the Einstein-Planck relation postulates (with Planck’s constant h) that

E = hν. (34.5)

Now, notice that multiplication by ih transforms (34.3) into

ih∂u

∂t= h2πν u = 2π(hν)u = 2πEu ⇔ ih

∂u

∂t= Eu, (34.6)

where we introduced for convenience

h = h

2π. (34.7)

Similarly, (34.2) is multiplied by h/2m to result in

h2

2m

∂2u

∂x2= − h2

2m4π24π2k2u = −(hk)2

2mu = − p

2

2mu. (34.8)

Finally, for a single particle in an external potential V , the total energy is

E = p2

2m+ V (x). (34.9)

123


In summary, we now have

h2

2m

∂2u

∂x2= − p

2

2mu and ih

∂u

∂t= Eu = ( p

2

2m+ V (x))u. (34.10)

Simple rearrangement and combination of the two equations finally yields Schrodinger’s equation

[− h2

2m

∂2

∂x2+ V (x)]Ψ(x, t) = i h ∂

∂tΨ(x, t) or HΨ = i h∂Ψ

∂t(34.11)

where we replaced the displacement u by a general wave form Ψ. We also introduced the abbrevi-ation

H = − h2

2m

∂2

∂x2+ V (x) (34.12)

Note that like the wave equation, the solution to Schrodinger’s equation can be written in a sepa-rable form

Ψ(x, t) =∞∑n=1

Ψn(x) exp(iEnt/h), (34.13)

whose insertion into Schrodinger’s equation yields an eigenvalue problem

HΨn = EnΨn (34.14)

with eigenvalues En (the energy levels) and associated (standing) wave forms Ψn.

Recall that the interaction potential between two charged particles at a distance r having chargesq1 and q2 is given by the Coulomb potential

V (r) = 1

4πε0

q1q2

r(34.15)

with ε0 denoting the electric permittivity of free space.

For a general system of multiple charged particles (electrons with charge −e and nuclei made up ofZ protons with charges +e), the potential energy can thus be written as

V = ∑α≠β

(Zαe)(Zβe)4πε0 ∣rαβ ∣

´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶nuclei interactions

+ ∑i≠j

e2

4πε0 ∣rij ∣´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶

electron interact.

− ∑i,α

(Zαe)e4πε0 ∣rαi∣

´¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¸¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¹¶electron/nuclei interact.

(34.16)

where Zi denotes the charge (number of protons) in the respective nucleus.

Most important for the development of the concepts of atomistics is the Born-Oppenheimerapproximation: electrons are assumed to be moving orders of magnitude faster than the heaveynuclei. Therefore, one can solve for the wave function of electrons while assuming the nuclei remainstatic at known positions.

124


Example: Hydrogen atom

Hydrogen is the simplest atom possible, having only one electron and one proton. Consequently,Schrodinger’s equation for the electron – assuming the proton is fixed at x = 0 by Born-Oppenheim– simplifies dramatically to

H = −h2

2me∇

2 −e2

4πε0 ∥x∥(34.17)

and one can solve for eigenforms by solving the eigenvalue problem

(−h2

2me∇

2 −e2

4πε0 ∥x∥)Ψn(x) = EnΨn(x). (34.18)

In fact, the solution can be obtained analytically in spherical coordinates:

En = −mee

4

8ε20h

2n2, Ψ(n,l,m)(r, θ, φ) = cn,l,mRn,l(r)Pml (cos θ) exp(imφ), (34.19)

with Legendre polynomials P lm and Laguerre polynomials Rn,l, and cn,l,m are known, closed-formconstants (omitted here for conciseness).

For every wave function Ψ, one can define the spatial probability density ∣Ψ∣2. Plotting theprobability density for the various energy levels results in the well known orbitals.

For example, for a simple 1D system with a single particle in a box of length L, the solution is

Ψn(x) =√

2

Lsin(nπx/L). (34.20)

Next, let us consider more complex atomic systems composed of electrons at positions qe =qe1, . . . ,qen and nuclei at positions qn = qn1 , . . . ,qnn.

Let us the time-independent Schrodinger equation for this scenario, i.e.,

HΨ(qe,qn) = EΨ(qe,qn). (34.21)

Using the Born-Oppenheimer approximation, we separate the (fast) electronic from the (slow)nuclei contributions by assuming a solution

Ψ(qe,qn) = Ψel(qe;qn)Ψnuc(qn). (34.22)

This leads to two decoupled problems, viz. on the one hand

Hel Ψel(qe;qn) = V (qn)Ψel(qe;qn), (34.23)

with

Hel =Ne

∑i

∣pei ∣22me

+ ∑α≠β

(Zαe)(Zβe)4πε0 ∣qnα − qnβ ∣

+∑i≠j

e2

4πε0 ∣qei − qej ∣−∑i,α

(Zαe)e4πε0 ∣qnα − qei ∣

. (34.24)

125


This implies that for given nuclei positions qn = qn1 , . . . ,qnn, one solves for the electron eigenstatesand the associated energy V (qn) which is a function of the nuclei positions only.

On the other hand, one then finds the solutions for the nuclei positions from

Hnuc Ψnuc(qn) = EΨnuc(qn), (34.25)

with

Hnuc =N

∑i

∣pni ∣22mi

+ V (qn), (34.26)

which, of course, can also be replaced by the time-dependent form (which is generally the case).Note that, at this point, all electron contributions have been condensed out; i.e., if V (q) is known,we can solve the above problem for the nuclei positions only. This is the approach taken byatomistics – to be discussed in the following.

34.2 Hamiltonian Description

In classical mechanics, an atomistic ensemble containing N atoms is uniquely described by howtheir nuclei positions

q(t) = q1(t), . . . ,qN(t) (34.27)

and the derived nuclei momenta

p(t) = p1(t), . . . ,pN(t) (34.28)

with pi =miqi evolve with time t. mi represents the mass of atom i, and dots denote material timederivatives. The ensemble’s total Hamiltonian is given by

H(q,p) =N

∑i=1

∣pi∣22mi

+ V (q), (34.29)

where the first term accounts for the kinetic energy and the second term represents the potentialenergy of the ensemble with V denoting a suitable atomic interaction potential.

Compared to the electron wave functions discussed above, atomic nuclei are moving relativelyslowly, which is why one assumes that the time evolution of an atomistic system is governed byHamilton’s equations,

q = ∂H∂p

, p = −∂H∂q

. (34.30)

The second equation yields Newton’s equations of motion for all atoms i = 1, . . . ,N :

miqi = fi(q) = −∂V

∂qi(q), (34.31)

where fi(q) represents the total (net) force acting on atom i. For quasistatics, these equationsreduce to

fi(q) = −∂V

∂qi(q) = 0, (34.32)

126


and they aim to minimize the total potential energy.

More generally, classical Hamiltonian mechanics are considered appropriate if the de Broeglie wavelength is much smaller than typical atomic spacings a, i.e., if

λ = hp≪ a. (34.33)

In most materials (including metals, ceramics and organic materials), the interaction potentialallows for an additive decomposition, i.e.

V (q) =N

∑i=1

Vi(q) (34.34)

with Ei being the energy of atom i. Some authors in the scientific literature have preferred todistinguish between external and internal forces by introducing external forces fi,ext on all atomsi = 1, . . . ,N so that

fi(q) = fi,ext(qi) −∂V

∂qi(q). (34.35)

Ideally (and in any reasonable physical system), such external forces are conservative and derivefrom an external potential Vext(q) and we may combine internal and external forces into a singlepotential, which is tacitly assumed in the following. Examples of conservative external forces includegravitation, long-range Coulombic interactions, or multi-body interactions such as during contact.For computational convenience, the latter is oftentimes realized by introducing artificial potentials(e.g., analogous to the spherical indenter potential we introduced in 214a).

127


35 Interatomic Potentials

As we have seen above, the constitutive description of atomistic ensembles is provided by inter-atomic potentials. An interatomic potential describes the effective binding energy of an atomembedded into an ensemble of atoms whose nuclei are located at q = q1, . . . ,qN:

V = V (q). (35.1)

In the following, we only consider potentials that depend on atomic (nuclei) positions (which canbe generalized to potentials that also depend on atomic momenta; this plays a role, e.g., in finite-temperature QC). The potential accounts for both repulsive and attractive forces.

In principle, every potential (written for an ensemble of N atoms) can be expanded as

V (q) =N

∑i=1

V1(qi) +1

2∑i≠jV2(qi,qj) +

1

3!∑

i≠j,i≠kV3(qi,qj ,qk) + . . . , (35.2)

where one usually avoids summation over identical indices (i.e., when writing i ≠ j we implysummation over i = 1, . . . ,N and j = 1, . . . ,N with the exclusion of the case where i = j).

The term involving V1 can usually be neglected, unless atoms are in an external field (e.g., ionsin an electric field). Otherwise, the potential should be invariant to rigid-body motion so V1 ≡ 0.Retaining only V2 and neglecting higher-order dependencies results in pair potentials (discussednext), retaining V2 and V3 results in three-body potentials (discussed in the following).

35.1 Pair Potentials

The specific class of two-body potentials or pair potentials assumes the form

V (q) =∑i≠jf(qi,qj) =∑

i≠jf(rij), (35.3)

where we can conclude from invariance under rigid body motion that the potential only dependson relative distances between between pairs of atoms (lowest-order correlation).

Example I: Coulombic crystals

As a simple example, consider ions with charges q = q1, . . . , qN interacting through Coulombicinteraction:

V (q) = 1

4πε0∑i≠j

qiqj

rij, with rij = ∥qi − qj∥ . (35.4)

The force between two isolated atoms (N = 2) is thus obtained as

f1 = −f2 = −∂V

∂q1= − ∂

∂q1

1

4πε0

q1q2

r= 1

4πε0

q1q2

r2r with r = q1 − q2

∥q1 − q2∥. (35.5)

128


Consequently, equally-signed charges with sign(q1) = sign(q2) repel one another and they producea positive binding energy V > 0. In contrast, charges of opposite signs, i.e., sign(q1) = − sign(q2),attract each other, resulting in a negative binding energy V < 0. The force on an atom i in anensemble of atoms is obtained by superposition:

fi = −∂V

∂qi= 1

4πε0∑j≠i

qiqj

r2ij

rij . (35.6)

Coulombic interactions are long-range because the 1/r-dependence decays slowly with distance.Consider an infinite 1D chain of ions of opposite charges q1 = −q2 = q. This is the prototype of anionic crystal. Assume that the chain has equal interatomic spacings a. The potential energy ofevery atom in the chain then becomes

V = 2

4πε0

∞∑i=1

[ q2

2ia− q2

(2i − 1)a] = −2q2

4πε0a(1 − 1

2+ 1

3− 1

4+ . . .) , (35.7)

where we exploited symmetry (the factor of 2 results from symmetry of neighbors). Using theTaylor expansion

ln(1 + x) = x − x2

2+ x

3

3− x

4

4+ . . . (35.8)

allows us to write the potential as

V = − q2

2πε0aln 2, (35.9)

We notice that this ionic crystal does not have an equilibrium spacing a0 since the minimum V isattained for a→ 0 (which could have been expected since the 1/r-scaling attracts nearest neighborsmore strongly than next-to-nearest neighbors repel one another).

Example II: Lennard-Jones potential

Next, consider one of the simplest two-body interatomic potentials with an equilibrium distance,the Lennard-Jones potential (named after Sir John Lennard-Jones) which has the general form

V (r) = − Arn

+ B

rm(35.10)

with integers exponents n,m > 0 and real-valued strength coefficients A,B > 0. A represents thestrength of the attractive contribution, whereas B dominates the repulsive contribution. For anequilibrium to exist, atoms should attract over large distances but repel over short distances; thisimplies that we must have m > n.

Note that two atoms at a distance r now have the binding energy (35.10), which has an extremumif

0 = ∂V∂r

= n A

rn+1−m B

rm+1= 1

rn+1(nA − mB

rm−n) ⇔ r0 = (mB

nA)

1/(m−n). (35.11)

129



∂2V

∂r2∣r=r0

= −n(n + 1) A

rn+20

+m(m + 1) B

rm+20

= 1

rn+20

[m(m + 1) B

rm−n0

− n(n + 1)A]

= 1

rn+20

[m(m + 1)BnAmB

− n(n + 1)A] = nA

rn+20

[(m + 1) − (n + 1)] = n(m − n)Arn+2

0

.

(35.12)

Consequently, the potential has a minimum at r = r0 if m > n (and A,B > 0), as expected. V (r0)is the maximum binding energy.

Next, let us compute the equilibrium spacing a0 of an infinite chain of identical atoms. To this end,consider the energy of a single atom, which is:

V = 2∞∑i=1

(− A

(ia)n +B

(ia)m) . (35.13)

Recall the definition of Riemann’s zeta-function,

ζ(n) =∞∑i=1

1

in, (35.14)

so that

V = 2(−Aζ(n)an

+Bζ(m)am

) . (35.15)

The minimizer is obtained from

0 = ∂V∂a

= 2(nAζ(n)an+1

−mBζ(m)am+1

) = 2

an+1(nAζ(n) −mBζ(m)

am−n) , (35.16)

so that

a0 = (mBζ(m)nAζ(n) )

1/(m−n)

. (35.17)

Consider, e.g., the most common form (6-12 potential) with n = 6 and m = 12. Then

a0 = (1382π6

675675

B

A)

1/6

≈ (1.96639B

A)

1/6. (35.18)

Note that for the same choice of m = 12 and n = 6 we have

r0 = (2B

A)

1/6. (35.19)

That is, the equilibrium distance r0 between two isolated atoms does not coincide with the equilib-rium distance a0 between atoms in an infinite chain (and yet another distance would be obtainedwhen considering only a finite number of neighboring atoms).

Finally, note that choices such as m = 12 and n = 6 result in quickly decaying interaction forces,which is why such potentials are called short-range.

130


If the equilibrium spacing a0 and the maximum binding energy V (a0) are known (which can bothbe determined experimentally), then the potential parameters can be identified by fitting.

Binding energies are typically on the order of electron-volts:

1eV = ∣qe∣ ⋅ 1V = 1.60217662 ⋅ 10−19J, (35.20)

where qe = 1.60217662 ⋅ 10−19C is the charge of a single electron. Therefore, 1eV corresponds to theenergy of a single electron that has passed through a potential of 1V.

Equilibrium spacings are on the order of a few Angstrom (named after Swedish physicist AndersJonas Angstrom who studied the wavelengths of electromagnetic radiation, where this characteristiclength comes up):

1A = 10−10m. (35.21)

Example III: Morse potential

A further example of a two-body potential of practical relevance is the Morse potential (namedafter American physicist Philip McCord Morse):

V (r) =D[1 − exp (−a(r − r0)) ]2

(35.22)

with equilibrium distance r0 (between two single atoms) and well depth

D = V∞ − V (r0), (35.23)

i.e., the energetic difference between the equilibrium and the energy at r →∞. The limit

V∞ = limr→∞

V (r) (35.24)

is known as the dissociation energy.

35.2 (An)Harmonicity and the Quasiharmonic Approximation

All of the above potentials have one thing in common: they are anharmonic, i.e., the potentialis not symmetric around the equilibrium point. One can always construct a harmonic potentialby Taylor expansion:

V (r) = V (r0) +∂V

∂r∣r0

(r − r0) +1

2

∂2V

∂r2∣r0

(r − r0)2 + h.o.t. (35.25)

Note that, if the linear term vanishes by definition of equilibrium, then

V (r) = V (r0) +C

2(r − r0)2 + h.o.t. (35.26)

This expansion is known as the quasiharmonic approximation with force constant

C = ∂2V

∂r2∣r0

> 0. (35.27)

131


This approximation is of particular importance when studying atomic vibrations, which are ofsmall amplitude so that the quasiharmonic approximation can be used to compute the lowest-ordervibrational frequencies (phonon modes). Note that this assumes that each atom is vibratingindividually. Therefore, the quasiharmonic approximation generally yields accurate predictions atlow absolute temperature, whereas it leads to significant errors at elevated temperatures wherevibrational amplitudes become so large that the assumption no longer holds.

35.3 Multi-Body Potentials

Pair potentials are simple and efficient but not suitable to describe most material behavior including,e.g., the behavior of metals. Here, multi-body potentials gain importance.

Example I: Stilinger-Weber potential

The Stilinger-Weber potential for silicon is an example for a three-body potential. In additionto pair interactions, the third-order potential term V3(qi,qj ,qk) is conveniently expressed in termsof the two distances and the bond angle between triples of atoms. The Stilinger-Weber potentialthus has the form

V (q) = ∑i,j∈L

V2(rij) + ∑i,j,k∈L

V3(rij , rik, θijk), (35.28)

where θijk denotes the angle between rij and rik.

Example II: Embedded Atom Method

The most common example for metals is the so-called Embedded Atom Method whose potentialfor an atom i surrounded by a set L of atoms is defined by

Vi(q) =1

2∑j∈L

Φ(rij) +U(ρi), with rij = ∥qi − qj∥ . (35.29)

Φ(r) is a pair potential representing nucleus-nucleus interactions. U(ρ) is called the embeddingfunction which denotes the access energy due to embedding atom i within the electron density ρipresent at the location of atom i due to all other atoms. For simplicty, one commonly assumes thatρi depends on the distances rij to all other atoms j:

ρi = ∑j∈L

f(rij). (35.30)

In order to compute the force acting on an atom k, we must consider the total potential energy

V (q) =∑i∈LVi(q). (35.31)

The force acting on an atom k is now derived as

fk = −∂V

∂qk= −

⎡⎢⎢⎢⎢⎣

1

2∑i∈L∑j∈L

Φ′(rij) +∑i∈LU ′(ρi) ∑

j∈Lf ′(rij)

⎤⎥⎥⎥⎥⎦

∂rij

∂qk. (35.32)

132


For atomic ensembles, the partial derivative can be computed exactly by using rij = ∥qi − qj∥ sothat

∑i,j

(⋅)ij∂rij

∂qk=∑i,j

(⋅)ijrij

rij(δik − δjk) =∑

i,j

(⋅)ij rij (δik − δjk) . (35.33)

Note that rij = rji and rij = −rji, so that the forces simplify to

fk = −⎡⎢⎢⎢⎢⎣

1

2∑i∈L∑j∈L

Φ′(rij) +∑i∈LU ′(ρi)∑

j∈Lf ′(rij)

⎤⎥⎥⎥⎥⎦

rij

rij(δik − δjk)

= −⎡⎢⎢⎢⎢⎣

1

2∑j∈L

Φ′(rjk) +U ′(ρk)∑j∈L

f ′(rjk)⎤⎥⎥⎥⎥⎦

rkj

rjk+ [1

2∑i∈L

Φ′(rik) +∑i∈LU ′(ρi)f ′(rik)]

rikrik

=⎡⎢⎢⎢⎢⎣∑j∈L

Φ′(rjk) + (U ′(ρj) +U ′(ρk))∑j∈L

f ′(rjk)⎤⎥⎥⎥⎥⎦rjk,

(35.34)

where rjk = rjk/rjk is the unit vector between atoms j and k, and the amplitude of the forcebetween atoms j and k is

fjk = Φ′(rjk) + (U ′(ρj) +U ′(ρk)) f ′(rjk) (35.35)

so that

fk = ∑j∈L

fjkrjk. (35.36)

For most potentials, pair interactions and electron densities are short-range, so that summationsover L can usually be restricted to summations over a small number of neighboring atoms. To thisend, one introduces a cut-off radius rcutoff which defines the cut-off neighborhood around an atomi as

Ci = j ∈ L ∶ ∥qj − qi∥ ≤ rcutoff. (35.37)

Further, note that many crystals show centrosymmetry. Let us define the centrosymmetryparameter

CS =∑i

∥qi,1 − qi,2∥2 , (35.38)

where (qi,1,qi,2) are the positions of the ith pair of opposite neighbors of an atom. Note that –for centrosymmetric lattices (i.e., CS = 0) – the net force on every atom in an infinite, uniform,equally-spaced crystal vanishes always by symmetry, irrespective of the actual lattice spacing. Thisis not the case for atoms near boundaries or defects.

Example III: Force Fields

Especially for organic materials such as, e.g., polymers, force fields have gained popularity. In anutshell, a molecule is approximated by a collection of charged atoms linked by elastic connections.

133


Consequently, the potential decomposes into contributions from bonds, angles, dihedral angles, andlong-range non-bond angles with the general form:

V (q) = ∑i,j∈bonds

kbondij r2

ij + ∑i∈angles

kanglei θ2

i + ∑i,j∈bonds

kdihedrali f(φi, θi)

+ ∑i,j∈L

4εij

⎡⎢⎢⎢⎢⎣(σijrij

)12

− (σijrij

)6⎤⎥⎥⎥⎥⎦+ ∑i,j∈L

qiqj

4πε0rij.

(35.39)

The Lennard-Jones term represents van der Waals interactions, while the Coulombic term standsfor ionic interactions.

134


36 Cauchy-Born Approximation and Effective Properties

The Cauchy-Born rule(s) discussed before for structural lattices at the continuum scale can also beapplied to atomic ensembles, which is a popular technique to extract, e.g., the zero-temperatureelastic moduli of crystalline solids. It also forms the basis for the quasicontinuum method discussedlater. Let us quickly review the underlying theory.

Consider an atomic ensemle ofN atoms at positionsQ = Q1, . . . ,QN. For crystals, these positionsare usually given by a Bravais lattice with basis B = A1, . . . ,Ad and cj ∈ Z:

Qi =d

∑j=1

cjAj . (36.1)

When an average deformation gradient F is applied to a simple Bravais lattice of equal atoms (suchas in fcc or bcc metals), then it is reasonable to assume under small deformation that the responseis an affine deformation of the crystal lattice. Thus, we may apply the local Cauchy-Born rule withaverage deformation gradient F to find the deformed atomic positions as

qa = FQa, a = 1, . . . ,N. (36.2)

This allows us to define an effective energy density

W (F ) = 1

∣Ω∣V (q1, . . . ,qN), (36.3)

where V (q1, . . . ,qN) includes the energy of atoms in an atomic unit cell, and ∣Ω∣ denotes itsvolume. In the simplest case of simple Bravais lattices, each atom experiences the exact sameneighborhood changes (all atoms are equal). Then, V (q1, . . . ,qN) can be taken as the energy ofa single atom only (and applying (36.2) to all of its neighbors within the cut-off radius).

The first Piola-Kirchhoff stress tensor components now follow as (writing qai for the ith com-ponent of qa)

PiJ(F ) = 1

∣Ω∣∂

∂FiJW (F ) = 1

∣Ω∣N

∑a=1

∂V

∂qak(q) ∂q

ak

∂FiJ

= − 1

∣Ω∣N

∑a=1

fak (q)∂

∂FiJFkLQ

aL = − 1

∣Ω∣N

∑a=1

fai (q)QaJ

(36.4)

or

P (F ) = − 1

∣Ω∣N

∑a=1

fa(q)⊗QJ , (36.5)

where

fa(q) = −∂V

∂qa. (36.6)

Analogously, we can derive the incremental stiffness tensor components as

CiJkL = ∂PiJ∂FkL

= − 1

∣Ω∣∂

∂FkL

N

∑a=1

fai (q)QaJ = −1

∣Ω∣N

∑a=1

∂fai∂qbk

(q)QaJQbL. (36.7)

135


The usual mapping relations apply to transform tensors between current and reference configura-tions.

For simple crystals, this gives a convenient recipe to compute the zero-temperature elastic constantsfrom molecular statics.

Example: Embedded Atom Method

For example, for the embedded atom method we have the atomic potential of a single atom,

V (q) = 1

2∑α∈L

Φ(rα) +U(ρ) with ρ = ∑α∈L

f(rα), (36.8)

where we sum over all neighboring atoms α ∈ L of an atom at q0 = FQ0, and we defined

rα = qα − q0 = FQα −FQ0 (36.9)

and rα = ∣rα∣. ρ is the effective electron density of the atom.

Without showing the full details (feel free to verify yourself), the first Piola-Kirchhoff stress tensoris obtained as

PiJ =1

∣Ω∣ ∑α∈L([U ′(ρ)f ′(rα) + 1

2Φ′(rα)]

rαi rαj

rα)F−1

Jj . (36.10)

The incremental modulus tensor in the current configuration follows as

cijkl =1

∣Ω∣

⎧⎪⎪⎨⎪⎪⎩U ′′(ρ) [∑

α∈Lf ′(rα)

rαi rαj

rα]⎡⎢⎢⎢⎢⎣∑β∈L

f ′(rβ)rβk r

βl

rβ

⎤⎥⎥⎥⎥⎦

+∑α∈L

[((U ′(ρ)f ′′(rα) + 1

2Φ′′(rα)) − 1

rα(U ′(ρ)f ′(rα) + 1

2Φ′(rα)))

rαi rαj r

αk r

αl

(rα)2] .

(36.11)

Thus, for given atomic reference positions Q in the crystal lattice and a given potential with knownfunctions Φ, U and f , the above can be used to compute the (zero-temperature) elastic moduli ofa single-crystal. Note that for fcc and bcc lattices, the resulting moduli will not be isotropic butexhibit cubic symmetry (the usual isotropic elastic moduli will have to be obtained by averagingover many grains in a polycrystal).

136


37 Multiscale Modeling Techniques

Atomistics is a prime example for why multiscale techniques are required. Molecular dynamicsand statics are convenient methods but computer architectures limit the accessible length andtime scales. Length scales are limited by the total number of atoms that can be simulated. Hereare recent benchmark simulations that indicate the trends in feasible numbers of particles overthe years, limited by the size of computer memory (i.e., the need to store atomic positions andmomenta):

2003: 108 atoms (LANL Q) (Kadau et al., 2004)

2007: 1010 atoms (BlueGene/L) (Kadau et al., 2007)

2012: 1012 atoms (BlueGene/Q) (Habib et al., 2012)

2015: 1013 particles (Titan) (Koumoutsakos, 2015)

2018-20: 1015 atoms (prediction of exascale computing initiatives)

Even though the number of particles that can be simulated continues to increase due to increasinglylarge memory on supercomputers, the same cannot be said about time scales because communica-tion times and CPU speeds will not advance at the same pace as memory. Therefore, even thoughcomputer architectures promise increasing sample sizes, time scales will remain a bottleneck.

This motivates techniques to bridge across scales, which is usually achieved in one of two ways:

hierarchical scale-bridging, also known as vertical scale-bridging, assumes a separationof scales so that a hierarchy of scales can be established. Information from the lower scales ispassed on to the larger scales by the definition of effective measures. Computational homog-enization and FE2, discussed before, are prime examples: RVE-information is passed on tothe macroscale, based on the spatial separation between micro- and macroscales. Difficultiesarise when a clear spatial separation of scales is not warranted (e.g., when macroscale featuresare on the same scale as microscale specifics) or when localization occurs on the microscale(e.g., when a crack develops in the RVE which develops into a crack on the macroscale).

concurrent scale-bridging, also known as horizontal scale-bridging, does not assumea separation of scales but patches together multiple simulation techniques concurrently. Forexample, using atomistics in one half of the simulation domain and continuum mechanics andthe other half. Challenges here arise from identifying compatible constitutive descriptions onboth sides of the interface separating the two domains, and from establishing an interfacethat is compatible with both descriptions and allows information to flow from one domainto another. Examples are the Coupled Atomistics-Discrete Dislocation (CADD) method, theQuasicontinuum (QC) method, Macroscopic-Atomistic-Ab Initio-Dynamics (MAAD), and theBridging Domain Method (BDM).

The idea behind concurrent techniques is to limit lower-scale resolution to where it is indeed needed,while using an efficient large-scale description in the remaining simulation domain. For example,when running atomistic simulations at low temperature, interesting features occur in the vicinity of

137


defects, interfaces and surfaces. This motivates to use atomistic techniques near those features whileusing a more efficient description in those regions deforming more homogeneously far away fromdefects, interfaces and surfaces. Ideally, such domain decomposition should be performed adap-tively, so that no knowledge is required a-priori about where the expensive lower-scale resolutionwill be required during a simulation.

Examples of both hierarchical and concurrent scale-bridging techniques are shown in Fig. 7.

(Abraham et al., 1998)

Figure 7: Examples of hierarchical (left) and concurrent (right) scale-bridging techniques. Hier-achical techniques pass information between scales by assuming a separation of scales. Concurrenttechniques as the shown MAAD scheme (adopted from Abraham et al. (1998)) couple differenttechniques in the same simulation domain. Here, e.g., highly-accurate tight-binding (TB) is usedat a crack tip, with molecular dynamics (MD) applied further awar from the tip, and finite elements(FE) used in the far field.

138


38 Quasicontinuum Method

38.1 Local Quasicontinuum

The motivational challenge addressed by the quasicontinuum method (or QC method) is thequest to apply atomistic techniques to significantly large length scales by restricting atomisticresolution to where it is indeed required. To this end, one uses the Cauchy-Born descriptionin a continuum region, which is to be coupled to a molecular-statics domain. The continuumdescription of Section 36 is numerically treated by finite elements. Key pillars of the QC methodare the selection of representative atoms, the interpolation of whose positions and momenta isused to compute the positions and momenta of all atoms. Further, one introduces summationrules to efficiently approximate the total energy of the coarse-grained atomistic ensemble (thiscan be achieved by the introduction of sampling atoms). Details are discussed in the multiscalemodeling slides posted online.

139


39 Atomistics: Molecular Dynamics

39.1 Statistical Mechanics

Recall that we describe the state of an atomistic system of N atoms by positions q = q1, . . . ,qNand momenta p = p1, . . . ,pN with pi =miqi. The total Hamiltonian yields Hamilton’s equationsof motion:

H(q,p) =N

∑i=1

∣pi∣22mi

+ V (q), qi =∂H∂pi

, pi = −∂H∂qi

. (39.1)

Every state of the system is thus described by a point in 2N -dimensional space, the so-calledphase space which is often denoted by Γ. Motion of atoms can be reinterpreted as a trajectory(q(t),p(t)) through phase space with some initial condition (q(0),p(0)).

Example: harmonic oscillator

As a simple example, consider a harmonic oscillator (i.e., a mass m attached to a linear spring withstiffness k) whose equation of motion mq + kq = 0 has the solution

q(t) = A sin(ωt) +B cos(ωt) with ω =√k/m. (39.2)

Also,

p(t) =mq(t) =mω [A cos(ωt) −B sin(ωt)] . (39.3)

Applying initial conditions q(0) = q0 and p(0) = p0 yields B = q0 and A = p0/(mω). Take, e.g., thesimple case p0 = 0 so that

q(t) = q0 cos(ωt), p(t) = −q0mω sin(ωt). (39.4)

Hence, the mass’ motion in phase space describes the ellipse

( qq0

)2

+ ( p

q0mω)

2

= 1 ⇔ q2 + ( p

mω)

2

= q20, (39.5)

with a unique ellipse for each initial condition q0 = q(0). Note that

H = p2

2m+ k

2q2 = 1

2mq2

0m2ω2 sin2(ωt) + k

2q2

0 cos2(ωt)

= k2q2

0 [sin2(ωt) + cos2(ωt)] = k2q2

0.

(39.6)

That is, without external forcing the Hamiltonian is constant along its trajectory and, moreover, thearea enclosed by the ellipse grows with increasing energy. Specifically, the ellipse’s major principalaxis is

q0 =√

2H/k. (39.7)

140


This can be generalized for systems in classical mechanics: motion is unique so that trajectories donot intersect and

dHdt

=N

∑i=1

(∂H∂qi

⋅ qi +∂H∂pi

⋅ pi) =N

∑i=1

(−pi ⋅ qi + qi ⋅ pi) = 0, (39.8)

i.e., energy is conserved and the Hamiltonian is constant along trajectories.

Similar in spirit to homogenization, we would like to define macroscopic, effective observableslike the state variables in continuum mechanics (e.g., energy, temperature, stress, etc.) Since theatomistic system is uniquely defined by (q,p), we may assume that each observable A can beexpressed as a function A = A (q,p). Unlike in computational homogenization discussed before,the key challenge here is to bridge across time scales, which is why one may define the effectivequantity A(t) at time t as the time average

A(t) = 1

∆t∫

t+∆t

tA (q(τ),p(τ)) dτ, (39.9)

where ∆t is large compared to the atomic fluctuations. Equilibrium in this context implies thatA(t) = const. and hence independent of time t. For almost all initial conditions and withoutproviding external power, the limitA obtained as ∆t→∞ exists and represents the true macroscopicquantity:

A = lim∆t→∞

1

∆t∫

t+∆t

tA (q(τ),p(τ)) dτ. (39.10)

Numerically, such quantities can – more or less – easily be computed during MD simulations bytaking

A ≈ 1

n∞

n∞

∑i=1

A (q(ti),p(ti)) . (39.11)

The above time averaging is numerically simple but computationally expensive and theoreticallyinconvenient. Therefore, one alternatively uses statistical approaches based on probability densities.

In statistical mechanics, one defines macrostates (described by the macroscopic observables) andthe associated microstates (describing all realizations in agreement with a particular macrostate).For example, we may fix the total energy of a system, i.e.,

N

∑i=1

∣pi∣22mi

+ V (q) = E = const. (39.12)

Here, energy E is a macroscopic observable and there are many microstates (i.e., particular atomicpositions and momenta) that satisfy the contraint (39.12). All microstates in agreement with agiven macrostate are collectively referred to as an ensemble. Depending on the particular macro-constraints, one differentiates between the classical ensembles:

the microcanonical ensemble (NV E-ensemble) which describes an isolated system withconstant energy (particle number and volume),

the canonical ensemble (NV T -ensemble) which describes a system in contact with a heatbath, i.e., at constant temperature (and constant particle number and volume),

141


the grand canonical ensemble (µV T -ensemble) which is an extension of the canonicalensemble and describes a system in contact with both heat and particle baths (constanttemperature, volume and chemical potential).

There are further ensembles of practical relevance such as the isothermal-isobaric ensemble(NpT -ensemble) which implies constant temperature and constant pressure.

If only a macrostate such as E is prescribed, there will be large numbers of potential microstates(e.g., for the harmonic oscillator the entire ellipse presents admissible microstates for a given E). Ifone takes an ensemble with a macro-constraint, the full system will follow trajectories that complywith the imposed macrostate and we can define other macroscopic quantities A for that system.The trajectory depends on the initial conditions, which means that in order to obtain the trueaverage A, one should ideally run very many, n to be specific, parallel simulations with the sameE but starting with different initial conditions so as to sample over as many points in phase spaceas possible. This yields the approximate ensemble average

⟨A⟩ ≈ 1

n

n

∑i=1

A(qn,pn). (39.13)

If we consider the limit n→∞, we obtain the macroscopic observable A as the ensemble averageor phase average

⟨A⟩ = ∫ΓA(q,p)ρ(q,p) dp dq (39.14)

where ρ(q,p) is a distribution function or probability density which satisfies

∫Γρ(q,p) dp dq = 1. (39.15)

Roughly speaking, the probability density ρ(q,p) defines the likelihood of finding an ensemble inthe configuration (q,p).

For the phase average to make physical sense, Ludwig Boltzmann introduced the so-called hypoth-esis of ergodicity: if one waits sufficiently long, any physical system will visit all possible pointsin phase space consistent with a given macrostate. This admits replacing the discrete sum over acountable number of measurements by the integral in phase space. Moreover, it also implies that

A = ⟨A⟩. (39.16)

39.2 Center of mass coordinates

In the following, we will link thermal properties to atomic vibrations. Therefore, it is convenientto reformulate atomic positions and momenta as those relative to the center of mass and the meanmomentum of all N atoms:

Q = ∑Ni=1miqi

∑Ni=1mi

, P =N

∑i=1

miqi =MQ, (39.17)

with the total mass M = ∑Ni=1mi. Balance of linear momentum now becomes

MQ =N

∑i=1

fext = Fext. (39.18)

142


Now, we can express all atomic positions and momentum with respect to the average values:

qi =Q + δqi, pi =miQ + δpi such that δpi =miδqi. (39.19)

The total Hamiltonian becomes

H =N

∑i=1

∥miQ + δpi∥2

2mi+ V (Q + δq1, . . . ,Q + δqN)

=N

∑i=1

⎡⎢⎢⎢⎢⎣

m2i ∥Q∥2

2mi+ ∥δpi∥2

2mi+ miQ ⋅ δpi

2mi

⎤⎥⎥⎥⎥⎦+ V (Q + δq1, . . . ,Q + δqN)

= 1

2(N

∑i=1

mi)∥Q∥2 +N

∑i=1

∥δpi∥2

2mi+ 1

2Q ⋅

N

∑i=1

δpi + V (Q + δq1, . . . ,Q + δqN)

(39.20)

Note that

N

∑i=1

δpi =N

∑i=1

mi δqi =d

dt

N

∑i=1

mi δqi =d

dt

N

∑i=1

mi (qi −Q) = d

dt(MQ −MQ) = 0. (39.21)

Hence, the total Hamiltonian is

H = ∥P ∥2

2M+N

∑i=1

∥δpi∥2

2mi+ V (Q + δq1, . . . ,Q + δqN) (39.22)

and

Q = ∂H∂P

, P = −∂H∂Q

= −N

∑i=1

∂V

∂qi= −

N

∑i=1

(−fi) = Fext. (39.23)


δqi =∂H∂δpi

, δpi = pi − P = − ∂H∂δqi

− mi

MP = fi −

mi

MFext. (39.24)

In case no external forces act on the ensemble (Fext = 0), then the average motion decouples fromthe atomic fluctuations and we may work with

H =N

∑i=1

∥δpi∥2

2mi+ V (Q + δq1, . . . ,Q + δqN) (39.25)

for which

δqi =∂H∂δpi

, δpi = −∂H∂δqi

. (39.26)

Since the mean motion is often of minor interest, we will – in the following – work only withthe perturbations (δq, δp) to describe atomic motion. For convenience, we will drop the δ in allperturbations and write (q,p).

143


39.3 The microcanonical ensemble

The microcanonical ensemble considers an isolated system containing a constant number of Natoms, contained in a constant volume V and having a constant energy E. Because the Hamiltonianof an isolated system is constant along trajectories in phase space, all microstates satisfying H = Eform a closed hypersurface which encloses a volume VR(E):

VR(E) = ∫ΓH (E −H(q,p)) dq dp = ∫

H<Edq dp (39.27)

with Heaviside jump function H(⋅). This allows us to define the density of states as

D(E) = dVRdE

(E). (39.28)

D(E) is thus the density of states contained between the hypersurfaces defined by E and E + dE,which defines the hypershell

Σ(E,∆E) = (q,p) ∶ E ≤H(q,p) ≤ E +∆E. (39.29)

The distribution function ρ of the microcanonical ensemble can now be defined as follows. Let usdefine ρ(E) such that it is constant within Σ(E,∆E) (later taking the limit ∆E → 0) and zeroeverywhere else. Note that we must have

∫Γρ(q,p;E,∆E) dq dp = 1, (39.30)

which results in

ρ(q,p;E,∆E) =⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

1

VR(E +∆E) − VR(E) if (q,p) ∈ Σ(E,∆E),

0 else.

(39.31)

The phase average now becomes

⟨A⟩ = lim∆E→0

∫ΓA(q,p)ρ(q,p;E,∆E) dq dp = lim

∆E→0

∫Σ(E,∆E)A(q,p)dq dp

VR(E +∆E) − VR(E)

= lim∆E→0

1∆E [∫

VR(E+∆E)A(q,p)dq dp − ∫

VR(E)A(q,p)dq dp]

VR(E +∆E) − VR(E)∆E

= 1

D(E)∂

∂E∫VR(E)

A(q,p)dq dp

= 1

D(E)∂

∂E∫

ΓA(q,p)H (E −H(q,p)) dq dp,

(39.32)

where we used the Heaviside function in order to expand the integral to all of phase space in thelast line. Thus, we obtain

⟨A⟩ = 1

D(E) ∫ΓA(q,p) δ (E −H(q,p)) dq dp = ∫

ΓA(q,p)ρ(q,p,E)dq dp (39.33)

144


with the microcanonical distribution function

ρ(q,p,E) = 1

D(E)δ (E −H(q,p)) . (39.34)

Notice that the normalization constraint implies that

∫Γδ (E −H(q,p)) dq dp =D(E). (39.35)

Now that we have the probability density, we can look at macroscopic state variables. For example,we may define the internal energy as

U = ⟨H⟩, (39.36)

so that for an isolated system U = E. From thermodynamics, recall that

∂S

∂U= 1

Tand T = 1

∂S/∂E (39.37)

with entropy S and temperature T .

Boltzmann postulated the entropy as

S = kB ln Ω, (39.38)

which in our case can be restated as

S(E) = kB lnD(E), (39.39)

which in the thermodynamic limit of N →∞ can be shown to approach

S(E) = kB lnVR(E). (39.40)

Using (39.32) and the fact that E = const., let us compute

⟨xi∂H∂xj

⟩ = 1

D(E)∂

∂E∫VR(E)

xi∂H∂xj

dq dp

= 1

D(E)∂

∂E∫VR(E)

xi∂(H −E)∂xj

dq dp

= 1

D(E)∂

∂E[∫

VR(E)

∂xi(H −E)∂xj

dq dp − ∫VR(E)

δij(H −E)dq dp]

= 1

D(E)∂

∂E[∫

S(E)xi(H −E)nj dq dp − ∫

VR(E)δij(H −E)dq dp] .

(39.41)

Note that H = E on S(E), so that the first integral vanishes and we are left with

⟨xi∂H∂xj

⟩ = 1

D(E)∂

∂E∫VR(E)

δij(E −H)dq dp

= δij

D(E)∂

∂E∫

Γ(E −H)H(E −H)dq dp

= δij

D(E) [∫ΓH(E −H)dq dp + ∫

Γ(E −H) δ(E −H)dq dp] .

(39.42)

145


Recall that the first integral is by definition VR(E) and notice that the second integral vanishes.Therefore,

⟨xi∂H∂xj

⟩ = δijVR(E)D(E) = δij

VR(E)V ′R(E) = δij [

∂

∂ElnVR(E)]

−1

= δijkB

∂S/∂E = δijkBT. (39.43)

Finally, let us take xi = pi (where i refers to one degree of freedom of one of the N atoms) so that∂H/∂xj = ∂H/∂pj = qj . Then,

δijkBT = ⟨pi∂H∂pj

⟩ = ⟨piqj⟩ = ⟨pipj

mj⟩ (39.44)

and considering i = j gives

⟨ p2i

2mi⟩ = 1

2kBT. (39.45)

Summing over all 3N degrees of freedom thus gives the relation

3N

∑i=1

⟨ p2i

2mi⟩ =

3N

∑i=1

⟨1

2miq

2i ⟩ = ⟨Kvibr.⟩ =

3N

2kBT ⇔ T = 2⟨Kvibr.⟩

3NkB, (39.46)

which links the average vibrational energy to the temperature of the ensemble.

Alternatively, using xi = qi gives

δijkBT = ⟨qi∂H∂qj

⟩ , (39.47)

so that summation over all 3 degrees of freedom of an atom α, again for i = j, yields

3kBT =3

∑i=1

⟨qi∂H∂qi

⟩ = −3

∑i=1

⟨qi∂V∂qi

⟩ = −⟨qα ⋅ fα⟩ (39.48)

and

−N

∑α=1

⟨qα ⋅ fα⟩ = −⟨W⟩ = 3NkBT, (39.49)

where W is often referred to as the virial of the system. Note that ⟨W⟩ = −2⟨Kvibr.⟩, which isreferred to as the virial theorem.

Without derivation, we just note that other continuum quantities can be derived similarly, includingthe virial stress tensor (in the currrent configuration, i.e., the Cauchy stresses)

σ = − 1

∣Ω∣N

∑a=1

⟨pa ⊗ pa2ma

+ fa ⊗ qa⟩ with fa = −∂V

∂qa, (39.50)

which – in static equilibrium – agrees with the Cauchy-Born derivation of Section 36.

146


40 Molecular Solution Algorithms

The equations of motion,

miqi = fi = −∂V∂qi

for i = 1, . . . ,N (40.1)

are to be solved numerically.

40.1 Zero-Temperature Molecular Statics

The simplest case, molecular statics at zero temperature, seeks the solution of

0 = fi = −∂V∂qi

for i = 1, . . . ,N. (40.2)

Owing to the symmetry of atomic lattices, many local energy minima exist (and oftentimes multipleglobal maxima), which is why the solution algorithm can be numerically demanding. In addition,for large numbers N the assembly and storage of system-wide matrices becomes computationallyintractable, which is why matrix-free iterative solvers are popular. Typical solution algorithmsinclude solvers of conjugate gradient and steepest descent type; the so-called Fast Inertial Re-laxation Engine (FIRE) is a related popular scheme.

Special attention must be given to the application of boundary conditions. Essential BCs toindividual atoms are generally to be avoided and one rather introduces padding regions to applyessential BCs over large numbers of atoms. Periodic BCs are a common way to simulate RVEs.

A general complication of particle methods such as atomistics (when compared to, e.g., FEM) isthe necessity for frequent neighborhood searches and the associated updates of neighbor lists ofall N atoms.

40.2 Molecular Dynamics

Molecular Dynamics (MD) integrates the dynamic equations of motion over time, which isgenerally done in an explicit fashion to avoid the assembly and solution of the global system forcomputational reasons.

The most common scheme is called velocity-Verlet and can be summarized as follows. Usingtime increments ∆t = tα − tα−1 we integrate the equation of motion to obtain

q(tα +∆t) = qα+1 = qα + qα∆t + 1

2qα∆t2 +O(∆t3), (40.3)

and we know that

aαi = qαi = fαi /mi. (40.4)

Thus, for known initial conditions q0 = q1(0), . . . ,qN(0) and p0 = p1(0), . . . ,pN(0) we canintegrate the above equations of motion. Note that, in order to ensure velocity-time reversibleintegration, one usually uses an average acceleration scheme, which uses

vα+1 = vα + aα + aα+1

2∆t (40.5)

147


rather than a fully explicit scheme. The advantage is that we may conclude

q(tα −∆t) ≈ qα − qα∆t + 1

2qα∆t2 = qα − vα∆t + 1

2aα∆t2 (40.6)

so that

qα = qα−1 + vα∆t − 1

2aα∆t2

= qα−1 + vα−1∆t + aα−1 + aα

2∆t2 − 1

2aα∆t2

= qα−1 + vα−1∆t + ∆t2

2aα−1.

(40.7)

Note that one generally stores q(t) and v(t) = p(t)/m (rather than q and p). The general algorithmfor zero-temperature MD is as follows:

(i) start with t0 = 0 and known q0 and v0 = p0/m(ii) fα = −∂V/∂q(qα)

(iii) aα = fα/m(iv) while t ≤ tend:

qα+1 = qα + vα∆t + fα∆t2/2mfα+1 = −∂V/∂q(qα+1)aα+1 = fα+1/mvα+1 = vα +∆t(aα + aα+1)/2tα = tα +∆t

Note that the force calculation fα = −∂V/∂q(qα) also involves neighborhood updates (i.e., for eachatom whose force is to be computed one finds, stores and frequently updates the neighboring atomswithin the Verlet radius, which is typically chosen larger than the cut-off radius to reduce thefrequency of neighborhood updates).

Without proof, we mention that the velocity-Verlet scheme is (approximately) energy conserving ;i.e., it conserves the total Hamiltonian in an approximate sense (more specifically, it conserves aquantity known as shadow Hamiltonian, which converges to the exact Hamiltonian as ∆t → 0;the error in the Hamiltonian scales of O(∆t2)). This is an important feature, which makes thevelocity-Verlet scheme directly applicable to the canonical (NV E) ensemble. For example, for thesimple harmonic oscillator discussed before, the shadow Hamiltonian can be derived exactly as

H∗(p, q) =H(p, q) − k2∆t2

8mq2, (40.8)

which obviously agrees with the exact Hamiltonian H in the limit ∆t→ 0.

We can easily compute the instantaneous temperature of the system in a postprocessing step, viz.(39.48) yields

T (tα) = 2

3NkB

N

∑i=1

mi

2∥δvαi ∥

2 (40.9)

where δvαi = vi(tα) − P (tα)/mi as discussed in Section 39.2. Note that the temperature T (t)fluctuates, and it makes more sense to compute the long-term average T as an approximation ofthe phase average.

148


40.3 Finite Temperature

Finite-temperature calculations usually require monitoring and/or controlling of the system tem-perature T . Irrespective of the ensemble being used, one can set the initial temperature Tini ofan atomic ensemble as follows.

First, one initializes all particles with a random velocity and, in a subsequent step, ensures thatthe average momentum P = ∑imivi vanishes. This is accomplished, e.g., by adjusting all particlevelocities as

v0 ← v0 − P0

Mwith, as before, M =

N

∑i=1

mi. (40.10)

Next, using (40.9) compute the instantaneous temperature T 0 and rescale all atomic velocities by

v0 ←√

Tini

T 0v0 (40.11)

such that

T 0 = 2

3NkB

N

∑i=1

mi

2∥δv0

i ∥2 = 2

3NkB

N

∑i=1

mi

2

Tini

T 0∥v0

i ∥2 = Tini

T 0

2

3NkB

N

∑i=1

mi

2∥v0

i ∥2 = Tini. (40.12)

Maintaining a constant temperature, such as when using the microcanonical (NV T ) ensemble,is a whole different challenge that is commonly met by using special time-integration schemes thatmodify atomic velocities at each time step.

Fixing the total kinetic energy can be accomplished, e.g., by Gauß’ principle of least constraint.To this end, notice that the classical equations of motion can be restated as minimizers of

C =N

∑i=1

mi

2(qi −

fimi

)2

because 0 = ∂

∂qC =mi (qi −

fimi

) =miqi − fi. (40.13)

This allows us to add constraints via Lagrange multipliers. E.g., the constraint

c =N

∑i=1

mi

2∥qi∥2 − 3

2NkBT = 0 (40.14)

can be differentiated to give

∂c

∂t=N

∑i=1

miqi ⋅ qi, (40.15)

which motivates the definition

C∗ = C − λN

∑i=1

miqi ⋅ qi. (40.16)

The resulting modified equations of motion are

∂

∂qC∗ =miqi − fi − λmiqi = 0 ⇔ miqi = fi + λmiqi. (40.17)

149


The Lagrange multiplier λ can be obtained from inserting qi = fi/mi + λqi into the constraintequation (40.15), which yields

N

∑i=1

mi (fi/mi + λqi) ⋅ qi = 0 ⇔ λ = − ∑Na=1 fa ⋅ qa∑Na=1ma ∥qa∥2

. (40.18)

Overall, we thus arrived at

miqi = fi − λ∑Na=1 fa ⋅ qa∑Na=1ma ∥qa∥2

qi. (40.19)

Note that this formulation does fix the average total kinetic energy but it does not allow anycontrol of the atomic fluctuations with respect to the mean motion (as required in the temperaturecalculation). This is commonly achieved by so-called thermostats.

40.4 Thermostats

For example, the Langevin thermostat modifies the equations of motion into

miqi = fi − γimiqi + gi(t), (40.20)

where γi is a damping/viscosity constant and gi denotes a random, time-varying force that mustsatisfy

⟨gi⟩ = 0. (40.21)

This can be achieved, while enforcing a temperature T , in practice if the force is chosen to followa random, normal distribution with variance

σ2i = 2γimikBT∆t (40.22)

at each time step. The damping term can be interpreted by Brownian motion at a significantlyhigher collision rate than that of atomic interactions.

A similar scheme is the Andersen thermostat which also introduces random forces as in theLangevin approach but does so not at every time step.

The probably most common thermostat is the Nose-Hoover thermostat, which starts with afictitious Hamiltonian

H∗ = p2

2m+N

∑i=1

∥pi∥2

2miq2+ V (q) + 3NkBT ln q. (40.23)

H∗ includes the 1-D motion of a fictitious atom having mass m, position q and momentum p; and ituses rescaled atomic momenta pi = qmiqi. Applying Hamilton’s equations of motion to all particlesi = 1, . . . ,N as well as to the fictitious atom results in the system of equations

miqi = fi − γmiqi,

γ = 1

m(N

∑i=1

∥pi∥2

mi− 3NkBT) .

(40.24)

150


Thus, the Nose-Hoover thermostat is similar in nature to the Langevin/Andersen thermostats; here,the viscous drag coefficient γ is not constant but evolves with time and presents an intrinsic forcethat aims to drive the system to the enforced temperature T . The only parameter to be adjustedis the fictitious mass m which my be interpreted as the inertia of the viscous coefficient.

We note that the same MD algorithm introduced above for zero-temperature conditions can beapplied at finite temperature, if we modify the atomic forces according to each thermostat. E.g.,for the Langevin thermostat we would define atomic forces

f∗i (tα) = fi(tα) − γmiqi(tα) +G(tα). (40.25)

Analogously, the Nose-Hoover thermostat uses forces

f∗i (tα) = fi(tα) − γ(tα)qi(tα) (40.26)

and additionally updates the viscous coefficient γ at each time step in the average-accelerationfashion according to

γα+1 = γα + ∆t

2(γα + γα+1) . (40.27)

Finally, note that one can analogously impose an average stress tensor, the most frequent formula-tion of which is known as the Parrinello-Rahman approximation.

151


References

Abraham, F. F., Broughton, J. Q., Bernstein, N., Kaxiras, E., 1998. Spanning the continuum toquantum length scales in a dynamic simulation of brittle fracture. EPL (Europhysics Letters)44 (6), 783.URL http://stacks.iop.org/0295-5075/44/i=6/a=783

Miehe, C., Koch, A., 2002. Computational micro-to-macro transitions of discretized microstructuresundergoing small strains. Archive of Applied Mechanics 72 (4), 300–317.URL http://dx.doi.org/10.1007/s00419-002-0212-2

152

http://stacks.iop.org/0295-5075/44/i=6/a=783

http://dx.doi.org/10.1007/s00419-002-0212-2

Index

L2-norm, 9n-dimensional space, 24(proper) subset, 3Angstrom, 1312-node bar element, 362-node beam element, 374-node bilinear quadrilateral, 384-node tetrahedron, 506-12 potential, 1308-node brick element, 418-node quadratic quadrilateral, 419-node quadratic quadrilateral, 41

a.e., 8acoustic tensor, 110action, 70action principle, 70affine displacement BCs, 92Andersen thermostat, 150anharmonic, 131approximation error, 78assembly operator, 53atomistics, 126average acceleration, 73average strain-driven BCs, 97average stress-driven BCs, 97averaging theorems, 85

Banach space, 10barycentric coordinates, 49basis, 32bijective, 3bilinear form, 21bilinear operator, 18binding energy, 129Bloch wave, 119bond angle, 132Born-Oppenheimer approximation, 124boundary, 3boundary conditions, 5Bravais lattice, 105Brownian motion, 150Bubnov-Galerkin approximation, 25

Cea’s lemma, 25

canonical ensemble, 141Cauchy stress tensor, 4central limit theorem, 84centrosymmetry, 133class Ck(Ω), 8classical solution, 20closure, 10complete polynomial, 16complete space, 10complete up to order q, 33completeness property, 33computational homogenization, 90concurrent scale-bridging, 137condensation method, 61condensed energy density, 64condition number, 78configuration, 83conjugate gradient, 57consistent mass matrix, 71Constant Strain Triangle, 50constraint potential, 97continuous at a point x, 8continuous over Ω, 8convergence, 10Coulomb potential, 124Coulombic interaction, 128CST, 50cubature rules, 43cut-off radius, 133

damped Newton-Raphson method, 55damping matrix, 72De Broeglie, 123deformation gradient, 4deformation mapping, 4degree, 15degrees of freedom, 34density of states, 144deviatoric, 66diffusion equation, 4Direct methods, 6Dirichlet boundary, 5Dirichlet-Poincare inequality, 12discrete problem, 25

153


discrete weak form, 71discretization error, 78displacement field, 4dissociation energy, 131distance, 9, 10distribution function, 142domain, 3dual (dissipation) potential, 63

effective incremental potential, 64effective mass density, 91effective material, 89effective response, 90eigenfrequencies, 74eigenmodes, 74Einstein-Planck relation, 123electric permittivity, 124electron-volts, 131elements, 34elliptic, 5Embedded Atom Method, 132energy norm, 27ensemble, 83, 141ensemble average, 84, 142equilibrium spacing, 129ergodicity, 142essential supremum, 10Euclidean norm, 9explicit time integration, 73extended Cauchy-Born rule, 121external force elements, 58

Fast Inertial Relaxation Engine, 57, 147FE2, 103finite differences, 6finite element, 34Finite Element Method, 34First Brillouin Zone, 119first fundamental error, 78first Piola-Kirchhoff stress tensor, 4first variation, 19first-order central difference, 6force constant, 131Fourier transform, 105full integration, 48function, 3functional, 18

Gauss quadrature, 43

Gauss-Chebyshew, 46Gauss-Hermite, 46Gauss-Legendre quadrature, 43, 45Gauss-Lobatto quadrature, 46Gauss-Newton method, 57generalized conservation law, 4generalized Maxwell model, 68global, 32global error estimate, 12, 14gradient elasticity, 121gradient flow method, 56Gram-Schmidt orthogonalization, 44grand canonical ensemble, 142

h-refinement, 35Hamilton’s equations, 126harmonic potential, 131heat equation, 4Hermitian polynomials, 37heterogeneous problem, 89hierarchical interpolation, 37hierarchical scale-bridging, 137higher-order interpolation, 14Hilbert space, 10Hill-Mandel condition, 92homogenized problem, 89horizontal scale-bridging, 137hp-refinement, 35hyperbolic, 5

idea, 34identity mapping, 3implicit time integration, 73incremental variational problem, 64indirect methods, 7infinitesimal strain tensor, 4initial boundary value problem, 5initial conditions, 5injective, 3inner product, 8inner product space, 8interatomic potential, 128internal energy, 145inverse function theorem, 40ionic crystal, 129Irreducible Brillouin Zone, 119isomorphism, 3isoparametric, 38

154


isoparametric mapping, 38isothermal-isobaric ensemble, 142

Jacobian, 39

L2-inner product, 8L2-space of functions, 11Lp-norm, 9Lagrange polynomials, 36Lagrangian description, 75Lagrangian interpolation, 36Langevin thermostat, 150Lax-Milgram theorem, 22Legendre polynomials, 44Lennard-Jones, 129line search method, 55linear, 18linear acceleration, 73linear elasticity, 29linear form, 21linear momentum balance, 4linear strain triangle, 52linear subspace, 3linear/vector space, 3local, 32local Cauchy-Born rule, 120local error estimate, 12

macroscale, 83macrostates, 141mapping, 3mesh, 34microcanonical ensemble, 141, 144microscale, 83microstates, 141mixed stress/strain-driven BCs, 97modeling error, 79Molecular Dynamics, 147monomial, 16Morse potential, 131motivation, 34multi-index, 15

neighborhood, 10Neumann boundary, 5Neumann-Poincare, 12Newmark-β, 73Newton-Cotes, 42Newton-Raphson method, 54

nodes, 34nonlinear least squares, 56nonlocal Cauchy-Born rule, 121nonlocal model, 121norm, 9normed linear space, 9Nose-Hoover thermostat, 150numerical integration, 42numerical integration error, 78

observable, 141one-to-one, 3onto, 3open, 10open set, 3operator, 18orbitals, 125order of a PDE, 5order reduction, 74ordered triad, 3orthogonal, 8over-integration, 48

p-refinement, 35pair potentials, 128parabolic, 5Parrinello-Rahman approximation, 151particle methods, 75perfectly bonded, 85periodic BCs, 92, 95perturbation stress tensor, 110Petrov-Galerkin, 25phase average, 142phase space, 140phonon modes, 132Poincare inequalities, 12Pontryagin duality, 106positive, 18predictor, 67principle of virtual work, 23probability density, 125, 142

Q4, 38Q8, 41Q9, 41QC method, 139quadratic tetrahedron, 52quadratic triangle, 52

155


quadrature error, 47quadrature rules, 43Quasi-Newton method, 55quasicontinuum method, 139quasiharmonic approximation, 131quasistatics, 4

r-refinement, 35range, 3Rayleigh quotient, 74realization, 83reciprocal lattice, 105representative atoms, 139representative volume element, 84Riemann sum, 42rigid-body modes, 61ringing artifacts, 113

sample enlargement, 84sampling atoms, 139Schrodinger’s equation, 124second-order central difference, 6semi-discretization, 71semi-norm, 13separation of scales, 89serendipity element, 41set, 3shape function properties, 32shape functions, 32short-range, 130simplex, 49simplicial quadrature, 50Sobolev norm, 14Sobolev semi-norm, 13Sobolev space, 15, 17solution error, 78space of all second-order polynomial functions, 3square-integrable, 11, 15stationarity condition, 19statistically homogeneous, 83statistically inhomogeneous, 83statistically representative, 84stencils, 6Stilinger-Weber, 132strong form, 23, 28subdifferential, 66subparametric, 38summation rules, 139

superparametric, 38support, 17surjective, 3symmetric, 18

tangent matrix, 54Taylor expansions, 6time average, 141total Hamiltonian, 126trajectory, 140translational invariance, 83triangle inequality, 9truncation error, 79two-body potentials, 128

under-integration, 48uniform traction BCs, 92unit cell, 83updated-Lagrangian, 75

Vainberg’s theorem, 26van der Waals, 134variation, 19variational constitutive updates, 64variational structure, 20velocity-Verlet, 147Verlet radius, 148vertical scale-bridging, 137virial, 146virial stress tensor, 146virial theorem, 146von Mises plasticity, 67von Mises stress, 67

weak form, 23, 70weak solution, 23Weierstrass approximation theorem, 33

zero-energy modes, 61zero-temperature MD, 148zeroth derivative, 8

156

computational solid mechanics { part ikochmann.caltech.edu/ae214/ae214lecturenotes.pdf ·...

Documents