automatic differentiation
DESCRIPTION
An introduction to automatic differentiation with examples.TRANSCRIPT
![Page 1: Automatic Differentiation](https://reader035.vdocuments.mx/reader035/viewer/2022073120/563db827550346aa9a9108a7/html5/thumbnails/1.jpg)
Automatic DifferentiationHigher derivatives and Taylor series
Automatic Differentiation
Tobias Hoeppner
18 May 2011
Tobias Hoeppner Automatic Differentiation
![Page 2: Automatic Differentiation](https://reader035.vdocuments.mx/reader035/viewer/2022073120/563db827550346aa9a9108a7/html5/thumbnails/2.jpg)
Automatic DifferentiationHigher derivatives and Taylor series
Outline
Automatic Differentiation
Higher derivatives and Taylor series
Tobias Hoeppner Automatic Differentiation
![Page 3: Automatic Differentiation](https://reader035.vdocuments.mx/reader035/viewer/2022073120/563db827550346aa9a9108a7/html5/thumbnails/3.jpg)
Automatic DifferentiationHigher derivatives and Taylor series
What is Automatic Differentiation
also own as:
I computational differentiation
I algorithmic differentiation
I differentiation of algorithms
AD is a process for evaluating derivatives which depends only onan alorithmic apecification of the function to be differentiated. inpractice the specification of the function is part of a computerprogramIt’s not symbolic differentiation It’s not divided differences
Tobias Hoeppner Automatic Differentiation
![Page 4: Automatic Differentiation](https://reader035.vdocuments.mx/reader035/viewer/2022073120/563db827550346aa9a9108a7/html5/thumbnails/4.jpg)
Automatic DifferentiationHigher derivatives and Taylor series
An Example
the function
f (x , y) = (xy + sin x + 4)(3y2 + 6) (1)
the goal of symbolic diff is to produce fromulas for its derivatives
∂f
∂x= (y + cos x)(3y2 + 6) = 3y2 cos x + 6 cos x + 3y3 + 6y ,
(2)
∂f
∂y= 6y(xy + sin x + 4) + x(3y2 + 6) = 9xy2 + 6y sin x + 24y + 6x
(3)
in principle, avaluation of these formulas gives exact values of thederivatives but roundoff error due to floating point
Tobias Hoeppner Automatic Differentiation
![Page 5: Automatic Differentiation](https://reader035.vdocuments.mx/reader035/viewer/2022073120/563db827550346aa9a9108a7/html5/thumbnails/5.jpg)
Automatic DifferentiationHigher derivatives and Taylor series
divided differences
I produce approximations to values
I involving only function evaluations
∂f
∂x≈ f (x + ∆x , y)− f (x − ∆x , y)
2∆x=
∂d
∂x+O(∆x2) (4)
where the term O(∆x2) denotes the (unknown) truncation error.in contrast, the values for derivatives obtained by AD are exactand are often much less expensive to compute.
Tobias Hoeppner Automatic Differentiation
![Page 6: Automatic Differentiation](https://reader035.vdocuments.mx/reader035/viewer/2022073120/563db827550346aa9a9108a7/html5/thumbnails/6.jpg)
Automatic DifferentiationHigher derivatives and Taylor series
How and why does AD work?
I AD works whenever the chain rule holds
I the theretical exactness of automatic differentiation stemsfrom the fact that it uses the same rules of differentiation aslearned in elementary calculus.
I rules are applied to an algorithmic specification rather than toa formula
I step back a little and consider how to evaluate (rather thandifferentiate) a formula)
Tobias Hoeppner Automatic Differentiation
![Page 7: Automatic Differentiation](https://reader035.vdocuments.mx/reader035/viewer/2022073120/563db827550346aa9a9108a7/html5/thumbnails/7.jpg)
Automatic DifferentiationHigher derivatives and Taylor series
evaluating a formula
I the formula given by (1)
I one starts with the values of x and y , builds up each factor,and then multiplies them to obtain the final result.
I the steps involved:
t1 = x , t6 = t5 + 4,
t2 = y , t7 = t22 ,
t3 = t1t2, t8 = 3t7, (5)
t4 = sin t1, t9 = t8 + 6,
t5 = t3 + t4, t10 = t6t9
the result is t10 = f (x , y)
Tobias Hoeppner Automatic Differentiation
![Page 8: Automatic Differentiation](https://reader035.vdocuments.mx/reader035/viewer/2022073120/563db827550346aa9a9108a7/html5/thumbnails/8.jpg)
Automatic DifferentiationHigher derivatives and Taylor series
obtain derivatives
I in case of a function f = f (x1, . . . , xm) of several variables,the first partial derivatives can be expressed compactly as thegradient vector
∇f =
[∂f
∂x1, . . . ,
∂f
∂xm
](6)
if u and v are functions whose gradients ∇u and ∇v are known orare previously computed, we compute ∇f using the rules
∇(u ± v) = ∇u ±∇v ,
∇(uv) = u∇v + v∇u,
∇(u/v) = (∇− (u/v)∇v)/v , v 6= 0,
for the arithmetic options and the chain rule
∇φ(u) = φ′(u)∇u, (7)
Tobias Hoeppner Automatic Differentiation
![Page 9: Automatic Differentiation](https://reader035.vdocuments.mx/reader035/viewer/2022073120/563db827550346aa9a9108a7/html5/thumbnails/9.jpg)
Automatic DifferentiationHigher derivatives and Taylor series
derivative of the code list
the code list (5) can be augmented with the gradients of eachentry,
t1 = x , ∇t1 = [1, 0],
t2 = y , ∇t2 = [0, 1],
t3 = t1t2, ∇t3 = t1∇t2 + t2∇t1 → [t2, t1],
t4 = sin t1, ∇t4 = (cos t1)∇t1 → [cos t1, 0],
t5 = t3 + t4, ∇t5 = ∇t3 +∇t4 → [t2 + cos t1, t1],
t6 = t5 + 4, ∇t6 = ∇t5 → [t2 + cos t1, t1],
t7 = t22 , ∇t7 = 2t2∇t2 → [0, 2t2],
t8 = 3t7, ∇t8 = 3∇t7 → [0, 6t2],
t9 = t8 + 6, ∇t9 = ∇t8 → [0, 6t2]
t10 = t6t9 ∇t10 = t6∇t9 + t9∇t6 → [t9(t2 + cos t1), 6t2t6 + t1t9].
Tobias Hoeppner Automatic Differentiation
![Page 10: Automatic Differentiation](https://reader035.vdocuments.mx/reader035/viewer/2022073120/563db827550346aa9a9108a7/html5/thumbnails/10.jpg)
Automatic DifferentiationHigher derivatives and Taylor series
final results
I the final results are t10 = f (x , y) and its gradient∇t10 = ∇f (x , y) = [t9(t2 + cos t1), 6t2t6 + t1t9].
I count of operatations: 22 = 2 + 10m
Tobias Hoeppner Automatic Differentiation
![Page 11: Automatic Differentiation](https://reader035.vdocuments.mx/reader035/viewer/2022073120/563db827550346aa9a9108a7/html5/thumbnails/11.jpg)
Automatic DifferentiationHigher derivatives and Taylor series
2nd order derivative
I in preceeding section, we computed first derivatives
I once a code list representation of function has been obtained,one can also apply rules for higher derivatives or recurrentrelations for Taylor coefficients.
I the second partial derivatives of a function f : Rm → Rconstitutes its Hessian matrix
H(f ) =
[∂2f
∂xi∂xj
]i ,j=1,...,m
(8)
-required for optimization algos using Newton’s method
Tobias Hoeppner Automatic Differentiation
![Page 12: Automatic Differentiation](https://reader035.vdocuments.mx/reader035/viewer/2022073120/563db827550346aa9a9108a7/html5/thumbnails/12.jpg)
Automatic DifferentiationHigher derivatives and Taylor series
Rules for arithmetic operations
I the rules for results of arithmetic operations are
H(u ± v) = H(u)±H(v),
H(uv) = uH(v) +∇uT∇v +∇vT∇u + vH(u),
H(u/v) =(H(u)−∇(u/v)T∇v −∇vT∇(u/v)− (u/v)H(v)
), v 6= 0,
and the chain rule takes the form
H(φ(u)) = φ′′(u)∇uT∇u + φ′(u)H(u)
for a twice differentiable functions φ as the standard functions.
Tobias Hoeppner Automatic Differentiation
![Page 13: Automatic Differentiation](https://reader035.vdocuments.mx/reader035/viewer/2022073120/563db827550346aa9a9108a7/html5/thumbnails/13.jpg)
Automatic DifferentiationHigher derivatives and Taylor series
The Taylor Series
f (x) = f (x0) +1
1!f ′(x0) · (x − x0)
+1
2!f ′′(x0)(x − x0)
2
+ . . . +1
n!f (n)(x0) · (x − x0)
n + . . .
=∞
∑n=0
1
n!f (n)(x0) · (x − x0)
n. (9)
Tobias Hoeppner Automatic Differentiation
![Page 14: Automatic Differentiation](https://reader035.vdocuments.mx/reader035/viewer/2022073120/563db827550346aa9a9108a7/html5/thumbnails/14.jpg)
Automatic DifferentiationHigher derivatives and Taylor series
Taylor coefficients
I Taylor coefficients are scalars
I suppose that f is a function of m variables
I series expansion at point x0 = (x01, . . . , x0m)
I in direction h = (h1, . . . , hm)
f (x0 + h) =∞
∑k=0
1
k !f (k)(x0)h
k =∞
∑k=0
fk , (10)
where fk = f (k)(x0)hk/k !, k = 0, 1, . . . denote the normalizedTaylor coefficients.
Tobias Hoeppner Automatic Differentiation