a brief overview of methods for computing derivatives wenbin yu department of mechanical &...
TRANSCRIPT
1
A Brief Overview of Methods for Computing Derivatives
Wenbin YuDepartment of Mechanical & Aerospace
EngineeringUtah State University, Logan, UT
2
Finite Difference vs Complex Step
Forward finite difference
Advantages: easy to use, no need to access to the source codes, no need to understand the equation or the code
Disadvantages• step-size dilemma (small enough to avoid truncation error, big enough to
avoid subtractive cancellation error)
• Expensive: always n+1 times of analysis time for n perturbations
Complex step approximation Better than finite difference if implemented correctly Complex variable Complex function
)()()(
)( hOh
xfhxfxf
Iyxz
I),(),(I)()( yxvyxuzvzuf
3
Finite Difference vs Complex Step
Complex step approximation (cont.) If analytic, Cauchy-Reimann equation holds
We deal with real functions of real variables
Not explicitly subject to subtractive cancellation errors and the truncation errors can be made as small as possible
)()(0)(0 xuxfxvy
h
hxf
h
hxv
x
u
x
fhh
)]I(Im[lim
)I(lim
00
h
yxvhyxv
y
v
x
uh
)I()I)((lim
0
I)(!3
)(!2
I)()()I(32
xfh
xfh
xfhxfhxf
)()(!3
)]I(Im[)( 2
2
hOxfh
h
hxfxf
4
Dual Number Automatic Differentiation (DNAD)
Extend all real numbers by adding a second component
d is just a symbol, analogous to the imaginary unit, but all powers of d higher than one equal to zero
Example:
d, 1111 xxxx
1 1 2 2 1 1 2 2 1 1
1 2 1 2 2 1 1 1 1
1 2 1 1 2 2 1 1
1 2 1 21 2
1 2 1 21 2
( d, d) ( d)( d) sin( d)
d d sin( ) cos( ) d
sin( ) d ( cos( )) d
( , ) ( )d
= ( , ),
f x x x x x x x x x x
x x x x x x x x x
x x x x x x x x
f ff x x x x
x x
f ff x x x x
x x
)sin(),( 12121 xxxxxf
5
Dual-number arithmetic
Complex step arithmetic
2
1
, , , , , ,
, * , , , / , , /
sin( , ) sin , cos exp( , ) exp , exp
log( , ) log , / , ,k k k
u u v v u v u v u u v v u v u v
u u v v uv u v uv u u v v uv u v uv v
u u u u u u u u u u
u u u u u u u u u ku
2 2 2 2
2 2
( I)*( I) ( ) I( )
( )( I) / ( I) I
1log( I) log( ) I arg( I)
2
u u v v uv u v u v uv
uv u v u v uvu u v v
v v v v
u u u u u u
Dual Number Automatic Differentiation (DNAD)
6
Comparing DNAD and complex step DNAD is more efficient as calculations are never more and
mostly less (less for *, /, and most intrinsic functions) DNAD is more accurate as it delivers the analytical derivatives
up to machine precision while complex-step is accurate only for extremely small imaginary parts; cancelation and subtraction errors can occur for some functions
Complex-step only has implementation and compiling optimization advantage for codes in languages supporting complex algebra (Fortran), while DNAD as a concept can be used for codes written in any strongly typed languages with real numbers defined
Complex step is not applicable to codes having complex operations and it can only compute sensitivities with respect to one variable Changes the calculation of the original analysis and program flow Hard to debug as many complex operations are
defined by not what you need Derivative of Exp(|x|) at x=-3
Dual Number Automatic Differentiation (cont.)
IF(ABS(x)>0) THEN……ELSE……..ENDIF
7
Performance Comparison
Efficiency Comparison
x=1.0; y=2.0; z=3.0
ftot=0.0d0
DO i=1,500000000
f=x*y-x*sin(y)*log(z)
ftot= (ftot- f)/exp(z)
ENDDO
write(*,*) ftot
Accuracy ComparisonDerivative of Exp(|z|) at z=-3
Complex step: -20.0855362857837
DNAD: -20.0855369231877
Exact: -20.0855369231877
Both complex step and DNAD are implemented in F90/95
# of Design Variables 1 3 9 15 16
Finite Difference 1.64*2 1.64*4 1.64*10 1.64*16 1.64*17
Complex Step 3.94 3.94*4 3.94*9 3.94*15 3.94*16
DNAD 2.11 2.67 14.98 22.16 25.56
Time (seconds) used by different methods
8
Implementation Using F90/95
A general-purpose F90/95 module for automatic differentiation of any Fortran codes including Fortran 77/90/95
Define a new data type DUAL_NUMTYPE,PUBLIC:: DUAL_NUM
REAL(DBL_AD)::x_ad_
REAL(DBL_AD)::xp_ad_
END TYPE DUAL_NUM
Overload functions/operations needed in the analysis codes to this new data type: relational operators, arithmetic operators/functions
Change to “xp_ad_(n)” with n as # of DVs for sensitivities
wrt to multiple DVs
9
Implementation Using F90/95 (cont.)
Define EXP INTERFACE EXP
MODULE PROCEDURE EXP_D
END INTERFACE
ELEMENTAL FUNCTION EXP_D(u) RESULT(res)
TYPE (DUAL_NUM), INTENT(IN)::u
REAL(DBL_AD)::tmp
TYPE (DUAL_NUM)::res
tmp=EXP(u%x_ad_)
res%x_ad_ = tmp
res%xp_ad_ =u%xp_ad_* tmp
END FUNCTION EXP_D
exp( , ) exp , expu u u u u
10
How to Use DNAD
To AD a Fortran code use DNAD1. Replace all the definitions of real numbers with dual numbers
REAL(8) :: x TYPE(DUAL_NUM) :: xREAL(8), PARAMETER:: ONE=1.0D0TYPE(DUAL_NUM),PARAMETER::ONE=DUAL_NUM(1.0D0,0.D0)
2. Insert “Use DNAD” right after Module/Function/Subroutine/ Program statements.
3. Change IO commands correspondingly if the code does not use free formatting read and write (can be automated by written some general-purpose utility subroutines)
4. Recompile the source along with DNAD.o5. The whole process can be automated, and even manually it
only takes just a few minutes for most real analysis codes, although step 3 is code dependent
11
How to Use DNAD (cont.)
To use the sensitivity capability Insert 0 after all real inputs not affected by the design
variable Insert 1 after the real input if it directly represents the
design variable Insert the corresponding sensitivities calculated by other
codes if the real inputs are affected indirectly by design variable, such as the sensitivity of nodal coordinates due to change of geometry
The sensitivities are reported in the outputs as the number following the function value
Designers only need to manipulate inputs/outputs of the code
12
DNAD Example
PROGRAM CircleAreaREAL(8),PARAMETER:: PI=3.141592653589793D0REAL(8):: radius, areaREAD(*,*) radiusArea=PI*radius**2WRITE(*,*) "AREA=", Area
END PROGRAM CircleAreaPROGRAM CircleArea
USE DNADTYPE (DUAL_NUM),PARAMETER:: PI=DUAL_NUM(3.141592653589793D0,0.D0)
TYPE (DUAL_NUM):: radius,areaREAD(*,*) radiusArea=PI*radius**2WRITE(*,*) "AREA=",Area
END PROGRAM CircleArea
Input: 5AREA=78.5398163397448
Input: 5,1AREA=78.5398163397448, 31.4159265358979
13
Value( )
Sens. (E)( )
Sens. (b)( )
Sens. (h)( )
EA 5200 2000 520.0 2600
EI22 17.333 6.667 1.733 26.00
EI33 4.333 1.667 1.300 2.167
GJ 4.833 1.859 1.247 3.433GJ (exact) 4.58 1.762 1.374 2.29
Example (VABS-AD)
510
4
3 4 222
3 4 233
3 4 2
5200 10 N
/12 17.333 10 N.m
/12 4.333 10 N.m
0.229 4.58 10 N.m
EA Ebh
EI Ebh
EI Ehb
GJ Gb h
• VABS: 10,000+ lines of F 90/95 codes• An isotropic rectangular section
meshed with two 8-noded quads
2x
3x
1 23
8 49
512
116
13
710
Element 1
2.6GPa, 0.3, 0.1m, 0.2mE b h
610410
Loss of accuracy due to coarse mesh remains the same, can be verified by sensitivity wrt E which is equal to GJ/E and GJ is linear of E
510
h
b
Element 2
14
Example (VABS-AD)
Changes to the inputs• Sensitivity wrt E:
Change 0.26E+10 .300000000E+00
To 0.26E+10 1. .300000000E+00 0.
• Sensitivity wrt b:
1 -0.0500000 -0.5 -0.1000000 0.0 2 0.0500000 0.5 -0.1000000 0.0 3 0.0000000 0.0 -0.1000000 0.0 4 0.0500000 0.5 0.1000000 0.0 5 0.0500000 0.5 -0.0500000 0.0 6 0.0500000 0.5 0.0000000 0.0 7 0.0500000 0.5 0.0500000 0.0 8 -0.0500000 -0.5 0.1000000 0.0 9 0.0000000 0.0 0.1000000 0.0 10 -0.0500000 -0.5 0.0500000 0.0 11 -0.0500000 -0.5 0.0000000 0.0 12 -0.0500000 -0.5 -0.0500000 0.0 13 0.0000000 0.0 0.0000000 0.0
• Sensitivity wrt h:
1 -0.0500000 0.0 -0.1000000 -0.5 2 0.0500000 0.0 -0.1000000 -0.5 3 0.0000000 0.0 -0.1000000 -0.5 4 0.0500000 0.0 0.1000000 0.5 5 0.0500000 0.0 -0.0500000 -0.25 6 0.0500000 0.0 0.0000000 0.0 7 0.0500000 0.0 0.0500000 0.25 8 -0.0500000 0.0 0.1000000 0.5 9 0.0000000 0.0 0.1000000 0.5 10 -0.0500000 0.0 0.0500000 0.25 11 -0.0500000 0.0 0.0000000 0. 12 -0.0500000 0.0 -0.0500000 -0.25 13 0.0000000 0.0 0.0000000 0.0
For more complex geometry and mesh, such inputs should be prepared by a mesh generator: so-called geometry sensitivity. Note # of design variables is not the same as # of seeds for inputs
15
Value Sensitivity Value Sensitivity
Exact (linear) 0.9615 2.8846 -1.4423 -2.8846
GEBT (linear) 0.9615 2.8845 -1.4423 -2.8846
GEBT (nonlinear) 0.5989 1.1157 -1.0540 -1.2665
GEBT (follower) 0.7114 1.3619 -1.5995 -3.8806
Example (GEBT-AD)
1a1 2
3F3a
53
33max 3 22
2max 3 22
1
5 10
3
2
L m
F N
U F L EI
F L EI
Responses and sensitivities by different analyses
max3U max
GEBT:
• 5000 lines of Fortran 90/95 codes
• 20,000 lines of Fortran 77 codes
• Includes BLAS, MA28 sparse linear solver, LAPACK, ARPACK sparse eigensolver
• Sensitivity with respect to L
16
Analytic Method
If we know the equations, there are even more efficient methods A linear system Unknowns: q; design parameter x; objective function: Derivative of the objective
Direct method
Adjoint method
xdx
dq
q
)()()( xFxqxK ),( xq
xxqxKxFK
qxxq
q
xqxKxFxqxKxFxqxKxqxK
)]()()([)(
)()()()()()()()()()(
1
xxqxKxF
qKK
q
)]()()([
1
17
Recommendation
Neither equations no source codes are available: finite difference (FD) method, step-size dilemma
AD: source codes available Computational/algorithmic/automatic differentiation (AD)
(apply chain rule to each operation in the program flow) Forward (direct): SRT: ADIFOR, OpenAD, TAF, TAPENADE; OO:
AUTO_DERIV, HSL_AD02, ADF,SCOOT Reverse (adjoint): very difficult for a general-purpose implementation complex-step: better than FD, less accurate/efficient than AD
Analytic methods: equations are known Continuous sensitivity: differentiate then solve Discrete sensitivity: approximate then differentiate Direct differentiation (forward) or adjoint (reverse) formulation Source codes can be exploited if the algorithms are also known
18
Recommendation (cont.)
Forward (direct) vs reverse (adjoint) Forward mode is in principle more efficient if the
number of objectives and constraints is larger than number of design variables (geometric sensitivities, many stress constraints)
Forward mode is easier and more straightforward for implementation, easier to exploit sparsity and etc.
AD vs analytic methods AD: very little effort to differentiate a code (conditional
compilation); can be done by analysts using AD tools developed by professional differentiators; not efficient
Analytic methods: efficient; needs to know the equations; exploiting of existing codes is possible but need to know the algorithms, more changes to the original codes
19
Recommendation (cont.)
Continuous vs discrete sensitivity Continuous method obtains approximate solution for exact
derivatives, while discrete method obtains exact derivatives of approximate solution
Continuous sensitivity is more accurate/efficient, particularly for problems with changing domain (topology/shape design)
Continuous sensitivity requires deep understanding of the problem (GDEs & BCs), significant effects are needed to derive sensitivity equations and BCs. Discrete sensitivity only requires nominal understanding of the equations and algorithms
Continuous sensitivity usually requires more changes to existing codes, while discrete sensitivity needs less changes and most of the change can be done automatically
Continuous sensitivity is the same as the discrete sensitivity if the same discretization, numerical integration, and linear design velocity fields are used for both methods
20
Sensitivity Analysis of MDO
Multiple analysis codes: preprocessors (CAD/mesh generator), aerodynamic codes, structure codes, performance analysis codes
Different people involved: Analysts: developers of analysis codes Differentiators: sensitivity enablers of analysis codes, Designers: end users of multiple analysis codes for MDO They could be all different, and collaboration may not
be practical (e.g. sensitivity analysis of NASTRAN) Possible two-way communications between analysis
codes (e.g. iterative process); only sensitivities of the converged state are needed, a linear problem could be solved directly for one code or iteratively for multiple coupled codes
21
Sensitivity Analysis of MDO (cont.)
Suggestions1. If source codes are not accessible, use finite difference2. If source codes are available, but we don’t know much about the
equations/algorithms (NASTRAN, CAD), use AD; If possible, iterative nonlinear solver should be avoided for efficiency
3. If we have some knowledge of the equations and algorithms, use discrete analytical method
4. If we have a deep knowledge of the equations/algorithms, use continuous analytical method
5. If # of objectives/constraints is larger than # of design variables, use forward mode, otherwise use reverse (adjoint) mode
6. Source codes differentiated by AD can be used as excellent tools to verify analytical sensitivity methods
7. The designer should only deal with inputs and outputs of the code and not have to access to the source and recompile the codes: similar to finite difference but with the capability to handle multiple design variables
8. If possible, collaborations should be facilitated between designers, differentiators, and analysts