a brief overview of methods for computing derivatives wenbin yu department of mechanical &...

21
1 A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

Upload: melina-moore

Post on 11-Jan-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

1

A Brief Overview of Methods for Computing Derivatives

Wenbin YuDepartment of Mechanical & Aerospace

EngineeringUtah State University, Logan, UT

Page 2: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

2

Finite Difference vs Complex Step

Forward finite difference

Advantages: easy to use, no need to access to the source codes, no need to understand the equation or the code

Disadvantages• step-size dilemma (small enough to avoid truncation error, big enough to

avoid subtractive cancellation error)

• Expensive: always n+1 times of analysis time for n perturbations

Complex step approximation Better than finite difference if implemented correctly Complex variable Complex function

)()()(

)( hOh

xfhxfxf

Iyxz

I),(),(I)()( yxvyxuzvzuf

Page 3: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

3

Finite Difference vs Complex Step

Complex step approximation (cont.) If analytic, Cauchy-Reimann equation holds

We deal with real functions of real variables

Not explicitly subject to subtractive cancellation errors and the truncation errors can be made as small as possible

)()(0)(0 xuxfxvy

h

hxf

h

hxv

x

u

x

fhh

)]I(Im[lim

)I(lim

00

h

yxvhyxv

y

v

x

uh

)I()I)((lim

0

I)(!3

)(!2

I)()()I(32

xfh

xfh

xfhxfhxf

)()(!3

)]I(Im[)( 2

2

hOxfh

h

hxfxf

Page 4: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

4

Dual Number Automatic Differentiation (DNAD)

Extend all real numbers by adding a second component

d is just a symbol, analogous to the imaginary unit, but all powers of d higher than one equal to zero

Example:

d, 1111 xxxx

1 1 2 2 1 1 2 2 1 1

1 2 1 2 2 1 1 1 1

1 2 1 1 2 2 1 1

1 2 1 21 2

1 2 1 21 2

( d, d) ( d)( d) sin( d)

d d sin( ) cos( ) d

sin( ) d ( cos( )) d

( , ) ( )d

= ( , ),

f x x x x x x x x x x

x x x x x x x x x

x x x x x x x x

f ff x x x x

x x

f ff x x x x

x x

)sin(),( 12121 xxxxxf

Page 5: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

5

Dual-number arithmetic

Complex step arithmetic

2

1

, , , , , ,

, * , , , / , , /

sin( , ) sin , cos exp( , ) exp , exp

log( , ) log , / , ,k k k

u u v v u v u v u u v v u v u v

u u v v uv u v uv u u v v uv u v uv v

u u u u u u u u u u

u u u u u u u u u ku

2 2 2 2

2 2

( I)*( I) ( ) I( )

( )( I) / ( I) I

1log( I) log( ) I arg( I)

2

u u v v uv u v u v uv

uv u v u v uvu u v v

v v v v

u u u u u u

Dual Number Automatic Differentiation (DNAD)

Page 6: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

6

Comparing DNAD and complex step DNAD is more efficient as calculations are never more and

mostly less (less for *, /, and most intrinsic functions) DNAD is more accurate as it delivers the analytical derivatives

up to machine precision while complex-step is accurate only for extremely small imaginary parts; cancelation and subtraction errors can occur for some functions

Complex-step only has implementation and compiling optimization advantage for codes in languages supporting complex algebra (Fortran), while DNAD as a concept can be used for codes written in any strongly typed languages with real numbers defined

Complex step is not applicable to codes having complex operations and it can only compute sensitivities with respect to one variable Changes the calculation of the original analysis and program flow Hard to debug as many complex operations are

defined by not what you need Derivative of Exp(|x|) at x=-3

Dual Number Automatic Differentiation (cont.)

IF(ABS(x)>0) THEN……ELSE……..ENDIF

Page 7: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

7

Performance Comparison

Efficiency Comparison

x=1.0; y=2.0; z=3.0

ftot=0.0d0

DO i=1,500000000

f=x*y-x*sin(y)*log(z)

ftot= (ftot- f)/exp(z)

ENDDO

write(*,*) ftot

Accuracy ComparisonDerivative of Exp(|z|) at z=-3

Complex step: -20.0855362857837

DNAD: -20.0855369231877

Exact: -20.0855369231877

Both complex step and DNAD are implemented in F90/95

# of Design Variables 1 3 9 15 16

Finite Difference 1.64*2 1.64*4 1.64*10 1.64*16 1.64*17

Complex Step 3.94 3.94*4 3.94*9 3.94*15 3.94*16

DNAD 2.11 2.67 14.98 22.16 25.56

Time (seconds) used by different methods

Page 8: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

8

Implementation Using F90/95

A general-purpose F90/95 module for automatic differentiation of any Fortran codes including Fortran 77/90/95

Define a new data type DUAL_NUMTYPE,PUBLIC:: DUAL_NUM

REAL(DBL_AD)::x_ad_

REAL(DBL_AD)::xp_ad_

END TYPE DUAL_NUM

Overload functions/operations needed in the analysis codes to this new data type: relational operators, arithmetic operators/functions

Change to “xp_ad_(n)” with n as # of DVs for sensitivities

wrt to multiple DVs

Page 9: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

9

Implementation Using F90/95 (cont.)

Define EXP INTERFACE EXP

MODULE PROCEDURE EXP_D

END INTERFACE

ELEMENTAL FUNCTION EXP_D(u) RESULT(res)

TYPE (DUAL_NUM), INTENT(IN)::u

REAL(DBL_AD)::tmp

TYPE (DUAL_NUM)::res

tmp=EXP(u%x_ad_)

res%x_ad_ = tmp

res%xp_ad_ =u%xp_ad_* tmp

END FUNCTION EXP_D

exp( , ) exp , expu u u u u

Page 10: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

10

How to Use DNAD

To AD a Fortran code use DNAD1. Replace all the definitions of real numbers with dual numbers

REAL(8) :: x TYPE(DUAL_NUM) :: xREAL(8), PARAMETER:: ONE=1.0D0TYPE(DUAL_NUM),PARAMETER::ONE=DUAL_NUM(1.0D0,0.D0)

2. Insert “Use DNAD” right after Module/Function/Subroutine/ Program statements.

3. Change IO commands correspondingly if the code does not use free formatting read and write (can be automated by written some general-purpose utility subroutines)

4. Recompile the source along with DNAD.o5. The whole process can be automated, and even manually it

only takes just a few minutes for most real analysis codes, although step 3 is code dependent

Page 11: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

11

How to Use DNAD (cont.)

To use the sensitivity capability Insert 0 after all real inputs not affected by the design

variable Insert 1 after the real input if it directly represents the

design variable Insert the corresponding sensitivities calculated by other

codes if the real inputs are affected indirectly by design variable, such as the sensitivity of nodal coordinates due to change of geometry

The sensitivities are reported in the outputs as the number following the function value

Designers only need to manipulate inputs/outputs of the code

Page 12: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

12

DNAD Example

PROGRAM CircleAreaREAL(8),PARAMETER:: PI=3.141592653589793D0REAL(8):: radius, areaREAD(*,*) radiusArea=PI*radius**2WRITE(*,*) "AREA=", Area

END PROGRAM CircleAreaPROGRAM CircleArea

USE DNADTYPE (DUAL_NUM),PARAMETER:: PI=DUAL_NUM(3.141592653589793D0,0.D0)

TYPE (DUAL_NUM):: radius,areaREAD(*,*) radiusArea=PI*radius**2WRITE(*,*) "AREA=",Area

END PROGRAM CircleArea

Input: 5AREA=78.5398163397448

Input: 5,1AREA=78.5398163397448, 31.4159265358979

Page 13: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

13

Value( )

Sens. (E)( )

Sens. (b)( )

Sens. (h)( )

EA 5200 2000 520.0 2600

EI22 17.333 6.667 1.733 26.00

EI33 4.333 1.667 1.300 2.167

GJ 4.833 1.859 1.247 3.433GJ (exact) 4.58 1.762 1.374 2.29

Example (VABS-AD)

510

4

3 4 222

3 4 233

3 4 2

5200 10 N

/12 17.333 10 N.m

/12 4.333 10 N.m

0.229 4.58 10 N.m

EA Ebh

EI Ebh

EI Ehb

GJ Gb h

• VABS: 10,000+ lines of F 90/95 codes• An isotropic rectangular section

meshed with two 8-noded quads

2x

3x

1 23

8 49

512

116

13

710

Element 1

2.6GPa, 0.3, 0.1m, 0.2mE b h

610410

Loss of accuracy due to coarse mesh remains the same, can be verified by sensitivity wrt E which is equal to GJ/E and GJ is linear of E

510

h

b

Element 2

Page 14: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

14

Example (VABS-AD)

Changes to the inputs• Sensitivity wrt E:

Change 0.26E+10 .300000000E+00

To 0.26E+10 1. .300000000E+00 0.

• Sensitivity wrt b:

1 -0.0500000 -0.5 -0.1000000 0.0 2 0.0500000 0.5 -0.1000000 0.0 3 0.0000000 0.0 -0.1000000 0.0 4 0.0500000 0.5 0.1000000 0.0 5 0.0500000 0.5 -0.0500000 0.0 6 0.0500000 0.5 0.0000000 0.0 7 0.0500000 0.5 0.0500000 0.0 8 -0.0500000 -0.5 0.1000000 0.0 9 0.0000000 0.0 0.1000000 0.0 10 -0.0500000 -0.5 0.0500000 0.0 11 -0.0500000 -0.5 0.0000000 0.0 12 -0.0500000 -0.5 -0.0500000 0.0 13 0.0000000 0.0 0.0000000 0.0

• Sensitivity wrt h:

1 -0.0500000 0.0 -0.1000000 -0.5 2 0.0500000 0.0 -0.1000000 -0.5 3 0.0000000 0.0 -0.1000000 -0.5 4 0.0500000 0.0 0.1000000 0.5 5 0.0500000 0.0 -0.0500000 -0.25 6 0.0500000 0.0 0.0000000 0.0 7 0.0500000 0.0 0.0500000 0.25 8 -0.0500000 0.0 0.1000000 0.5 9 0.0000000 0.0 0.1000000 0.5 10 -0.0500000 0.0 0.0500000 0.25 11 -0.0500000 0.0 0.0000000 0. 12 -0.0500000 0.0 -0.0500000 -0.25 13 0.0000000 0.0 0.0000000 0.0

For more complex geometry and mesh, such inputs should be prepared by a mesh generator: so-called geometry sensitivity. Note # of design variables is not the same as # of seeds for inputs

Page 15: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

15

Value Sensitivity Value Sensitivity

Exact (linear) 0.9615 2.8846 -1.4423 -2.8846

GEBT (linear) 0.9615 2.8845 -1.4423 -2.8846

GEBT (nonlinear) 0.5989 1.1157 -1.0540 -1.2665

GEBT (follower) 0.7114 1.3619 -1.5995 -3.8806

Example (GEBT-AD)

1a1 2

3F3a

53

33max 3 22

2max 3 22

1

5 10

3

2

L m

F N

U F L EI

F L EI

Responses and sensitivities by different analyses

max3U max

GEBT:

• 5000 lines of Fortran 90/95 codes

• 20,000 lines of Fortran 77 codes

• Includes BLAS, MA28 sparse linear solver, LAPACK, ARPACK sparse eigensolver

• Sensitivity with respect to L

Page 16: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

16

Analytic Method

If we know the equations, there are even more efficient methods A linear system Unknowns: q; design parameter x; objective function: Derivative of the objective

Direct method

Adjoint method

xdx

dq

q

)()()( xFxqxK ),( xq

xxqxKxFK

qxxq

q

xqxKxFxqxKxFxqxKxqxK

)]()()([)(

)()()()()()()()()()(

1

xxqxKxF

qKK

q

)]()()([

1

Page 17: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

17

Recommendation

Neither equations no source codes are available: finite difference (FD) method, step-size dilemma

AD: source codes available Computational/algorithmic/automatic differentiation (AD)

(apply chain rule to each operation in the program flow) Forward (direct): SRT: ADIFOR, OpenAD, TAF, TAPENADE; OO:

AUTO_DERIV, HSL_AD02, ADF,SCOOT Reverse (adjoint): very difficult for a general-purpose implementation complex-step: better than FD, less accurate/efficient than AD

Analytic methods: equations are known Continuous sensitivity: differentiate then solve Discrete sensitivity: approximate then differentiate Direct differentiation (forward) or adjoint (reverse) formulation Source codes can be exploited if the algorithms are also known

Page 18: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

18

Recommendation (cont.)

Forward (direct) vs reverse (adjoint) Forward mode is in principle more efficient if the

number of objectives and constraints is larger than number of design variables (geometric sensitivities, many stress constraints)

Forward mode is easier and more straightforward for implementation, easier to exploit sparsity and etc.

AD vs analytic methods AD: very little effort to differentiate a code (conditional

compilation); can be done by analysts using AD tools developed by professional differentiators; not efficient

Analytic methods: efficient; needs to know the equations; exploiting of existing codes is possible but need to know the algorithms, more changes to the original codes

Page 19: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

19

Recommendation (cont.)

Continuous vs discrete sensitivity Continuous method obtains approximate solution for exact

derivatives, while discrete method obtains exact derivatives of approximate solution

Continuous sensitivity is more accurate/efficient, particularly for problems with changing domain (topology/shape design)

Continuous sensitivity requires deep understanding of the problem (GDEs & BCs), significant effects are needed to derive sensitivity equations and BCs. Discrete sensitivity only requires nominal understanding of the equations and algorithms

Continuous sensitivity usually requires more changes to existing codes, while discrete sensitivity needs less changes and most of the change can be done automatically

Continuous sensitivity is the same as the discrete sensitivity if the same discretization, numerical integration, and linear design velocity fields are used for both methods

Page 20: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

20

Sensitivity Analysis of MDO

Multiple analysis codes: preprocessors (CAD/mesh generator), aerodynamic codes, structure codes, performance analysis codes

Different people involved: Analysts: developers of analysis codes Differentiators: sensitivity enablers of analysis codes, Designers: end users of multiple analysis codes for MDO They could be all different, and collaboration may not

be practical (e.g. sensitivity analysis of NASTRAN) Possible two-way communications between analysis

codes (e.g. iterative process); only sensitivities of the converged state are needed, a linear problem could be solved directly for one code or iteratively for multiple coupled codes

Page 21: A Brief Overview of Methods for Computing Derivatives Wenbin Yu Department of Mechanical & Aerospace Engineering Utah State University, Logan, UT

21

Sensitivity Analysis of MDO (cont.)

Suggestions1. If source codes are not accessible, use finite difference2. If source codes are available, but we don’t know much about the

equations/algorithms (NASTRAN, CAD), use AD; If possible, iterative nonlinear solver should be avoided for efficiency

3. If we have some knowledge of the equations and algorithms, use discrete analytical method

4. If we have a deep knowledge of the equations/algorithms, use continuous analytical method

5. If # of objectives/constraints is larger than # of design variables, use forward mode, otherwise use reverse (adjoint) mode

6. Source codes differentiated by AD can be used as excellent tools to verify analytical sensitivity methods

7. The designer should only deal with inputs and outputs of the code and not have to access to the source and recompile the codes: similar to finite difference but with the capability to handle multiple design variables

8. If possible, collaborations should be facilitated between designers, differentiators, and analysts