unc seminar numerical issues in fast maxwell’s equation ...vv vv vvvv. unc seminar mathematical...
TRANSCRIPT
UNC Seminar
Numerical IssuesNumerical IssuesIn Fast Maxwell’s Equation In Fast Maxwell’s Equation
Solvers for Integrated Circuit Solvers for Integrated Circuit InterconnectInterconnect
Zhenhai Zhu, Ben Song and Jacob White
RLE Computational prototyping group, MITrleweb.mit.edu/vlsi
UNC Seminar
•• BackgroundBackground•• PrePre--corrected FFT algorithmcorrected FFT algorithm•• Unit testing resultsUnit testing results•• Surface integral formulationSurface integral formulation•• Numerical ResultsNumerical Results
OutlineOutline
UNC SeminarMathematical PreliminariesMathematical Preliminaries
SrrfrrrKSdS
∈=′′′∫vvvvv ),()(),( ρ
A simple integral equation:A simple integral equation:
)(span ,)()(1
rbBrbr jn
n
jjjn ′=′=′ ∑
=
vvv αρ
Project the solution on a functional space:Project the solution on a functional space:
1( , ) ,
ik r reK r r
r r r r
′−
′ =′ ′− −
v vv v v v v v
UNC SeminarMathematical PreliminariesMathematical Preliminaries
)(span ,0)(e),( n rtTrrt inivvv ==
Enforce the residual to be orthogonal toEnforce the residual to be orthogonal toanother functional space:another functional space:
fA =αA dense linear system:A dense linear system:
)()(),()(en rfrrrKSdrS
nvvvvv −′′′= ∫ ρ
Residual:Residual:
UNC SeminarSome very useful applicationsSome very useful applications
Figures thank to Figures thank to CoventorCoventor
Electrostatic analysisto compute the capacitance
Magneto-quasi-static analysisto compute impedance
UNC Seminar
© Carleton University Hammerhead UAV Project, 2000, David Willis
Picture thanks to David Joe Willis
Some very useful applicationsSome very useful applications
Computational Aerodynamics
MirrorGimbal
Rotate
Picture thanks to Xin Wang
Stokes Flow SolverViscous drag
UNC SeminarIntroduction to Iterative SolversIntroduction to Iterative Solvers
A simple iterative solver:A simple iterative solver:
0
1
Solve step 1: guess an initial solution , let 0
step 2: compute the residual -
step 3: find an update from step 4: update the solution
step 5: 1, go to step 2
k
k k
Ax bx k
r b Ax
x rx x x
k k+
==
=
= += +
VV
UNC SeminarFast MatrixFast Matrix--Vector ProductVector Product
The most expensive step:The most expensive step:
AxGoal:
2( ) ( ) or ( log( ))O N O N O N N⇒
UNC SeminarWellWell--known Fast Algorithmsknown Fast Algorithms
•• Fast Multiple Method Fast Multiple Method •• Hierarchical SVDHierarchical SVD•• Panel Clustering MethodPanel Clustering Method
Key idea:interaction matrix is low rank
UNC SeminarKernel “Independent” TechniqueKernel “Independent” Technique
Basic requirements:
Reciprocity: ( , ) ( , )G r r G r r′ ′=v v v v
Commonly used Green’s function all satisfythese requirements
1 1, , ( ), ( )
n n
ik r r ik r re er r r r r r r r
′ ′− −∂ ∂′ ′ ′ ′− − ∂ − ∂ −
v v v v
v v v v v v v v
Shift invariance: ( , ) ( , )G r r r r G r r′ ′+ + =v v v v v vV V
UNC Seminar
•• BackgroundBackground•• PrePre--corrected FFT Algorithmcorrected FFT Algorithm•• Unit testing resultsUnit testing results•• Surface Integral FormulationSurface Integral Formulation•• Numerical ResultsNumerical Results
OutlineOutline
UNC SeminarFFTFFT--based Methodbased Method
( , ) ( ) ( ), S
dS G r r r f r r Sρ′ ′ ′ = ∈∫v v v v v
( , ) ( ,0) ( )G r r G r r G r r′ ′ ′= − = −v v v v v v%Key idea: kernel is shift-invariant
A simple example:
H fα =
UNC SeminarFFTFFT--based Methodbased Method
, ( )j
i j i jpanel
H dS G r r′ ′= −∫v v%
1,Only ( 1,2,..., ) are unique. H is a Toeplitz
matrix. Matrix vector product could be computed using FFT in O( log( )) time.
jH j N
N N
=
Operations: Operations: O(O(NNlog(log(NN)))) Memory: O(Memory: O(NN))
If collocation method with constant basis is used
UNC Seminar
Separation of Regular Grid From Separation of Regular Grid From DiscretizationDiscretization PanelsPanels
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Basic stepsBasic steps
(1)
(2)
(3)
(4)
[ ]αPQg = :Project (1)
[ ]αDd =Ψ :Direct (4)
[ ] gg I φ=Ψ :eInterpolat (3)
[ ] gg QH=φ :Convolve (2)
[ ] [ ][ ][ ] α)( PHIDdg +=Ψ+Ψ=Ψ
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Basic IdeaBasic Idea
bggggbbbbb NNNNNNNNNN PHIDA ××××× += ][][][][][
A sparse representation A sparse representation of the system matrixof the system matrix
)( ))log(( )( )( )( 2bggbbb NONNONONONO
2( ) ( ) ( ) ( ) ( )b b b g bO N O N O N O N O N
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Interpolation MatrixInterpolation Matrix
),( Compute
Given
yxg
φ
φ
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Interpolation MatrixInterpolation Matrix
( , ) ( , ) ( , )tk k
k
x y c f x y f x y cφ = =∑
2 2 2 2 2 2
An example of ( , ) :
1, , , , , , , ,kf x y
x x y xy x y y xy x y
UNC Seminar
[ ]cF
c
cc
yxfyxfyxf
yxfyxfyxfyxfyxfyxf
g
g
g
g =
=
=
9
2
1
999992991
229222221
119112111
9,
2,
1,
),(),(),(
),(),(),(),(),(),(
ML
MOMMLL
Mφ
φφ
φ
[ ] 1( , ) ( , )tgx y f x y Fφ φ−=
pFFTpFFT Algorithm:Algorithm:Interpolation MatrixInterpolation Matrix
[ ] cyxfc
cyxfyxfyxfyx t ),(),(),(),(),(
9
1
921 =
= MLφ
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Interpolation MatrixInterpolation Matrix
[ ] gt
gt
iii WFrfrdStrrtti
φφφ ===Ψ ∫∆
−1)()()(),( vvvv
Operations: 9Operations: 9NNbb Memory: 9Memory: 9NNbb
[ ] g
g
g
i IWW φφ
φ=
=
Ψ=Ψ
M
M
M
LLLM
M
9,
1,
91
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Outer Differential OperatorOuter Differential Operator
[ ] 1( ) ( )( )t
i
t tn iW dSt r f r F
n r−
∆
∂=
∂∫v vv
1( ) ( )( ) ( )
tgr f r F
n r n rφ φ−∂ ∂
=∂ ∂
v vv v
( , ) ( )( ) S
dS G r r rn r
ρ∂ ′ ′ ′
∂ ∫v v vv
If the kernel has a differential operator outside:
The operator works on the interpolation
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Projection MatrixProjection Matrix
),()1(EsE rrG vv=φ
gt
gi
EiigE rrG φρρφ )(),(,)2( == ∑ vv
gρ charge grid find
Assume a unit charge at point SE
s
)2()1(such that EE φφ =
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Projection MatrixProjection Matrix
gt
gi
EiigE rrG φρρφ )(),(,)2( == ∑ vv
kk
kE crfrrG ∑= )(),( vvv
match both sides at grid point irv
Expand the Green’s functionE
s gFc φ1−=
gst
EsE FrfrrG φφ 1)1( )(),( −== vvv
[ ] 1)()( −= Frf stt
gvρ
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Projection MatrixProjection Matrix
)r(b jvon distributi a
is charge theIf
[ ] 1)( )()()( −
∆∫= FrfrdSb t
jtj
gbj
vvρ
[ ] 1)()( −= Frf stt
gvρ
For unit point charge
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Projection MatrixProjection Matrix
∑=
=bN
j
tjgjgQ
1
)( )(ρα
For multiple panels:
( ) ( )j jj
r b rρ α= ∑v v
[ ] 1( )( ) ( ) ( )bj
j t tg jdSb r f r Fρ −
∆
= ∫v v
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Projection MatrixProjection Matrix
[ ],1 ,1
,9 ,9
( ) ( )
,1
( ) ( ),9
0 0
0
g g
g g
j k
g j
gj k
g k
QQ P
Q
ρ ρ αα
ρ ρ α
= = =
O M MM MM M M
M M M M M M MM M M
M MM M M M
Operations: 9Operations: 9NNbb Memory: 9Memory: 9NNbb
∑=
=bN
j
tjgjgQ
1
)( )(ρα
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Inner Differential OperatorInner Differential Operator
[ ]∫∆
−=bj
Frfrdn
drdSb t
jtg
1)()(
)( vvvρ
gst
s Frfrdn
dr
rdnd
φφ 1)()(
)()(
−= vvvv
)(),()(
rrrGrdn
dSd
S
′′′
′∫vvvv ρ
If the kernel has a differential operator inside:
The operator works on the projection
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Duality of [Duality of [II] and [] and [PP]]
[ ] 1)()( −
∆∫ FrfrdSb t
jbj
vv
[ ]∫∆
−
ti
FrfrdSt ti
1)()( vv
then,or ),()( If nnji BTrbrt == vv
ith row of [I]:
jth column of [P]:
tIP =
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Convolution MatrixConvolution Matrix
(1)
(2)
(3)
(4) [ ] gg
igi
jijg
QH
QrrG
=
′= ∑φ
φ ,, ),( vv
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Convolution MatrixConvolution Matrix
, ( )i j i jH G r r′= −v v%1,Only ( 1, 2,..., ) are uniquej gH j N=
Regular grid and position invariance
Operations: Operations: O(O(NNgglog(log(NNgg)))) Memory: O(Memory: O(NNgg))
, ,( )g j i j g ii
G r r Qφ ′= −∑ v v%
Fast convolution by FFT in O(Nlog(N)) time
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Direct MatrixDirect Matrix
[ ][ ][ ]
=Ψ==
g
gg
g
IQH
PQ
φφ
α
[ ][ ][ ]αPHI=Ψ
Summary of the first three steps:
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Direct MatrixDirect Matrix
[ ])(
)( )(),(,, )(
i
g
i
j
HWAD jjitjiji
Ν∈
−= ρ
Operations: Operations: O(O(NNbb)) Memory: Memory: O(O(NNbb))
UNC Seminar
pFFTpFFT Algorithm:Algorithm:Four sparse matricesFour sparse matrices
[ ] [ ] [ ][ ][ ] αα )( PHIDA +≈=Ψ•• Projection:Projection:
Operations: Operations: O(O(NNbb)) Memory: Memory: O(O(NNbb))•• Convolution:Convolution:
Operations: Operations: O(O(NNgglog(log(NNgg)))) Memory: O(Memory: O(NNgg))•• Interpolation:Interpolation:
Operations: Operations: O(O(NNbb)) Memory: Memory: O(O(NNbb))•• Nearby interaction:Nearby interaction:
Operations: Operations: O(O(NNbb)) Memory: Memory: O(O(NNbb))
UNC Seminar
•• BackgroundBackground•• PrePre--corrected FFT Algorithmcorrected FFT Algorithm•• Unit testing resultsUnit testing results•• Surface Integral FormulationSurface Integral Formulation•• Numerical ResultsNumerical Results
OutlineOutline
UNC Seminar
Unit testingUnit testing
Let x be a random vector
1
2
1 2 2
1 2
( )
y Ax
y pfft x
y yerror
y
==
−=
( , ) ( )S
dS K r r r Axρ′ ′ ′ ⇒∫v v v
On the surface of a sphere with radius R
UNC Seminar
SingleSingle--Layer KernelLayer KernelAccuracy (4Accuracy (4--5 digits)5 digits)
Number of Panel
1r
UNC Seminar
SingleSingle--Layer Kernel Layer Kernel Accuracy (4Accuracy (4--5 digits), 5 digits), R/R/λ λ = 1e= 1e--66
Number of Panel
ikrer
UNC Seminar
SingleSingle--Layer Kernel Layer Kernel Accuracy (4Accuracy (4--5 digits), 5 digits), R/R/λ λ = 1e2= 1e2
Number of Panel
ikrer
UNC Seminar
DoubleDouble--Layer Kernel Layer Kernel Accuracy (2Accuracy (2--3 digits)3 digits)
Number of Panel
1r
UNC Seminar
DoubleDouble--Layer Kernel Layer Kernel Accuracy (2Accuracy (2--3 digits ), 3 digits ), R/R/λ λ = 1e= 1e--66
ikrer
Number of Panel
UNC Seminar
DoubleDouble--Layer Kernel Layer Kernel Accuracy (2Accuracy (2--3 digits ), 3 digits ), R/R/λ λ = 1e2= 1e2
Number of Panel
ikrer
UNC SeminarMatrix vector product time O(Matrix vector product time O(NN))
Break even point
Number of Panel
UNC SeminarMemory O(Memory O(NN))
Number of Panel
UNC Seminar
•• BackgroundBackground•• PrePre--corrected FFT Algorithmcorrected FFT Algorithm•• Unit testing resultsUnit testing results•• Surface Integral FormulationSurface Integral Formulation•• Numerical ResultsNumerical Results
OutlineOutline
UNC Seminar
Summary of the Summary of the Surface Integral FormulationSurface Integral Formulation
iSSrrE
rnrrG
rnrE
rrGSdrEi
∈′′∂
′∂−
′∂′∂′′= ∫
vvvvv
vvvvvvv
))()(
),()()(
),(()(21 1
1
SrrrrrGrd sS∈=′′′∫
vvvvvv )()(),(0 εφρ
SrrrErn
rrGrnrE
rrGSdrES
∈∇+′′∂
′∂−
′∂′∂′′=− ∫
vvvvvv
vvv
vvvv )())(
)(),(
)()(
),(()(21 0
0 φ
contact ,)( ∈= rcrvvφ
Current Conservation0)( =•∇ rE vv
UNC SeminarSystem Matrix StructureSystem Matrix Structure
1 1
1 1
1 1
1 0 1 0 1 0 1 0 1 0 1 0 1
2 0 2 0 2 0 2 0 2 0 2 0 2
0
ˆˆ
x
y
z
x y z x y xx
x y z x y zy
zx y z
x y z
P D dE dnP D dE dn
P D dE dnt P t P t P t D t D t D t Et P t P t P t D t D t D t E
j En n n
a a a c c cA P
ωσ
φρ
∇ ∇
ii
UNC SeminarPrePre--conditionerconditioner
1
1
1
1 0 1 0 1 0 1 1 1 1
2 0 2 0 2 0 2 2 2 2
0
2
2
2ˆ2 2 2ˆ2 2 2
d
xd
yd
zd d d
x y z x y x xd d d
x y z x y z y
zx y z
x y zd
P I dE dnP I dE dn
P I dE dnt P t P t P t I t I t I t Et P t P t P t I t I t I t E
j En n n
a a a c c cA P
π
π
π
π π ππ π π
ωσ φ
ρ
− − − ∇ ∇
ii
UNC Seminar
•• BackgroundBackground•• PrePre--corrected FFT Algorithmcorrected FFT Algorithm•• Unit testing resultsUnit testing results•• Surface Integral FormulationSurface Integral Formulation•• Numerical ResultsNumerical Results
OutlineOutline
UNC SeminarA Ring ExampleA Ring Example
Cross-section: 0.5x0.5 mm2
Radius: 10 mm
Picture thanks to Junfeng Wang
UNC Seminar
Frequency(Hz)
Res
ista
nce
(O
hm
)
A Ring ExampleA Ring Example
+ FastHenry 960o FastHenry 3840* FastHenry 15360-- Surface 992-- DC Analytical formulae
UNC Seminar
Frequency(Hz)
Ind
uct
ance
(n
H)
A Ring ExampleA Ring Example
+ FastHenry 960o FastHenry 3840* FastHenry 15360-- Surface 992-- DC Analytical formulae
UNC SeminarA Spiral ExampleA Spiral Example
Cross-section: 0.5x0.5 mm2
Inner Radius: 10 mm Spacing: 0.5 mmNumber of turns: 2
Picture thanks to Junfeng Wang
UNC Seminar
Frequency(Hz)
Res
ista
nce
(O
hm
)A Spiral ExampleA Spiral Example
o FastHenry 7680+ FastHenry 1920- Surface 488
UNC Seminar
Frequency(Hz)
Ind
uct
ance
(n
H)
A Spiral ExampleA Spiral Example
o FastHenry 7680+ FastHenry 1920- Surface 488
UNC Seminar
Frequency(Hz)
Imp
edan
ce (
oh
m)
A Shorted Transmission LineA Shorted Transmission Line
x 1011
+ Without Ground- With Ground
UNC SeminarA Spiral Over GroundA Spiral Over Ground
4-turn spiral over substrate 15162 panelsMQS: 106k unknowns, 69 minutes, 348 MbEMQS: 121k unknowns, 93minutes, 379 Mb
UNC SeminarMultiMulti--conductor busconductor bus
3 layer, 10 conductors each layer 12540 panelsMQS: 87.5k unknowns, 41 minutes, 165 MbEMQS: 100k unknowns, 61 minutes, 218 Mb
UNC Seminar
•• PrePre--corrected FFT algorithmcorrected FFT algorithm•• Performance of the algorithmPerformance of the algorithm•• A surface integral formulationA surface integral formulation•• Numerical resultsNumerical results
ConclusionsConclusions