dac15 slides by hao zhuang and chung-kuan cheng at uc san diego
TRANSCRIPT
1. University of California, San Diego
2. Tsinghua University
An Algorithmic Framework of
Large-Scale Circuit Simulation Using
Exponential Integrators
Hao Zhuang1, Wenjian Yu2, Ilgweon Kang1, Xinan
Wang1, and Chung-Kuan Cheng1
2
Outline
โข Motivation & Contributions
โข Background of time-domain circuit simulation
โข Our algorithmic framework
โข Exponential integrators
โข Invert Krylov subspace method
โข Experimental results
โข Conclusions & future directions
Motivation โข SPICE
โ critical to wide ranges of IC
โข Modern IC
โ billions of transistors
โ complex interconnects
โข Requirement:
โ new structures e.g., FinFET, 3D
โ strong coupled
โ post-layout effects
โ capability & accuracy
โข Simulation runtime
โ Long or โ
3
From Dick Sites, โDatacenter
Computers modern challenges in CPU
designโ Google Inc. 2015 & Intel i7
From Synopsys Inc. Issue 3, 2012
Technology Update FinFET: The Promises
and the Challenges
โข Target of matrix factorization:
conductance matrix ๐บ ONLY Less expensive
4
Contributions โข Exponential Integration
Stable, Explicit No Newton-Raphson
โข Handling tasks (even when traditional schemes
FAIL)
โข large-scale, strong coupled, post-layout
A promising framework
Basic & BENR as An Example (1)
โข Differential Equations
โข BE: Backward Euler
5
capacitance
(/inductance)
conductance
(/incidence)
time step
input
nonlinear devices dynamics
Basic & BENR as An Example (2)
โข NR: Newton-Raphson
โข BENR: Backward Euler + Newton-Raphson
iterations
6
Jacobian matrix
Basic & BENR as An Example (3)
โข NR: Newton-Raphson
โข BENR: Backward Euler + Newton-Raphson
iterations
7
Jacobian matrix
capacitance
matrix
Matrix Exponential Method
โข Our previous attempt [Weng12]
where
โข It also uses NR
The Jacobian matrix
9
capacitance matrix
10
๐ถ, ๐บ matrices from FreeCPU [Zhang, Yu TCAD 2013]
nnz: non-zero terms
๐บ ๐ถ
Matrices from a Post-Layout Case
12
๐๐ข(๐ถ
๐+ ๐บ)
๐ถ, ๐บ matrices
๐บ ๐ถ ๐ฟ ๐
Matrices from a Post-Layout Case
13
Matrices from a Post-Layout Case
๐ฟ and ๐ of ๐๐ข(๐ถ)
๐ฟ and ๐ of ๐๐ข(๐ถ
โ+ ๐บ)
๐๐ข(๐บ)
๐ฟ ๐
๐ถ, ๐บ matrices
14
๐ฟ and ๐ of ๐๐ข(๐ถ
โ+ ๐บ)
๐ฟ and ๐ of ๐๐ข(๐บ)
In this example, ๐๐ข(๐บ) โข contains less nnz (~10%)
&
โข less complicated nnz
distributions
Matrices from a Post-Layout Case
โข Traditional methods are
all challenged by ๐ถ,
when ๐ถ is complicated,
โข Two techniques:
โ ER: Exponential Rosenbrock Formulation
โ Invert Krylov subspace to compute ๐๐ฝ๐ฃ
โข Computational advantages
โ Simple matrix factorization target: exploit the
feature of ๐๐ข(๐บ)
โ Stable explicit method to solve circuit system
15
Our proposed framework
ER: Exponential Rosenbrock
Start from
๐๐ฅ ๐ก
๐๐ก= ๐ (๐ฅ , ๐ข, ๐ก)
โข The next time step solution [Hochbruck, et. al. SIAM09]
๐ฅ ๐+1 = ๐ฅ ๐ + ๐๐๐1 ๐๐๐ฝ๐ ๐ (๐ฅ ๐ , ๐ข, ๐ก๐) + ๐๐2 ๐2 ๐๐๐ฝ๐ ๐k
where ๐ฝ๐ = ๐๐ /๐๐ฅ , ๐๐ = ๐๐ /๐๐ก
๐1 ๐๐๐ฝ๐ = (๐โ๐๐ฝ๐โ๐ผ๐)/๐๐๐ฝ๐
๐2 ๐๐๐ฝ๐ = (๐โ๐๐ฝ๐โ๐ผ๐)/๐๐2๐ฝ๐
2 โ ๐ผ๐/๐๐๐ฝ๐
16
Exponential Integrators:
Proved to be Stable, Explicit, High-Order Accuracy for ODE
ER in Circuit Simulation
Chain rule:
๐๐ ๐ฅ ๐ก
๐๐ฅ
๐๐ฅ ๐ก
๐๐ก= ๐ต๐ข ๐ก โ ๐(๐ฅ )
where
๐๐ ๐ฅ ๐ก
๐๐ฅ= ๐ถ ๐ฅ ๐ก = ๐ถ๐, ๐ฝ๐ = โ๐ถ๐
โ1๐บ๐,
๐ ๐ = ๐ฝ๐ + ๐ถ๐โ1 ๐น ๐ + ๐ต๐ข ๐ก , ๐๐ = ๐ถ๐
โ1 ๐ต๐ข ๐ก๐+1 โ๐ต๐ข ๐ก๐
โ๐
We have ALL the components to obtain ๐ฅ ๐+1
๐ฅ ๐+1(๐๐) = ๐ฅ ๐ + ๐๐๐1 ๐๐๐ฝ๐ ๐ (๐ฅ ๐ , ๐ข, ๐ก) + ๐๐2 ๐2 ๐๐๐ฝ๐ ๐k
17
Local Nonlinear Error Control
The local nonlinear error estimator [Caliari09]
๐๐๐ ๐ฅ ๐+1, ๐ฅ ๐ = ๐1 ๐๐๐ฝ๐ ๐ถ๐โ1ฮ๐น ๐
where ฮ๐น ๐ = ๐น ๐ฅ ๐+1 โ ๐น (๐ฅ ๐)
18
ER-C: ER with Correction Term
Reuse ฮ๐น ๐ to improve the accuracy by padding
the extra term
๐ท๐ = ๐พ๐๐๐2 ๐๐๐ฝ๐ ๐ถ๐โ1ฮ๐น ๐
The further corrected solution is
๐ฅ ๐+1,๐ = ๐ฅ ๐+1 โ ๐ท๐
Krylov Method for MEVP ๐๐ฝ๐ฃ โข ๐๐ฝ๐ฃ: Matrix Exponential and Vector Product
(MEVP) via standard Krylov subspace [Weng12]
๐พ๐ ๐ฝ, ๐ฃ โ ๐ ๐๐๐ ๐ฃ , ๐ฝ๐ฃ , ๐ฝ2๐ฃ , โฆ , ๐ฝ๐โ1๐ฃ
โ Arnoldi process and Matrix reduction:
๐ฝ๐๐ = ๐๐๐ป๐ + ๐๐+1,๐๐ฃ ๐+1๐ ๐T
โข MEVP is computed by
๐๐ฝ๐ฃ โ ๐ฃ 2๐๐ ๐๐ป๐๐ 1
โข Explicit feature: time stepping only by scaling ๐ป๐
with h,
๐โ๐ฝ๐ฃ โ ๐ฃ 2๐๐ ๐โ๐ป๐๐ 1
19
20
Standard Krylov subspace
Im
Re 0
โlikeโ these eigenvalues
Eigenvalues of J: small magnitude of Re
Eigenvalues of J: large magnitude of Re
(a) Standard Krylov Basis [Weng12]
๐พ๐ ๐ฝ, ๐ฃ โ ๐ ๐๐๐ ๐ฃ , ๐ฝ๐ฃ , ๐ฝ2๐ฃ , โฆ , ๐ฝ๐โ1๐ฃ
spectrum of
๐ฝ = โ๐ชโ๐๐ฎ
21
Standard Krylov subspace
Im
Re 0
โข these eigenvalues
defines the major
dynamical behavior
โข demand more bases to
characterize
Eigenvalues of J: small magnitude of Re
Eigenvalues of J: large magnitude of Re
(a) Standard Krylov Basis [Weng12]
๐พ๐ ๐ฝ, ๐ฃ โ ๐ ๐๐๐ ๐ฃ , ๐ฝ๐ฃ , ๐ฝ2๐ฃ , โฆ , ๐ฝ๐โ1๐ฃ
spectrum of
๐ฝ = โ๐ชโ๐๐ฎ
22
Im
Re
Im
Re 0 0
Invert Krylov subspace method captures
โimportantโ eigenvalues in the original spectrum
Eigenvalues of J: small magnitude of Re
Eigenvalues of J: large magnitude of Re
Invert Krylov subspace
Invert Krylov Basis [Zhuang, et. al. DAC14]
๐พ๐ ๐ฝโ1, ๐ฃ โ ๐ ๐๐๐ ๐ฃ , ๐ฝโ1๐ฃ , ๐ฝโ2 ๐ฃ , โฆ , ๐ฝโ๐+1๐ฃ
spectrum of ๐ฝโ1 spectrum of ๐ฝ
Simple Matrix Fct. Taget
23
Invert Krylov Subspace approach transfers
๐ฝ = โ๐ถโ1๐บ ๐ฝโ1= โ๐บโ1๐ถ
At each iteration, we
generate invert
Krylov subspace
๐๐ = ๐ฃ 1, ๐ฃ 2, โฏ , ๐ฃ ๐
by solving
โ๐ฎ๐ = ๐ช๐๐โ๐
24
Overall Framework
ER-C: further
improve the solution
โข No Newton-Raphson
โข Build upon exponential
integrators
โข explicit method for
DAE solver
โข adjust error by step
size control
Experimental Results
โข Implemented in MATLAB2013a & C/C++ (GCC
4.7.3)
โ Opensource BSIM3 device model with C
โ MATLAB Executable (MEX) external interface
between device evaluation and matrix solvers
โข Linux workstation
โ Intel CPU i7 3.4GHZ
โ 32GB memory.
โ Utilize single thread mode.
25
27
Runtime Performance โข #Dev.: the number of devices.
โข nnzC & nnzG: the number of non-zero
elements in linear C and G.
โข #step: the number of steps for transient
simulation;
For each time step,
โข #NRa: the average NR iterations
โข #ma: the average dimension of invert
Krylov subspace
โข RT(s): the runtime.
โข SP: the runtime speedup Test circuits
28
Conclusions and Future Directions
Accelerate SPICE-level time-domain simulation
โข Exponential Integrators
โข Stable explicit formulation
โข ๐๐ฝ๐ฃ w/ invert Krylov Subspace & Less expensive matrix factorizations.
โข Handling tasks even when traditional methods fail.
Future directions:
โข parallelism, can be accelerated further by multicore/many-core computing systems.
โข many derivatives & tools can be built upon.