dac15 slides by hao zhuang and chung-kuan cheng at uc san diego

29
1. University of California, San Diego 2. Tsinghua University An Algorithmic Framework of Large-Scale Circuit Simulation Using Exponential Integrators Hao Zhuang 1 , Wenjian Yu 2 , Ilgweon Kang 1 , Xinan Wang 1 , and Chung-Kuan Cheng 1

Upload: hao-zhuang

Post on 05-Aug-2015

108 views

Category:

Engineering


0 download

TRANSCRIPT

1. University of California, San Diego

2. Tsinghua University

An Algorithmic Framework of

Large-Scale Circuit Simulation Using

Exponential Integrators

Hao Zhuang1, Wenjian Yu2, Ilgweon Kang1, Xinan

Wang1, and Chung-Kuan Cheng1

2

Outline

โ€ข Motivation & Contributions

โ€ข Background of time-domain circuit simulation

โ€ข Our algorithmic framework

โ€ข Exponential integrators

โ€ข Invert Krylov subspace method

โ€ข Experimental results

โ€ข Conclusions & future directions

Motivation โ€ข SPICE

โ€“ critical to wide ranges of IC

โ€ข Modern IC

โ€“ billions of transistors

โ€“ complex interconnects

โ€ข Requirement:

โ€“ new structures e.g., FinFET, 3D

โ€“ strong coupled

โ€“ post-layout effects

โ€“ capability & accuracy

โ€ข Simulation runtime

โ€“ Long or โˆž

3

From Dick Sites, โ€œDatacenter

Computers modern challenges in CPU

designโ€ Google Inc. 2015 & Intel i7

From Synopsys Inc. Issue 3, 2012

Technology Update FinFET: The Promises

and the Challenges

โ€ข Target of matrix factorization:

conductance matrix ๐บ ONLY Less expensive

4

Contributions โ€ข Exponential Integration

Stable, Explicit No Newton-Raphson

โ€ข Handling tasks (even when traditional schemes

FAIL)

โ€ข large-scale, strong coupled, post-layout

A promising framework

Basic & BENR as An Example (1)

โ€ข Differential Equations

โ€ข BE: Backward Euler

5

capacitance

(/inductance)

conductance

(/incidence)

time step

input

nonlinear devices dynamics

Basic & BENR as An Example (2)

โ€ข NR: Newton-Raphson

โ€ข BENR: Backward Euler + Newton-Raphson

iterations

6

Jacobian matrix

Basic & BENR as An Example (3)

โ€ข NR: Newton-Raphson

โ€ข BENR: Backward Euler + Newton-Raphson

iterations

7

Jacobian matrix

capacitance

matrix

Matrix Exponential Method

โ€ข Our previous attempt [Weng12]

where

8

Matrix Exponential Method

โ€ข Our previous attempt [Weng12]

where

โ€ข It also uses NR

The Jacobian matrix

9

capacitance matrix

10

๐ถ, ๐บ matrices from FreeCPU [Zhang, Yu TCAD 2013]

nnz: non-zero terms

๐บ ๐ถ

Matrices from a Post-Layout Case

11 ๐‘™๐‘ข(๐ถ)

๐ถ, ๐บ matrices

๐บ ๐ถ

๐ฟ ๐‘ˆ

Matrices from a Post-Layout Case

12

๐‘™๐‘ข(๐ถ

๐‘•+ ๐บ)

๐ถ, ๐บ matrices

๐บ ๐ถ ๐ฟ ๐‘ˆ

Matrices from a Post-Layout Case

13

Matrices from a Post-Layout Case

๐ฟ and ๐‘ˆ of ๐‘™๐‘ข(๐ถ)

๐ฟ and ๐‘ˆ of ๐‘™๐‘ข(๐ถ

โ„Ž+ ๐บ)

๐‘™๐‘ข(๐บ)

๐ฟ ๐‘ˆ

๐ถ, ๐บ matrices

14

๐ฟ and ๐‘ˆ of ๐‘™๐‘ข(๐ถ

โ„Ž+ ๐บ)

๐ฟ and ๐‘ˆ of ๐‘™๐‘ข(๐บ)

In this example, ๐‘™๐‘ข(๐บ) โ€ข contains less nnz (~10%)

&

โ€ข less complicated nnz

distributions

Matrices from a Post-Layout Case

โ€ข Traditional methods are

all challenged by ๐ถ,

when ๐ถ is complicated,

โ€ข Two techniques:

โ€“ ER: Exponential Rosenbrock Formulation

โ€“ Invert Krylov subspace to compute ๐‘’๐ฝ๐‘ฃ

โ€ข Computational advantages

โ€“ Simple matrix factorization target: exploit the

feature of ๐‘™๐‘ข(๐บ)

โ€“ Stable explicit method to solve circuit system

15

Our proposed framework

ER: Exponential Rosenbrock

Start from

๐‘‘๐‘ฅ ๐‘ก

๐‘‘๐‘ก= ๐‘” (๐‘ฅ , ๐‘ข, ๐‘ก)

โ€ข The next time step solution [Hochbruck, et. al. SIAM09]

๐‘ฅ ๐‘˜+1 = ๐‘ฅ ๐‘˜ + ๐‘•๐‘˜๐œ™1 ๐‘•๐‘˜๐ฝ๐‘˜ ๐‘” (๐‘ฅ ๐‘˜ , ๐‘ข, ๐‘ก๐‘˜) + ๐‘•๐‘˜2 ๐œ™2 ๐‘•๐‘˜๐ฝ๐‘˜ ๐‘k

where ๐ฝ๐‘˜ = ๐œ•๐‘” /๐œ•๐‘ฅ , ๐‘๐‘˜ = ๐œ•๐‘” /๐œ•๐‘ก

๐œ™1 ๐‘•๐‘˜๐ฝ๐‘˜ = (๐‘’โ„Ž๐‘˜๐ฝ๐‘˜โˆ’๐ผ๐‘›)/๐‘•๐‘˜๐ฝ๐‘˜

๐œ™2 ๐‘•๐‘˜๐ฝ๐‘˜ = (๐‘’โ„Ž๐‘˜๐ฝ๐‘˜โˆ’๐ผ๐‘›)/๐‘•๐‘˜2๐ฝ๐‘˜

2 โˆ’ ๐ผ๐‘›/๐‘•๐‘˜๐ฝ๐‘˜

16

Exponential Integrators:

Proved to be Stable, Explicit, High-Order Accuracy for ODE

ER in Circuit Simulation

Chain rule:

๐‘‘๐‘ž ๐‘ฅ ๐‘ก

๐‘‘๐‘ฅ

๐‘‘๐‘ฅ ๐‘ก

๐‘‘๐‘ก= ๐ต๐‘ข ๐‘ก โˆ’ ๐‘“(๐‘ฅ )

where

๐‘‘๐‘ž ๐‘ฅ ๐‘ก

๐‘‘๐‘ฅ= ๐ถ ๐‘ฅ ๐‘ก = ๐ถ๐‘˜, ๐ฝ๐‘˜ = โˆ’๐ถ๐‘˜

โˆ’1๐บ๐‘˜,

๐‘” ๐‘˜ = ๐ฝ๐‘˜ + ๐ถ๐‘˜โˆ’1 ๐น ๐‘˜ + ๐ต๐‘ข ๐‘ก , ๐‘๐‘˜ = ๐ถ๐‘˜

โˆ’1 ๐ต๐‘ข ๐‘ก๐‘˜+1 โˆ’๐ต๐‘ข ๐‘ก๐‘˜

โ„Ž๐‘˜

We have ALL the components to obtain ๐‘ฅ ๐‘˜+1

๐‘ฅ ๐‘˜+1(๐‘•๐‘˜) = ๐‘ฅ ๐‘˜ + ๐‘•๐‘˜๐œ™1 ๐‘•๐‘˜๐ฝ๐‘˜ ๐‘” (๐‘ฅ ๐‘˜ , ๐‘ข, ๐‘ก) + ๐‘•๐‘˜2 ๐œ™2 ๐‘•๐‘˜๐ฝ๐‘˜ ๐‘k

17

Local Nonlinear Error Control

The local nonlinear error estimator [Caliari09]

๐‘’๐‘Ÿ๐‘Ÿ ๐‘ฅ ๐‘˜+1, ๐‘ฅ ๐‘˜ = ๐œ™1 ๐‘•๐‘˜๐ฝ๐‘˜ ๐ถ๐‘˜โˆ’1ฮ”๐น ๐‘˜

where ฮ”๐น ๐‘˜ = ๐น ๐‘ฅ ๐‘˜+1 โˆ’ ๐น (๐‘ฅ ๐‘˜)

18

ER-C: ER with Correction Term

Reuse ฮ”๐น ๐‘˜ to improve the accuracy by padding

the extra term

๐ท๐‘˜ = ๐›พ๐‘•๐‘˜๐œ™2 ๐‘•๐‘˜๐ฝ๐‘˜ ๐ถ๐‘˜โˆ’1ฮ”๐น ๐‘˜

The further corrected solution is

๐‘ฅ ๐‘˜+1,๐‘ = ๐‘ฅ ๐‘˜+1 โˆ’ ๐ท๐‘˜

Krylov Method for MEVP ๐‘’๐ฝ๐‘ฃ โ€ข ๐‘’๐ฝ๐‘ฃ: Matrix Exponential and Vector Product

(MEVP) via standard Krylov subspace [Weng12]

๐พ๐‘š ๐ฝ, ๐‘ฃ โ‰” ๐‘ ๐‘๐‘Ž๐‘› ๐‘ฃ , ๐ฝ๐‘ฃ , ๐ฝ2๐‘ฃ , โ€ฆ , ๐ฝ๐‘šโˆ’1๐‘ฃ

โ€“ Arnoldi process and Matrix reduction:

๐ฝ๐‘‰๐‘š = ๐‘‰๐‘š๐ป๐‘š + ๐‘•๐‘š+1,๐‘š๐‘ฃ ๐‘š+1๐‘’ ๐‘šT

โ€ข MEVP is computed by

๐‘’๐ฝ๐‘ฃ โ‰ˆ ๐‘ฃ 2๐‘‰๐‘š ๐‘’๐ป๐‘š๐‘’ 1

โ€ข Explicit feature: time stepping only by scaling ๐ป๐‘š

with h,

๐‘’โ„Ž๐ฝ๐‘ฃ โ‰ˆ ๐‘ฃ 2๐‘‰๐‘š ๐‘’โ„Ž๐ป๐‘š๐‘’ 1

19

20

Standard Krylov subspace

Im

Re 0

โ€œlikeโ€ these eigenvalues

Eigenvalues of J: small magnitude of Re

Eigenvalues of J: large magnitude of Re

(a) Standard Krylov Basis [Weng12]

๐พ๐‘š ๐ฝ, ๐‘ฃ โ‰” ๐‘ ๐‘๐‘Ž๐‘› ๐‘ฃ , ๐ฝ๐‘ฃ , ๐ฝ2๐‘ฃ , โ€ฆ , ๐ฝ๐‘šโˆ’1๐‘ฃ

spectrum of

๐ฝ = โˆ’๐‘ชโˆ’๐Ÿ๐‘ฎ

21

Standard Krylov subspace

Im

Re 0

โ€ข these eigenvalues

defines the major

dynamical behavior

โ€ข demand more bases to

characterize

Eigenvalues of J: small magnitude of Re

Eigenvalues of J: large magnitude of Re

(a) Standard Krylov Basis [Weng12]

๐พ๐‘š ๐ฝ, ๐‘ฃ โ‰” ๐‘ ๐‘๐‘Ž๐‘› ๐‘ฃ , ๐ฝ๐‘ฃ , ๐ฝ2๐‘ฃ , โ€ฆ , ๐ฝ๐‘šโˆ’1๐‘ฃ

spectrum of

๐ฝ = โˆ’๐‘ชโˆ’๐Ÿ๐‘ฎ

22

Im

Re

Im

Re 0 0

Invert Krylov subspace method captures

โ€œimportantโ€ eigenvalues in the original spectrum

Eigenvalues of J: small magnitude of Re

Eigenvalues of J: large magnitude of Re

Invert Krylov subspace

Invert Krylov Basis [Zhuang, et. al. DAC14]

๐พ๐‘š ๐ฝโˆ’1, ๐‘ฃ โ‰” ๐‘ ๐‘๐‘Ž๐‘› ๐‘ฃ , ๐ฝโˆ’1๐‘ฃ , ๐ฝโˆ’2 ๐‘ฃ , โ€ฆ , ๐ฝโˆ’๐‘š+1๐‘ฃ

spectrum of ๐ฝโˆ’1 spectrum of ๐ฝ

Simple Matrix Fct. Taget

23

Invert Krylov Subspace approach transfers

๐ฝ = โˆ’๐ถโˆ’1๐บ ๐ฝโˆ’1= โˆ’๐บโˆ’1๐ถ

At each iteration, we

generate invert

Krylov subspace

๐‘‰๐‘š = ๐‘ฃ 1, ๐‘ฃ 2, โ‹ฏ , ๐‘ฃ ๐‘š

by solving

โˆ’๐‘ฎ๐’˜ = ๐‘ช๐’—๐’Šโˆ’๐Ÿ

24

Overall Framework

ER-C: further

improve the solution

โ€ข No Newton-Raphson

โ€ข Build upon exponential

integrators

โ€ข explicit method for

DAE solver

โ€ข adjust error by step

size control

Experimental Results

โ€ข Implemented in MATLAB2013a & C/C++ (GCC

4.7.3)

โ€“ Opensource BSIM3 device model with C

โ€“ MATLAB Executable (MEX) external interface

between device evaluation and matrix solvers

โ€ข Linux workstation

โ€“ Intel CPU i7 3.4GHZ

โ€“ 32GB memory.

โ€“ Utilize single thread mode.

25

Accuracy

26

27

Runtime Performance โ€ข #Dev.: the number of devices.

โ€ข nnzC & nnzG: the number of non-zero

elements in linear C and G.

โ€ข #step: the number of steps for transient

simulation;

For each time step,

โ€ข #NRa: the average NR iterations

โ€ข #ma: the average dimension of invert

Krylov subspace

โ€ข RT(s): the runtime.

โ€ข SP: the runtime speedup Test circuits

28

Conclusions and Future Directions

Accelerate SPICE-level time-domain simulation

โ€ข Exponential Integrators

โ€ข Stable explicit formulation

โ€ข ๐‘’๐ฝ๐‘ฃ w/ invert Krylov Subspace & Less expensive matrix factorizations.

โ€ข Handling tasks even when traditional methods fail.

Future directions:

โ€ข parallelism, can be accelerated further by multicore/many-core computing systems.

โ€ข many derivatives & tools can be built upon.

Thanks and Q&A

29