hicap: a fast hierarchical algorithm for 3d capacitance extraction
DESCRIPTION
HiCap: A Fast Hierarchical Algorithm for 3D Capacitance Extraction. Weiping Shi Department of Computer Science University of North Texas. Outline. Introduction Previous Research Integral Equation & N-Body Problem New Algorithm Experimental Results Conclusion Future Work. Introduction. - PowerPoint PPT PresentationTRANSCRIPT
Weiping Shi
Department of Computer Science
University of North Texas
HiCap: A Fast Hierarchical Algorithm for 3D Capacitance
Extraction
HiCap: A Fast Hierarchical Algorithm for 3D Capacitance
Extraction
OutlineOutline
Introduction
Previous Research
Integral Equation & N-Body Problem
New Algorithm
Experimental Results
Conclusion
Future Work
IntroductionIntroduction
Capacitance Extraction: Given a set of conductors in 3-D space, compute the capacitance between all pairs of conductors.
1V
-
-
--
-
- -+
+
++
+C=Q
Signal delay = gate delay + interconnect delay
Interconnect delay is caused by RC (resistance and capacitance) parasitic.
R
C C
Interconnect delay dominates gate delay in deep sub-micron VLSI.
0
5
10
15
20
25
30
35
40
45
0.85 0.5 0.35 0.25 0.18 0.13 0.11
Gate
Interconnect(Al+SiO2)
Interconnect(Cu+lowk)
Sum (Al+SiO2)
Sum (Cu+lowk)
Generation (micron)
Delay(ps)
Importance in VLSIImportance in VLSI
Fast and accurate capacitance extraction is crucial in the design and verification of VLSI circuits and packaging. Current 3D tools are too slow.
FastCap, Raphael, QuickCap, etc. 2D/2.5D/Quasi-3D tools use 3D engines to generate
library. Accuracy depends on 3D engines. Dracula, HyperExtract, Arcordia, Fire&Ice, Star-
RC, Columbus, etc. For critical nets and clock trees, 3D accuracy is
necessary.
Importance in MEMSImportance in MEMS
Accurate capacitance extraction of complex 3-D structures is also important in design of MEMS (MicroElectroMechanical Systems).
Design of most motion sensors needs accurate estimate of capacitance.
Design of most drivers needs to solve a similar potential problem.
A recent ARPA report estimates the market of above applications at 1 to 3 billion dollars by 2004.
Enlarged comb driver
Previous ResearchPrevious Research
Differential Maxwell Equation (Finite Difference Method or Finite Element Method) Raphael Field Solver
Integral Laplace Equation (Boundary Element Method) Multipole algorithm FastCap by Nabors & White.
O(N) time. Kernel dependent. Pre-corrected FFT algorithm by Phillips & White.
O(N log N) time. Kernel independent. SVD algorithm IES3 by Kapur & Long. O(N log N)
time. Kernel independent.
Integral Equation ApproachIntegral Equation Approach
where (x) is the known surface potential,
(x’) is the charge density,
da’ is an incremental conductor surface area,
x’ is on da’,
is the kernel.
where P is an NxN matrix of potential coefficients,
q is an N-vector of panel charges,
v is an N-vector of known panel potentials.
Partition conductor surfaces into N panels and assume uniform charge density on each panel. Then we have a linear system:
Pq = v
Each entry pij of potential coefficient matrix P represents the potential at panel Ai due to unit charge on panel Aj:
Solution q of the linear system Pq = v gives the capacitance.
ChallengeChallenge
Partition the conductor surfaces into N panels,
Calculate and store the dense NxN matrix P, and
Solve the linear system Pq = v
In O(N) time?
N-body ProblemN-body Problem
N-body Problem: Given N particles in 3D space, compute all forces between the particles.
Hierarchical Algorithm (Appel 85) O(N) time (Esselink) Radiosity (Hanrahan, Salzman & Aupperle)
Multipole Algorithm (Greengard & Rohklin 87) O(N) time FastCap
Appel’s Key IdeasAppel’s Key Ideas
For practical purposes, forces acting on a particle need only be calculated to within the given precision.
The force due to a cluster of particles at some distance can be approximated with a single term.
Outline of New AlgorithmOutline of New Algorithm
Adaptively partition conductor surfaces into small panels according to a user supplied error bound Pe.
Approximate potential coefficient matrix P and store it in a hierarchical data structure of size O(N).
The data structure permits O(N) time matrix-vector product Px for any N-vector x.
Solve linear system Pq = v using iterative methods.
Adaptive Panel PartitionAdaptive Panel Partition If the potential coefficient estimate between two panels
are greater than Pe, then partition the panels. Otherwise, record the coefficient.
A
H
C
B
I
J
C
EF G
M NL
J
1 2 3 4 5
Coefficient Matrix RepresentationCoefficient Matrix Representation
A
D
G
H
CB
E
F
I J
K L
M N
Entries of P are are stored in a hierarchical data structure as links.
A
B C
D E
H
I J
K L
A
B
C
D
E
H
I
J
L
K
Matrix with
block entries
It can be shown the matrix contains O(N) block entries, where N is the number of panels.
If expanded explicitly, the matrix would contain NxN entries.
If panel sizes were uniform, the matrix would be much larger than NxN.
Matrix-Vector Product PxMatrix-Vector Product Px
A
B C
D E
F G
H
I J
K L
M N
Compute charge for all panels in O(N) time.
A
B C
D E
F G
H
I J
K L
M N
Compute potential for all panels in O(N) time.
A
B C
D E
F G
H
I J
K L
M N
Distribute potential to leaf panels in O(N) time.
Solving Linear SystemsSolving Linear Systems
Use iterative methods such as GMRES or MINRES.
Each iteration requires a matrix-vector product Px and can be completed in O(N) time.
Number of iterations needed is very small, normally 10-20 regardless of N.
Error and ComplexityError and Complexity
Error of approximation can be controlled by the user supplied error bound Pe.
Time complexity is O(N) because each of the above steps is O(N).
Experimental ResultsExperimental Results
Test examples: Bus crossing 2x2, 3x3, …, 6x6. In commercial tools, thousands of these crossings will be computed to build the library.
2x2 Bus crossing
Previous 3D AlgorithmsPrevious 3D Algorithms
FastCap expansion order 2 (assume accurate).
FastCap expansion order 0.
Pre-corrected FFT. 40% faster than FastCap(2) and uses 1/4 of memory of FastCap(2).
IES3. 60% faster than FastCap(2) and uses 1/5 of memory of FastCap(2).
CPU time (in seconds):
0
50
100
150
200
250
2x2 3x3 4x4 5x5 6x6
FastCap(2)
FastCap(0)
New
40 - 100 times faster than FastCap(2), 14 - 40 times faster than FastCap(0).
Memory (in MB):
0102030405060708090
100
2x2 3x3 4x4 5x5 6x6
FastCap(2)
FastCap(0)
New
1/60 - 1/100 of memory of FastCap(2), 1/80 - 1/280 of memory of FastCap(0).
Error with respect to FastCap(2):
0.00%1.00%2.00%3.00%4.00%5.00%6.00%7.00%8.00%9.00%
10.00%
2x2 3x3 4x4 5x5 6x6
FastCap(0)
New
Less than 2.7% error with respect to FastCap(2), 3 times more accurate than FastCap(0).
ConclusionConclusion
A new algorithm significantly faster than previous best algorithms. It provides the possibility for 3D extraction of clock trees and critical nets. It can also be used to generate libraries for commercial 2D/2.5D tools.
Kernel independent. Can be applied to multi-layered dielectrics.
Adaptive refinement scheme produces good partition of conductor surfaces.
Hierarchical data structure is much more efficient than previous data structures.
Future ResearchFuture Research
Capacitance Extraction High order basis function Bottom-up construction of hierarchy Full chip and critical net extraction
Inductance Extraction FastHenry is too slow No commercial tool for mutual inductance.
Variational Parasitic Extraction
MEMS application