a timed-automaton based method for accurate computation of delay in the presence of cross-talk...
Post on 17-Dec-2015
227 Views
Preview:
TRANSCRIPT
A Timed-Automaton Based Method for Accurate Computation of Delay in the
Presence of Cross-Talk
Serdar Tasiran, Sunil P. Khatri, Sergio Yovine,
Robert K. Brayton, Alberto L. Sangiovanni-Vincentelli
Department of Electrical Engineering & Computer Sciences
University of California, Berkeley
Overview
Problem: Computing the delay of a combinational circuit.
OUTLINE
Why a new method?
Timed automata Input waveforms Gate delay models Cross-talk models
Computing delay with timed automata
How to fight computational complexity: A conjunctively-decomposed representation Conservative delay computation
Experimental results
Future research
Why a new method?
Before deep-submicron, a “solved problem” Devadas, Keutzer, Malik ‘93
McGeer, Saldanha, Brayton, Sangiovanni ‘’93
Lam, Brayton ‘94
Higher clock speeds Fewer levels of logic
Greater timing accuracy required
Increased effect of parasitics: cross-talk (coupling)
New process technologies, circuit families, dynamic logic, complex gates Conventional gate delay models no longer adequate
Must model new effects at circuit level
Boolean behavior and timing very interdependent
Delay depends on relative arrival times and values of inputs
What about topological analysis or simulation?
BUT: Number of possible input patterns exponential in # of inputs: For large circuits, infeasible to simulate all patterns. Delay not guaranteed unless all patterns are simulated.
From ICCAD ‘97tutorial on timing analysis.(Devgan, et. al.)
Topological delay does not account for cross-talk. Assuming worst case cross-talk on all wires is too conservative. Only transistor-level simulation provides desired accuracy
OUR APPROACH Timed automata serve as delay models for circuit components Delay parameters obtained by
Simulation Analytical methods
Formal timing verification used to compute delay All patterns covered; delay guaranteed.
From ICCAD ‘97tutorial on timing analysis.(Devgan, et. al.)
OUR METHOD
GOOD
MEDIUM
HEURISTICS
HEURISTICS
YES
Timed Automata
Clocks (timers): real-valued variables, increase at same rate.
For each location an output assignment an invariant: a clock predicate.
Clock predicate: Positive Boolean combination of x d and x d.
i o
2 delay 3
i = 1, x 0
i =0, x 3
o = 0 2 x 3 x 3o = 0 o = 1
Initial
2
3
i
o
reset x
Timed Automata as Delay Models
Example: NAND gate Determine delay parameters
using SPICE simulation Construct timed
automaton model with these parameters.
o =0
a = b = 0, x 0
o =0o =0
o =1
x d1fall,maxx d2fall,max
o =0x drise,max
d2fall,min x
drise,min x
d1fall,min x
a = 0, b =1
x 0
a = 1, b = 0
or
a = b = 0
a b
a
b
T1 T2
T3
T4
ab
o
Timed Automata as Delay Models
Delay of this gate depends on Old and new values of a, b, c, d, e Relative arrival times of a, b, c, d, e
Modeling this circuit with [dmin, dmax]is too coarse.
Delay models with state are more powerful Timed automata can express sophisticated delay models SPICE-simulate an individual circuit component exhaustively Capture delay information into a timed automaton. Desired amount of detail can be incorporated into delay model
Allows complexity-accuracy trade-off
Modeling Cross-talk
W WS
H
T
As feature sizes shrink Wire delays become dominant W and S shrink linearly T shrinks sub-linearly
Wire-to-wire capacitance becomesmore significant.
Transitions on wires affect the delays of neighboring wires
Timed automaton model obtained by
Extraction of parasitics from layout
SPICE simulation for various input patterns
Simple cross-talk model One wire switches Wires switch in the
same direction Wires switch in the
opposite direction
stable
x 0
one switchx done,maxx dopposite,max
samex dsame,max
x 0
opposite
done,min x
x 0dsame,min xdopposite,min x
Representing Sets of Input Waveforms
Two-vector delay: All inputs areinitially stable and then switchsimultaneously.
clock = highi = iold i = inew
For each input signal
x 0
i = iold i = inewx = arrivei
Different arrival times
i = iold i = inewdmin x dmax
Asynchronous input
Floating-mode:clock = highi = arbitrary i = inew
For each input signal
x 0
Delay Computation with Timed Automata
GIVEN Set of primary input waveforms.
Represented by timed automaton I. A combinational circuit
Described as an interconnection of components G1, G2, …, Gk
MUST EXPLORE THE STATE SPACE OF Automaton representing primary output waveforms
F = (primary inputs, internal variables) ( I || G1 || G2 || ... || Gk )
COMPUTE Earliest and latest time each output of F changes its value
Exploiting the Structure of the Problem
OBSERVATIONS: State space has no cycles: otherwise circuit oscillates Depth of state space limited by longest
topological path: linear in circuit size S(k) : Set of states that system can be in after k transitions. Need to store S(k) only: Savings in space May revisit states: Trading off time for memory The representation for S(k) can be kept in conjunctively
decomposed form.
S(0)
S(1) S(2)S(3) S(4)
S(5)
Conjunctively Decomposed Representations
3
x3
2
x2
1
xi
4
x4
Represent S(k) = iSi
(k) where
Si(k) (i, xi , i-1, xi-1) represents (i, xi) as a function of (i-1, xi-1)
Compute Si(k,k+1) separately for each i, based on Si-1
(k,k+1) only:
Si(k,k+1) = Si-1
(k,k+1) Si(k) Ti
Support of each partition kept small: Smaller BDDs.
MORE OBSERVATIONS: Circuit components have bounded memory State of component is correlated with components in its vicinity.
Partition circuit into slices so that at step (k)possible values of (i, xi) determined uniquely by (i-1, xi-1)
Implementation
Timed-automaton-based delay computation algorithm implemented inside MOCHA.
BDD based implementation Circuit is partitioned into slices Decomposed representation of state sets Reached state computation is performed on a
per-partition basis.
Case study: n-bit carry skip adder
Case Study
Potential cross-talk Doesn’t actually occur, because
c_out and A3 are separated in time Algorithm must be cross-talk
aware not to overestimate delay
Experimental Results (1)
Withcross-talk (K1)
Withoutcross-talk (K2)
METHOD Run time(s)
Maxdelay (ps)
Run time(s)
Maxdelay (ps)
SPICE 2.94 x 106 611 3.09 x 106 616Timed
Automata602 660 617 660
Conv. ExactTiming
1.75 770 1.62 740
TopologicalDelay
0.1 920 0.1 890
Experimental Results (2)
# ofCSAs
# oftimers
Delay(ps)
# of BDDvars
BDDMemory
(MB)
CPUtime
2 30 634 526 29 2 min
3 45 767 820 58 8 min
4 60 900 1114 78 13 min
5 75 1034 1408 85 21 min
6 90 1167 1702 102 30 min
8 120 1440 1180 63 15 min
16 240 2480 2362 78 2.8 hrs.
Compare: Monolithic representation can not complete the 4-bit example in 1GB.
Advantages of Approach
Modeling issues and verification and analysis issues are decoupled.
Timed automata serve as clean interface between the two.
The same algorithms remain applicable For different delay models At different levels of the hierarchy
Efficiency can be traded-off for accuracy without modifying
analysis algorithm.
Exact characterization of delay computation problem Allows sound conservative simplifications.
Timing properties other than delay can be verified Hold and set-up times For dynamic logic, is the input pulse wide enough to discharge output? Is there a channel-connected path from supply to ground?
Status and Future Work
Timed-automaton-based delay computation algorithm implemented inside MOCHA.
BDD based implementation Circuit is partitioned into slices Decomposed representation of state sets Reached state computation is performed on a per-partition
basis.
Best performance so far: 32-bit carry skip adder 3 hours, ~80MB
FUTURE WORK: Exploit hierarchy
top related