clustering of flip-flops for useful-skew clock tree synthesis · clock tree synthesis (cts) •...
TRANSCRIPT
![Page 1: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/1.jpg)
Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis
Chuan Yean Tan • Purdue University Rickard Ewetz • University of Central Florida Cheng-Kok Koh • Purdue University
1
![Page 2: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/2.jpg)
Outline • MBFF Introduction • Previous MBFF/Clustering Solutions • Bounded Arrival Time Useful Skew Tree • Clustering of Flip-Flops Mechanics• Proposed Methodology • Evaluation & Experimental Results• Summary
2
![Page 3: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/3.jpg)
Outline • MBFF Introduction • Previous MBFF/Clustering Solutions • Bounded Arrival Time Useful Skew Tree • Clustering of Flip-Flops Mechanics• Proposed Methodology • Evaluation & Experimental Results• Summary
3
![Page 4: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/4.jpg)
Multi-Bit Flip Flop (MBFF)
D0 Q0
CLK
D0 Q0
CLK
D1 Q1D0 Q0
CLK
D1 Q1
D2 Q2D3 Q3
1-Bit Flip-Flop 2-Bit Flip-Flop 4-Bit Flip-Flop
4
![Page 5: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/5.jpg)
Benefits of utilizing MBFF
• Reduce power consumption• Reduce routing complexity and clock tree complexity,
less elements to connect leads to less routing complexity
5[3] L.Chen et al. Using MBFF for clock power saving by design compiler. – Proceedings of Synopsys User Group, 2010
MBFF Power And Area Savings[3]
![Page 6: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/6.jpg)
Outline • MBFF Introduction • Previous MBFF/Clustering Solutions • Bounded Arrival Time Useful Skew Tree • Clustering of Flip-Flops Mechanics• Proposed Methodology • Evaluation & Experimental Results• Summary
6
![Page 7: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/7.jpg)
Previous Works Summary
7
Key Methodologies of Previous Works:• Timing-Driven Intersecting of Feasible Regions[14] • Maximum Independent Set based on heuristic approach [2]• x- and y- interval graphs to determine maximum clique [9]• K-means Algorithm [15,10]
[2] Y.T.Chang et al. Post-placement power optimization with multi-bit flip-flops - ICCAD 2010[9] I.H.R. Jiang et al. INTEGRA: Fast multi-bit flip-flop cluster for clock power saving. - IEEE Transactions CAD of I.C.S. 2015[10] A.B.Khang et al. Improve flop tray-based design implementation for power reduction. - ICCAD 2016[14] S.H.Wang et al. Power-driver flip-flop merging and relocation. - IEEE Transactions CAD of I.C.S 2012[15] G.Wu et al. Flip-Flop clustering by weighted k-means algorithm. - DAC 2016
![Page 8: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/8.jpg)
D Q
CLK
fanini
fanouti
8
Fanout’s slack
Timing-Driven Clustering of Flip-Flops
Feasible Region
![Page 9: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/9.jpg)
D Q
CLK
fanini
fanouti
9
Feasible Region
Flip-Flop is allowed to moved with the region
![Page 10: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/10.jpg)
Analysis of Previous Works
• Previous work cluster flip-flop without consideration of clock tree synthesis (CTS)
• Timing slacks are obtained pre-CTS, timing does not correlate as accurately after CTS
• Most modern designs adopt useful-skew tree to reduce area and power consumption
• However, once the UST has been built, updating or displacing FF’s placement is expensive!
10
![Page 11: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/11.jpg)
Outline • MBFF Introduction • Previous MBFF/Clustering Solutions • Bounded Arrival Time Useful Skew Tree • Clustering of Flip-Flops Mechanics• Proposed Methodology • Evaluation & Experimental Results• Summary
11
![Page 12: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/12.jpg)
D Q
CLK
D Q
CLK
Launch FFi Capture FFj
Combinational Logic
tiCQ tij
max tjS
tjHtij
mintiCQ
Setup Constraints: ti + tiCQ + tij
max + tjS ≤ tj + T
Hold Constraints: ti + tiCQ + tij
min ≥ tj + tjH
12
![Page 13: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/13.jpg)
Setup Constraints: ti + tiCQ + tij
max + tjS ≤ tj + T
Hold Constraints: ti + tiCQ + tij
min ≥ tj + tjH
Constraints Formulation: uij = T – ti
CQ – tijmax – tj
S
lij = tjH – ti
CQ – tijmin
Skew Constraints: lij ≤ ti – tj ≤ uijlij ≤ skewij ≤ uij
13
![Page 14: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/14.jpg)
Skew Constraint Graph (SCG)
14
FF1 FF4
Skew Constraint Graph
FF3
FF2l12 ≤ t1 – t2 ≤ u12 l24 ≤ t2 – t4 ≤ u24
l13 ≤ t1 – t3 ≤ u13 l34 ≤ t3 – t4 ≤ u34
Weighted Edges: w12 = u12w21 = -l12
From the skew constraints: t1 – t2 ≤ u12t2 – t1 ≤ -l12
Generalize weighted constraint: ti – tj ≤ wij
![Page 15: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/15.jpg)
Useful-Skew Tree based on Arrival Time Constraints [5]
FF1 FF4
Skew Constraint Graph (SCG)
FF3
FF2l12 ≤ t1 – t2 ≤ u12 l24 ≤ t2 – t4 ≤ u24
l13 ≤ t1 – t3 ≤ u13 l34 ≤ t3 – t4 ≤ u34
ff4ub
ff4lb
ff2ub
ff2lb
ff3ub
ff3lb
Arrival Time Ranges
15
ff1ub
ff1lb
[5] R. Ewetz et al. Clock Tree Construction based on arrival time constraints – ACM 2017
LP Formulation
No negative cyclefrom SCG
![Page 16: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/16.jpg)
[5] R. Ewetz et al. Clock Tree Construction based on arrival time constraints – ACM 2017
D Q
CLK
D Q
CLK
D Q
CLK
D Q
CLK
Arrival Time Ranges Clock Tree Construction
16
ff4ub
ff4lb
ff2ub
ff2lb
ff3ub
ff3lb
ff1ub
ff1lb
Useful-Skew Tree based on Arrival Time Constraints [5]
Clock Tree Synthesis
![Page 17: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/17.jpg)
Outline • MBFF Introduction • Previous MBFF/Clustering Solutions • Bounded Arrival Time Useful Skew Tree • Clustering of Flip-Flops Mechanics• Proposed Methodology • Evaluation & Experimental Results• Summary
17
![Page 18: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/18.jpg)
Clustering of FFs Mechanics
D Q
CLK
D Q
CLK
FFiMerge FF Candidate 1
FFjMerge FF Candidate 2
fanini
fanouti
fanoutj
faninj
18
![Page 19: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/19.jpg)
Displacing FFs
D Q
CLK
D Q
CLK
FFiMerge FF Candidate 1
FFjMerge FF Candidate 2
MBFF Merged Location
ΔkiΔkj
Worst Case: FF’s displacement introduces increase in wirelength 19
![Page 20: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/20.jpg)
Displacing FFs
D Q
CLK
D Q
CLK
FFiMerge FF Candidate 1
FFjMerge FF Candidate 2
update1
Moving each FF will require update to SCG to avoid “out-dated” skew constraints!!
D0 Q0
CLK
D1 Q1
20
MBFFij
update2
![Page 21: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/21.jpg)
Displacing FFs
D Q
CLK
D Q
CLK
FFiMerge FF Candidate 1
FFjMerge FF Candidate 2
MBFFij
update1
Furthermore, displacing FFi does not guarantee that all skew constraints will be met after update1
D0 Q0
CLK
D1 Q1update2
21
![Page 22: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/22.jpg)
Timing Delays
D Q
CLK
D Q
CLK
FFiMerge FF Candidate 1
FFjMerge FF Candidate 2
Merge Location
ΔkiΔkj
Let Δk = timing delay for increase in wirelength 22
![Page 23: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/23.jpg)
Timing Delays
fanini
fanouti
fanoutj
faninj
D0 Q0
CLK
D1 Q1
MBFFij
Δk1
Δk4
Δk3
Δk2
ΣΔki = total timing delay increased from displacing FFs23
![Page 24: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/24.jpg)
Comparing FF’s Arrival Time Ranges
FF1ub
FF1lb
FF2ub
FF3ub
FF4ub
FF5ub
FF2lb
FF3lb
FF4lb
FF4lb
Example of arrival time ranges derived from skew constraints24
![Page 25: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/25.jpg)
Merging Flip-Flop Candidates
FF1ub
FF1lb
FF2ub
FF3ub
FF4ub
FF5ub
FF2lb
FF3lb
FF4lb
FF4lb
Merge Candidate
25
![Page 26: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/26.jpg)
Overlapping Arrival Time Ranges Between FFs
FF1ub
FF1lb
FF2ub
FF3ub
FF4ub
FF5ub
FF2lb
FF3lb
FF4lb
FF4lb
Overlapping Region
26
![Page 27: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/27.jpg)
Updating New Arrival Time Range
FF1ub
FF1lb
FF2ub
MBFF1ub
FF5ub
FF2lb
MBFF1lb
FF4lb
Resulting Merge FF Arrival Time Range
Fast computation to update Arrival Time Ranges after merging FFs 27
![Page 28: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/28.jpg)
Outline • MBFF Introduction • Bounded Arrival Time Useful Skew Tree • Previous MBFF/Clustering Solutions • Clustering of Flip-Flops Mechanics• Proposed Methodology • Evaluation & Experimental Results• Summary
28
![Page 29: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/29.jpg)
Proposed Methodology
Design Netlist
Specify Arrival Time Range
Select a FF Candidate Randomly
Cluster Flip-Flops Algorithm
Clock Tree Synthesis
if not all FF evaluated
29
![Page 30: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/30.jpg)
Cluster Flip-Flops Algorithm
Find Flip-Flop Nearest Neighbor
Arrival Time Ranges Overlaps?
Compute Cluster Location
Scaled Arrival Time Ranages Overlaps?
Cluster Flip-Flops
New FF Candidate
YES
NO
NO
YES
30
![Page 31: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/31.jpg)
Cluster Flip-Flops Algorithm
Find Flip-Flop Nearest Neighbor
Arrival Time Ranges Overlaps?
Compute Cluster Location
Scaled Arrival Time Ranages Overlaps?
Cluster Flip-Flops
New FF Candidate
YES
NO
NO
YES
31
![Page 32: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/32.jpg)
Nearest Neighbor FF Selection HeuristicD Q
CLK
D Q
CLK
D Q
CLK
D Q
CLK
D Q
CLK
D Q
CLK
D Q
CLK
Cluster Pair 1
Cluster Pair 2
Cluster Pair 3
32
![Page 33: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/33.jpg)
Cluster Flip-Flops Algorithm
Find Flip-Flop Nearest Neighbor
Arrival Time Ranges Overlaps?
Compute Cluster Location
Scaled Arrival Time Ranages Overlaps?
Cluster Flip-Flops
New FF Candidate
YES
NO
NO
YES
33
![Page 34: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/34.jpg)
Compute Clustering Location
Cluster Location Algorithm:Weighted FF candidate based on FF’s fanout Higher fanout Higher capacitive load
D Q
CLK
D Q
CLK
Merge FF Candidate 1, FFiNo. of Fanout = 5
Merge FF Candidate 2, FFjNo. of Fanout = 2
Cluster Location
34
![Page 35: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/35.jpg)
Compute Clustering Location
D Q
CLK
D Q
CLK
Merge FF Candidate 1, FFiNo. of Fanout = 5
Merge FF Candidate 2, FFjNo. of Fanout = 2
Cluster Location
Displacing FF with higher capacitive load will incur higher delay change
35
![Page 36: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/36.jpg)
Cluster Flip-Flops Algorithm
Find Flip-Flop Nearest Neighbor
Arrival Time Ranges Overlaps?
Compute Cluster Location
Scaled Arrival Time Ranges Overlaps?
Cluster Flip-Flops
New FF Candidate
YES
NO
NO
YES
36
![Page 37: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/37.jpg)
Arrival Time Range Overlaps
FF1ub
FF1lb
FF2ub
FF3ub
FF4ub
FF5ub
FF2lb
FF3lb
FF4lb
FF4lb
Overlapping Region
37
![Page 38: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/38.jpg)
Updating Scaled Arrival Time Ranges
FF1ub
FF1lb
FF2ub
MBFF1ub
FF5ub
FF2lb
MBFF1lb
FF4lb
38
Scaled Arrival Time Range update after FF has been moved
Scaled from FF’s displacement
![Page 39: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/39.jpg)
Cluster Flip-Flops Algorithm
Find Flip-Flop Nearest Neighbor
Arrival Time Ranges Overlaps?
Compute Cluster Location
Scaled Arrival Time Ranages Overlaps?
Cluster Flip-Flops
New FF Candidate
YES
NO
NO
YES
39
![Page 40: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/40.jpg)
Outline • MBFF Introduction • Previous MBFF/Clustering Solutions • Bounded Arrival Time Useful Skew Tree • Clustering of Flip-Flops Mechanics• Proposed Methodology • Evaluation & Experimental Results• Summary
40
![Page 41: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/41.jpg)
Evaluation Framework
41
Monte Carlo Framework: • Process Variation (10%)• Voltage Variation (15%) • Temperature Variation (30%)
Tree Structure: • Zero-Skew Tree (ZST) • Useful-Skew Tree (UST) [5]
Cluster FF
CTS[5]
CTO[16,17,18]
Quality of Results (QOR)
[5] R. Ewetz et al. Clock Tree Construction based on arrival time constraints – ACM 2017[16] R. Ewetz et al. MCMM clock tree optimization based on slack redistribution using a reduced slack graph. ASP-DAC ’16[17] V. Ramachandran. Construction of minimal functional skew clock trees. ISPD’12[18] S. Roy, et al. Clock tree resynthesis for multi-corner multi-mode timing closure. ISPD’14
![Page 42: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/42.jpg)
Clustering FF Results
42
8 – 14% decrease in total capacitance
Benchmark from IWLS 2005 [8]
[8] International Workshop for Logic Synthesis – http://irwls.org/iwls2005/benchmarks.html - 2005
![Page 43: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/43.jpg)
Clustering FF + MBFF Transformation Results
43
20% – 34% decrease in total capacitance
![Page 44: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/44.jpg)
Data & Clock Path Cap Reduction
44
Assuming worst casedatapath wirelength increase
Clock Path WireCap Reduction >> Data Path Wire Cap Reduction
![Page 45: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/45.jpg)
Outline • MBFF Introduction • Bounded Arrival Time Useful Skew Tree • Previous MBFF/Clustering Solutions • Clustering of Flip-Flops Mechanics• Proposed Methodology • Evaluation & Experimental Results• Summary
45
![Page 46: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/46.jpg)
Summary
46
• Displacing FFs requires update to skew constraints to ensure timing is met
• Clustering of Flip-Flops based on Bounded Arrival Time Range Clock Tree Synthesis provides fast update to the skew constraints while ensuring timing correctness
• Clustering of FFs reduces routing complexity • MBFF transformation further reduces power consumption
![Page 47: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/47.jpg)
47
Q&A
![Page 48: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/48.jpg)
Backup Slides
48
![Page 49: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/49.jpg)
Bounded Arrival Time Ranges
FF1ub
FF1lb
FF2ub
MBFF1ub
FF5ub
FF2lb
MBFF1lb
FF4lb
Resulting Merge FF Arrival Time Range
49
![Page 50: Clustering of Flip-Flops for Useful-Skew Clock Tree Synthesis · clock tree synthesis (CTS) • Timing slacks are obtained pre -CTS, timing does not correlate . as accurately after](https://reader030.vdocuments.mx/reader030/viewer/2022040109/5e8acbefae5308585540d09d/html5/thumbnails/50.jpg)
Skew Constraint Graph (SCG)
FF1 FF4
Skew Constraint Graph
FF3
FF2l12 ≤ t1 – t2 ≤ u12 l24 ≤ t2 – t4 ≤ u24
l13 ≤ t1 – t3 ≤ u13 l34 ≤ t3 – t4 ≤ u34
50