xin-wei shih and yao-wen chang. introduction problem formulation algorithms experimental results...
TRANSCRIPT
![Page 1: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/1.jpg)
Fast Timing-Model Independent Buffered Clock-Tree Synthesis
Xin-Wei Shih and Yao-Wen Chang
![Page 2: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/2.jpg)
Introduction Problem formulation Algorithms Experimental results Conclusions
Outline
![Page 3: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/3.jpg)
Skew-minimized buffered clock-tree synthesis plays an important role in high-performance VLSI designs for synchronous circuits.
Due to the insufficient accuracy of existing timing models for modern chip design, embedding simulation process into a clock-tree synthesis flow becomes inevitable.
Introduction
![Page 4: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/4.jpg)
A possible way to improve the speed is performing the clock construction by structure optimization.◦ Mesh
In this paper, a novel timing-model independent buffered clock tree synthesis method is proposed.◦ Buffering and wiring structures of all paths from
the clock source to its sinks are almost the same.
Introduction
![Page 5: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/5.jpg)
Problem: Buffered Clock-Tree Synthesis (BCTS)
Instance: Given a set of clock sinks, a slew-rate constraint, and a library of buffers.
Question: Construct a buffered clock tree to minimize its skew, subject to no slew-rate violation.
Problem formulation
![Page 6: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/6.jpg)
Algorithm
![Page 7: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/7.jpg)
The number of leaves (sinks) can be treated as a multiplication sequence of branching.◦ This multiplication sequence exactly forms a
factorization.
Then, the BNP is arranged in the non-increasing order1 of the factorization list.
Branch-Number Planning
![Page 8: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/8.jpg)
Branch-Number Planning
![Page 9: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/9.jpg)
A top-down manner like [10] or a bottom-up one like [7, 11], they can hardly apply to non binary tree structures.
Therefore, we propose a novel partitioning method, which can not only handle non-binary tree structures, but also achieve good quality in terms of the cluster diameter.◦ cluster diameter : the maximum distance among
sub-trees within the same cluster.
Tree Construction-Partitioning
![Page 10: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/10.jpg)
We borrow the idea of cake cutting, i.e., slicing a cake into pieces from the center of the cake.
Partitioning
![Page 11: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/11.jpg)
Embedding-Region Construction
![Page 12: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/12.jpg)
Node Embedding
![Page 13: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/13.jpg)
Since the identical branch numbers at the same level are required in the symmetrical structure, a pseudo sink should be transformed into a dangling wire to maintain the symmetry.
For partitioning, we relax that the sizes of clusters in a partition can differ by at most one for the first recursion.
For node embedding, we let the embedding regions of pseudo sinks cover the entire chip.
Pseudo Sink Handling
![Page 14: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/14.jpg)
A top-down manner◦ By tracing along the tree edges, once the slew
rate is about to violate the constraint, identical buffers are inserted for all branches.
◦ Insert identical buffers in terms of the type and the size at the same level.
◦ The slew rate is approximated by accumulated capacitance starting from the latest inserted buffer.
Buffer Insertion
![Page 15: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/15.jpg)
Buffer Insertion
![Page 16: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/16.jpg)
Implemented in the C++ programming language on a 2.6 GHz AMD-64 workstation.
Four ISPD’09 Clock Network Synthesis Contest benchmarks with no blockages [17] and the IBM benchmarks [19].
Use ngspice [13] simulation based on the 45nm process technology [14] to evaluate the quality.
Experimental results
![Page 17: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/17.jpg)
clock skew (skew) clock-latency range(CLR) total resource usage (usage)
Experimental results
![Page 18: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/18.jpg)
Experimental results
![Page 19: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/19.jpg)
Experimental results
![Page 20: Xin-Wei Shih and Yao-Wen Chang. Introduction Problem formulation Algorithms Experimental results Conclusions](https://reader036.vdocuments.mx/reader036/viewer/2022082710/56649e375503460f94b27243/html5/thumbnails/20.jpg)
We have presented a fast timing-model independent buffered clock tree synthesis method to construct a symmetrical clock tree with little wiring overhead.
By symmetrically constructing a clock tree, the clock skew can be minimized without referring to simulation information.
Conclusions