un/dopack: re-clustering of large system-on-chip designs with interconnect variation for low-cost...
TRANSCRIPT
![Page 1: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/1.jpg)
Un/DoPack: Re-Clustering of Large System-on-Chip Designs with
Interconnect Variation for Low-Cost FPGAs
Marvin Tom*Marvin Tom* Xilinx Inc. ([email protected])
San Jose, CA, USA*Work performed at University of British Columbia
David Leong University of British Columbia ([email protected])Vancouver, BC, Canada
Guy Lemieux University of British Columbia ([email protected])Vancouver, BC, Canada
![Page 2: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/2.jpg)
2
Overview
• Introduction, Goals and Motivation– Reduce channel width, lower cost, make circuits “routable”
• Benchmark Circuits – Varying amount of interconnect variation
• Un/DoPack CAD Tool:– Iterative channel width reduction by whitespace insertion
• Results
• Conclusion
![Page 3: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/3.jpg)
3
Overview
• Introduction, Goals and Motivation– Reduce channel width, lower cost, make circuits “routable”
• Benchmark Circuits – Varying amount of interconnect variation
• Un/DoPack CAD Tool:– Iterative channel width reduction by whitespace insertion
• Results
• Conclusion
![Page 4: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/4.jpg)
4
Mesh-Based FPGA Architecture• 9 logic blocks• 4 wires per channel• 3*4=12 total horizontal tracks
L L L
L L L
L L L
L L L
L L L
L L L
L L L
L
L
L
L
• Larger FPGAs have more “aggregate” interconnect
• 16 logic blocks• 4 wires per channel• 4*4=16 total horizontal tracks
![Page 5: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/5.jpg)
5
Motivation: Area of FPGA Devices
alu4
apex2
apex4
bigkey
des
diffeq
dsip
elliptic
ex1010
ex5p
frisc
misex3
pdc
s298s38417
s38584seq
spla
tseng
10
20
30
40
50
60
70
80
90
0 50 100 150 200 250 300
CLB Count
Routed Channel
Width
Number ofLayout Tiles
SIZE ofLayout Tile
Total Layout AREA= SIZE * Number
MCNC Circuits Mapped onto an FPGA
![Page 6: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/6.jpg)
6
Motivation: Channel Width Demand
alu4
apex2
apex4
bigkey
des
diffeq
dsip
elliptic
ex1010
ex5p
frisc
misex3
pdc
s298s38417
s38584seq
spla
tseng
10
20
30
40
50
60
70
80
90
0 50 100 150 200 250 300
CLB Count
Routed Channel
Width
Logic RangeUser buys bigger device.
InterconnectRange
User hasno choice!
Devices built for worst-casechannel width (fixed width)
Interconnect dominates area (>70%)
MCNC Circuits Mapped onto an FPGA
![Page 7: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/7.jpg)
7
Goal: Reduce Channel Width
alu4
apex2
apex4
bigkey
des
diffeq
dsip
elliptic
ex1010
ex5p
frisc
misex3
pdc
s298s38417
s38584seq
spla
tseng
10
20
30
40
50
60
70
80
90
0 50 100 150 200 250 300
CLB Count
Routed Channel
Width
But { apex4, elliptic, frisc, ex1010, spla, pdc } are unroutable….
Can we make them routable in a Constrained FPGA?
Altera Cyclone• Channel width constraint of 80 routing tracks
Constrained FPGA• Channel width constraint of 60 routing tracks• Smaller area, lower cost for low-channel-width circuits
![Page 8: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/8.jpg)
8
alu4
apex2
apex4
bigkey
clma
des
diffeq
dsip
elliptic
ex1010
ex5p
frisc
misex3
pdc
s298s38417
s38584seq
spla
tseng
pdc
ex1010
frisc splaapex4 elliptic
10
20
30
40
50
60
70
80
90
0 50 100 150 200 250 300 350 400 450 500 550 600 650 700
CLB Count
Ro
ute
d C
ha
nn
el W
idth
Possible Solution• Trade-off logic utilization for channel width
– User can always buy more logic…. (not more wires)
FPGA 1 FPGA 2
L L L L
L L L L
L L L L
L L L L
L L L L
L L L L
L L L L
L L L L
L
L
L
L
L L L L L
Trade-off:
CLB count
for
Channel width
What about area??
![Page 9: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/9.jpg)
9
Features and Costs of Two FPGA Families
• Sample Benchmark Circuit– 10,000 LEs– 150 Routing Tracks– No Multipliers– 100 K Memory
Altera Device LEs Memory Mult. Routing Cost
Cyclone 1C12 12,060 239,616 0 80 $56
Stratix 1S10 10,570 920,448 48 232 $190
Cyclone 1C20 20,060 294,912 0 80 $100
Stratix 1S20 18,460 1,669,249 80 232 $350
• Sample Benchmark Circuit– 20,000 LEs– 75 Routing Tracks
![Page 10: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/10.jpg)
10
Overview
• Introduction, Goals and Motivation– Reduce channel width, lower cost, make circuits “routable”
• Benchmark Circuits – Varying amount of interconnect variation
• Un/DoPack CAD Tool:– Iterative channel width reduction by whitespace insertion
• Results
• Conclusion
![Page 11: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/11.jpg)
11
GNL Circuit Benchmark Suite
• Create benchmark circuits with variation– SoC <==> Randomly integrate/stitch together “IP Blocks”– IP Blocks have varied interconnect needs
• Generate Netlist (GNL)– Stroobandt @ Ghent University– Synthetic benchmark generator
• GNL circuits generated hierarchically– Root # I/Os, # IP blocks– Second Level 20 IP blocks, # LEs, Rent parameter
![Page 12: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/12.jpg)
12
Rent Linear Interpolation• 7 benchmark circuits• Average Rent = 0.62, Stdev Rent = 0 0.12• 240/120 primary inputs/outputs
0.350.40
0.450.50
0.550.60
0.650.70
0.750.80
bigke
y
s385
84.1
ellipt
icdif
feq
s298 alu
4
mise
x3 pdc
ex5p
ex10
10
IP Blocks
Ren
t P
aram
eter
Stdev000Stdev002
Stdev004Stdev006
Stdev008 / meta cloneStdev010
Stdev012
![Page 13: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/13.jpg)
13
Overview
• Introduction, Goals and Motivation– Reduce channel width, lower cost, make circuits “routable”
• Benchmark Circuits – Varying amount of interconnect variation
• Un/DoPack CAD Tool:– Iterative channel width reduction by whitespace insertion
• Results
• Conclusion
![Page 14: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/14.jpg)
14
Un/DoPack Flow
• Iterative non-uniform cluster depopulation tool
• Step 1: Traditional SIS/VPR• Step 2: UnPack:
– Congestion Calculator
• Step 3: DoPack:– Incremental Re-Cluster
• Step 4,5: Fast Place/Route
Circuit DescriptionArchitecture Description
Channel Width ConstraintArray Size Constraint
Cluster(iRAC Replica)
Placement(VPR)
Routing(VPR)
Channel WidthConstraint Met?
Success!
CongestionCalculator(UnPack)
Fast Placement(Incremental or
VPR)
Fast Routing(VPR)
Channel WidthConstraint Met?
Yes Yes
No No
Array Size LimitsReached?
Failure
Yes
No
Synthesize andTechnology Map(SIS/Flowmap)
IncrementalCluster
(DoPack)
![Page 15: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/15.jpg)
15
Un/DoPack Flow: SIS/VPRCircuit Description
Architecture DescriptionChannel Width Constraint
Array Size Constraint
Cluster(iRAC Replica)
Placement(VPR)
Routing(VPR)
Channel WidthConstraint Met?
Success!
CongestionCalculator(UnPack)
Fast Placement(Incremental or
VPR)
Fast Routing(VPR)
Channel WidthConstraint Met?
Yes Yes
No No
Array Size LimitsReached?
Failure
Yes
No
Synthesize andTechnology Map(SIS/Flowmap)
IncrementalCluster
(DoPack)
• Step 1: Traditional SIS/VPR
Circuit DescriptionArchitecture Description
Channel Width ConstraintArray Size Constraint
![Page 16: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/16.jpg)
16
Un/DoPack Flow: SIS/VPRCircuit Description
Architecture DescriptionChannel Width Constraint
Array Size Constraint
Cluster(iRAC Replica)
Placement(VPR)
Routing(VPR)
Channel WidthConstraint Met?
Success!
CongestionCalculator(UnPack)
Fast Placement(Incremental or
VPR)
Fast Routing(VPR)
Channel WidthConstraint Met?
Yes Yes
No No
Array Size LimitsReached?
Failure
Yes
No
Synthesize andTechnology Map(SIS/Flowmap)
IncrementalCluster
(DoPack)
• Step 1: Traditional SIS/VPR
Cluster(iRAC Replica)
Placement(VPR)
Routing(VPR)
Synthesize andTechnology Map(SIS/Flowmap)
![Page 17: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/17.jpg)
17
Un/DoPack Flow: SIS/VPRCircuit Description
Architecture DescriptionChannel Width Constraint
Array Size Constraint
Cluster(iRAC Replica)
Placement(VPR)
Routing(VPR)
Channel WidthConstraint Met?
Success!
CongestionCalculator(UnPack)
Fast Placement(Incremental or
VPR)
Fast Routing(VPR)
Channel WidthConstraint Met?
Yes Yes
No No
Array Size LimitsReached?
Failure
Yes
No
Synthesize andTechnology Map(SIS/Flowmap)
IncrementalCluster
(DoPack)
• Step 1: Traditional SIS/VPR
Channel WidthConstraint Met?
Success!
Yes
No
![Page 18: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/18.jpg)
18
Un/DoPack Flow: UnPackCircuit Description
Architecture DescriptionChannel Width Constraint
Array Size Constraint
Cluster(iRAC Replica)
Placement(VPR)
Routing(VPR)
Channel WidthConstraint Met?
Success!
CongestionCalculator(UnPack)
Fast Placement(Incremental or
VPR)
Fast Routing(VPR)
Channel WidthConstraint Met?
Yes Yes
No No
Array Size LimitsReached?
Failure
Yes
No
Synthesize andTechnology Map(SIS/Flowmap)
IncrementalCluster
(DoPack)
• Step 2: UnPack:– Congestion Calculator
CongestionCalculator(UnPack)
Array Size LimitsReached?
Failure
Yes
No
![Page 19: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/19.jpg)
19
Un/DoPack Flow: UnPackCircuit Description
Architecture DescriptionChannel Width Constraint
Array Size Constraint
Cluster(iRAC Replica)
Placement(VPR)
Routing(VPR)
Channel WidthConstraint Met?
Success!
CongestionCalculator(UnPack)
Fast Placement(Incremental or
VPR)
Fast Routing(VPR)
Channel WidthConstraint Met?
Yes Yes
No No
Array Size LimitsReached?
Failure
Yes
No
Synthesize andTechnology Map(SIS/Flowmap)
IncrementalCluster
(DoPack)
• Step 2: UnPack– Generate Congestion Map– CLB Label = Largest CW occ
in 4 adjacent channels
010
2030
4050
010
2030
4050
0
20
40
60
80
100
120
CLB X-LocationCLB Y-Location
CLB
Lab
el
010
2030
4050
60
010
2030
4050
600
20
40
60
80
100
120
CLB X-LocationCLB Y-Location
CLB
Lab
el
![Page 20: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/20.jpg)
20
Un/DoPack Flow: UnPackCircuit Description
Architecture DescriptionChannel Width Constraint
Array Size Constraint
Cluster(iRAC Replica)
Placement(VPR)
Routing(VPR)
Channel WidthConstraint Met?
Success!
CongestionCalculator(UnPack)
Fast Placement(Incremental or
VPR)
Fast Routing(VPR)
Channel WidthConstraint Met?
Yes Yes
No No
Array Size LimitsReached?
Failure
Yes
No
Synthesize andTechnology Map(SIS/Flowmap)
IncrementalCluster
(DoPack)
• Step 2: UnPack:– Depop Center = Largest CLB label
M X M Array
![Page 21: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/21.jpg)
21
Un/DoPack Flow: UnPackCircuit Description
Architecture DescriptionChannel Width Constraint
Array Size Constraint
Cluster(iRAC Replica)
Placement(VPR)
Routing(VPR)
Channel WidthConstraint Met?
Success!
CongestionCalculator(UnPack)
Fast Placement(Incremental or
VPR)
Fast Routing(VPR)
Channel WidthConstraint Met?
Yes Yes
No No
Array Size LimitsReached?
Failure
Yes
No
Synthesize andTechnology Map(SIS/Flowmap)
IncrementalCluster
(DoPack)
• Step 2: UnPack:– Option 1 Coarse Grain:
• Dpop Radius = M/4
• Dpop Amt: 1 new row/col in array
M X M Array
![Page 22: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/22.jpg)
22
Un/DoPack Flow: UnPackCircuit Description
Architecture DescriptionChannel Width Constraint
Array Size Constraint
Cluster(iRAC Replica)
Placement(VPR)
Routing(VPR)
Channel WidthConstraint Met?
Success!
CongestionCalculator(UnPack)
Fast Placement(Incremental or
VPR)
Fast Routing(VPR)
Channel WidthConstraint Met?
Yes Yes
No No
Array Size LimitsReached?
Failure
Yes
No
Synthesize andTechnology Map(SIS/Flowmap)
IncrementalCluster
(DoPack)
• Step 2: UnPack:– Option 2 Fine Grain:
• Dpop Radius = M/4, M/5, M/6, M/8
• Dpop Amt: 1 new row/col in region
M X M Array
![Page 23: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/23.jpg)
23
Un/DoPack Flow: DoPackCircuit Description
Architecture DescriptionChannel Width Constraint
Array Size Constraint
Cluster(iRAC Replica)
Placement(VPR)
Routing(VPR)
Channel WidthConstraint Met?
Success!
CongestionCalculator(UnPack)
Fast Placement(Incremental or
VPR)
Fast Routing(VPR)
Channel WidthConstraint Met?
Yes Yes
No No
Array Size LimitsReached?
Failure
Yes
No
Synthesize andTechnology Map(SIS/Flowmap)
IncrementalCluster
(DoPack)
• Step 3: DoPack:– Incremental Re-Cluster
IncrementalCluster
(DoPack)
No
![Page 24: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/24.jpg)
24
Un/DoPack Flow: Fast P&RCircuit Description
Architecture DescriptionChannel Width Constraint
Array Size Constraint
Cluster(iRAC Replica)
Placement(VPR)
Routing(VPR)
Channel WidthConstraint Met?
Success!
CongestionCalculator(UnPack)
Fast Placement(Incremental or
VPR)
Fast Routing(VPR)
Channel WidthConstraint Met?
Yes Yes
No No
Array Size LimitsReached?
Failure
Yes
No
Synthesize andTechnology Map(SIS/Flowmap)
IncrementalCluster
(DoPack)
• Step 4,5: Fast Place/Route
Success!
Fast Placement(Incremental or
VPR)
Fast Routing(VPR)
Channel WidthConstraint Met?
Yes
No
![Page 25: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/25.jpg)
25
Un/DoPack Flow: Fast P&RCircuit Description
Architecture DescriptionChannel Width Constraint
Array Size Constraint
Cluster(iRAC Replica)
Placement(VPR)
Routing(VPR)
Channel WidthConstraint Met?
Success!
CongestionCalculator(UnPack)
Fast Placement(Incremental or
VPR)
Fast Routing(VPR)
Channel WidthConstraint Met?
Yes Yes
No No
Array Size LimitsReached?
Failure
Yes
No
Synthesize andTechnology Map(SIS/Flowmap)
IncrementalCluster
(DoPack)
• Step 4,5: Fast Place/Route
• Fast Placement– UBC Incremental Placer
(under development)– VPR –fast
• Fast Router– Use illegal pathfinder solution
from first iterations• Unsuccessful so far
– Use full routed solution• Slow but reliable
![Page 26: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/26.jpg)
26
Overview
• Introduction, Goals and Motivation– Reduce channel width, lower cost, make circuits “routable”
• Benchmark Circuits – Varying amount of interconnect variation
• Un/DoPack CAD Tool:– Iterative channel width reduction by whitespace insertion
• Results
• Conclusion
![Page 27: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/27.jpg)
27
Un/DoPack: Baseline Flow
• UnPack: Coarse grained congestion calculator• DoPack: iRAC replica• Fast Place: UBC Incremental Placer• Fast Route: None
• FPGA Architecture: – LUT size (k) = 6– Cluster size (N) = 16– Inputs per cluster (I) = 51– Wires of length (L) = 4
![Page 28: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/28.jpg)
28
Area of GNL Benchmarks
0.901.001.101.201.301.401.501.601.701.801.902.00
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1.05
% of Maximum Channel Width
No
rmal
ized
Are
a
stdev0
stdev002
stdev004
stdev006
stdev008 / meta clone
stdev010
stdev012
![Page 29: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/29.jpg)
29
Interconnect Variation: Impact on FPGA Architecture Design
70
80
90
100
110
120
130
140
Min
imu
m R
ou
ted
Ch
an
ne
l W
idth
Baseline
10% Area Increase
20% Area Increase
25% Area Increase
High VariationHigh VariationCircuits RequireCircuits Require
Wide Channel WidthWide Channel Width
![Page 30: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/30.jpg)
30
Critical Path of GNL Benchmarks
0.95
1.00
1.05
1.10
1.15
1.20
1.25
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1.05% of Max Channel Width
Nor
mal
ized
Crit
ical
Pat
h
![Page 31: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/31.jpg)
31
Un/DoPack Congestion Map
010
2030
4050
010
2030
4050
0
20
40
60
80
100
120
CLB X-LocationCLB Y-Location
CLB
Lab
el
010
2030
4050
60
010
2030
4050
600
20
40
60
80
100
120
CLB X-LocationCLB Y-Location
CLB
Lab
el
Before
AfterUn/DoPack
![Page 32: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/32.jpg)
32
Multi-Region Un-Pack
• Depopulate multiple regions at once – Depopulate each region
separately– Smaller radius
= M/10
• Handle overlapping regions
![Page 33: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/33.jpg)
33
Normalized Area
0.80
1.00
1.20
1.40
1.60
1.80
2.00
2.20
2.40
2.60
2.80
3.00
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
Channel Width Constraint (% of max MRCW)
Nor
mal
ized
Are
a
stdev000
stdev008 / clone
stdev010
![Page 34: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/34.jpg)
34
Normalized Critical Path
0.95
1.00
1.05
1.10
1.15
1.20
1.25
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
Channel Width Constraint (% of max MRCW)
Nor
mal
ized
Crit
ical
Pat
h D
elay
stdev000
stdev008 / clone
stdev010
![Page 35: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/35.jpg)
35
Run-Time Comparisons
0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
Channel Width Constraint (% of max MRCW)
Lo
g R
un
Tim
e (
in h
ou
rs)
stdev000
stdev008
stdev010
MR stdev000
MR stdev008 / clone
MR stdev010
![Page 36: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/36.jpg)
36
Conclusion• Un/DoPack: FPGA CAD flow
– Find “local” congestion depopulate reduced interconnect demand
• FPGA benchmark circuit “suite”– Stdev: Used to vary interconnect demand
• Discoveries…– “Non-uniform” depopulation limits area inflation– “Interconnect variation” important for area inflation and FPGA
architecture design– “Routing closure” achieved by re-clustering and incremental
place & route• UNROUTABLE circuits made ROUTABLE
buy an FPGA with MORE LOGIC!!!
![Page 37: Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com)](https://reader035.vdocuments.mx/reader035/viewer/2022062304/56649e8b5503460f94b90bed/html5/thumbnails/37.jpg)
End of Talk