interconnect architectures
TRANSCRIPT
![Page 1: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/1.jpg)
Interconnect Architectures for Modulo-Scheduled
Coarse-Grained Reconfigurable Arrays
Ahmed Hassan Mohammed
1
![Page 2: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/2.jpg)
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
2
![Page 3: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/3.jpg)
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
3
![Page 4: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/4.jpg)
Introduction4
![Page 5: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/5.jpg)
Introduction
ASIC
Reconfigurable
Arraysµ
Processors
Flexibility
Perf
orm
ance
5
Platforms
![Page 6: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/6.jpg)
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
6
![Page 7: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/7.jpg)
Reconfigurable Arrays
Fine-grained Coarse-grained
Purpose
Basic Unit
level
Re-configurability
Performance
General purpose
LUT
bit-level
High overhead
Low
Application Specific
ALU
word-level
Reduced overhead
High
7
![Page 8: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/8.jpg)
Reconfigurable Arrays
Fined-grained
Coarse-grained
Purpose
Basic Unit
level
Re-configurability
Performance
General purpose
LUT
bit-level
High overhead
Low
Application Specific
ALU
word-level
Reduced overhead
High
8
![Page 9: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/9.jpg)
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
9
![Page 10: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/10.jpg)
Device Architecture10
![Page 11: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/11.jpg)
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
11
![Page 12: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/12.jpg)
Mapping Technology
Dataflow graph Architecture
12
![Page 13: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/13.jpg)
Mapping Technology
Iteration 1 Iteration 2 Iteration 3
13
![Page 14: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/14.jpg)
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
14
![Page 15: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/15.jpg)
Proposed Architectures
Closest Topology
Clique Topology
Directional Topology
Heterogeneous Topology
15
![Page 16: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/16.jpg)
Proposed Architectures
Finput = 3
Example16
Finput : Number of possible inputs for a CFU
![Page 17: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/17.jpg)
Proposed Architectures
Closest Topology
Clique Topology
Directional Topology
Heterogeneous Topology
17
![Page 18: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/18.jpg)
Proposed Architectures
Closest Topology
Finput = 2Finput = 3Finput = 4
6
4 2 5
3 3 7
5 2 4
Finput = 5
Labels Label <= Finput
18
Label by the closest
![Page 19: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/19.jpg)
Proposed Architectures
Closest Topology
Clique Topology
Directional Topology
Heterogeneous Topology
19
![Page 20: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/20.jpg)
Proposed Architectures
Clique Topology
Finput = 2Finput = 3Finput = 4
4
5 2 5 6
3 2 4
6 3
Finput = 5
Labels Label <= Finput
20
Label by the row and column
![Page 21: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/21.jpg)
Proposed Architectures
Closest Topology
Clique Topology
Directional Topology
Heterogeneous Topology
21
![Page 22: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/22.jpg)
Proposed Architectures
Directional Topology
Finput = 2Finput = 3Finput = 4
4 4 5 5
2 2 3 3
6 6 7 7
Finput = 5
Label <= Finput
22
Label by the next row and column
![Page 23: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/23.jpg)
Proposed Architectures
Closest Topology
Clique Topology
Directional Topology
Heterogeneous Topology
23
![Page 24: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/24.jpg)
Proposed Architectures
Heterogeneous Topology
Finput = 2Finput = 3Finput = 4
6
2 5 2 4
3 7 4
3 5 6
Finput = 5
Label <= Finput
24
Label by the third column
![Page 25: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/25.jpg)
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
25
![Page 26: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/26.jpg)
Experimental Results
Ten benchmark kernels are used for comparison. Each is a single loop containing between 18 and 184 operations per iteration.
How Finput affects IPC (instruction per cycle)?
How Finput affects the Area ?
26
![Page 27: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/27.jpg)
Experimental Results
Finput vs. IPC27
![Page 28: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/28.jpg)
Experimental Results
Finput vs. Area28
![Page 29: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/29.jpg)
Overall Results29
All Topologies
![Page 30: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/30.jpg)
Outline
Introduction Reconfigurable Arrays
Device Architecture Mapping Technology
Proposed Architectures Experimental Results Overall Results References
30
![Page 31: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/31.jpg)
Overall Results31
Finput vs. IPC/Area
![Page 32: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/32.jpg)
Overall Results Different interconnect topologies affect both performance and area.
Partially interconnected fabric is better than the fully connected fabric.
Software pipelining is affected by the amount of flexibility in the interconnect architecture.
32
![Page 33: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/33.jpg)
References Steven J.E. Wilton, Noha Kafafi, Bingfeng Mei, Serge Vernalde
“Interconnect Architectures for Modulo-Scheduled Coarse-Grained Reconfigurable Arrays ”, 2004 IEEE.
Frank Bouwens, Mladen Berekovic, Andreas Kanstein, and Georgi Gaydadjiev, “Architectural Exploration of the ADRES Coarse-Grained Reconfigurable Array”, 2007.
Reiner Hartenstein, “Coarse Grain Reconfigurable Architectures”, 2001.
Lu Wan, Chen Dong, Deming Chen, “A New Coarse-Grained Reconfigurable Architecture with Fast Data Relay and Its Compilation Flow”.
33
![Page 34: Interconnect Architectures](https://reader036.vdocuments.mx/reader036/viewer/2022062503/546b2a0daf79599b248b53c2/html5/thumbnails/34.jpg)
Thanks
34