![Page 1: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/1.jpg)
Reconfigurable Computing Systems:An Overview
Presented by:
Gurwant Kaur Koonar
Vijay Pandya
14th March 2003
![Page 2: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/2.jpg)
Introduction
Reconfigurable Computing (RC) is an emerging paradigm for digital systems design. The key feature of which is the ability to perform computations in hardware to achieve performance of ASIC and flexibility of GP processors.
Technology improvements have made possible new programmable logic devices (FPGAs, CPLDs).
Objective of the talk: Give an overview and the hardware architectures of reconfigurable computing, and the software that targets these machines, such as compilation tools.
![Page 3: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/3.jpg)
Definition
Reconfigurable Computing (RC) is a computing paradigm in which algorithms are implemented as a temporally and spatially ordered set of very complex tasks. These tasks are executed on a large set of interconnected programmable hardware elements
![Page 4: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/4.jpg)
Definition(cont’d) computing paradigm - defines the basic RC computing
model without reference to implementation. very complex tasks – commonly referred to as
configurations RC tasks require more time than general purpose computing instructions and more area than the typical general purpose execution unit.
Spatial and temporal partitioning – algorithms are decomposed into tasks in both the space and time domains.
hardware elements - at their core RC devices consist of a very large set of simple programmable elements collectively called Reconfigurable Execution Unit (REU)
![Page 5: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/5.jpg)
General Characteristics of RC Stored configuration algorithms No software Pipeline architectures are common Real-time applications
Advantages Flexible
Configurable Cost comparable to GPP
Hardware is readily available Shorter development cycle than ASICs
Parallelism Algorithm parallelism exploited in custom architecture
Problem specific operators and control High-performance
Reduced memory dependence and exploit fine-grained algorithm parallelism. Timesharing
Hardware can be time multiplexed by multiple applications
![Page 6: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/6.jpg)
Disadvantages
Additional area requirements Configuration memory (internal/external),
Internal switches and other hardware overhead Time Overhead
Device configuration, and internal switches
![Page 7: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/7.jpg)
Traditional Computing Using Application-Specific Integrated Circuits
(ASICs) to “hard-wire” an algorithm in hardware. Extremely fast Require less Silicon area Less power hungry than GP architectures Extremely inflexible Expensive both in design and fabrication Errors are difficult to correct
Examples:Consumer Electronics, Telecommunications, Automotive Industry
![Page 8: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/8.jpg)
Traditional Computing(Cont'd) General-purpose hardware, combined with
application-specific software Extremely flexible due to versatile instruction set. Much less expensive to develop. Poor performance compared to ASICs. Errors can be dynamically patched.
Examples: Commodity PC hardware running commercial software.
![Page 9: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/9.jpg)
Reasons for Poor Software Performance
Fetching of instructions Interpretation of instructions Scheduling of instructions Wrong mix of hardware resources to suit a
particular application’s needs
Therefore Reconfigurable computing is intended to fill the gap between HW and SW.
![Page 10: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/10.jpg)
Flexibility and Efficiency Tradeoffs
![Page 11: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/11.jpg)
Can we call FPGA’s to be Reconfigurable Processing unit ? Traditional FPGAs are configurable, but not
run-time reconfigurable Traditional FPGAs expect to read their
configuration out of a serial EEPROM, one bit at a time.
Therefore, FPGA must be reprogrammed in its entirety and that its previous internal state cannot be captured beforehand.
![Page 12: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/12.jpg)
Features for Reconfigurable Hardware
On-the-Fly Reprogrammability Partial Reprogrammability Externally-Visible Internal State
![Page 13: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/13.jpg)
Kress ALU Array-III(KrAA-III)
instruction level parallelism transparently scalable fast routing and placement (seconds only) dynamically and partially reconfigurable
(microseconds) suitable for full custom design on microprocessor chip: much higher acceleration than
by caches on microprocessor chip: fast and low power by full
custom design acceleration by massive run time to compile time
migration
![Page 14: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/14.jpg)
Kress ALU Array-III(KrAA-III)
KrAA-III consists of PEs called rDPU-III (reconfigurable DataPath Unit III) arranged in a NEWS network.
Figure shows the KrAAIII chip containing 9 rDPUs.
![Page 15: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/15.jpg)
Basic Architecture of today’s commercial reconfigurable processor
![Page 16: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/16.jpg)
Devices which combined FPGA with Standard processor core
Triscend’s E5 and A7 Altera’s two Excalibur families Atmel’s FPSLIC Chameleon Systems’ CS2000
![Page 17: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/17.jpg)
Zippy Architecture
It is used to develop reconfigurable processor technology for domain of handheld and wearable computing.
To investigate new trade offs between performance, power consumption and system cost
It is an international research effort lead by Swiss Federal Institute of Technology
![Page 18: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/18.jpg)
Reconfigurable Computing Merging Efficiency and Versatility
![Page 19: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/19.jpg)
Hardware Design steps
![Page 20: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/20.jpg)
ExamplesSPLASH IIMulti FPGA parallel computer with orchestrated systolic communications to perform inter- FPGA data transfer
![Page 21: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/21.jpg)
GarpFor general purpose loop acceleration
![Page 22: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/22.jpg)
CMC Rapid Prototyping Platform
![Page 23: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/23.jpg)
RC Applications RC has demonstrated >10x performance density advantage
over microprocessors and DSPs Pattern matching Data encryption Data compression Video and image processing
Commercial Push Handheld devices - PDAs, mobile Phones, specialized tools Networks - telecom switches, network routers, network bridges High-performance Computing – super computers, medical
appliances, robot navigation and planning Defense – Ballistic Missiles, KV navigation, Spacecraft
processing
![Page 24: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/24.jpg)
RC Implementations Hardware
Catalina Research Incorporated - http://www.catalinaresearch.com/Chameleon
Annapolis Microsystems - http://www.annapmicro.com/Wildstar
Alpha Data Parallel Systems - http://www.alpha-data.com
Tools Celoxica - http://www.celoxica.com Star Bridge Systems - http://www.starbridgesystems.com Annapolis Microsystems - http://www.annapmicro.com/
CoreFire
![Page 25: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/25.jpg)
Content
Coupling Approaches (Reconfigurable Hardware with General Processor)
Granularity of the FPGA as an RCS Implementation Approaches
Compile Time Reconfiguration Run Time Reconfiguration
Some more advantages Challenges Software like Design environment
![Page 26: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/26.jpg)
Coupling Approaches for Reconfigurable Hardware (RH)
RH can be coupled to GP as:
A functional unit (Tight Coupling) A Co-processor An Attached processing unit A Standalone processing unit (Loosely
coupled)
![Page 27: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/27.jpg)
Coupling Approaches Cont’d
As a Functional Unit: Within a host processor (General purpose: GP) Uses data-path of a host machine
As a Coprocessor: Without constant supervision of the GP GP initializes the RH Independent parallel computation Less communication overhead
![Page 28: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/28.jpg)
Coupling Approaches Cont’d
As an attached processing unit: Behaves as an additional processor Memory Cache not visible Independent Computation but high
communication overhead
As a Standalone: The most loosely coupled to GP Infrequent Communication with the GP Independent computation for very long
period of time
![Page 29: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/29.jpg)
Different levels of coupling
FU
Workstation
Coprocessor
CPU Memory Caches
I/O Interfac
e
Standalone Processing Unit
Attached Processing Unit
![Page 30: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/30.jpg)
Pros and Cons of different coupling approaches
The tight integration Very less communication overhead RH can not operate “alone” for long period
of time Amount of Reconfig. Logic is limited
The loose integration Greater parallelism Higher communication overhead
![Page 31: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/31.jpg)
Logic Block Granularity
Referred to the size and complexity of the CLB
Fine grained logic block Less complex, Altera Flex 10k consists of
single 4 input LUT with flip-flop Useful for bit-level manipulation Exceed the performance of GP in case of
operation on variable bit data width Smaller area, high amount of computation
(Compact) Encryption and image processing application
![Page 32: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/32.jpg)
Logic Block Granularity cont’d
Coarse grained logic block Larger granularity of the CLB Helps perform more complex operations Four 2-bit inputs (GARP) and multiplier in
each logic block for 4 x 4 multiplication Finite State Machine Word-width (16 bit) data path circuits
implementation in Very coarse-grained structure
Logic block closer to small processor
![Page 33: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/33.jpg)
Implementation Approaches
Compile Time Reconfiguration (CTR) Static implementation strategy Single system wide configuration Configuration doesn’t change during
computation Similar to using ASIC for application
acceleration Run Time Reconfiguration (RTR)
Dynamic implementation strategy Multiple time-exclusive configurations Dynamic hardware allocation (run-time)
![Page 34: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/34.jpg)
RTR
Main Task: Dividing algorithm into time-exclusive segments
Global RTR Allocates whole FPGA resources for each
configuration Single system wide configuration for each
phase Local RTR
Locally reconfigure subsets of logic at run-time Partial reconfiguration, flexibility Functional division of labor
![Page 35: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/35.jpg)
RTR Cont’d
EXE. A
LOAD B
EXE. B
LOAD C
EXE. C
LOAD A
EXE. EXE.
A A
BD
C
Global RTR
Local RTR
![Page 36: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/36.jpg)
Implementation Issues
Temporal partitions a iterative process Possibly inefficient usage of FPGA
resources in global RTR Simulation Efficient usage of hardware in local RTR Current CAD tools: poor match for local
RTR (Examples of Local RTR: RRANN-2 and
DISC )
![Page 37: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/37.jpg)
Power Savings in RC system
Exploitation of numerical properties of an application
Higher number of operations per clock due to deep pipelines
Sensor/actuator pre-conditioning and “glue logic” functions on chip
![Page 38: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/38.jpg)
Some Challenges
Access to the development of RCS restricted to hardware developers
Run-time environment, RTR scheduling
Difficulties in routing for RC hardware having large number of CLBs
Connection scheme in multi-FPGA system
![Page 39: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/39.jpg)
Software Aspect
Software like design environment System C (Synopsys), Handel C (Celoxica) Hardware-Software co-design (ARM Rapid
Prototyping Platform (RPP) Generation of detail gate level description
(netlist) by HLL (High level language) Technology mapping, Placement and
Routing Generation of .bit files (language of the
FPGA)
![Page 40: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/40.jpg)
Software Aspect Cont’d
Programming language/HDL SoC consists 50 to 90% software Wide acceptability of C/C++ Simulation timing
Simulation takes long time in current CAD tools
C/C++ debugger very efficient
![Page 41: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/41.jpg)
RC1000 Celoxica platform
DK1 design suite (handel C) RC1000 plug-in card, PCI bus interfacing
Xilinx Virtex-1000 FPGA (1 million gates) Design Flow
Handel C Source Files
Compile
GenerateEDIF (netlist)
GenerateVHDL/Verilog
Simulate & netlist
Place & RouteTools
GenerationBitStream
![Page 42: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/42.jpg)
Hardware-Software Co-design
Amdahl’s Law
T = 1(1 – a) + a / s
T = Overall speedupa = Fraction of the original program that
could be enhanced by transferring to h/w
s = Speedup obtained for particular fraction of program
![Page 43: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/43.jpg)
Summary
RCS to bridge the gap between Software and hardware (flexibility and performance)
FPGA ideal candidate for an RH Spatial Execution Reprogrammability Design time Design and synthesis flow for CAD tools
Hybrid Architecture Recent advancement in CAD tools
![Page 44: Reconfigurable Computing Systems: An Overview Presented by: Gurwant Kaur Koonar Vijay Pandya 14 th March 2003](https://reader038.vdocuments.mx/reader038/viewer/2022110103/56649e395503460f94b2a61e/html5/thumbnails/44.jpg)
Questions?????????????