OPTIMIZATION OF POWER REDUCTION
IN FPGA INTERCONNECT BY
CHARGE RECYCLING
Deepa Soman, HyunSuk Nam, Rekha Srinivasaraghavan, Shashank Sivakumar
Agenda Day 2
Power Reduction Techniques (Conti)
Charge Recycling Our Project Discussions
Day 1 Intro Power Consumpti
on Techniques Power Reduction T
echniques Discussions
Motivation Achilles’ Heel Logic flexibility & re-programmability -
longer wires (7-14 X) higher than asics
Introduction
Power Consumption Dynamic Power - power consumed while the
inputs are active
Static power - power consumed even when there is no circuit activity !!!
fCVP dddynamic2
KTqV
DSdd
leakageddsub
th
eIV
IVP
0
.
Why Panic about Power?
Why Static Power??
Low Power Opportunities
Hardware Techniques
• Voltage Scaling Dual Vdd
• Frequency Scaling• Clock Gating
9
Voltage Scaling Selecting core voltage based on
performance requirements
How to Choose? – From Timing Analysis
Types: 1) Static Voltage Scaling 2) Dynamic Voltage Scaling
10
1. Static Voltage Scaling Selected core voltage only Realized using on chip Low-Dropout
regulator(LDO) Voltage controlled by configuration bit
stream 0.8-V - minimum dynamic and leakage
power 1.0-V - overall highest performance
[1]"A FPGA Prototype Design Emphasis on Low Power Technique" Xu, Jian
1.0v
0.8v
LDO
11
2. Dynamic Voltage Scaling Provides different voltage levels Realized using voltage controlling unit
Can be level shifter or DC-DC converter DVS implementation
(LDMC – Logic Delay Measurement Unit) Delay error
”Dynamic Voltage Scaling for Commercial FPGAs”, C.T. Chow1, L.S.M. Tsui1, P.H.W.
12
Dual Supply Voltage (Vdd) Separate voltage supplies for
configuration SRAM and other elements Purpose: To support sleep mode
Shutdown most logic except SRAM using LDO
“A Dual-VDD Low Power FPGA Architecture” A. Gayasen1, K. Lee1, N. Vijaykrishnan1, M. Kandemir1, M.J. Irwin1, and T. Tuan2
13
Performance Static voltage scaling techniques leads to nearly
53% power reduction. Dynamic(upto 54%). Dual Vdd- 14%
Merits: SVS - Simple hardware DVS - Self adaptive Dual Vdd – eliminate speed penalty
Demerits: SVS - Voltage is fixed DVS - design complexity Dual Vdd - area overhead
[1]"A FPGA Prototype Design Emphasis on Low Power Technique" Xu, Jian[2]”A 90-nm Low-Power FPGA for Battery-Powered Applications”,Tuan, Das, Steve, Sean
14
Frequency ScalingfCVP dddynamic
2 f : frequency of switching
Dynamic Clock Management Implementations
(a)Simple dynamic clock management circuit
(b) Using Feedback, PLL circuit can reduce skew; lock time
(c) dynamic clock divisionMerits:• Can subsequently reduce voltageDemerits:• Increased Latency
15
Benefits of Frequency Scaling
As frequency decreases, power consumption also decreases
"Dynamic Clock Management for Low Power Applications in FPGAs", Lan, zilic
16
Clock Gating Controlling the clock flow Purpose: To temporarily disable blocks Can be realized in hardware using clock enable
signals minimizes power dissipation in clock
circuits/network
17
Clock Gating - Performance
industry-a,b,c,d, are DSP circuits, while the remaining circuits are collected from customers and are of unknown function
Over 20% power reductions are observed for the DSP circuits
Clock Power Reduction for Virtex-5 FPGAs
Eliminates unnecessary toggling on outputs, gates of FFs and clock signalsDemerits:Clock skew
"Clock Power Reduction for Virtex-5 FPGAs",Wang, Gupta, Anderson
A
• System Level: • Algorithm
Modification• CAD Tools :
• Logic Partitioning
• Mapping,• Clustering • Placement &
Routing
Software Techniques
Low Power FFT Implementation Architecture
Matrix multiplication ->1D array low power dissipation than 2D array
Module Disabling – Clock gating to disable modules eg: twiddle factor calculation
dynamic memory activation Multiple time multiplexed Pipeline uP Parallel Processing Algorithm : Block Matrix Multiplication
FFT implementation Results 17% to 26% power reduction
"High throughput energy efficient multi-FFTarchitecture on FPGAs" , Chen , Park, Prasanna
21
Energy Reduction Contributions of CAD Stages
Clustering contributes to the major share !
"On the interaction between power aware FPGA CAD algorithms" , Julien , Steven
Power Aware Clustering Power Aware TV pack How?? Cost function Modification to include
power
Results: Power Aware clustering
“Netlength Based Routability Driven Power Aware Clustering" , Akoglu, Easwaran
Power Aware Placement
Results
"On the interaction between power aware FPGA CAD algorithms" , Julien , Steven
Temperature Aware Routing leakage current increases exponentially
with temperature
Switching capacitance
27
Algorithm By discouraging routing algorithm to
form connections that cross hotspot regions
Cost Function Modification:
Power Savings Range between 30 – 63 %
"A Temperature-Aware Placement and Routing targeting 3D FPGAs", Kostas, Soudris
Power-Aware FPGA Design Flow
Step 2• Power Aware
Packing• or Clustering
CAD • Power Aware Placement
Tools • Power Aware Routing
Step 1• Power Based
Architectural• (High level
modelling)
RTL
• Voltage scaling, Dual Vdd
• Freq Scaling, Clock gating
Main/Baseline PaperProblem Addressed
Power consumption in FPGAs is dominated by interconnect(62%)
Proposed idea Charge recycling for
power reduction in FPGA interconnect
Charge Recycling (CR)
Charge Recycling in FPGAs
How?? “Unused routing resources “ as reservoirs
Reduces charge drawn from Vdd25% reduction in energy
1. 2. 3. 4.
5. 6. 7.
Unused/Reservoir
Unused/Reservoir
Unused w/o friends !!
CR-Capable FPGA Interconnect
Analysis Four components
SRAM Cell• Produce signals CR and TS :
control a switch (Normal, CR, tri-state )
Delay Line• Transition between VIN and
DLOUT
CR Circuit• Perform the charge sharing
between the load and reservoir Input Stage
Experiments/MethodologyVPR6.0
Baseline : Island style, Unidirectional, Wilton (K=6 ,N=4)
Router – Path Finder - Cost Function ModificationPost Routing CR mode
VPR place/route tool helps in finding % increase in area
VPR Cost Function Cost Function – Path
Finder
Modified Cost Function
Post - Routing Mixed Integer Linear Program
Tries to maximize the number of nodes to be put into CR mode
Constraint: Critical delay of the circuit
Results Dynamic power in the FPGA interconnect is reduced by up to ∼15-18.4%
Results Continued… Number of min-width transistors as the
area metric Reductions in power savings are not
directly proportional to the reduction in CR-capable switches (area)
What we propose new? Not all unused wires become friends Unused wires connected to constant voltage
“URekha” --- Unused wires Tri-stated “further power savings!!”
~6% savings
Thank you