design framework for partial run-time fpga reconfiguration

25
Design Framework for Design Framework for Partial Run-Time FPGA Partial Run-Time FPGA Reconfiguration Reconfiguration Chris Conger, Ann Gordon-Ross, and Alan D. George Presented by: Abelardo Jara- Berrocal HCS Research Laboratory College of Engineering University of Florida ERSA 2008 Las Vegas, NV July 14–17, 2008

Upload: violet-ward

Post on 31-Dec-2015

38 views

Category:

Documents


0 download

DESCRIPTION

ERSA 2008 Las Vegas, NV July 14–17, 2008. Design Framework for Partial Run-Time FPGA Reconfiguration. Chris Conger, Ann Gordon-Ross, and Alan D. George Presented by: Abelardo Jara-Berrocal HCS Research Laboratory College of Engineering University of Florida. Outline. Introduction - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Design Framework for Partial Run-Time FPGA Reconfiguration

Design Framework for Design Framework for Partial Run-Time FPGA Partial Run-Time FPGA ReconfigurationReconfiguration

Chris Conger, Ann Gordon-Ross, and Alan D. George

Presented by: Abelardo Jara-Berrocal

HCS Research LaboratoryCollege of Engineering

University of FloridaERSA 2008Las Vegas, NVJuly 14–17, 2008

Page 2: Design Framework for Partial Run-Time FPGA Reconfiguration

2

Outline Introduction Partial Reconfiguration (PR) Overview Proposed Design Methodologies Framework analysis Conclusions

Page 3: Design Framework for Partial Run-Time FPGA Reconfiguration

3

General purpose I/O

System controller

FPGA

Configuration lines

Shared memory

Battery

Module A

Module B

Module A

Module BModule A

Module B

Module C

Introduction – Fully reconfigurable systems

Bitstreams storage

External I/O

Design station

Required design

1. Device too small for complex designs

Module C

Module B

Module A

Module B

Module A

Module C

Module C

Module B

Module A

Module C

2. Big full bitstreams (long reconfiguration time)

Config 1

Config 2

Config 3Config 1 RequestConfig 2 Request

3. Complete system operation is halted prior to reconfiguration

Doe

s’nt

fit Module C

Module B

disabled

disabled

enabled

enabled

disabled

disabled

Page 4: Design Framework for Partial Run-Time FPGA Reconfiguration

4

Newer Xilinx FPGA families offer partial reconfiguration feature A rectangular region of the FPGA can be reconfigured without affecting

the remaining FPGA area System can continue operating without interruption

Introduction – The Virtex 4 PR architecture

)Reconfigurable

region 1

Reconfigurable region 2

Page 5: Design Framework for Partial Run-Time FPGA Reconfiguration

5

Module A

Module C

Module B

Introduction – A sample PR architecture

FPGA

Bitstreams storage

Battery

External I/O

Module C

3. Smaller partial bitstreams

Module A request

1. System controller does not need to be placed in an external device2. Access to fast Internal Configuration Access Port (ICAP – 32 bits, 100 MHz)

4. No need to halt complete system when reconfiguring a module5. Time multiplexing of FPGA resources, load and unload HW modules on demand

Base system configuration

JTAG

Reconfigurable area

disabled

disabled

Co

ntr

oll

er

(Mic

rob

laze

)

ICAP

Fla

sh

co

ntr

oll

er

Module C

Module B

enabled

Module Aenableddisabled

Static area

Module A

Module B

Page 6: Design Framework for Partial Run-Time FPGA Reconfiguration

6

Co

ntr

oll

er

(Mic

rob

laze

)

ICAP

Fla

sh

co

ntr

oll

er

Introduction – Current PR Design Flow Steps

Partition the system into modules Define static modules and

reconfigurable modules Decide the number of PR regions

(PRRs) Decide PRR sizes, shapes and

locations Map modules to PRRs Define PRR interfaces, instantiate

slice macros for PRR interfaces

Optimization problems Design partitioning Number of PRRs PRR sizes, shapes and locations Mapping PRMs to PRRs Type and placement of PRR

interfaces

Module A

Module C

Module B

Static modules Reconfigurable Modules (PRMs)

12

FP

GA

# of PRRs?

PRR 1

PRR 2

Sta

tic r

egio

nStatic modules

Modules: A and B

Modules: C

De

sig

n

pa

rtiti

on

ing

De

sig

n

floo

rpla

nn

ing

a

nd

bu

dg

etin

g

Page 7: Design Framework for Partial Run-Time FPGA Reconfiguration

7

Introduction – Early Access PR Design Flow Introduced by Xilinx in FPL’06

Major improvements: Automatic implementation scripts Rectangular regions (not full column reconfiguration) Static nets can cross reconfigurable regions Slice macros replace bus macros

Partitioning and floorplanning steps are manually executed Design guidelines for these steps are not provided

(manual)

Placement and PRRs constraints

PRM Bitstreams

Design partitioning

Design floorplanning and budgeting

Xilinx PR Implementation

FlowFull Initial Bistream

Reconfigurable design

specifications

(automatic)Potential for development of automatic CAD tools

Page 8: Design Framework for Partial Run-Time FPGA Reconfiguration

8

Introduction – Current PR design tools limitations

PR design is a very specialized task Only a physical level of support is provided

Architectural knowledge of the target device is a must Not very flexible, many design constraints

Partitioning and floorplanning steps are manually executed No performance sensitive design guidelines are provided No automatic heuristics based design flow is available too

Lack of abstraction from low level details discourages designers from using PR Difficult for many end users

In this work, we will propose a taxonomy of PR systems design flows and a efficient methodology for each type.

Page 9: Design Framework for Partial Run-Time FPGA Reconfiguration

9

PR Overview – Taxonomy of PR systems design flows PR System

Design Flow

MultipurposeSpecial purpose

Highly specialized systems design

All PRMs that will exist on the system are known at design time

Each PRR is independently optimized (size, shape, location, interface) based on the PRMs that will be mapped to it

Output is:

1) Floorplan defining a static region and a set of optimized PRRs

2) The set of PRMs that can be placed in each PRR (PRMs to PRRs mapping)

Not optimized for a specific application

PRMs required by the application are not known when designing the base system

Goal is to design a flexible and reusable base design that can be used for several different PR systems

Base system designer defines a set of PRRs with fixed shapes, sizes, locations and interfaces

Generated floorplan is used as input template for the PRMs implementation

Page 10: Design Framework for Partial Run-Time FPGA Reconfiguration

10

Proposed Design Methodology: Special-Purpose Partition the system into several

hardware modules Synthesize the hardware modules Use a control flow graph (CFG) and a

states table to represent: Application states and the transitions

between them (execution path coverage) Set of modules required in each

application state

Let

’s s

ee a

n e

xam

ple

Page 11: Design Framework for Partial Run-Time FPGA Reconfiguration

11

Proposed Design Methodology: Special-Purpose

1. A, B are present in all states (static modules)

2. C, F, G and D are reconfigurable modules (PRMs)

3. F and G are mutually exclusive with respect to C (they can not be placed in the same PRR than C)

4. F, G, D and E can be placed in the same PRR

5. C, D and E can be placed in the same PRR

S1

S2

S5S4

S3STATE MODULES

S1 A, B, C

S2 A, B, C, F

S3 A, B, C, G

S4 A, B, D

S5 A, B, E

Static Reconfigurable

C

F

G

D

E

Define region partitioning constraints

Establishing constraints

Page 12: Design Framework for Partial Run-Time FPGA Reconfiguration

12

4

?21 ?

Proposed Design Methodology: Special-Purpose Define the number of PRRs to be used

Optimization variable Number is computed based on CFG and states table

# PRRs =

Define a PRMs to PRRs mapping Optimization problem Combinatorial design space Design space is reduced usign design constraints

Static Region:

PRR 1:

PRR 2:

A, B

C, D, E

F, G Possible solution (not necessarily the optimal)

Page 13: Design Framework for Partial Run-Time FPGA Reconfiguration

13

Module A

Module B

Module C

Module D

Module E

Module F

Module G

And when do we size our PRRs? Don’t worry, it is our next step

Proposed Design Methodology: Special-Purpose

Required static region resources (Resources

are added)

Required PRR 1 Resources (Maximum of

each resource type)

Required PRR 2 Resources (Maximum of each resource type)

Mo

du

les

pro

file

Slices BRAMs DSP48s

Page 14: Design Framework for Partial Run-Time FPGA Reconfiguration

14

Fin

al o

ptim

ized

cus

tom

bas

e sy

stem

flo

orpl

an Define the PRR sizes, shapes, locations inside the FPGA fabric

Floorplanning optimization problem Proper metrics for PRR performance analysis are required Design guidelines for efficient PRR floorplanning are also a necessity

Proposed Design Methodology: Special-Purpose

FP

GA

Sta

tic r

egio

n

PRR 1 Resources

PRR 2 Resources

Reconfigurable region with enough resources for PRR1

PR

R1

PR

R2

We do the same for PRR2

Define PRR interfaces Place slice macros

Page 15: Design Framework for Partial Run-Time FPGA Reconfiguration

15

Proposed Design Methodology: Special-Purpose Methodology outputs

Custom base system

PRMs to PRRs mapping

They are used as input files for the automatic Xilinx PR Design Flow

Page 16: Design Framework for Partial Run-Time FPGA Reconfiguration

16

Proposed Design Methodology: Special-Purpose

Opportunity to automate this flow through design tools

Optimization variables Number of PRRs PRRs sizes, shapes, and

locations PRMs to PRRs mapping Other additional

optimization variables can be defined

Several possible cost functions: Area wastage Power usage Application latency Throughput …

Page 17: Design Framework for Partial Run-Time FPGA Reconfiguration

17

Framework analysis – PRR Geometries PR system design flows require:

Proper metrics for PRR performance analysis

Design guidelines for efficient PRR floorplanning

Study of the effects of varying PRR shape over Maximum Clock Frequency Partial Bitstream Size

Five separate test cores: Beamforming (DSP/slice) CFAR (slice/memory) AES (register) ARM7 softcore (hybrid) Sine/Cosine LUT (memory)

Performed on V4SX55 thus far

Aspect ratio =

PRR Height / PRR Width

Page 18: Design Framework for Partial Run-Time FPGA Reconfiguration

18

Framework analysis – Beamforming (~125 MHz, 40%)

5022 slices 16 DSP48s 17 RAMB16s Baseline, non-PR performance = 1614 kB, 127.845 MHz

Clo

ck fr

eq

uen

cy (

MH

z)

Bits

trea

m s

ize

(kB

)

Aspect ratio Aspect ratio

Page 19: Design Framework for Partial Run-Time FPGA Reconfiguration

19

Framework analysis – CFAR (~100 MHz, 16%)

2610 slices 2 DSP48s 34 RAMB16s Baseline, non-PR performance = 1001 kB, 103.616 MHz

Clo

ck fr

eq

uen

cy (

MH

z)

Bits

trea

m s

ize

(kB

)

Aspect ratio Aspect ratio

Page 20: Design Framework for Partial Run-Time FPGA Reconfiguration

20

Framework analysis – AES (~80 MHz, 13.75%)

3634 slices 3943 registers 4 RAMB16s Baseline, non-PR performance = 1393 kB, 80.483 MHz

Clo

ck fr

eq

uen

cy (

MH

z)

Bits

trea

m s

ize

(kB

)

Aspect ratio Aspect ratio

Page 21: Design Framework for Partial Run-Time FPGA Reconfiguration

21

Framework analysis – ARM7 (~40 MHz, 6.8%)

1826 slices 16 DSP48s 10 RAMB16s Baseline, non-PR performance = 872 kB, 40.985 MHz

Clo

ck fr

eq

uen

cy (

MH

z)

Bits

trea

m s

ize

(kB

)

Aspect ratio Aspect ratio

Page 22: Design Framework for Partial Run-Time FPGA Reconfiguration

22

Framework analysis – Sine/Cosine LUT

107 slices 27 RAMB16s Baseline, non-PR performance = 571 kB, 204.918 MHz

Clo

ck fr

eq

uen

cy (

MH

z)

Bits

trea

m s

ize

(kB

)

Aspect ratio Aspect ratio

Page 23: Design Framework for Partial Run-Time FPGA Reconfiguration

23

Framework analysis – PRR Geometries Slice-intensive designs show best bitstream

size/clock frequency performance with aspect ratio around 2-4 Roughly equivalent to aspect ratio of the FPGA as a whole

Non-slice intensive designs show best bitstream performance with aspect ratio >> 4 Due to columnar distribution of RAMB16/DSP48 resources on

chip Clock frequency relatively insensitive to aspect ratio Not shown in graph: resource wastage also improved

Results are more pronounced for high frequency designs

However, aspect ratio not the only design consideration Placement on a chip relative to other regions, pins, or

resources may affect (restrict) choice of PRR shape

Page 24: Design Framework for Partial Run-Time FPGA Reconfiguration

24

Conclusions - Contributions of this work Taxonomy for PR systems design flows and a design methodology for

efficient development of each type Identification of relevant optimization variables and constraints

Number of PRRs, optimal mapping of PRMs to PRRs, system floorplanning Propose their incorporation in a future automatic design tool

Study of the effects of varying PRR shape Maximum Clock Frequency Partial Bitstream Size Multiple classes of cores/designs

Memory-intensive DSP-intensive Combinational Logic-intensive Register-intensive Etc.

PRR floorplanning guidelines definitions and delivery

Page 25: Design Framework for Partial Run-Time FPGA Reconfiguration

25

Questions