reconfigurable computing - · pdf file• reconfigurable computing is intended to fill the...

48
Reconfigurable computing Eduardo Sanchez EPFL Eduardo Sanchez 2 Reconfigurable computing Methods for execution of algorithms: hardwired technology: high performance software-programmed microprocessors: high flexibility

Upload: trinhcong

Post on 06-Feb-2018

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Reconfigurable computing

Eduardo Sanchez

EPFL

Eduardo Sanchez 2

Reconfigurable computing

• Methods for execution of algorithms:

• hardwired technology: high performance

• software-programmed microprocessors: high flexibility

Page 2: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 3

• Why hardwired solutions are faster than software solutions?

Eduardo Sanchez 4

• Reconfigurable computing is intended to fill the gap betweenhard and soft, achieving potentially much higher performancethan software, while maintaining a higher level of flexibility thanhardware (Compton and Hauck, “Reconfigurable computing”,ACM Computing Surveys, June 2002)

• Reconfigurable computing:

• systems incorporating some form of hardware programmability

• when we talk about reconfigurable computing we are usually talkingabout FPGA-based systems design

• Main motivations:

• accelerators for computing intensive applications

• tools for system validation: prototyping, emulation

Page 3: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 5

Moore's law

Eduardo Sanchez 6

Page 4: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 7

Eduardo Sanchez 8

410 millionItanium 2

125 millionPentium 4

37 millionMoore's law

(2x - 24 months)

3.3 billionMoore's law

(2x - 18 months)

27.4 trillionMoore's law

(2x - 12 months)

Transistors

(actual in 2004)

Transistors

(predicted for 2004)

Page 5: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 9

Eduardo Sanchez 10

• Intel introduces a new chip-fabrication process every twoyears:

• 2001: 0.13 micron

• 2003: 90 nm

• 2005: 65 nm

• 2007: 45 nm

• 2009: 32 nm

• ....

Page 6: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 11

Problem: Wirth's law

• Software is slowing faster than hardware is accelerating

• Expressed in Biblical cadences:rov ive and Ga ake awa

Andy GroveIntel Chairman

Bill GatesMicrosoft President

Eduardo Sanchez 12

• Computing requirement increases even faster than Moore'slaw

Page 7: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 13

Problem: time to market

Eduardo Sanchez 14

Problem: power consumption

400MHz

200MHz

100MHz

50MHz

Page 8: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 15

Embedded systems

• Embedded systems, which are hidden from the user andcannot usually be manipulated or reprogrammed, are found invirtually all electronic equipment used today, from wirelesstelephones and DVD players to cars and airplanes

• A fifth of the value of each car produced in the EU is due toembedded electronics, a value that is expected to rise to about40 percent by 2015

Eduardo Sanchez 16

Page 9: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 17

Eduardo Sanchez 18

Pervasive computing

• "The most profound technologies are those that disappear.They weave themselves into the fabric of everyday life untilthey are indistinguishable from it"Mark Weiser, "The Computer for the 21st Century", ScientificAmerican, Septiembre, 1991

• In a near future, the computer will disappear for beingeverywhere: it will be ubiquitous, pervasive

• Pervasive systems will be so integrated with theirs users thatthey will be invisible, they will disappear

Page 10: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 19

Eduardo Sanchez 20

• Evolution of computer systems:

• mainframes: one computer, many users

• PCs: one computer, one user

• pervasive systems: many computers, one user

Page 11: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 21

remote communication fault tolerancehigh availability

remote information accessdistributed security

mobile networkingadaptive applications

energy-aware systemsmobile information access

location sensitivity

smart spacesinvisibility

localized scalabilityuneven conditioning

distributed systems

mobile computing

pervasive systems

Eduardo Sanchez 22

Smart Dust project

Page 12: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 23

• A 20 MIPS CPU embedded in a shoe

• Four times more powerful than early Silicon Graphics workstations (Motorola68000)

Eduardo Sanchez 24

• A lot of new devices to design

Page 13: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 25

Integrated circuits

Full-custom(ASIC)

Hand-made

Libraries

Semi-custom

Maskprogrammable

Fieldprogrammable

Gate array ROMPROMPALPLA

CPLD FPGA

Standard circuits

Eduardo Sanchez 26

Field Programmable Gate Arrays

• Array of logic cells

• Each cell is able to implement a logic function, chosen amongseveral possible functions: the choice is done by programming

• Interconnections between cells are also programmable

• Two types, depending on the cell’s complexity:

• fine grain

• coarse grain

• Two types, depending on the programming mode:

• RAM: every logic cell contains a LUT (look-up table), accompanied by aflip-flop, and all interconnected with programmable routing pathways

• anti-fuses

Page 14: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 27

programmableinterconnections

programmablefonctions

configuration

I/O celllogic cell

Eduardo Sanchez 28

Programmable

interconnect

Programmable

logic blocks

Page 15: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 29

|

&a

b

cy

y = (a & b) | !c

Required function Truth table

1011101

000

001

010

011

100

101

110

1111

y

a b c y

00001111

00110011

01010101

10111011

SRAM cells

Programmed LUT

8:1

Multi

ple

xer

a b c

Eduardo Sanchez 30

• An example of logic cell:

LUTCarry &Control

SP

D

EC

RC

Q

G4

G3

G2

G1

BY

YQ

YYB

Cout

Cin

• Functional frequencies are design-dependant

Page 16: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 31

16-bit SR

flip-flop

clock

mux

y

qe

a

b

c

d

16x1 RAM

4-input

LUT

clock enable

set/reset

Eduardo Sanchez 32

16-bit SR

16x1 RAM

4-input

LUT

LUT MUX REG

Logic Cell (LC)

16-bit SR

16x1 RAM

4-input

LUT

LUT MUX REG

Logic Cell (LC)

Slice

Page 17: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 33

CLB CLB

CLB CLB

Logic cell

Slice

Logic cell

Logic cell

Slice

Logic cell

Logic cell

Slice

Logic cell

Logic cell

Slice

Logic cell

Configurable logic block (CLB)

Eduardo Sanchez 34

Columns of embedded

RAM blocks

Arrays of

programmable

logic blocks

Page 18: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 35

RAM blocks

Multipliers

Logic blocks

Eduardo Sanchez 36

x

+

x

+

A[n:0]

B[n:0] Y[(2n - 1):0]

Multiplier

Adder

Accumulator

MAC

Page 19: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 37

uP

RAM

I/O

etc.

Main FPGA fabric

Microprocessorcore, special RAM,

peripherals andI/O, etc.

The “Stripe”

Eduardo Sanchez 38

uP

(a) One embedded core (b) Four embedded cores

uP uP

uP uP

Page 20: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 39

Configuration data in

Configuration data out

= I/O pin/pad

= SRAM cell

Eduardo Sanchez 40

Serial load with FPGA as master

Mode Pins Mode

Serial load with FPGA as slave

Parallel load with FPGA as master

Parallel load with FPGA as slave

0 0

0 1

1 0

1 1

Page 21: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 41

Configuration data in

Mem

ory

Dev

ice

Control

Configuration

data out

FPGA

Cdata In

Cdata Out

Eduardo Sanchez 42

Configuration data [7:0]

Mem

ory

Dev

ice

Control FPGA

Cdata In[7:0]

Address

Page 22: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 43

Configuration data [7:0]Mem

ory

Dev

ice Control FPGA

Cdata In[7:0]

Eduardo Sanchez 44

Mem

ory

Devic

e

Control

Mic

rop

roce

ss

or

Address

Data

Peri

ph

era

lP

ort

, etc

.

FPGA

Cdata In[7:0]

Page 23: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 45

• Total area = active logic + configuration memory + interconnect

interconnect

active logic

configuration memory

Eduardo Sanchez 46

• Advantages over PLDs:

• enhanced flexibility

• reduced board space, power and cost

• increased performance

• Advantages over ASICs:

• reprogrammability

• off-the-shelf availability

• zero NRE (non-recurring engineering) costs

• reduced time-to-market

• ease-of-use

Page 24: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 47

CumulativeNRE + Unit Cost

CumulativeVolume K Units

ASIC .15

ASIC .25

FPGA .25 FPGA .15

ASIC costs starthigher, but slopeis flatter

For each technologyadvance, FPGAs becomemore cost effective

Eduardo Sanchez 48

• As performance requirements increase, the implementation ofcontrol elements in embedded applications is moving from 8-bits to 32-bits

• At the same time, the implementation vehicle of choice forembedded applications is moving from ASICs to FPGAs due tocost and time-to-market pressures

Page 25: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 49

Eduardo Sanchez 50

Synthesis methodology

configuration bit-string

schematic

graphic editor VHDL

placement

routing

partition

Page 26: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 51

Registertransfer level

RTL

Logic

Simulator

RTL functionalverification

LogicSynthesis

Gate-levelnetlist

Logic

Simulator

Place-and-Route

Gate-level functionalverification

Eduardo Sanchez 52

Graphical State Diagram

Graphical Flowchart

When clock rises If (s == 0) then y = (a & b) | c; else y = c & !(d ^ e);

Textual HDL

Top-level

block-level

schematic

Block-level schematic

Page 27: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 53

Eduardo Sanchez 54

Page 28: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 55

Intellectual property (IP)

• A semiconductor IP block is a predesigned function to beimplemented in a semiconductor device. In some cases, thefunctions are parametrisable, allowing a degree ofcustomization. These functions include physical libraryfunctions (analog or digital), basic blocks (such as countersand muxes) and system-level macros (also known as cores orvirtual components) - including memory blocks

• Market:

• 1999: 442 millions dollars (semiconductors total : 196’136 M$)

• 2000: 620 millions dollars (total semiconductors total : 231’601 M$)

• 2004: 2’940 millions dollars (semiconductors total : 339’545 M$)

Eduardo Sanchez 56

System-on-a-chip (SOC)

SOC

ASIC FPGA

· expensive circuit· lower performance· higher consumption· lower development cost· faster adaptation to change

Page 29: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 57

ASIC SOC

24012020SRAM Mb/cm2

400022001000MIPS/watt

200x10630x1065x106Gates/cm2

0.050.090.18Technology

201120042000

Eduardo Sanchez 58

MicroBlaze soft processor• Thirty-two 32-bit general purpose registers

• 32-bit instruction word with three operands and two addressingmodes

• Separate 32-bit instruction and data buses that conform to IBM’sOPB (On-chip Peripheral Bus) specification

• Separate 32-bit instruction and data buses with direct connectionto on-chip block RAM through a LMB (Local Memory Bus)

• 32-bit address bus

• Single issue, 3-stage pipeline (instruction fetch, operand fetch,execution)

• Hardware multiplier

• Big-endian

Page 30: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 59

Eduardo Sanchez 60

Virtex-II Pro family• Virtex-II Pro FPGAs provide up to four embedded 32-bit IBM PowerPC 405

RISC processors, each delivering over 420 Dhrystone MIPS at 300 MHz

• 16KB data / 16KB instruction caches

• memory management unit

• variable page size (1KB-16MB)

• five-stage datapath pipeline

• integer multiply/divide unit

• 32x32 bit general purpose registers

• dedicated on-chip memory interface

• it takes up as little as 2% of the total die area of XC2VP50

• it does not have a hardware floating point unit

• Up to twenty-four on-chip 3.125 Gbps Rocket I/O transceivers

• Based on a 0.13μ, 9-layer copper/low-K dielectric technology

Page 31: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 61

Eduardo Sanchez 62

Page 32: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 63

Virtex-4 family from Xilinx

• Columnar architecture

Eduardo Sanchez 64

• A Configurable Logic Block (CLB) contains 4 interconnectedslices

Page 33: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 65

• A simplified view of the slice is:

Eduardo Sanchez 66

• BRAM • Multipliers (DSP) blocks

Page 34: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 67

• Integrated PowerPC 405• Fully integrated Ethernet

Media Access Controller(EMAC)

• Bitstreams encrypted with256-bit AES algorithm

• 90-nm, 11-layer technology

• 500MHz for memory andmultipliers

• Lowest power

Eduardo Sanchez 68

• Three platforms:

Logic

Memory

DCMs

DSP

Logic

Memory

DCMs

DSP

Logic

Memory

DCMs

DSP

RocketIO

PowerPC

SX PlatformOptimized for

high-performancesignal processing

FX PlatformOptimized for

embedded processing andhigh-speed serial

connectivity

LX PlatformOptimized for

high-performance logic

Page 35: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 69

2442192896209,936142,128XC4VFX14

0

2042160768126,76894,896XC4VFX10

0

1642128576124,17656,880XC4VFX60

12424844882,59241,904XC4VFX40

8213232041,22419,224XC4VFX20

-2132320464812,312XC4VFX12

---51264085,76055,296XC4VSX55

---19244883,45634,560XC4VSX35

---12832042,30423,040XC4VSX25

---96960126,048200,448XC4VLX20

0

---96960125,184152,064XC4VLX16

0

---96960124,320110,592XC4VLX10

0

---80768123,60080,640XC4VLX80

---6464082,88059,904XC4VLX60

---6464081,72841,472XC4VLX40

---4844881,29624,192XC4VLX25

---32320486413,824XC4VLX15

RocketIO

transceiv

er

10/100/

1000

EMAC

PowerP

C

XtremeDS

P Slice

SelectI

O

DC

M

Block

RAM

[Kb]

Logic

CellsDevice

Eduardo Sanchez 70

Virtex-5 family from Xilinx

Page 36: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 71

Eduardo Sanchez 72

• 32-Kb block RAM running at 550 MHz

• Compared to Virtex-4, Virtex-5 devices offer 30% higheraverage speed and 65% higher capacity in the largest device.Dynamic power consumption is reduced by 35% and chip areais 45% smaller

Page 37: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 73

Stratix II family from Altera

Eduardo Sanchez 74

Page 38: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 75

• Each Logic Array Block (LAB) contains 4 Adaptive Logic Modules(ALM)

Eduardo Sanchez 76

Page 39: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 77

Eduardo Sanchez 78

Page 40: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 79

Eduardo Sanchez 80

Page 41: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 81

Stratix III family from Altera

Eduardo Sanchez 82

• Adaptive Logic Module (ALM)

Page 42: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 83

Eduardo Sanchez 84

Page 43: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 85

Cyclone II family from Altera

Eduardo Sanchez 86

Page 44: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 87

Eduardo Sanchez 88

Page 45: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 89

Nios soft processor

• RISC-like processor

• Full 32-bit instruction set, data path and address space

• 32 general-purpose registers

• 32 external interrupt sources

• Single-instruction 32x32 multiply and divide producing a 32-bitresult

• Single-instruction barrel shifter

• 6-level pipeline

• Branch prediction

Eduardo Sanchez 90

Page 46: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 91

Eduardo Sanchez 92

Page 47: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 93

Fusion family from Actel

• This FPGA family integrates thestandard programmable logicwith configurable analog andFlash memory

• Configurable analog to digitalconverter (ADC), supportingresolutions up to 12 bits, andsample rates up to 600 ksamples per second

• A 32-bit ARM7 soft-core isavailable

Eduardo Sanchez 94

Page 48: Reconfigurable computing - · PDF file• Reconfigurable computing is intended to fill the gap between hard and soft, ... •1999: 442 millions dollars (semiconductors total : 196’136

Eduardo Sanchez 95

• VersaTile configurations:

Eduardo Sanchez 96