fpl 2003 - sept. 2, 2003 software decelerators eric keller, gordon brebner and phil james-roxby...

18
FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James- Roxby Xilinx Research Labs

Upload: aleesha-chase

Post on 17-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003 - Sept. 2, 2003

Software Decelerators

Eric Keller, Gordon Brebner and Phil James-Roxby

Xilinx Research Labs

Page 2: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 2

Talk Outline

• Background• Software Decelerators• Case Study: Finite State Machines• Results• Conclusions

Page 3: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 3

Modern Platform FPGA

High Performance Sync Dual-Port™ RAM

SelectIO™-Ultra Technology

Advanced FPGA Logic 18 Bit

18 Bit36 Bit

Embedded DSP Functionality

PowerPC™ Processors 400+ MHz clock rate

DCM

Digital Clock Management

Z

VCCIO

Z

Z

ImpedanceControl

Digitally Controlled Impedance

High-speed Serial Transceivers 622 Mbps to 3.125 Gbps

Page 4: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 4

Hardware Accelerator

• Processor-Centric• Algorithms executed on processor

– key functions performed by hardware

• Goal: Increase overall performance

ProcessorMem DWT

JPEG2000

Tier 1 CoderRCT

Page 5: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 5

Motherboard On A Chip

• Processor running an operating system• Common board peripherals on FPGA

– Ethernet MAC– SVGA

controller

Page 6: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 6

Logic-centric viewpoint

• Consistent with an interface-centric view that is appropriate for reactive systems - highly relevant for future ambient intelligence/ubiquitous computing

• Processors have no special status in systems, and indeed play only a secondary role as ‘function units’

• Explicit ‘hardware-software co-design’ becomes lesser issue - certainly no top-level partitioning

• Hardware accelerators of processor-centric model are inverted and replaced by ‘software decelerators’

Page 7: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 7

Software Decelerators

• Algorithms are executed in logic– Processor executes software to perform one or more

services for programmable logic

&

inputs outputs*

+

+

PPC

Page 8: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 8

Motivation

• Emergence of platform FPGAs• To increase overall system quality

– by making use of services provided by processor

• Ease of designing a complex function• Offload non time-critical logic

– to achieve a better partition (e.g. saving area)

• Offload corner cases– e.g. in MIR IPv4 packets handled in logic, IPv6 handled in

processor

Page 9: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 9

Goals

• Overall area consumed by software decelerator should not be greater than logic counterpart

• Interfacing logic should consume minimal logic• Interface should shield logic from processor

– and vice versa

• Provide timing and resource usage information• Implementation neutral method to capture design

Page 10: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 10

Example: finite state machines• Implement a general class of sequential functions that

are recognizable in digital designs• Processor determines next state and state outputs to

meet schedule determined by logic-based system– possibility to support multiple state machines

Hardwareplatform

Software

Timing report

FSMdeceleratorgenerator

GraphicalRepresentation

TextualRepresentation

Page 11: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 11

Design Entry

• Graphical front end– e.g. StateCAD

• Textual intermediate representation– XML to support many design entry methods

• Define interface

• Define state

<variables> <variable name=“op” dir=“in” width=“4”/></variables>

<state name=“stateADD”> <eqns> <eqn lhs=“out0” rhs=“in1+in2/> </eqns> <transitions> <tran next=“state1”/> </transitions></state>

Page 12: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 12

Logic-Processor Interface

• Rest of system doesn’t see processor signals• Choice of interface

– PowerPC’s native busses: PLB, OCM, DCR• With only two nodes, optimizations are possible

– interface logic always being addressed– No need for arbiter

PowerPC

Page 13: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 13

Clocking

• Polling/Interrupt on external clock– processing time for state must be less than clock period– processor uses polling to detect clock edges – clock edge causes an interrupt

• Software Generated– processor generates clock pulse using a memory

mapped circuit– allows different states to take different processing time

Page 14: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 14

Software Design

• General case is complex requiring timing analysis• Assembly code generation

– each state has same structure (clock/reset, equations, transitions)

• Execute out of cache– predictable memory accesses

• Accurate timing generation– count the exact number of cycles it will take for each

state and transition

Page 15: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 15

Results: Resource Usage

OCM DCR PLB

sys FFs LUTs Ratio FFs LUTs Ratio FFs LUTs Ratiors232 1 4 3.6% 2 6 5.4% 4 8 7.2%miim 20 38 62.3 21 40 65.6 23 42 68.9

tx_host_io

94 75 23.4 95 77 24.1 97 79 24.7

*Ratio is the area of the decelerator as a percentage of area consumed by a logic implementation

Page 16: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 16

Results: Performancesystem Worst-

casePerf.(cycles)

Worst-casePerf.(MHz)

% timein I/O

CodeSize(kbytes)

CodeSize(% ofcache)

rs232 40 8.75 30.95% 1416 8.6%

miim 74 4.73 25.22% 2968 18.1%

tx_host_io 135 2.59 33.99% 1952 11.9%

Page 17: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 17

Conclusions

• Software decelerators – through example of FSM based design methodology– extendable to other functions– can provide an increased overall system quality

• Methodology applicable to subset of designs– achievable speeds vary with characteristics of FSM

• I/O takes a lot of processing time

Page 18: FPL 2003 - Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs

FPL 2003Software Decelerators 18

Future Work

• Further study implications of logic centric model• Automatic selection and synthesis of logic-

processor interfaces• Characteristics of hard/soft processors

– e.g. I/O takes large percentage of time

• FSM based architectural components• Domain-specific high-level design entry and tools