virtex-5 fpga hdl coding techniques part 1. fundamentals of fpga design 1 day designing for...

53
Virtex-5 FPGA HDL Coding Techniques Part 1

Upload: rebecca-greene

Post on 24-Dec-2015

244 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Virtex-5 FPGA HDL Coding Techniques

Part 1

Page 2: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Fundamentals of FPGA Design

1day

Designing forPerformance

2day

s

Advanced FPGAImplementation

2days

Intro to VHDL or Intro to Verilog

3days

FPGA and ASIC Technology Comparison

FPGA vs. ASIC Design Flow

ASIC to FPGACoding Conversion

Virtex-5 FPGA Coding Techniques Spartan-3 FPGA Coding Techniques

Curriculum Path

forASIC Design

FPGA and ASIC Technology Comparison

Page 3: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Welcome

This training will help you build efficient Virtex®-5 FPGA designs that have an efficient size and run at high speed

We will show you how to avoid some of the most common design mistakes

This content is essential if you have never coded a design for the Virtex-5 FPGA or are converting an ASIC design

Page 4: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Objectives

After completing this module, you will be able to:

Optimize ASIC code for implementation in a Virtex-5 FPGA

Build a checklist of tips for optimizing your code for the Virtex-5 FPGA

Page 5: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Introduction

There is no single “perfect” way to create a design– Different synthesis options and implementation

options will lead to different results• One method will NOT work best in all cases

– The coding techniques described here are strongly recommended because they have the biggest impact on device utilization and speed

There are however guidelines that usually lead

to improved results

Page 6: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Tactics to Meet Timing

As always, use as many of the dedicated resources as possible (SRLs, DSP48s, and block RAMs)

Different tactics must be used when your device is full– Timing does not matter if your design does not fit in the device– The tactics that will be discussed generally work best in

designs that are not full

One of the most effective ways to reduce power in FPGAs is to reduce the number of resources– One of the side benefits of these techniques is that they will

allow you to improve performance and reduce power

Page 7: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Limiting Virtex-5 FPGA Resources

Build a design that uses fewer “limiting” resources– Fewer registers

• Many designs run out of registers before other components (especially if the design is heavily pipelined)

• Registers are most often the limiting resource in Virtex-5 designs

– Fewer LUTs• The LUT6 is 40 percent more efficient than a LUT4

But this does not yield performance benefits for every design

Page 8: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Virtex-5 FPGA Registers

Why are registers sometimes a scarce resource?– The Virtex-5 FPGA has ~30 percent fewer registers for a given logic or

array size compared to the Virtex-4 FPGA• The 4VLX80 device has 71,680 slice flip-flops versus 51,840 for a 5VLX85

device• So you should NOT need to pipeline as much!

– Be aware of control signal limitations (this will be covered later)– Lack of use (inference) of SRLs, block RAMs, and DSP slice resources– Replication of registers (logic replication)

• Careful use of synthesis options that may increase your design size is important

Page 9: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Introduction to Control Sets

A control signal is– Clock Enable / Gate Enable– Write Enable– Set / Preset– Reset / Clear– Clock / Gate– Slice: WE, CE, SR, REV, CLK

A control set is– A group of enable, set, reset,

and clock• This includes Vcc / Gnd when

they are not used

Unique control sets are– The number of groups of unique control signals that your design has

Tip: The implementation tools cannot group flip-flops into the same slice if they do not share the same control signals

Page 10: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

What Creates Control Signals?

Control signals are the signals that are connected to the actual control ports on the register

Inference code– Clocks and asynchronous set/resets always become control signals

• They cannot be moved to the datapath

– Clock enables and synchronous set/resets sometimes become control signals (this is decided by the synthesis tool)• These control signals can be moved to the datapath

– How will a global asynchronous reset and a local reset inferred on a single register be implemented?• Asynchronous reset gets the port on the register• Synchronous reset gets a LUT input

Tip: Clock enables and synchronous sets and resets can be moved to the datapath

Page 11: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

What Creates Control Signals?

Instantiation of primitives and cores– Gate-level connection of UNISIM and core primitives dictates control signal

usage

Synthesis optimization– Synthesis may choose to build a control signal for logic optimization

Physical synthesis– Can change control sets from original specifications– Global or logic optimization may choose to build a control signal for logic

optimization

Tip: The instantiations of cores you make should share the same control signals you infer to minimize the number of control sets

Page 12: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Why Be Concerned?

Four registers per slice; all share the same control signals – If the number of registers in the control

set do not divide cleanly by four, some registers must go unused

This is of concern for designs that have several very low fanout control signals

A design with a large number of control sets potentially can show lower utilization of registers (but not always)

Tip: Try to build in byte-wide widths for the highest device utilization

Page 13: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

What Designs Are Okay?

Designs with plenty of flip-flops to spare – Designs with low flip-flop-to-LUT ratios

• These are generally slow or lightly pipelined designs or ASIC prototypes

– Designs with lots of room in a particular device

Designs with a small number of control sets are preferable– The key is to evaluate slices and CLBs that have wasted registers– Try to build designs with common control signals (plan)

Designs with datapaths divisible by four are not affected even if they have a high number of control sets– Such as byte-wide enables or data control registers, for example

Page 14: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Active-Low Control Signals

Problem: Active-low control signals can produce sub-optimal resultsWhy?– Control ports on Virtex-5 FPGA registers are active-high– Hierarchical design methods

This results in…– Poorer utilization

• More LUTs• Less dense slice packing• More routing resources necessary

– Longer run times• Prohibits hierarchical design flows• More difficult timing

– Worse timing and power

Tip: Use active-high signals for CEs, sets, and resets

Page 15: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Use Active-High Control Signals

Hierarchical design methods can proliferate LUT usage on active-low control signals

Flip-Flop

The inverters cannot be

combined into the same slice

This consumes more power and

makes timing difficult

Page 16: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Why Synchronous Resets?

Each DSP48E has ~250 registers; none have asynchronous reset

The DSP slice is more versatile than most realize– The XC5V50 device has ~12,000 DSP slice registers– The XC5V330 device has ~48,000 DSP slice registers– Can be used for multipliers, add/sub, MACC, counters (with

programmable terminal count), comparators, shifters, multiplexer, pattern match, and many other logic functions

Many design that run out of slices are not fully utilizing the DSP48E– Synthesis tools will infer the DSP48E for multipliers, but they are not

smart enough to infer other functions• Can control synthesis use with attributes, but NOT if an asynchronous reset

is used

Tip: Use sync reset when using the DSP slice resources

Page 17: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Why Synchronous Resets?

Block RAMs obtain minimum clock-to-output time by using the output registers– Output registers only have synchronous resets

Unused block RAMs can be used for many alternative purposes– ROMs, large LUTs, complex logic, state machines, deep-shift registers, etc.

Using unused block RAMs for other purposes can free up hundreds of flip-flops– Using the block RAM in dual-port mode allows for greater utilization of this

resource

Many designs that run out of slices are not fully utilizing the block RAM resources– Synthesis tools are not yet smart enough to infer less obvious functions

Tip: Use sync reset when using the block RAM resources

Page 18: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Why Synchronous Resets?

Synthesis could choose to move low-fanout synchronous resets from a control signal to the datapath to free up more registers– Synthesis tools can do this, but it may depend on synthesis settings (may

not be on by default)– The Xilinx implementation tools cannot change what is synthesized

This could allow packing of this register into a slice previously not possible– Can improve timing as well as register density

D

S

Low Fanout

Page 19: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Why Synchronous Resets?

Synchronous resets are automatically timed– Do not need any special timing constraints– Do not need special switches or setting to analyze timing

Synchronous resets are inherently more predictable – Less susceptible to accidentally missing timing, runt pulses, or other

phenomenon from upsetting logical functionality– Less prone to a race condition

• Release of an asynchronous signal may not always have predictable results

Tip: Synchronous resets enable your design to need minimal testing

Page 20: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Caveats to Synchronous Resets

Synchronous resets may make timing more difficult, the design larger, and result in longer run times

Why?– The implementation tools automatically time synchronous reset paths– This can result in

• More timing paths to analyze and meet timing On average ~five percent increase in the number of timing paths

• More replication of design resources• With some synthesis tools this will use fewer SRLs, block RAM, DSP48s, and

other dedicated hardware

Page 21: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Changing to Synchronous Resets

All new code should use synchronous resets when a reset is necessary

For existing code, you have three choices– Leave alone

• Acknowledge the possible drawbacks of asynchronous resets

– Use synthesis switch

• Not the same as changing to synchronous reset but can help

– Manually (or use a script) to change the asynchronous reset to synchronous

– Removing the top-level reset port does not get the same result• Remove the reset from your code

Synplify:syn_clean_reset

XST:-async_to_sync YES

Page 22: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

No Resets is Best

Resets

Page 23: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Why No Resets at All?

Using synchronous logic frees up additional logic

Designs in which the resets were removed resulted in an average of 3 ½ percent fewer registers

Synthesis can realize this additional logic automatically

Tip: This makes it easier for the mapper to group this register with registers of a different control set

Page 24: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Why No Resets at All?

Synthesis can infer SRL-based shift registers– But only if no resets are used (otherwise flip-flops are wasted)– Or, the synthesis tool can emulate the reset (not what you want)

The SRL is also useful for synchronous FIFOs, non-binary counters, terminal count logic, pattern generators, and reconfigurable LUTs

Tip: NO reset saves a lot of flip-flops

Page 25: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Why No Resets at All?

Routing can be considered one of the most valuable resources

Resets compete for the same resources as the rest of the active signals of the design – Including timing-critical paths– More available routing gives the tools a better chance to meet your timing

objectives

Tip: NO reset saves routing and improves design speed

Page 26: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Why No Resets at All?

Even more block RAM inference

Why?– Virtex-5 FPGA RAMs

• RAM enable has precedence over reset

– Virtex-5 FPGA registers• Reset has priority over the clock enable

– Coding for this functionality makes no sense– With no reset, the enable precedence has no consequence

Tip: NO reset gets more block RAMs

Page 27: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Why No Resets at All?

Designs without resets have fewer timing paths– By an average of 18 percent fewer timing paths

Results in less run time

Improved performance

Less memory necessary during PAR

Tip: NO reset builds a faster design and saves run time

Page 28: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

How Do I Get By?

Some designs can get away without any resets but many designs need some resets– Very few designs require resets on all registers

• Most ASICs require a described reset on every register. • Implement this with the built-in Global Set/Reset (GSR)

Suggestion– Be selective when you code resets (FSMs, I/O, and flushing data)

• Only place resets that have impact on functionality

Page 29: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

How Do I Get By?

Initialize all registers in VHDL / Verilog code– This should be done whether using a reset or not

Perform RTL simulation of the design– If it functions during simulation, it should function on the FPGA

VHDL:

signal my_regsiter : std_logic_vector (7 downto 0) := (others <= ‘0’);

Verilog:

reg [7:0] my_register = 8’h00;

Page 30: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

If your design barely fits, Xilinx recommends reducing the size of your design before trying to gain timing closure– Most of these tips reduce design size

Try to minimize the number of control sets your design uses

Asynchronous resets can inhibit optimization of general logic (can force additional LUT inputs to be used)

Synchronous resets allow synthesis tools to convert a control signal reset to the datapath

Avoid the use of global resets– Initialize all registers from your HDL– If you have to, use the Startup_Virtex5 primitive to access the GSR net

Summary

Page 31: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Xilinx recommends NOT using the synthesis option to convert asynchronous resets to synchronous

Avoid resets on SRLs (no reset functionality)

Avoid asynchronous resets on block RAMs (the block RAM’s output register only supports a synchronous reset)

Avoid asynchronous resets on DSP slice resources (their flip-flops only support a synchronous reset)

Be aware of the difference between coding for a block RAM’s control signal precedence and a flip-flop’s precedence

Use active-high control signals

If you can design out your global reset, you will save a lot of routing and build a faster design

Summary (continued)

Page 32: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Where Can I Learn More?

Xilinx Online Documents – support.xilinx.com

• To search for an Application Note or White Paper, click the Documentation tab and enter the document number (WP231 or XAPP215) in the search window

• White papers for reference WP231 – HDL Coding Practices to Accelerate Design Performance WP248 – Retargeting Guidelines for Virtex-5 FPGAs WP275 – Get your Priorities Right – Make your Design Up to 50%

Smaller• User guides for reference

UG193 - Virtex-5 FPGA XtremeDSP Design Considerations

Additional Online Training– www.xilinx.com/training

Page 33: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Virtex-5 FPGA HDL Coding Techniques

Part 2

Page 34: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Fundamentals of FPGA Design

1day

Designing forPerformance

2day

s

Advanced FPGAImplementation

2days

Intro to VHDL or Intro to Verilog

3days

FPGA and ASIC Technology Comparison

FPGA vs. ASIC Design Flow

ASIC to FPGACoding Conversion

Virtex-5 FPGA Coding Techniques Spartan-3 FPGA Coding Techniques

Curriculum Path

forASIC Design

FPGA and ASIC Technology Comparison

Page 35: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Welcome

This training will help you build efficient Virtex®-5 FPGA designs that have an efficient size and run at high speed

We will show you how to avoid some of the most common design mistakes

This content is essential if you have never coded a design for the Virtex-5 FPGA or are converting an ASIC design

Page 36: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Objectives

After completing this module, you will be able to:

Optimize ASIC code for implementation in a Virtex-5 FPGA

Build a checklist of tips for optimizing your code for the Virtex-5 FPGA

Page 37: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Clock Enable

Control the use of clock enables from the code– Code them only when needed– If a low-fanout CE is necessary, use synthesis attributes to control the use

of control signals at the signal or module level• Do not use global switches to turn off the use of CEs

Results in an average of 25-percent LUT increase

– Consider using alternative coding methods for low-fanout clock enables• This will map the CE as an input to the LUT

VHDL: Q <= ((not CE) AND A) OR (CE AND Q);

Verilog: Q <= (~CE & A) | (CE & Q);

VHDL:if (CE) then Q <= A;

Verilog:if (CE) Q <= A;

Tip: Code low-fanout CEs for a LUT input. This will enable the flip-flop to be part of a larger control set

Page 38: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Map Report

MAP will report on the number of control sets for a particular design (Virtex-5 FPGA only)

Running MAP with the -detail switch will give a detailed analysis of the number of unique control signals (can be a large report)

Low number of members within a control set are of concern (fewest flip-flops per control set)

Page 39: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Global Clock Enable

To gate entire clock domains for power reduction, use the clock-enabled global buffer resource BUGCE– For applications that only pause the clock on small areas of the design,

use the clock enable pin of the FPGA register

Tip: This will save general routing resources

Page 40: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

DSP Slice

Use adder chains instead of adder trees– Adder trees tend to have varying size

• This usually makes larger adders in the last stages, which increases logic levels

– The Virtex-5 FPGA uses adder chains which obtain peak performance and use minimal power• Requires pipelining• Adds latency

Adder Tree

Adder Chain

Tip: Use adder chains instead of adder trees

Page 41: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Block RAM

Avoid “read before write” mode for fastest performance– This is easily inferred from your coding style of your memory or by

instantiation from the CORE Generator™ tool

Synplify and other third-party synthesis tools can insert bypass logic to prevent a possible mismatch error between your RTL and hardware behavior– Intended to force RAM outputs to a known value when read and write

operations occur on the same memory cell– If you know this will never happen you can prevent this logic from being

added and damaging your performance with an attribute• Attribute syn_ramstyle of mem : signal is “no_rw_check”;

Tip: Infer or instantiate the memory that is most appropriate

Page 42: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

I/O Registers

IOB registers provide fixed setup and clock-to-output times– Fastest way to capture input data and clock data off the

device

IOB register can make it difficult to meet internal timing– Their use can lengthen route delays to internal logic– Only use IOB registers when it is necessary to meet I/O

timing• It is best to allow your synthesis tool to put registers into

IOBs based on timing constraints (if your tool supports this).

• Otherwise complete the following steps…

1) Disable global I/O register usage in your synthesis tool

2) Disable the Map option to pack registers into IOBs (PAR)

3) Selectively move registers into IOB with a UCF attribute

Tip: Only use IOB registers when necessary to meet I/O timing

Page 43: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Design Hierarchy

Register all inputs and outputs to each hierarchical block– Or at least register the outputs

Place all I/O components at the top level– This includes I/O registers, DDR, SERDES, and delay elements– If not, place them in one block of hierarchy

Any logic that needs to be placed in a single resource (such as a single DSP slice) should be contained in a single hierarchical block

Any logic that needs the synthesis tool to use resource sharing should be placed in a single hierarchical block

Manually duplicate registers with high fanout at a hierarchical boundary

Tip: Following these guidelines ensures that your design is less likely to interfere with design optimization and incremental design practices

Page 44: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Synthesis Options

Replicate registers with high fan-out– This allows high fan-out logic to be moved closer to destinations– This can be determined from a timing report– Manual duplication or replication constraints with the synthesis tools

should be applied

Retiming option should be used, especially if design has been pipelined– Pipelining is still encouraged, but not as essential

Tip: Duplicate high fan-out logic, pipeline as needed, and if you pipeline use retiming

Page 45: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Synthesis Options

Overconstraining during synthesis can significantly increase register use– Seen as an average increase from 1–5 percent– Do NOT over-constrain during synthesis

Global optimization can lead to mixed results– Can achieve ~10 percent flip-flop reduction

• Gives back much of that (and sometimes more) due to control signals

FSM optimization– Turning off FSM optimization can yield a small flip-flop savings– One-hot encoding is not as useful

Do NOT use slice or LUT compression switches– In some cases, latch-thrus are used and consume registers

Tip: Do NOT over-constrain and do NOT use slice or LUT compression

Page 46: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Synthesis Options Summary

To help meet your timing objectives…– Turn ON logic replication and retiming– Turn OFF resource sharing– Turn ON logic optimization (widening deep data paths)– Turn OFF FSM optimization– Do NOT over constrain during synthesis– Do NOT use slice or LUT compression switches

– These synthesis options make the design larger, but save FFs and give the PAR algorithms more flexibility to meet timing

Page 47: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Easiest Designs to Migrate to the Virtex-5 FPGA

Designs that can utilize the new hard IP– EMAC, DSP slice, block RAM, PowerPC® 440 processor, and PCI™

technology, for example

Low-power designs that use the dedicated IP

“Slow” designs– Designs with several LUT levels generally see greater speed due to the

LUT6 and improved routing architecture

Tip: Add as much IP to your design as you can

Page 48: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Toughest Designs to Migrate to the Virtex-5 FPGA

Structural designs– Designs that have not been coded properly (as just discussed)– Designs that have NOT been resynthesized– Designs that use many old netlists and cores from previous architectures

Some types of DSP designs

Heavily pipelined designs

What is in common?– They were not optimized!

Tip: Use the coding techniques described in these recorded modules and you will yield the high speed design you hoped

Page 49: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Common Questions

“Why can’t I code how I want to?”– You can. As long as it is synthesizable (RTL), Xilinx can build it. This

module highlights some of the lesser known trade-offs of coding styles in terms of area, power, and performance.

“Shouldn’t the tools be able to make my code optimal?”– Some coding styles make this more difficult

• While FPGAs are programmable, the underlying dedicated hardware is fixed

Page 50: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Common Questions

“The Virtex-5 FPGA should always be a speed grade faster than the Virtex-4 FPGA, right?”– No, this is not always true, particularly for heavily pipelined designs.

“This design easily fit in the Virtex-4 FPGA and now it can’t fit in the Virtex-5 FPGA. What’s wrong?”– Check how many control sets your design has. If you have too many, you

may need to evaluate your use of control signals. Also, check that your cores and use of the dedicated hardware is optimal.

“Why can’t the software just optimize my inverters across a partition?”– Remember that partitions are there to preserve hierarchy and parts of your

design. Allowing any tool to selectively remove an option is counterintuitive.

Page 51: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Summary

Follow our synthesis recommendations…– Turn ON logic replication and retiming– Turn OFF resource sharing– Turn ON logic optimization (widening deep data paths)– Turn OFF FSM optimization– Do NOT over constrain during synthesis– Do NOT use slice or LUT compression switches

Be careful with coding unnecessary clock enables

IOB registers can make it more difficult to meet internal timing– Follow our directions to use the IOB registers only for IO timing

Follow our guidelines to ensure that your design does not interfere with design optimization and incremental design practices

Page 52: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Where Can I Learn More?

Xilinx Online Documents – support.xilinx.com

• White papers for reference WP231 – HDL Coding Practices to Accelerate Design Performance WP248 – Retargeting Guidelines for Virtex-5 FPGAs WP275 – Get your Priorities Right – Make your Design Up to 50%

Smaller• User guides for reference

UG193 - Virtex-5 FPGA XtremeDSP Design Considerations • Software Manuals (found from the web or the Help menu)

Constraints Guide

Additional Online Training– www.xilinx.com/training

Page 53: Virtex-5 FPGA HDL Coding Techniques Part 1. Fundamentals of FPGA Design 1 day Designing for Performance 2 days Advanced FPGA Implementation 2 days Intro

Xilinx is disclosing this Document and Intellectual Property (hereinafter “the Design”) to you for use in the development of designs to operate on, or interface with Xilinx FPGAs. Except as stated herein, none of the Design may be copied, reproduced, distributed, republished, downloaded, displayed, posted, or transmitted in any form or by any means including, but not limited to, electronic, mechanical, photocopying, recording, or otherwise, without the prior written consent of Xilinx. Any unauthorized use of the Design may violate copyright laws, trademark laws, the laws of privacy and publicity, and communications regulations and statutes.

Xilinx does not assume any liability arising out of the application or use of the Design; nor does Xilinx convey any license under its patents, copyrights, or any rights of others. You are responsible for obtaining any rights you may require for your use or implementation of the Design. Xilinx reserves the right to make changes, at any time, to the Design as deemed desirable in the sole discretion of Xilinx. Xilinx assumes no obligation to correct any errors contained herein or to advise you of any correction if such be made. Xilinx will not assume any liability for the accuracy or correctness of any engineering or technical support or assistance provided to you in connection with the Design.

THE DESIGN IS PROVIDED “AS IS" WITH ALL FAULTS, AND THE ENTIRE RISK AS TO ITS FUNCTION AND IMPLEMENTATION IS WITH YOU. YOU ACKNOWLEDGE AND AGREE THAT YOU HAVE NOT RELIED ON ANY ORAL OR WRITTEN INFORMATION OR ADVICE, WHETHER GIVEN BY XILINX, OR ITS AGENTS OR EMPLOYEES. XILINX MAKES NO OTHER WARRANTIES, WHETHER EXPRESS, IMPLIED, OR STATUTORY, REGARDING THE DESIGN, INCLUDING ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE, AND NONINFRINGEMENT OF THIRD-PARTY RIGHTS.

IN NO EVENT WILL XILINX BE LIABLE FOR ANY CONSEQUENTIAL, INDIRECT, EXEMPLARY, SPECIAL, OR INCIDENTAL DAMAGES, INCLUDING ANY LOST DATA AND LOST PROFITS, ARISING FROM OR RELATING TO YOUR USE OF THE DESIGN, EVEN IF YOU HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. THE TOTAL CUMULATIVE LIABILITY OF XILINX IN CONNECTION WITH YOUR USE OF THE DESIGN, WHETHER IN CONTRACT OR TORT OR OTHERWISE, WILL IN NO EVENT EXCEED THE AMOUNT OF FEES PAID BY YOU TO XILINX HEREUNDER FOR USE OF THE DESIGN. YOU ACKNOWLEDGE THAT THE FEES, IF ANY, REFLECT THE ALLOCATION OF RISK SET FORTH IN THIS AGREEMENT AND THAT XILINX WOULD NOT MAKE AVAILABLE THE DESIGN TO YOU WITHOUT THESE LIMITATIONS OF LIABILITY.

The Design is not designed or intended for use in the development of on-line control equipment in hazardous environments requiring fail-safe controls, such as in the operation of nuclear facilities, aircraft navigation or communications systems, air traffic control, life support, or weapons systems (“High-Risk Applications”). Xilinx specifically disclaims any express or implied warranties of fitness for such High-Risk Applications. You represent that use of the Design in such High-Risk Applications is fully at your risk.

© 2012 Xilinx, Inc. All rights reserved. XILINX, the Xilinx logo, and other designated brands included herein are trademarks of Xilinx, Inc. All other trademarks are the property of their respective owners.

Trademark Information