avoiding metastability in fpga devices mapld 2009 david landoll applications architect mentor...

32
Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp.

Upload: janice-conley

Post on 11-Jan-2016

233 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Avoiding Metastability in FPGA Devices

MAPLD 2009

David LandollApplications ArchitectMentor Graphics Corp.

Page 2: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Today’s FPGAs

Onboard Computer

System on Chip / FPGA

Memory Controller

Timers IRQCtrl

UART DMA Controller

AHB/APB Bridge

AMBA AHB

32bit Data bus

Data Memory Parity Memory

CAN Controller

HDLC Controller

SpaceWire Controller

AMBA APB

FPU

HDLC Controller

CAN Switch

Leon CPU

EDAC Controller

Boot PROM

SDRAM SDRAM

Address/Control bus

RS422 SpaceWire HDLC1

Up/Downlink HDLC0

Up/Downlink CAN0 CAN1

Dual CAN transceiver

1.5V Linear Regulator

+3.3V

+3.3V +1.5V

CLK Generator

Configuration PROM

JTAG

I/O port

PPS out PPS in

DMA Controller UART

CPU AMBAAHB

• Fabrication advances provide more available silicon area• More functionality can weigh less and take up less space• Integrating/reusing capabilities lowers cost

2

Page 3: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Flight Management

MaintenanceDiagnostics

Image Processing

Integrated Avionics Processing

Flight Control & Avoidance Systems

Boeing 787: Integration’s Next StepFrom its central processor to its common data network, surveillance system and navigation system, the theme of the Boeing 787 Dreamliner is integration.

James W. Ramsey Communications

Weather Radar

3

Integration Presents New Challenges

Such integration usually involves multiple independent clock domains, which leads to clock-domain crossings and metastability errors!

Page 4: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Clock Domain Crossing (CDC) ErrorsUnpredictable Loss of Data

CDC problems— corrupt control and data signals— are subtle, intermittent, unpredictable— are the 2nd major cause of respins— are difficult to reproduce and debug — are temperature, voltage, and process sensitive— will only occur in hardware; often in the final design

Traditional verification techniques do not work for CDC signals

4

A CDC Verification methodology is needed to reduce the risk of CDC related data errors

4

Page 5: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

MetastabilityWhat the heck is it, anyway?

What is a clock?— Periodic pulsing signal— Digital logic uniformly connected to this signal— Acts as the Symphony Conductor – keeps logic in sync— Action happens across the logic at one specific point

Typically the “rising edge”

55

Vee, Vss: GND: 0V

Vcc, Vdd : +5V, +3.3V

Page 6: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

MetastabilityWhat the heck is it, anyway?

What’s in a register?— (Also known as a latch, flip-flop, etc)— Contain transistors that “trap” the input value at the

appropriate time E.g. rising edge of the clock

— How does this happen?

66

Page 7: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

MetastabilityThe Physics of a Register

Let’s take a look at a register— CMOS D-type transmission gate flip-flop

7

CLK

D Q

D

CLK

Q0

0

0

-- simple D-type flip-flop

process(CLK)

begin

if rising_edge(CLK) then

Q <= D;

end if;

end process;

7

Transistor Model of a D Flip-Flop

Page 8: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

MetastabilityThe Physics of a Register

Let’s take a look at a register— CMOS D-type transmission gate flip-flop

8

CLK

D Q

D

CLK

Q1

0

0

-- simple D-type flip-flop

process(CLK)

begin

if rising_edge(CLK) then

Q <= D;

end if;

end process;

8

Transistor Model of a D Flip-Flop

Page 9: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

MetastabilityThe Physics of a Register

Let’s take a look at a register— CMOS D-type transmission gate flip-flop

9

CLK

D Q

D

CLK

Q1

1

1

-- simple D-type flip-flop

process(CLK)

begin

if rising_edge(CLK) then

Q <= D;

end if;

end process;

9

Transistor Model of a D Flip-Flop

Page 10: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Transistor Model of a D flip-flop

MetastabilityThe Physics of a Register

Let’s take a look at a register— CMOS D-type transmission gate flip-flop

10

CLK

D Q

D

CLK

Q0

0

1

Only works if D has a “good value” at the rising edge of the clock(no Set-up/hold time violations)

-- simple D-type flip-flop

process(CLK)

begin

if rising_edge(CLK) then

Q <= D;

end if;

end process;

10

Page 11: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

MetastabilityThe Physics of a Register

When setup/hold conditions are violated, the output of a storage element becomes unpredictable

This effect is called metastability If not contained, metastability can propagate…

11

din tffMTBF

clk

1

fclk = Clock Frequency

fin = Input Signal Frequency

td = Duration of critical time window

CLK

D

Q

CLK

D Q

Setup/hold window

Metastability is UNAVOIDABLE in designs with multiple asynchronous clocks

11

Page 12: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Clock Domain Crossing signal

Clock Domain CrossingsGuaranteed to Cause Metastability

When 2 or more designs run on disparate clocks:— The clocks will continually skew, guaranteeing setup/hold violations — Signals from one design to another are “Clock Domain Crossings” (CDCs)

12

D

CLK

Q

Sensor System Guidance System

TxTx

Clock B

Clock AClock A

Setup/hold window

Signals that cross asynchronous clock

domains (CDC signals) WILL violate setup and

hold conditions

12

D

CLK

Q

Page 13: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Mitigating Clock Domain Crossing Issues

Problem:— Signals crossing a clock domain will violate set-up/hold— Impact: Control/data signals will be dropped/corrupted

Loss of Data

Approaches:— Avoid having systems that have multiple clocks

Although sensible, it’s becoming impossible

— Design around the problem Designer can add “synchronizers” to the design Metastability still happens, but nobody else sees it

— E.g. 2DFF, FIFO, etc.— “Fences in” metastability

13

13

Page 14: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Isolate Metastability: Synchronizers

Designers add synchronizers to reduce the probability of metastable signals

Synchronizers are sub-circuits that can prevent metastable values from being sampled across clock domains

— Take unpredictable metastable signals and create predictable behavior

14

Page 15: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Mitigating Clock Domain Crossing IssuesIsolate Metastability: Synchronizers

15

QQ

Clock AClock A Clock BClock B

ii i +1i +1 i +2i +2i -1i -1ii i +1i +1 i +2i +2i -1i -1

TxTx

Metastability window

When metastability occurs, the delay through a synchronizer becomes unpredictable

RxRx

ii i +1i +1 i +2i +2i -1i -1 i +3i +3

15

Page 16: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Invalid Command

Synchronizer Delays Can Reconverge with unexpected resultswith unexpected results

CDC signals cross with an assumed relationship Can be combinational, sequential, or deeply sequential Unpredictable delays on CDC paths lead to reconvergence errors

— Designs need logic to correctly handle reconvergence— Can occur on single-bit or multiple-bit signals

16

S1

S2

S3 S4FS

M In

pu

t

tx_d0

tx_d2

tx_d1

Sync 2

Sync 1

Sync 0

0 0

00 1

00 1

0 0 0

00 0

00 1

0

Valid Command – but delayedG

rey E

nco

der

Gre

y D

eco

der

0 0

01 1

11 1

1 0 0

01 0

11 1

1

Sync 10

0

00 0

01 1

1

16

Page 17: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

And, Synchronizers Fail if Misused

Synchronization between clock domains requires a transfer protocol

— Ensures data is predictably transferred between domains

These protocols must be verified

When protocol is violated— Data is lost— Simulation may not show a failure— Silicon will eventually show a

functional error

Synchronizer won’t function properly if the required Transfer Protocol is violated

17

Page 18: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Verification Must Cover All Three CDC Problems

18

Clock domain crossings need:— Structured synchronization— Transfer protocols— Global reconvergence checking

Reconvergence problem

Missing sync problem Possible

protocol problem

18

Page 19: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Mitigating Clock Domain Crossing Issues

Problem:— Signals crossing a clock domain will violate set-up/hold— Impact: Control/data signals will be dropped/corrupted

Approaches:— Avoid having systems that have multiple clocks— Designer can add “synchronizers” to the design— Designer-added synchronizers + full CDC verification

Assures synchronizers are present and used correctly

19

19

Page 20: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Recommendations

During design planning1. Create systems/designs using 1 clk, 1 edge when possible

2. If multiple clocks are required, try to use 1 designer for both clock domains, and use coding guidelines

1. Use signal naming conventions

2. Many clock domain errors come from design changes, not the initial design

3. Limit “clock domain crossings” to specific areas or blocks in the design, when possible.

4. NOTE: These techniques can help assure synchronizers are present, but are unlikely to help identify reconvergence or CDC protocol issues.

3. When multi-clock design is required, plan for proper verification

1. How to we accomplish this?

20

20

My_sig_Tx

My_data_Tx_RegMy_data_Rx_sync1

My_data_Rx_sync2

Comb_sig_Rx

For Example:

• Append “_A_reg” to signals leaving A-clk register, “_A” for A-clk combo signals

• Leverage during code reviews - help identify missing synchronizers

• Make sure ONLY _A_reg signals go to synchronizers (no combo logic)

Page 21: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Verifying CDC Synchronization

Problem:— Missing synchronizers will create metastability— Correctly placed but misused synchronizers won’t work— Reconvergence of synchronized signals can create

unexpected behavior Approaches:

— Simulation Digital logic simulators do NOT model transistor behavior Do not model “metastability”

21

21

Page 22: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

For example …

22

Q

D

CLK

Simulation captures a ‘1’ while silicon produces

either a ‘1’ or ‘0’

Setup Violation

Q

D

CLK

Hold Violation

Simulation Does NOT Reflect Silicon Behavior

Q in silicon Q in silicon

Simulation captures a ‘0’ while silicon produces

either a ‘1’ or ‘0’

22

Q in simulation Q in simulation

Page 23: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Verifying CDC Synchronization

Problem:— Missing synchronizers will create metastability— Correctly placed but misused synchronizers won’t work— Reconvergence of synchronized Control logic bugs

Approaches:— Simulation

Won’t model CDC’s correctly to detect errors

— Static Timing Analysis Can be used to identify signals that cross domains Can be used as input for a manual review But…Won’t detect missing or incorrectly used synchronizers, or

reconvergence

23

23

Page 24: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Verifying CDC Synchronization

Problem:— Missing synchronizers will create metastability— Correctly placed but misused synchronizers won’t work— Reconvergence of synchronized Control logic bugs

Approaches:— Simulation

Won’t model CDC’s correctly to detect errors

— Static Timing Analysis Identifies signals for manual review, but otherwise useless

— Manual Design Reviews Error prone (and very time consuming) Typically only identifies synchronizer structures, misses

reconvergence and invalid sync protocol usage Evidence suggests at least some synchronizers will be missed

24

24

Page 25: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

For Example…Trivial Reconvergence Error

Reconverging synchronized CDC signals - timing is unpredictable. Need to verify the downstream logic can handle variations

— Manually identifying the reconvergence is very hard— Manually identifying all possible behaviors is harder— Manually assuring logic will behave correctly – typically intractable

25

25

Page 26: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Verifying CDC Synchronization

Problem:— Missing synchronizers will create metastability— Correctly placed but misused synchronizers won’t work— Reconvergence of synchronized Control logic bugs

Approaches:— Simulation - Won’t model CDC’s correctly to detect errors

— Timing Analysis - Identifies signals for review, but otherwise useless

— Manual Design Reviews - error prone, incomplete

— Lab Verification? Problem is intermittent, debug is impossible

— Spice simulation? – It *does* model transistors, but… Where will you get the “Spice deck”? (transistor level model)

Would be far too slow on a large FPGA

26

26

Page 27: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Verifying CDC Synchronization

Problem:— Missing synchronizers will create metastability— Correctly placed but misused synchronizers won’t work— Reconvergence of synchronized Control logic bugs

Approaches:— So - we need a new method that reliably:

Identifies ALL CDC signals, structures, reconvergence Assures ALL connected, functioning correctly Creates reports for manual reviews The EDA industry has responded

— 6 commercial tools now available…and counting— But…most won’t identify all 3 of our CDC issues

27

27

Page 28: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Mentor’s CDC Verification Technology

Who’s using our technology? Mil-Aero

— Honeywell, Inc. — L-3 Communications — Lockheed Martin Co — Ministry of Aerospace & Aeronautics — Northrop Grumman Corp — Raytheon — Rockwell Collins Inc. — SAAB Group — Thales

Commercial— Widely used in commercial space

The market leader in CDC verificationThe market leader in CDC verification

28

28

Page 29: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

Example Value from One Customer Design

— IEEE standard serial communications core— Used in 50-60 other COMMERCIAL ASIC products— Widely deployed (millions in use daily)

Placed core in a sensor guidance system— Found issues in the lab— Debugged FPGA for weeks— Suspected a CDC issue, but not sure…

Deployed Mentor’s CDC solution— Results same day— Found 199 serious CDC bugs!

■        45 Missing Synchronizers■        83 Incorrect Synchronizers■        76 Reconverging Signals■        11 other problems

— Most resulting from “more stressful” usage In production:

— Commercial ASIC : Customer issue – device is erratic, locks up— Avionics: Could result in an Airworthiness Directive

29

29

Page 30: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

SummaryRecommendations

During design planning1. Create systems/designs using 1 clk, 1 edge when possible

2. If multiple clocks are required, try to use 1 designer for all clock domains

3. When multi-clock design is required, plan for proper verification

During verification1. Watch for multiple clocks in designs (Tip – Count PLLs)

2. Ask how CDC issues are mitigated (remember there are 3)

Utilize commercial tools designed for detecting these problems1. Verify all 3 classes of CDC problems

1. Structural Verification

2. Protocols Verification

3. Reconvergance Verification

2. Use reports to aid manual reviews

3. Use CDC tools to support ROBUSTNESS

30

30

Page 31: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

In Conclusion …

Every multi-clock design is subject to metastability Traditional verification methodologies CANNOT

assure robustness

To properly mitigate the dangers of CDC, we strongly recommend a solution that… :

— Supports Manual Reviews— Automatically reports all sources of CDC problems— Has a proven CDC verification methodology &

customer success

31

31

Page 32: Avoiding Metastability in FPGA Devices MAPLD 2009 David Landoll Applications Architect Mentor Graphics Corp

32