dealing with the three horrible problems in verification
TRANSCRIPT
![Page 1: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/1.jpg)
1
Dealing with the Three Horrible Problems in Verification
Prof. David L. Dill
Department of Computer Science
Stanford University
![Page 2: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/2.jpg)
2
An excursion out of the ivory tower
0-In, July 1996, initial product design discussions: There are three horrible problems in verification:
1. Specifying the properties to be checked2. Specifying the environment3. Computational complexity of attaining high coverage
Up to then, I had assumed that the first two were someone else’s problem, and focused on the last.
I still think this is a reasonable framework for thinking about verification.
![Page 3: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/3.jpg)
3
Topics
• Mutation coverage (Certess)• System-Level Equivalence Checking (Calypto)• Integrating verification into early system design
(research).• Conclusions
![Page 4: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/4.jpg)
4
Typical verification experience
Functional
testing
Weeks
Bugs
per
week
TapeoutPurgatory
(Based on fabricated data)
![Page 5: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/5.jpg)
5
Coverage Analysis: Why?
• What aspects of design haven’t been exercised?– Guides test improvement
• How comprehensive is the verification so far?– Stopping criterion
• Which aspects of the design have not been well-tested?– Helps allocate verification resources.
![Page 6: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/6.jpg)
6
Coverage Metrics
• A metric identifies important – structures in a design representation
• HDL lines, FSM states, paths in netlist
– classes of behavior• Transactions, event sequences
• Metric classification based on level of representation.– Code-based metrics (HDL code)– Circuit structure-based metrics (Netlist)– State-space based metrics (State transition graph)– Functionality-based metrics (User defined tasks)– Spec-based metrics (Formal or executable spec)
![Page 7: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/7.jpg)
7
Code-Based Coverage Metrics
• On the HDL description– Line/code block coverage– Branch/conditional coverage– Expression coverage– Path coverage
• Useful guide for writing test cases
• Little overhead
• Inadequate in practice
• always @ (a or b or s) // mux• begin• if ( ~s && p )• d = a;• r = x • else if( s )• d = b;• else• d = 'bx;• if( sel == 1 )• q = d;• else if ( sel == 0 )• q = z
![Page 8: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/8.jpg)
8
Circuit Structure-Based Metrics
• Toggle coverage: Is each node in the circuit toggled?
• Register activity: Is each register initialized? Loaded? Read?
• Counters: Are they reset? Do they reach the max/min value?
• Register-to-register interactions: Are all feasible paths exercised?
• Datapath-control interface:Are all possible combinations of control and status signals exercised? sinit
s3
s4
s2
s5
s6
Control
Datapath
(0-In checkers have these kindsof measures.)
![Page 9: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/9.jpg)
9
Observability problem
• A buggy assignment may be stimulated, but still missed
Examples:• Wrong value generated
speculatively, but never used.
• Wrong value is computed and stored in register– Read 1M cycles later,
but simulation doesn’t run that long.
![Page 10: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/10.jpg)
10
Detection terminology
• To detect a bug– Stimuli must activate buggy logic
Verification Environment
Compare
ReferenceModel
Stimuli
Activation Bug
Design under Verification
![Page 11: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/11.jpg)
11
Detection terminology
• To detect a bug– Stimuli must activate buggy logic– The bug must propagate to a checker
Verification Environment
Compare
ReferenceModel
Stimuli
PropagationActivation Bug
Design under Verification
![Page 12: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/12.jpg)
12
Detection terminology
• To detect a bug– Stimuli must activate buggy logic– The bug must propagate to a checker– The checker must detect the bug
Verification Environment
Compare
ReferenceModel
Stimuli
PropagationDetection
Activation Bug
Design under Verification
![Page 13: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/13.jpg)
13
Detection terminology
• Traditional verification metrics do not account for non-propagated, or non-detected bugs
Verification Environment
Compare
ReferenceModel
Stimuli
PropagationDetection
Activation Bug
Design under Verification
Traditional verification metrics No visibility with traditional metrics
![Page 14: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/14.jpg)
14
Mutation testing
• To evaluate testbench’s bug detection ability– Inject fake bugs into design (“mutations”).– Simulate and see whether they are detected.– If not, there is a potential gap in the testbench.
• There can be many kinds of mutations– “Stuck at” faults– Wrong logical or other operators
• Idea originates in software testing– But is obviously related to testability.
• Efficient implementation is a challenge.
![Page 15: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/15.jpg)
15
Certess approach to Mutation Analysis
ReportFault Model Analysis
Fault Activation Analysis
Qualify the Verif. Env.
Report
Report
Static analysis of the design
Analysis of the verification environment behavior
Measure the ability of the verification environment to detect mutations
Iterate if needed
![Page 16: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/16.jpg)
16
Avoiding the horrible problems
• Qualify test framework, not design– Environment/properties are in existing test bench.
• High-quality coverage metric targets resources at maximizing useful coverage.
![Page 17: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/17.jpg)
17
SEC Advantages
• SEC vs. Simulation– Simulation is resource intensive, with lengthy run times -
SEC runs orders of magnitude faster than simulation– Vector generation effort-laden, may be source of errors –
SEC requires minimal setup, no test vectors– Simulation output often requires further processing for answers –
SEC is exhaustive (all sequences over all time)
• SEC vs. Property Checkers– Properties are created to convey specification requirements –
SEC uses the golden model as the specification– Properties are often incomplete, and not independently verifiable– Properties are time consuming to construct
![Page 18: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/18.jpg)
18
=?
Enabling ESL™
• SLEC comprehensively proves functional equivalence
• Identifies design differences (bugs)
• Supports sequential design changes – State Changes
– Temporal differences
– I/O differences
ReferenceModel
ReferenceModel
ImplementationModel
ImplementationModel
SLEC™
![Page 19: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/19.jpg)
19
SLEC Finds Functional Differences in C-C Verification
Customer example • Verify HLS model is the functionally
equivalent to the reference model
• Simulation uncovered no differences - for the given test-bench
• SLEC System found differences between the two models
– Reference model was incorrect• Probable corner case not easily detectable by
simulation
SLEC System finds all possible errors or inconsistencies. Simulation is not exhaustive, and therefore cannot fully prove equivalence.
• Typical functional differences introduced during refinement
– Code optimization for HLS– Datapath word size optimization– Ambiguous ESL code
• Ex: Out of array bounds
Behavioral C/C++ HLS C/C++
wrapper
HLS Model
wrapper
RefModel
C to C Verification Failed!
DIFFERENCE FOUND!
User Defined Input
Constraints
Reference Model HLS Input Code
![Page 20: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/20.jpg)
20
Application Bugs Found
Wireless baseband High-Level Synthesis bug in array’s access range
Video processing Design bug in logic on asynchronous reset line
Video processing High-Level Synthesis bug in sign extension
Custom DSP Block Design bug in normalization of operands
DCT Function High- Level Synthesis bug in “wait_until()” interpretation
Image Resizer Design bug at proof depth = 1.
Design Bugs Caught by SLEC System
![Page 21: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/21.jpg)
21
System-level Formal Verification
• Sequential Logic Equivalence Checking (SLEC) Leverages system-level verification Comprehensive verification – 100% coverage Quick setup - no testbenches required Rapid results – eliminates long regressions Focused debug – short counter examples
• Why is it needed Independent verification Find bugs caused by language ambiguities or
incorrect synthesis constraints Shift left by -1 Divide by zero
Verify RTL ECO’s Parallels current RTL synthesis methodology
![Page 22: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/22.jpg)
22
Application Bugs Found
Multi-media Processor Dead end states created
FFT Combinational loops created
QuantizeDivide by zero defined in RTL, but undefined in C code
Ultra wideband FilterShift left or right by N bits, when the value being shifted is less than N bits
Multi-media processorShift by an integer in the C-code could be a shift by a negative number which is undefined in C
High-level Synthesis Bugs found by SLEC
![Page 23: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/23.jpg)
23
RTL to RTL Verification with Sequential Differences
• RTL Pipelining– Latency & throughput
changes– Clock speed enhancement
cmd1 data1 calcA1 out1calcB1
cmd2 data2 calcA2 out2calcB2
cmd3 data3 calcA3 out3calcB3
cmd4 data4 calcA4 out4calcB4
cmd1 data1 calc1 out1
cmd2 data2 calc2 out2
cmd3 calc3 out3
cmd4 data4 calc4 out4
data3
Verified Equivalent
Or Counter-Example
=?
![Page 24: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/24.jpg)
24
RTL to RTL Verification with Sequential Differences
• RTL Resource Sharing– State and latency
changes– Size optimization
A B C
Sum
++
clk
reset
B C
Sum
+
A
Verified Equivalent
Or Counter-Example
=?
![Page 25: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/25.jpg)
25
RTL to RTL Verification with Sequential Differences
C2
C1
C2
C1
D
Q
Comb.Logic
C2
C1
C2
C1 Comb.Logic
D
Q
ReducedComb.Logic
Comb.Logic
Verified Equivalent
Or Counter-Example
• RTL Re-Timing– State changes– Slack adjustment
Allows micro-architecture modifications without breaking testbenches
=?
![Page 26: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/26.jpg)
26
Designer’s Dilemma– Efficient Design for Power
• At 90nm and below, power is becoming the most critical design constraint– Exponential increase in leakage power consumption– Quadratic increase in power density
• Clock Gating is the most common design technique used for reducing power– Designers manually add clock gating to control dynamic power
• Clock gating is most efficiently done at the RTL level, but is error prone
– Mistakes in implementation cause delays and re-spins
– Difficult to verify with simulation regressions
• Requires modifications to testbenches
• Insufficient coverage of clock gating dependencies
• Aggressive clock gating approaches sometimes rejected due to verification
complexity
![Page 27: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/27.jpg)
27
Addressing Power in the Design Flow
• Power management schemes are considered globally as part of the system model and initial RTL functionality– Sleep modes
– Power down
• Power optimizations are local changes made to RTL that do not effect the design functionality– Disabling previous pipeline stages when
the data is not used
– Data dependent computation, like multiple by zero
PhysicalImplementation
Manual RTL Optimization
RTL
High level Synthesis or
Manual Creation
OptimizedRTL
SystemModel
![Page 28: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/28.jpg)
28
CG
CG
CG
Combinational clock gating
en
clkenclk CG
Sequential clock gating
Sequential clock gating
Combinational Equivalence Checking
Sequential Equivalence Checking
CGCGCG
![Page 29: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/29.jpg)
29
Research
• Verification currently based on finding and removing bugs.
• Finding bugs earlier in the design process would be beneficial– Early descriptions (protocol, microarchitecture) are smaller,
more tractable– Early bugs are likely to be serious, possibly lethal– Bug cost goes up by >10x at each stage of design.
• People have been saying this for years.
Why can’t we start verifying at the beginning of the design?
![Page 30: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/30.jpg)
30
An Experiment
• DARPA-sponsored “Smart Memories” project starting up
• Have verification PhD student (Jacob Chang) work with system designers– Try to verify subsystems as soon as possible.– Understand what “keeps designers awake at night.”– Try to understand “Design for verification” (willing to trade off
some system efficiency for verification efficiency).
![Page 31: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/31.jpg)
31
Initial Results: Dismal
• Used a variety of formal verification tools– SRI’s PVS system– Cadence SMV
• Did some impressive verification work• Didn’t help the design much
– By the time something was verified, design had changed.– We know this would be a problem, but our solutions weren’t
good enough
![Page 32: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/32.jpg)
32
Desperate measures required
• We discarded tools, used pen and paper• This actually helped!• Real bugs were found• Design principles were clarified• Designers started listening
![Page 33: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/33.jpg)
33
What did we learn?
• Early verification methods need to be nimble– Must be able to keep up with design changes.
• Existing formal methods are not nimble.– Require comprehensive descriptions– High level of abstraction helps…– But one description still takes on too many issues– So design changes necessitate major changes in
descriptions – too slow!
![Page 34: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/34.jpg)
34
Approach: Perspective-based Verification
• Need to minimize the number of issues that we tackle at one time.
• Perspective: Minimal high-level formalization of a design to analyze a particular class of properties.– Perspectives should be based on designer’s abstractions
• What does he/she draw on the whiteboard?
– Should capture designer’s reasoning about correctness
![Page 35: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/35.jpg)
35
Example: Resource dependencies
• Verify Cache Coherence message system is deadlock free
• Model– Dependency graph and check for cycles
• Analysis method– Search for cycles– In this case: by hand!
• System-level deadlocks are notoriously hard to find using conventional formal verification tools.
![Page 36: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/36.jpg)
36
Dependency graph (cache & memory)
Proc Request
OutDMA
UncachedRequest
OutSyncmiss
OutReplies
Writebacks
OutCache miss
Outwakeup
Proc Reply
InDMA
uncachedRequest
InCache Miss
Reply
InCoherence
Request
InSync Replay
RequestCancel Reply
InDMA Replyuncached
Cancel Reply
InDMA
UncachedRequest
InSyncmiss
InReplies
Writebacks
InCache miss
Inwakeup
OutCache Miss
Reply
OutCoherence
Request
OutSync Replay
RequestCancel Reply
OutDMA
uncachedReply
Either drop wakeup if no entry or store wakeup
Always entry in MSHR to accept replies
MemoryRequest
Cache Controller - Any operation forwhich MSHR or USHR, and cancelare already allocated can sink input
messages
MSHRNet MSHRUSHRNet USHR
Out DMAReply
UncachedCancel Reply
In Cancel
Out Cancel
NetCancel
In CancelRequest
Proc Cancel
Out CancelRequest
![Page 37: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/37.jpg)
37
Resource Dependency Perspective
1. Partial formalization of design– Relevant details
• Request buffers dependency• Protocol dependencies
– e.g. cancel must follow all other SyncOp commands• Virtual channel in networks
– Irrelevant details• Network ordering requirements• Cache controller ordering requirements• Buffer implementation
2. One class of verification properties– Deadlock free
3. Captures why the property is correct– Ensure no cycle in resource dependency
![Page 38: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/38.jpg)
38
Bug found
• Dependency Cycle found– Taken into account
dependency behavior of• Virtual channel
• Memory controller
• Cache controller
• Easy to find once formal model is constructed– Hard to find using simulation
• All channels must be congested
• Bug found before implementation
CacheController
Memory Controller
CacheController
SyncMiss
Sync Op Unsuccessful
SyncMiss
Sync Op Successful
Wake Up
ReplayReplay
![Page 39: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/39.jpg)
39
Parallel Transaction Perspective
• Many systems process a set of transactions– Memory reads/writes/updates– Packet processing/routing
• User thinks of transactions as non-interfering processes
• Hardware needs to maintain this illusion.• Model: State transaction diagram• Analysis: Systematically check whether one
transaction can interfere with another.
Several important bugs were found by manually applying this method.
![Page 40: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/40.jpg)
40
Parallel Transaction Perspective
1. Partial formalization of design– Relevant details
• Effect of transition on self and others• Locking mechanism
– Irrelevant details• Timing and ordering information• Buffering issues• Deadlock issues
2. Targets on one verification property– Same behavior of single process in a multi-process
environment
3. Captures why the property is correct– Interrupts are conflict free
![Page 41: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/41.jpg)
41
Transaction Diagram Verifier
• Tool developed for verification of the parallel transaction perspective
• User input– Invariants– Transition guards– Transition state changes
• Invariants easy to see it’s true for single process• TDV verifies invariant for single process, plus
– Invariants are true even if other processes execute at the same time
![Page 42: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/42.jpg)
42
TDV
• User supplies– Blocks (transaction steps)
• Pre-conditions, post-conditions, guards, assignments
– Links between blocks (control flow)
• Tool loops through all pairs of block– Construct the verification tasks– Verify the tasks through another tool
• STP decision procedure
• Not a model checker– Verifies unbounded number of transactions
– Uses theorem-proving technology.
![Page 43: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/43.jpg)
43
Tradeoffs
• Sacrifices must be made– Perspectives are necessarily partial– Not easy to link perspectives to RTL– Not easy to link perspectives to each other
• …but, at least, you can verify or find bugs while their still relevant to the design!
![Page 44: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/44.jpg)
44
The horrible problems
• Perspectives omit irrelevant details.– including irrelevant environmental constraints.
• Properties are at the level the designer thinks, so they are easier to extract
• Computational complexity reduced as well
![Page 45: Dealing with the Three Horrible Problems in Verification](https://reader033.vdocuments.mx/reader033/viewer/2022060117/558679c1d8b42a4a3d8b468d/html5/thumbnails/45.jpg)
45
Conclusions
• Practical verification technology must take account of the three horrible problems
• Products currently on the market do this in innovative ways– Coverage analysis that is a closer match to actual bug-
finding ability• Evaluates existing verification environment
– System-level equivalence checking avoids need to add assertions
• Environmental constraint problem reduced.
• We need a new perspective on system-level verification