verification of hierarchical cache coherence protocols for future processors

50
Verification of Hierarchical Cache Coherence Protocols for Future Processors Student: Xiaofang Chen Advisor: Ganesh Gopalakrishnan

Upload: chaela

Post on 15-Mar-2016

56 views

Category:

Documents


0 download

DESCRIPTION

Verification of Hierarchical Cache Coherence Protocols for Future Processors. Student: Xiaofang Chen Advisor: Ganesh Gopalakrishnan. Outline. Background Proposed solutions High level hierarchical coherence protocol verification - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Verification of Hierarchical Cache Coherence Protocols for Future Processors

Verification of Hierarchical Cache Coherence Protocols for Future Processors

Student: Xiaofang Chen

Advisor: Ganesh Gopalakrishnan

Page 2: Verification of Hierarchical Cache Coherence Protocols for Future Processors

2

Outline

Background Proposed solutions

– High level hierarchical coherence protocol verification

– Refinement check: specifications vs. RTL implementations

Conclusion

Page 3: Verification of Hierarchical Cache Coherence Protocols for Future Processors

3

Hierarchical Cache Coherence Protocols

Chip-level protocols

Inter-cluster protocols

Intra-cluster protocols

dirmem dirmem

Page 4: Verification of Hierarchical Cache Coherence Protocols for Future Processors

4

Modeling and Verification of Coherence Protocols

High-level modeling approaches– Model checking

Low-level modeling: RTL or VHDL– Simulation

Page 5: Verification of Hierarchical Cache Coherence Protocols for Future Processors

5

Problems with Hierarchical Coherence Protocols

For high level modeling– Handle the complexity of hierarchical protocols

For RTL implementations– Verify a RTL correctly implements the specification

Page 6: Verification of Hierarchical Cache Coherence Protocols for Future Processors

6

Example: Verification Complexity (I)

RAC

L2 Cache+Local Dir

L1 Cache

Main Mem

Home ClusterRemote Cluster 1

Remote Cluster 2

L1 Cache

Global Dir

RAC

L2 Cache+Local Dir

L1 Cache

L1 Cache

RAC

L2 Cache+Local Dir

L1 Cache

L1 Cache

Page 7: Verification of Hierarchical Cache Coherence Protocols for Future Processors

7

Example: Verification Complexity (II)

Tool: Murphi Verification

– IA-64 machine

– 18GB memory

– 40-bit hash compaction

– Non-conclusive after >30 hours of state enumeration

Page 8: Verification of Hierarchical Cache Coherence Protocols for Future Processors

8

Differences in Modeling: Specs vs. Impls

1 1.1 1.

2

1.3

home clientbuf

local

cache

One step in high-level

Multiple steps in low-level

1.4

1.5

Page 9: Verification of Hierarchical Cache Coherence Protocols for Future Processors

9

Differences in Execution: Specs vs. Impls

1

1.1 1.2

1.3

2 3

2.1 2.2 3.1

3.2

3.3

Interleaving in HL

Concurrency in LL

Page 10: Verification of Hierarchical Cache Coherence Protocols for Future Processors

10

Proposed Mechanisms

For high level modeling, develop– A few M-CMP coherence protocols

– A compositional approach

For specifications vs. implementations, develop– A formal theory

– A compositional approach

– A practical tool

Page 11: Verification of Hierarchical Cache Coherence Protocols for Future Processors

11

2005

Abstraction + assume guarantee Inclusive M-CMP protocols Chen et al. FMCAD 2006

Transaction based refinement check

Hierarchical protocols verification

2006 2007 2008

Transaction based refinement check Complete case study for a benchmark Chen et al. TECHCON 2007 Best session paper in verification

Extensions: refinement check

Present Predicate abstraction for Murphi Bounded Transaction based testing Chen et al. UUCS-06-002, UUCS-06-003

Starting practices

Hierarchical protocols verification

Refinement theory Modular refinement check Chen et al. FMCAD 2007

Improved approach: one level a time Automated abstraction Non-inclusive M-CMP protocols Chen et al. HLDVT 2007

Make muv a practical tool

Thesis Timeline

Page 12: Verification of Hierarchical Cache Coherence Protocols for Future Processors

12

Outline

Background Proposed solutions

– High level hierarchical coherence protocol verification

– Refinement check: specifications vs. RTL implementations

Conclusion

Page 13: Verification of Hierarchical Cache Coherence Protocols for Future Processors

13

An M-CMP Benchmark Protocol

RAC

L2 Cache+Local Dir

L1 Cache

Main Mem

Home ClusterRemote Cluster 1

Remote Cluster 2

L1 Cache

Global Dir

RAC

L2 Cache+Local Dir

L1 Cache

L1 Cache

RAC

L2 Cache+Local Dir

L1 Cache

L1 Cache

Inter-cluster

Intra-cluster

Page 14: Verification of Hierarchical Cache Coherence Protocols for Future Processors

14

Protocol Features

Both levels use MESI protocols– Intra-cluster: FLASH

– Inter-cluster: DASH

Silent drop on non-Modified cache lines Network channels are non-FIFO Inclusive caches

Page 15: Verification of Hierarchical Cache Coherence Protocols for Future Processors

15

Another Benchmark: Non-inclusive Caches

RAC

L2 Cache+Local Dir

L1 Cache

Main Mem

Home ClusterRemote Cluster 1

Remote Cluster 2

L1 Cache

Global Dir

RAC

L2 Cache+Local Dir

L1 Cache

L1 Cache

RAC

L2 Cache+Local Dir

L1 Cache

L1 Cache

Page 16: Verification of Hierarchical Cache Coherence Protocols for Future Processors

16

Our Compositional Approach

Original protocol

Page 17: Verification of Hierarchical Cache Coherence Protocols for Future Processors

17

Our Compositional Approach

Page 18: Verification of Hierarchical Cache Coherence Protocols for Future Processors

18

One Way to Decompose Protocols

Create three abstract protocols Each with 1 detailed cluster + 2 abstracted clusters

Page 19: Verification of Hierarchical Cache Coherence Protocols for Future Processors

19

Abstract Protocol #1

RAC

L2 Cache+Local Dir’

Main Mem

Home Cluster

Remote Cluster 1

Global Dir

RAC

L2 Cache+Local Dir

L1 Cache

L1 Cache

RAC

L2 Cache+Local Dir’

Remote Cluster 2

Page 20: Verification of Hierarchical Cache Coherence Protocols for Future Processors

20

Abstract Protocol #2

RAC

L2 Cache+Local Dir’

Main Mem

Home Cluster

Remote Cluster 1

Global Dir

RAC

L2 Cache+Local Dir

L1 Cache

L1 Cache

RAC

L2 Cache+Local Dir’

Remote Cluster 2

Page 21: Verification of Hierarchical Cache Coherence Protocols for Future Processors

21

Problems with This Approach Every abstract protocol contains 2 protocols Duplicated behaviors in abstract protocols State space still large

1818 636,613,051M2

1812 284,088,425M1

Mem (GB)Time (hour)# of states

Page 22: Verification of Hierarchical Cache Coherence Protocols for Future Processors

22

Second Way to Decompose Protocols

RAC

L2 Cache+Local Dir’

Main Mem

Home ClusterRemote Cluster 1

Remote Cluster 2

RAC

L2 Cache+Local Dir’

Global Dir

RAC

L2 Cache+Local Dir’

Home Cluster Remote Cluster 1

ABS #1 ABS #2

ABS #3

L2 Cache+Local Dir

L1 Cache

L1 Cache

L2 Cache+Local Dir

L1 Cache

L1 Cache

Page 23: Verification of Hierarchical Cache Coherence Protocols for Future Processors

23

Model Checking Results

Model checkpassed

Use mem(GB)

18

18

18

1.8

1.8

1.8

Model checktime (sec)

> 125,410

44,978

66,249

270

50

21

# of states

> 438,120,000

284,088,425

636,613,051

1,500,621

574,198

198,162

Full model

Abs. model 1

Abs. model 2

Abs. model 1

Abs. model 2

Abs. model 3

Classicalapproach

Firstapproach

Secondapproach

Nonconclusive

Yes

Yes

Yes

Yes

Yes

Page 24: Verification of Hierarchical Cache Coherence Protocols for Future Processors

24

Details of Our Approach

Abstraction– States

– Transitions, properties

Constraining– Assume guarantee reasoning

Page 25: Verification of Hierarchical Cache Coherence Protocols for Future Processors

25

Abstraction on States

Intra-cluster

Inter-cluster

Page 26: Verification of Hierarchical Cache Coherence Protocols for Future Processors

26

State Representation

L2 Cache+Local Dir

L1 Cache

L1 Cache

L2 Cache+Local Dir’

L1s Network L2Local Dir

Original cluster

RAC

RAC

RAC

L2 Cache+Local Dir

L1 Cache

L1 Cache

L1s Network L2Local Dir

L2Local Dir’ RAC

Abstract clusters

Page 27: Verification of Hierarchical Cache Coherence Protocols for Future Processors

27

Rule: guard action guard

– Become more permissive

action– Allow more behaviors

Abstracting Transitions and Properties

Page 28: Verification of Hierarchical Cache Coherence Protocols for Future Processors

28

An Example of Abstraction

RAC

L2 Cache+Local Dir

L1 Cache

L1 Cache

RAC

L2 Cache+Local Dir’

WBClusters[c].WbMsg.Cmd = WB

Clusters[c].L2.Data := Clusters[c].WbMsg.Data;

Clusters[c].L2.HeadPtr := L2; …

True

Clusters[c].L2.Data := nondet; …

Abstract inter-cluster protocol

Abstract intra-cluster protocol

Page 29: Verification of Hierarchical Cache Coherence Protocols for Future Processors

29

Abstraction, Now Constraining

Page 30: Verification of Hierarchical Cache Coherence Protocols for Future Processors

30

An Example of Constraining

RAC

L2 Cache+Local Dir

L1 Cache

L1 Cache

RAC

L2 Cache+Local Dir’

WBClusters[c].WbMsg.Cmd = WB

Clusters[c].L2.State = Excl

True &

Clusters[c].L2.State = Excl

Clusters[c].L2.Data := nondet; …

Page 31: Verification of Hierarchical Cache Coherence Protocols for Future Processors

31

Non-inclusive Protocols: History Variables

RAC

L2 Cache+Local Dir’

Main Mem

Home ClusterRemote Cluster 1

Remote Cluster 2

RAC

L2 Cache+Local Dir’

Global Dir

RAC

L2 Cache+Local Dir’

Home Cluster Remote Cluster 1

L2 Cache+Local Dir

L1 Cache

L1 Cache

L2 Cache+Local Dir

L1 Cache

L1 Cache

Page 32: Verification of Hierarchical Cache Coherence Protocols for Future Processors

32

Experimental Results

Model checkpassed

Use mem(GB)

18

1.8

1.8

1.8

Model checktime (sec)

> 161,398

770

250

248

# of states

> 473,260,000

4,070,484

2,424,719

2,424,719

Full model

Abs. model 1

Abs. model 2

Abs. model 3

Classicalapproach

Secondapproach

Nonconclusive

Yes

Yes

Yes

Page 33: Verification of Hierarchical Cache Coherence Protocols for Future Processors

33

Outline

BackgroundProposed solutions

High level hierarchical coherence protocol verification

– Refinement check: specifications vs. RTL implementations

Conclusion

Page 34: Verification of Hierarchical Cache Coherence Protocols for Future Processors

34

Our Approach

Use a hardware language– Hardware Murphi

Develop a formal theory of refinement check Develop a compositional approach

– Abstraction

– Assume guarantee

Develop a practical tool

Page 35: Verification of Hierarchical Cache Coherence Protocols for Future Processors

35

Hardware Murphi

Murphi extension by S. German and G. Janssen A concurrent shared variable language

– On each cycle• Multiple transitions execute concurrently• Exclusive write to a variable• Shared reads to variables• Write immediately visible within the same transition• Write visible to other transitions on the next cycle

Support transactions, signals, etc

Page 36: Verification of Hierarchical Cache Coherence Protocols for Future Processors

36

Transaction

Group multiple steps in impl

Transaction Rule-1 …. … Rule-6 … End;

12

3

456

Page 37: Verification of Hierarchical Cache Coherence Protocols for Future Processors

37

Workflow of Our Refinement Check

Hardware MurphiImpl model

Product model inHardware Murphi

Product model in VHDL

MurphiSpec model

Property check

Muv

Check low-level correctly implements high-level

Page 38: Verification of Hierarchical Cache Coherence Protocols for Future Processors

38

Full List of Assertions for Refinement Check

1. Serializability for specifications

2. No write-write conflicts

3. Initial states containment

4. Write set variables containment

5. Enableness for specifications

6. Joint variables match at the end of transactions

Page 39: Verification of Hierarchical Cache Coherence Protocols for Future Processors

39

An Example

Transaction

Rule-1

guard1 action1;

Rule-2

guard2 action2;

Rule-3

guard3 action3;

End;

Rule

spec_guard spec_action;

Impl transaction

Spec rule

Page 40: Verification of Hierarchical Cache Coherence Protocols for Future Processors

40

An Example (Cont’d)

Transaction

Rule-1 guard1 action1; assert spec_guard; spec_action; Rule-2

guard2 action2;

Rule-3 guard3 action3;

End;

assert impl_var1 = spec_var1;assert impl_var2 = spec_var2; …

Page 41: Verification of Hierarchical Cache Coherence Protocols for Future Processors

41

Driving Benchmark

Buf

Buf

Buf Remote

Dir Cache Mem

Router

Buf

Buf

Buf

LocalHome

Remote

Dir Cache Mem

S. German and G. Janssen, IBM Research Tech Report 2006

LocalHome

Page 42: Verification of Hierarchical Cache Coherence Protocols for Future Processors

42

Bugs Found with Refinement Check

Benchmark satisfies cache coherence already Bugs still found

– Bug 1: router unit loses messages

– Bug 2: home unit replies twice for one request

– Bug 3: cache unit gets updated twice from one reply

Refinement check is an automatic way of constructing checks

Page 43: Verification of Hierarchical Cache Coherence Protocols for Future Processors

43

Model Checking Approaches

Monolithic– Straightforward property check

Compositional– Divide and conquer

Product model in VHDL

Monolithic

Compositional

Page 44: Verification of Hierarchical Cache Coherence Protocols for Future Processors

44

Compositional Refinement Check

Reduce the verification complexity Basic Techniques

– Abstraction • Removing details to make verification easier

– Assume guarantee• A simple form of induction which introduces assumptions and

justifies them

Page 45: Verification of Hierarchical Cache Coherence Protocols for Future Processors

45

In More Detail

Abstraction– Change variables to free input variables

– E.g. change a latch to free input signal

Assume guarantee

(spec.Var = impl.Var) holds

Assume for reads of a transaction

Page 46: Verification of Hierarchical Cache Coherence Protocols for Future Processors

46

Experimental Results

Verification Time

1-bit 10-bit

1-day

Datapath

Configurations– 2 nodes, 2 addresses, SixthSense

30 min

Monolithic approachCompositional approach

Page 47: Verification of Hierarchical Cache Coherence Protocols for Future Processors

47

Outline

BackgroundProposed solutions

High level hierarchical coherence protocol verificationRefinement check: specifications vs. RTL implementations

Conclusion

Page 48: Verification of Hierarchical Cache Coherence Protocols for Future Processors

48

2005

Abstraction + assume guarantee Inclusive M-CMP protocols Chen et al. FMCAD 2006

Transaction based refinement check

Hierarchical protocols verification

2006 2007 2008

Transaction based refinement check Complete case study for a benchmark Chen et al. TECHCON 2007 Best session paper in verification

Extensions: refinement check

Present Predicate abstraction for Murphi Bounded Transaction based testing Chen et al. UUCS-06-002, UUCS-06-003

Starting practices

Hierarchical protocols verification

Refinement theory Modular refinement check Chen et al. FMCAD 2007

Improved approach: one level a time Automated abstraction Non-inclusive M-CMP protocols Chen et al. HLDVT 2007

Make muv a practical tool

Thesis Timeline

Page 49: Verification of Hierarchical Cache Coherence Protocols for Future Processors

49

Thank you.

Page 50: Verification of Hierarchical Cache Coherence Protocols for Future Processors

50

Related Work

Parameterized verification– Chou et al.

Bluespec– Arvind et al.

Aggregation of distributed actions – Park and Dill

Compositional verification– Many previous works including McMillan, Jones, etc.