rapid development of a flexible validated processor model david a. penry manish vachharajani david...

26
Rapid Development of a Flexible Validated Processor Model David A. Penry David A. Penry Manish Vachharajani Manish Vachharajani David I. August David I. August The Liberty Architecture Research Group The Liberty Architecture Research Group Princeton University Princeton University Dept. of Electrical and Computer Dept. of Electrical and Computer Engineering Engineering University of Colorado, Boulder University of Colorado, Boulder

Post on 19-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Rapid Development of a Flexible Validated Processor Model

David A. PenryDavid A. Penry

Manish VachharajaniManish Vachharajani

David I. AugustDavid I. August

The Liberty Architecture Research GroupThe Liberty Architecture Research Group

Princeton UniversityPrinceton University

Dept. of Electrical and Computer EngineeringDept. of Electrical and Computer Engineering

University of Colorado, BoulderUniversity of Colorado, Boulder

2

Architectural Exploration

Want baseline simulator to be validatedWant baseline simulator to be validated

Want rapid model changes (flexibility)Want rapid model changes (flexibility)

Simulator

ArchitectureIdeas

Baseline Simulator

3

Why are flexible, validated simulators hard?

Barriers to flexibilityBarriers to flexibilityConcurrent hardware, but sequential simulatorsConcurrent hardware, but sequential simulatorsLack of reuseLack of reuse

Validation: 3 kinds of errors [Black & Shen, 98]Validation: 3 kinds of errors [Black & Shen, 98]Specification – knowing what to modelSpecification – knowing what to modelAbstraction – deciding how much to modelAbstraction – deciding how much to modelModeling – doing it rightModeling – doing it right

Liberty Simulation Environment Liberty Simulation Environment addresses flexibility and modeling error through addresses flexibility and modeling error through

concurrent structural modelingconcurrent structural modeling

4

The Itanium 2

11 weeks, 1 designer11 weeks, 1 designer

Systematic, incremental approachSystematic, incremental approach

EXP REN REG EXE DET WRB

Structural HazardDetection Data Hazard

Detection

Register Renaming& uOp insertion

IPG ROTI-BUF

BranchResolution

DCU

5

Designing the model: Front-end

EXP REN REG EXE DET WRBIPG ROTI-BUF

2 weeks 1½ weeks

ModelingInvestigation

DCU

6

Designing the model: EXP

REN REG EXE DET WRBIPG ROTI-BUF

½ day ½ day

ModelingInvestigation

DCU

EXP

7

Designing the model: REN

REG EXE DET WRBIPG ROTI-BUF

1 week 1 week

ModelingInvestigation

DCU

EXP REN

8

Designing the model: Rest of backend

IPG ROTI-BUF

½ week 1 ½ week

ModelingInvestigation

DCU

EXP REN REG EXE DET WRB

9

Designing the model: DCU

IPG ROTI-BUF

1½ weeks 2 weeks

ModelingInvestigation

EXP REN REG EXE DET WRB

DCU

10

Initial model results

11

Is this good enough?

Evaluate effectiveness of instruction prefetching (186.crafty)Evaluate effectiveness of instruction prefetching (186.crafty)

0.571 13.3%0.647Modified model

8.7%0.5970.649Simulation model

0.623

CPI: Prefetching

2.1%0.636Itanium 2 hardware

SpeedupCPI: No Prefetching

Model

12

Initial model component analysis

13

Refinements: Front end

ROT

I-BUF

½ day ½ day

ModelingInvestigation

EXP REN REG EXE DET WRBIPG

DCU

Added prefetch unitAdded prefetch unit

14

Refinements: L1D

ROT

I-BUF

1 day

ModelingInvestigation

EXP REN REG EXE DET WRBIPG

DCU

Fix page sizeFix page size

Add real hardware page table walkAdd real hardware page table walk

15

Refinements: Load-use stalls

ROT

I-BUF

8 days 5 days

ModelingInvestigation

EXP REN REG EXE DET WRBIPG

DCU

Add detailed L2/L3 cache behaviorAdd detailed L2/L3 cache behavior

16

Refined model component analysis

17

Comparison of component errors

18

Refined model results

Was 19.8%

19

Uses of the model within our group

New pipeline organization with re-cycling of instructions (4 New pipeline organization with re-cycling of instructions (4 weeks)weeks)

Rangan, et al. Decoupled Software Pipelining with the Rangan, et al. Decoupled Software Pipelining with the Synchronization Array. PACT ’04. (2 weeks)Synchronization Array. PACT ’04. (2 weeks)

Reis, et al. Design and Evaluation of Hybrid Fault-Reis, et al. Design and Evaluation of Hybrid Fault-Detection Systems. ISCA-32. (a few hours to modify, a Detection Systems. ISCA-32. (a few hours to modify, a couple days to learn LSE)couple days to learn LSE)

20

Conclusions

Liberty Simulation Environment is effective at reducing Liberty Simulation Environment is effective at reducing modeling errors and increasing flexibilitymodeling errors and increasing flexibility

To control abstraction and specification errors:To control abstraction and specification errors:Use disciplined refinement: investigate, decide, modelUse disciplined refinement: investigate, decide, modelQuantitatively verify documentsQuantitatively verify documentsQuantitatively verify abstractionsQuantitatively verify abstractions

Single metrics are not enough to validate modelsSingle metrics are not enough to validate models

21

Backup slides

22

Simulator Construction Systems

Reuse simulator Reuse simulator infrastructureinfrastructure

Architectural SimulatorInstance

Architecture Description

Simulator Builder

But still must be able to But still must be able to reuse descriptionsreuse descriptions

Structural Structural compositioncompositionMedium-grained Medium-grained components components Standard Standard communication communication contractscontractsHigh High parameterizabilityparameterizabilitySeparation of Separation of concernsconcerns

23

This study: Itanium 2

HW complexity ≠ model complexityHW complexity ≠ model complexity

11 weeks, 1 designer11 weeks, 1 designer

24

Liberty Simulation Environment

Simulator construction system for high reuseSimulator construction system for high reuse

Two-tiered specificationsTwo-tiered specificationsLeaf module templates in CLeaf module templates in CNetlisting language for instantiation and customizationNetlisting language for instantiation and customization

Three-signal standard communications contract with Three-signal standard communications contract with overrides (overrides (control functionscontrol functions))

Code is generatedCode is generated

Enable

Data

Ack

25

Shapes

26

Designing the model

EXP REN REG EXE DET WRBIPG ROTI-BUF

ModelingInvestigation

DCU