ee 382n microarchitecture yale patt, instructor eiman ebrahimi, khubaib, tas

Post on 21-Feb-2016

49 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

EE 382N Microarchitecture Yale Patt, instructor Eiman Ebrahimi, Khubaib, TAs The University of Texas at Austin Spring semester, 2010. Compile-time outline of what we will do. Introduction and Focus on the Fundamentals Tradeoffs Mechanisms: run-time and compile-time - PowerPoint PPT Presentation

TRANSCRIPT

EE 382N Microarchitecture

Yale Patt, instructorEiman Ebrahimi, Khubaib, TAs

The University of Texas at AustinSpring semester, 2010

Compile-time outline of what we will do

• Introduction and Focus on the Fundamentals

• Tradeoffs

• Mechanisms: run-time and compile-time

• Approaches to concurrency

• Impact of Multi-core

Lecture 1: Introduction and Focus

Outline

• A science of tradeoffs• The transformation hierarchy• The algorithm, the compiler, the microarchitecture• The microarchitecture view• The physical view• Speculation• Design points• Design Principles• Role of the Architect• Numbers • Embedded processors – because…• Thinking outside the box

Trade-offs, the overriding consideration:What is the cost?

What is the benefit?

• Global view– Global vs. Local transformations

• Microarchitecture view– The three ingredients to performance

• Physical view– Wire delay (recently relevant)– Power, energy (recently relevant)– Soft errors (recently relevant)– Partitioning (since the beginning of time)

Trade-offs, the overriding consideration:What is the cost?

What is the benefit?

• Global view– Global vs. Local transformations

• Microarchitecture view– The three ingredients to performance

• Physical view– Wire delay (recently relevant)– Power, energy (recently relevant)– Soft errors (recently relevant)– Partitioning (since the beginning of time)

Algorithm

Program

ISA (Instruction Set Arch)

Microarchitecture

Circuits

Problem

Electrons

The Triangle(originally from George Michael)

• Only the programmer knows the ALGORITHM– Pragmas– Pointer chasing– Partition code, data

• Only the COMPILER knows the future (sort of ??)– Predication– Prefetch/Poststore– Block-structured ISA

• Only the HARDWARE knows the past– Branch directions – Cache misses– Functional unit latency

Trade-offs, the overriding consideration:What is the cost?

What is the benefit?

• Global view– Global vs. Local transformations

• Microarchitecture view– The three ingredients to performance

• Physical view– Wire delay (recently relevant)– Power, energy (recently relevant)– Soft errors (recently relevant)– Partitioning (since the beginning of time)

Trade-offs, the overriding consideration:What is the cost?

What is the benefit?

• Global view– Global vs. Local transformations

• Microarchitecture view– The three ingredients to performance

• Physical view– Wire delay (recently relevant)– Power, energy (recently relevant)– Soft errors (recently relevant)– Partitioning (since the beginning of time)

Outline

• A science of tradeoffs• The transformation hierarchy• The algorithm, the compiler, the microarchitecture• The microarchitecture view• The physical view• Speculation• Design points• Design Principles• Role of the Architect• Numbers • Embedded processors – because…• Thinking outside the box

Speculation

• Why good? – improves performance

• How? – we guess– Branch prediction– Way prediction– Data prefetching– Value prediction– Address prediction– Recent visibility: Memory disambiguation

• Why bad? – consumes energyl

Design Principles

• Critical path design

• Bread and Butter design

• Balanced design

Numbers(because comparch is obsessed with numbers)

• The Baseline – Make sure it is the best– Superlinear speedup– Recent example, one core vs. 4 cores with ability to fork

• The Simulator you use – Is it bug-free?• Understanding vs “See, it works!”

– 16/64• You get to choose your experiments

– SMT, throughput: run the idle process– Combining cores: what should each core look like

• You get to choose the data you report– Wrong path detection: WHEN was the wrong path detected

• Never gloss over anomalous data

Embedded processors vs General Purpose(because it represents 99%+ of the market)

• Mostly everything about general purpose applies

• Advantage: easier to design wholistically– advantage of special purpose or limited purpose

• Greater use of ASICs– Although that could change

• Why VLIW is good (though bad for general purpose)

• Partitioning is critical

• Memory latency

• Reconfigurability

Finally, people are always telling you:Think outside the box

I prefer: Expand the box

Something we are all familiar with:Look-ahead Carry Generators

• They speed up ADDITION

• But why do they work?

12 9

21

182378645259827637

Addition

top related