computer architecture from many perspectives
DESCRIPTION
Computer Architecture From Many Perspectives. Peter Hsu, Ph.D. Presented 13 September 2001 at Department d’Arquitectura de Computadors, Universitat Polit è cnica de Catalunya (UPC), Barcelona, Spain. Industry’s View. Computer architecture vs. design ? - PowerPoint PPT PresentationTRANSCRIPT
Computer Architecture Computer Architecture From Many PerspectivesFrom Many Perspectives
Peter Hsu, Ph.D.Peter Hsu, Ph.D.
Presented 13 September 2001 at Department d’Arquitectura de Presented 13 September 2001 at Department d’Arquitectura de Computadors, Universitat PolitComputadors, Universitat Politèècnica de Catalunya (UPC), cnica de Catalunya (UPC),
Barcelona, SpainBarcelona, Spain
Industry’s ViewIndustry’s View Computer Computer architecturearchitecture vs. vs. designdesign??
Tradition: Architect creates a Tradition: Architect creates a planplan to to transform client’s transform client’s desiredesire into physical into physical realityreality
Interpretation:Interpretation: Plan = logical design, project schedule, cost Plan = logical design, project schedule, cost
projections, …projections, … Reality = mechanical, thermal, electrical Reality = mechanical, thermal, electrical
issues; reliability, …issues; reliability, … Desire = profit, return on investmentDesire = profit, return on investment
AgendaAgenda Challenge to think more broadly about Challenge to think more broadly about
computer designcomputer design Physics: materials, signal integrity, …Physics: materials, signal integrity, … Manufacturing: tolerances, infrastructure, …Manufacturing: tolerances, infrastructure, … Financial: design, fabrication costs, …Financial: design, fabrication costs, …
Interconnected-ness of issues, solutionsInterconnected-ness of issues, solutions Architect interface many different audiencesArchitect interface many different audiences Optimality depends of scope of viewOptimality depends of scope of view Novel solutions to current problemsNovel solutions to current problems
Presentation MethodologyPresentation Methodology By examplesBy examples
1.1. Office environment multiprocessor serverOffice environment multiprocessor server Mechanical, manufacturing issuesMechanical, manufacturing issues Electrical signal, supply issuesElectrical signal, supply issues
2.2. 3-D graphics chip for PC3-D graphics chip for PC Project scheduling, costs, return on investmentProject scheduling, costs, return on investment
CaveatsCaveatsAuthor’s bias: performanceAuthor’s bias: performance
1.1. latency latency (memory, inter-processor, etc.)(memory, inter-processor, etc.),,2.2. bandwidth, bandwidth, thenthen
3.3. micro-architecturemicro-architecture
Decisions for presentation clarityDecisions for presentation clarity Not advocating particular designNot advocating particular design No claims of “right” formula, ways of doing thingsNo claims of “right” formula, ways of doing things One person’s opinion; “your mileage may vary”One person’s opinion; “your mileage may vary”
Example #1Example #1
Office environment multiprocessor serverOffice environment multiprocessor server
Topics:Topics: Interconnect/packaging schemeInterconnect/packaging scheme
Manufacturability considerations Manufacturability considerations performance performance Material characteristics Material characteristics micro-architecture micro-architecture
Power supplyPower supply Product usage environment Product usage environment micro-architecture micro-architecture
Interconnect/PackagingInterconnect/Packaging AssumptionsAssumptions
Multiple chips Multiple chips (more powerful than desktop)(more powerful than desktop) Not cheap Not cheap (e.g. US$50,000)(e.g. US$50,000) Have control of CPU designHave control of CPU design i.e. Traditional computer system companyi.e. Traditional computer system company
Approach:Approach: Low latency Low latency physically close together physically close together
Multichip ModuleMultichip Module
88 chip stacks
silicon substrate
printed circuit board
pressure plate 3000 wire bonds
alignment cage
heat distributor
12mm
4mm
14cm
10mm
Chip StackChip Stack
DRAMs
processors
router
12mm
10mm
0.3mm
12mm
10m width 20m pitch
stack shown upside down
Manufacturing IssuesManufacturing Issues Stacking technologyStacking technology
Limited production today, not discussed hereLimited production today, not discussed here Silicon substrateSilicon substrate
Process compatibility, availabilityProcess compatibility, availability Mechanically compliant connectionMechanically compliant connection
Thermal expansion mismatch, reliabilityThermal expansion mismatch, reliability Repair strategyRepair strategy Inventory, product mixInventory, product mix
Silicon SubstrateSilicon Substrate
12mm chip
12mm
4mm spacer
4mm
maximum cut-set2048 p-to-p links
150m pitch
3200 wire bondssubstrate to PCB
200mm(8 inch)wafer
maximum tracelength 24.8cm
14cm
Stack to Substrate Stack to Substrate ConnectionConnection
silicon substrate
router chip
DRAMs
wirebondsprings
conventionalwirebond pads
heat
Mechanical ConstraintsMechanical Constraints Machined parts need Machined parts need
several mils toleranceseveral mils tolerance
chipstack
75m clearance(0.003 inch or 3 mils)
alignment cage
250m pitch
125m pad
125m space
75m tolerance
chip stack
substrate
ImplicationsImplications Manufacturing infrastructureManufacturing infrastructure
64 stacks, 200mm wafer 64 stacks, 200mm wafer 12 12××12mm die12mm die 250250µµm pad pitch m pad pitch 2304 pads 2304 pads
Thermal densityThermal density Stacked CPU’s not feasibleStacked CPU’s not feasible
Goal: low latency interconnect scheme in Goal: low latency interconnect scheme in this contextthis context
646464 Full Crossbar?64 Full Crossbar? TradeoffsTradeoffs
Electrical delay: off-chip crossings, data skewElectrical delay: off-chip crossings, data skew Logical latency: contention, queuingLogical latency: contention, queuing Per-link bandwidth: memory hot-spotsPer-link bandwidth: memory hot-spots
Design Design Source Synchronous LinksSource Synchronous Links
8 data, 2 (differential) clock wires 8 data, 2 (differential) clock wires (20% overhead)(20% overhead) 63632210 10 1260 signals / stack 1260 signals / stack (45% power/ground)(45% power/ground)
Cut-set: 20,480 signals Cut-set: 20,480 signals (track (track 7 7mm))
PhysicsPhysics Delay Delay distance distance (speed of light)(speed of light) Distant bandwidth Distant bandwidth wire pipelining wire pipelining
transmission line transmission line low resistance low resistance R < Z0 reflections, need terminator Z0 ≤ R ≤ 2Z0 self terminating 2Z0 < R cannot wave pipeline
Impedance Z0 is function of material, dimensions
Basic FormulasBasic Formulas
[+ 0.06 + 1.66 0.14 ]( )WT ( )H
T ( )HT
0.222
( )TS
1.34
= 1.15 + 2.80( )WT
0.222
C ( )H
T
0.222
R = LW H
Z0 =
C0 C
Bakoglu, H.B., Bakoglu, H.B., Circuits, Interconnections, and Packaging for VLSICircuits, Interconnections, and Packaging for VLSI, Addison-Wesley, , Addison-Wesley, 19901990
Design ChallengeDesign Challenge MaterialsMaterials
Copper, Copper, 1.7 1.7 mmcmcm ““Low-K” Insulator, Low-K” Insulator, 3.0 3.0
Problem Cut-set narrow ( narrow (4µm) wire Corner-to-corner 25mm Resistance ≈ 150Ω Impedance Z0 ≈ 25Ω
Interconnect DimensionsInterconnect Dimensions
4 3.5 7.5
6
5 10 5.5 15 7.5
8 VDD
VSS
2 3
width W space S pitch
height H
insulation thickness T
4.5
SubstrateSubstrate FeaturesFeatures
Self-terminating transmission linesSelf-terminating transmission lines 11 L L 7.2 7.2cmcm R R 51 51 ZZ00 27 27 22 L L 18.4 18.4cmcm R R 52 52 ZZ00 26 26 33 L L 24.8 24.8cmcm R R 47 47 ZZ00 27 27
Integral power grid: shielding, image currentIntegral power grid: shielding, image current ““Sweet spot”Sweet spot”
Fast: sub- 2ns corner-to-cornerFast: sub- 2ns corner-to-corner High bandwidth: 2 Gbits/s/wireHigh bandwidth: 2 Gbits/s/wire Cheap: 7 metal Cheap: 7 metal (3 X•Y (3 X•Y pad) pad) wafer, 2wafer, 2µµm lithographym lithography
Power ConsiderationsPower Considerations Installation environmentInstallation environment
Home, office, server room?Home, office, server room? Heat DensityHeat Density
Liquid cooling, heat pipe?Liquid cooling, heat pipe? Heat DissipationHeat Dissipation
Physical size, fan noise?Physical size, fan noise?
EnergyEnergy
first stage
second stage
120A 12V DC
1,280A 1V DC-10%
-5%
15A 110V AC
10% variation
Office (USA)Office (USA) Peak Peak ≈≈1300W1300W Sustained Sustained ≈≈500W500W Human comfortHuman comfort
ImplicationsImplications Stack Stack 20W 20W
ImplicationsImplications
DRAMs
Processor
router
10W CPU+ 2W L2 cache
(12W total)
3W logic +1W substrate
(4W total)
41W activesimultaneously
(4W total)
LimitLimit MHzMHz CPU micro-CPU micro-
architecturearchitecture Memory Memory
bandwidthbandwidth Router Router
performanceperformance ……
Example #2Example #2
3-D graphics chip for PC3-D graphics chip for PC
TopicsTopics Design costDesign cost
Example numbers (huge variances!)Example numbers (huge variances!) Return on investmentReturn on investment
Impact on development costImpact on development cost
Example Project ScheduleExample Project Schedule
ArchitectureArchitecture
Logical Design (RTL)Logical Design (RTL)
Logical VerificationLogical Verification
Physical Design (Synthesis, P&R)Physical Design (Synthesis, P&R)
Physical Verification (Timing)Physical Verification (Timing)
Prototype Fabrication, PackagePrototype Fabrication, Package
Silicon DebugSilicon Debug
In-System VerificationIn-System Verification
2nd Physical Design2nd Physical Design
2nd Physical Verification2nd Physical Verification
Production FabricationProduction Fabrication
9 months
12 month
9 month
6 month
3
2
4
6 month
3
3
9 months1st tapeout
customersamples
2nd tapeout
Years1 2.250.25 0.5 0.75 1.25 1.5 1.75 2 2.5 2.75 3
Variations Startup company: architecture –6 Big company: verification +6
Example Resource NeedsExample Resource Needs
PerformancePerformance
Logical DesignLogical Design
Logical VerificationLogical Verification
Physical DesignPhysical Design
Physical VerificationPhysical Verification
Package DesignPackage Design
Reference Board DesignReference Board Design
In-System VerificationIn-System Verification
2nd Physical Design2nd Physical Design
2nd Physical Verification2nd Physical Verification
DocumentationDocumentation
5 – 25 People
5 – 25
5 - 50
2 - 30
1 - 3
1 - 5
1 - 5
2 – 10
Range: Startup – mature company
1 2.250.25 0.5 0.75 1.25 1.5 1.75 2 2.5 2.75 3Years
CostsCosts Development
Approximation: person year = $ ⅓ M $150K Salary + 50% Benefits + 50% Equipment +
20% Facilities Variation ($20M - $200M) risk exposure risk exposure
Fallacy: design cost design complexity Manufacturing
60mm2 die ≈ $10 (estimate 70% yield) Ball grid array package, assembly ≈ $5
Return On InvestmentReturn On Investment FactorsFactors
Amount of money expendedAmount of money expended Time valueTime value Opportunity costOpportunity cost
Reasonable ROI: 5-10Reasonable ROI: 5-10 after 4 years after 4 years New development very risky compared to New development very risky compared to
selling existing productsselling existing products Many, many non-technical risksMany, many non-technical risks
Case StudyCase Study PC graphics chipPC graphics chip
$20M development, 3 years$20M development, 3 years $15 per-unit manufacturing cost$15 per-unit manufacturing cost Lifetime volume: 2M units?Lifetime volume: 2M units? Desired price: $15 + 5($20M/2M units) = $65Desired price: $15 + 5($20M/2M units) = $65
Architecture impacts development cost: e.g. Architecture impacts development cost: e.g. super-pipeline super-pipeline circuit style circuit style CAD tools CAD tools people resources people resources
ConclusionConclusion Industrial computer architecture: plan Industrial computer architecture: plan
mapping vision to realitymapping vision to reality VisionVision
Performance goals; micro-architecture, ROI, …Performance goals; micro-architecture, ROI, … RealityReality
Electrical, mechanical, thermal physics; financial Electrical, mechanical, thermal physics; financial constraints; people’s feelings; …constraints; people’s feelings; …
PlanPlan ““Convince me to bet on you…” [author’s opinion]Convince me to bet on you…” [author’s opinion]
CommentComment As computer industry moves to “System-As computer industry moves to “System-
On-a-Chip” (SOC) products, there is a On-a-Chip” (SOC) products, there is a huge demand for computer architects that huge demand for computer architects that understand and are able to optimize in understand and are able to optimize in broad contexts.broad contexts.