march 11, 2003 ss-sq03-w: 1 stanford streaming supercomputer (sss) winter quarter 2002-2003 wrapup...

21
SS-SQ03-W: 1 March 11, 2003 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford University March 11, 2003

Post on 21-Dec-2015

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 1 March 11, 2003

Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting

Bill Dally, Computer Systems Laboratory

Stanford University

March 11, 2003

Page 2: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 2 March 11, 2003

Year 2 Overview

• Where we are today– First year goal was met: demonstrated feasibility on single node– Feedback from site visit team was very positive – Potential for a big impact on scientific computing– But still much to do!

• Key FY03 goals– Get long-term software infrastructure in place

• Select approach, implement baseline Brook to SSS compiler– Multi-node versions that scale

• Language, compiler, simulator– Tackle hard problems: 3-D, Irregular neighborhoods/sparse matrix

solve• Language support, numerics support, evaluate on simulator

– Refine architecture• Cluster organization, aspect ratio, register organization, memory

organization– Industrial Partner

• Start serious discussions, outreach to build support, close partner in 04

Page 3: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 3 March 11, 2003

Some concerns

• We’re doing a great job – but…• Losing a bit of focus and momentum

– Tooling on the detail– Need to take a step back and reexamine the big

picture

• Need to raise our outside profile – Publish

• Overview paper• Brook paper

– Generate some more convincing evidence of advantages

• Need a control for bandwidth measures

– Update the web page– Visit the labs

Page 4: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 4 March 11, 2003

Lets review our overall goal

Exploit capabilities of VLSI to realize cost-effective scientific computing.

Page 5: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 5 March 11, 2003

Review – What is the SSS Project About?

• Exploit streams to give 100x improvement in performance/cost for scientific applications vs. ‘cluster’ supercomputers

– From 100 GFLOPS PCs to TFLOPS single-board computers to PFLOPS supercomputers

• Use layered programming system to simplify development and tuning of applications

– Stream languages– Streaming virtual machine

• Demonstrated feasibility of streaming scientific computing in year 1• Refine architecture and programming system in year 2

– Demonstrate realistic applications (3D, irregular)– Build usable compiler– Resolve architecture questions – aspect ratio, conditional execution,

sparse clusters, reg organization, memory system, etc…• Build a prototype and demonstrate CITS applications in years 3-6

– With industrial and government partners– Broaden our base of support

Page 6: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 6 March 11, 2003

Industrial Partner Update

• Candidates– Cray, IBM, Sun, HP, SGI, Intel

• Initial discussion– Present SSS project and results to date– Discuss collaboration models– Identify next steps

• Met with Cray, Sun, and SGI– They listened politely, but little traction– Need more convincing evidence– Need to address programming issue

• Have to provide a path for legacy codes

Page 7: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 7 March 11, 2003

Outreach

• National Labs– Los Alamos– Livermore– Sandia

• Other Government– NASA– DARPA– DoD (Charlie Holland)– AFOSR

• User communities

Page 8: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 8 March 11, 2003

Software Win 02 Goals

Brook– Define carefully the semantics of the operators

• No progress– Work on “views of memory” abstraction

• Proposed API – will write up for next SW meeting– Support for partitioning, shared memory, naming, fitting into stream

abstraction• Adopting UPC – will write up for next SW meeting

– Support for irregular neighborhoods• Failed to find an application

– Multithreaded version (Christos)• Have simple model for multi-node – written up

– (NEW) Preliminary Brooktran spec– Concrete Winter goals [Ian/Frank]

• Review of the language [Pat]• Partitioning (UPC)• Multi-node/Multi-threaded version• Irregular support – w/ application• PPoPP paper• MD on BRT

Page 9: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 9 March 11, 2003

Brook Spring 03 Goals

• Refine semantics of operators– New version of spec

• Implement views of memory API (UPC)• Find application for irregular structures

– Dijkstra, incomplete LU

• Dynamic structure• Start switching to new compiler• Brooktran spec/implementation

– Implemented in Open64

• Concern – have lost metacompiler support

Page 10: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 10 March 11, 2003

Software Win 02 Goals

SVM– Spec has evolved

• Concensus between MIT, Texas, Stanford, USC

– Implement multinode version• No progress

– SVM to simulator path• No progress

– Multi-thread

Page 11: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 11 March 11, 2003

SVM – Spring 03 Goals

• Spec is complete – and supports SSS• Revise single-node simulator• Multi-node simulator (prelim)

Page 12: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 12 March 11, 2003

Software Win 02 Goals (3 of 3)

• Start regular meetings [Done]• Compiler

– Decide on flow from Brook->SVM->SSS [Mattan]• Done

– Select base compiler [Jayanth]• ORC, Gnu, SUIF, Tendra, others…• Done

– “Spike” a simple program from Brook->SSS [Mattan/Jayanth ++]

• Started – modified front end – operating on WHIRL– Brook to Nvidia– Optimizations [Spring]

• Run time– Write a white paper

Page 13: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 13 March 11, 2003

Compiler Spring 03 Goals

• Complete feasibility study• Brook to C path

– Parse Brook– Generate C

• Optimizations– See Mattan’s document

• Need to generate SVM code by mid summer

• Parse Brooktran [Alan, Fatica, Jayanth]• Kernel scheduler MULADD [Das]• SVM to SSS [Francois – long term – need plan]

Page 14: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 14 March 11, 2003

Application Win 02 Goals

• StreamFLO[Fatica]– Base version is complete– Not running on simulator – Early start on 3D version – partitioning waiting on API def

• StreamFEM [Barth]– Waiting on spec for partitioning– 3D arithmetic kernels done– Tridiagonal in Brook

• StreamMD [Eric/student]– Ported GROMACS to the NV30 – benchmarks

• Performance dependent on number of registers• Doesn’t work with CG compiler

• Model applications [Ron/Frank]– Started

• Look at Sierra, purple benchmarks: ppm, sweep3D [delay]

Page 15: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 15 March 11, 2003

Application Spring 03 Goals

• StreamFLO[Fatica]– Parse Brooktran – F to WHIRL [Alan, Fatica]– Partitioned version – multi-node UPC– 3D version

• StreamFEM [Barth]– Simulate 3D– Sparse LUD– Partitioned version

• StreamMD [Eric/student]– Hand-tune NV30 assembly code – GROMACS in Brook

• Model applications [Ron/Frank]– C implementations of adaptive structures

• Look at Sierra, purple benchmarks: ppm, sweep3D [delay]

Page 16: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 16 March 11, 2003

Architecture Win 02 Goals• Single-Node Simulator [Jung-Ho, Knight]

– 64-bit support, MULADD, Scalar Processor– Not yet

• Multi-Node Simulator [Jung-Ho, Abhishek]– Network model– Multi-node mechanisms– Not yet

• Point Studies– Aspect ratio

• SSE vs VLIW• Planning

– Conditional execution [Mattan/Ujval]• Started

– Sparse clusters– SRF organization [Nuwan]

• Complete– Cache alternatives [Jung Ho]– Add and store study [Jung Ho]

• Started– I/O– Iterative operations [Francois]

• Planned

Page 17: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 17 March 11, 2003

Architecture Spring 03 Goals

• Multi-node simulator• Point Studies

– Aspect ratio [TIM]– Conditional execution [Mattan/Ujval]– Sparse clusters [Delay]– SRF organization [Nuwan]

• Refine• Cache alternatives [Jung Ho]

– Add and store study [Jung Ho]– I/O [?]– Iterative operations [Francois]

• 64-bit [delay]• Scalar Processor [delay]

Page 18: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 18 March 11, 2003

Special Win 02 Goals

• Fix website [Pat]– Public and private websites

• Name that computer– Mississippi– Axios– Submit names to Mattan– Bill, Pat, Bill to choose

• Project Party [Mattan – Pat’s house]

Page 19: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 19 March 11, 2003

Name Resolution

• From now on, the SSS is called

Merrimac

Page 20: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 20 March 11, 2003

Winter Quarter Meeting Schedule

• 4/1 Fedkiw Party• 4/8 Alan, Fatica Brooktran• 4/15 Kapasi Conditionals• 4/22 Fatica StreamFLO update• 4/29 Review Prep• 5/6 Review Prep• 5/13 Tim, Tim StreamFEM 3D• 5/20 Ian, Pat Brook Specification• 5/27 Mattan Bandwidth Comparison• 6/3 Jayanth Compiler• 6/10 Bill Wrapup

Page 21: March 11, 2003 SS-SQ03-W: 1 Stanford Streaming Supercomputer (SSS) Winter Quarter 2002-2003 Wrapup Meeting Bill Dally, Computer Systems Laboratory Stanford

SS-SQ03-W: 21 March 11, 2003

Papers• Arch

– Indexable SRFs (Nuwan)– Streaming Supercomputer Overview (Tim K.)– Streaming on conventional CPUs (Mattan)– Conditionals (Ujval)– Remote Ops (Jung Ho)– Aspect Ratio (?)– Data parallel (SSE) vs. ILP (VLIW)

• Software– Design of Brook (Ian)– Data parallel programming on graphics HW (Pat)– Brook to CG

• Compiler• Apps

– Gromacs– StreamFEM (Tim2)

• Overview (Bill and Pat)