Download - In conjunction with ICS 2000 Is S-DSM Dead?
![Page 1: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/1.jpg)
URCS 3/12/02 1
Is S-DSM Dead?
Michael L. Scott
Department of Computer ScienceUniversity of Rochester
Keynote Address2nd Workshop on
Software-Distributed Shared Memory
In conjunction with ICS 2000Santa Fe, NM, May 2000
![Page 2: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/2.jpg)
URCS 3/12/02 2
The Company We Keep?
� David Cheriton, SOSP 1999:» Capabilities are dead; no
» S-DSM is dead; no
» Active networks are dead; no
» IPv6 is dead
� MLS:» Interconnection network topology
» Byzantine agreement
» Load balancing and scheduling
![Page 3: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/3.jpg)
URCS 3/12/02 3
Why Are We Doing This?
� Shared memory an attractive programming model» Familiar
» Arguably simpler, esp. for non-performance-critical code
� Hardware coherence is faster, but software» Is cheaper» Can be built faster (sooner to market, faster processors)
» Can use more complex protocols
» Is easier to tune/enhance/customize
» Is the only option on distributed machines
![Page 4: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/4.jpg)
URCS 3/12/02 4
The Price-Performance Curve
S-DSM maymaximize “bang forthe buck” forshared-memoryparallel computing
S-DSM CC-NUMAMemory Channel / VIA,
Shrimp, S-COMA, Typhoon,Hamlyn / Scheduled Transfer,PCI-SCI, T3E, Mercury, etc.
$
![Page 5: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/5.jpg)
URCS 3/12/02 5
Is S-DSM Dead?
� Yes» The key ideas seem to have been discovered; the rate of
innovation is way down
» Application writers are still choosing MPI
» Program committees aren’t looking for papers
� No» Speedups for well-written apps are very good
» Wide-spread use awaits production-quality systems
» The ideas are valuable in a wider domain; hence InterWeave(more later)
![Page 6: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/6.jpg)
URCS 3/12/02 6
Outline
� Historical perspective� Some thoughts about current status� What we don’t need� What we do need� A project that’s doing some of it (joint work with
Sandhya Dwarkadas and students)
Caveat: I’m trying to stir things up a bit; please don’ttake offense if I oversimplify, overstate things, orignore your favorite project!
![Page 7: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/7.jpg)
URCS 3/12/02 7
A Brief History of the Field
» 1986 Ivy» 1989 Shiva
» 1990 Munin
» 1992 LRC (TreadMarks)
» 1993 Sh. Regions, Midway
» 1994 AURC (Shrimp)» 1995 CRL
» 1996 Shasta
» 1997 Cashmere
» 1998 HLRC
The original idea (Kai Li)
Relaxed memory model,optimized protocols
Software-only protocols
Leverage special HW(User-level messages,multiprocessor nodes)
![Page 8: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/8.jpg)
URCS 3/12/02 8
Papers at Leading Conferences
Anybody see a trend?
0
2
4
6
8
10
12
1987 1989 1991 1993 1995 1997 1999
SOSPASPLOSISCAOSDI
![Page 9: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/9.jpg)
URCS 3/12/02 9
Why Haven’t WeConquered the World?
Been at this for 14 years, but:
� No major OS vendor packages S-DSM
� TreadMarks the only commercially-available system» Rice Spin-off
» Kuck and Associates OpenMP implementation» Some noteworthy successes (e.g. NIH/FastLink)
� High-end users (who might tolerate research code)still stick to MPI
Where are the other success stories?
![Page 10: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/10.jpg)
URCS 3/12/02 10
So Where Do We Stand Today?
� Converging on the “right” implementation» Relaxed memory model
» VM-based protocols (false sharing not a major issue)
» Multiprocessor nodes
» User-level network interface
� So-so performance» OK for well-written apps on modest numbers of nodes, but
» Nobody has demonstrated real scalability
![Page 11: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/11.jpg)
URCS 3/12/02 11
Cashmere Speedups
32 processors
0
5
10
15
20
25
30
35
Barnes LU CLU WaterSp WaterNSq EM3D Ilink Gauss SOR TSP
![Page 12: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/12.jpg)
URCS 3/12/02 12
Cashmere and MPI
0
5
10
15
20
25
30
EP IS SOR
MPICSM
![Page 13: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/13.jpg)
URCS 3/12/02 13
HPC Is Not Our Niche
� We’re not going to run S-DSM on 4000 nodes� We’re not going to match the performance of hand-
tuned MPI code� We’re not going to convert the national labs
What is our niche?Modest-sized clustersand the applications they run
![Page 14: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/14.jpg)
URCS 3/12/02 14
What We Don’t Need
� More protocol tweaks� New APIs� More isomorphic implementations� More SPLASH benchmarks� More “scalable” systems tested on 16 nodes
Are there only four big ideas?
![Page 15: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/15.jpg)
URCS 3/12/02 15
What We Need
� Single system image» good debuggers
» process and memory management
� Compiler integration� Non-scientific apps
» CSCW, OLTP, e-commerce, games
� Wide-area distribution(for functionality, not performance)» Heterogeneity» Application-specific memory models
» Fault tolerance
![Page 16: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/16.jpg)
URCS 3/12/02 16
Cashmere Plans
� Protocol tweaks; VIA, Linux, Myrinet ports� Compiler integration (ARCH)� Global memory management� Applications
» CFD in Astrophysics» Protein folding
» Laser fusion
� MPI comparison
� InterWeave
» Object recognition» Volumetric reconstruction
» Neural simulation
![Page 17: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/17.jpg)
URCS 3/12/02 17
InterWeave Motivation
� Convenient support for “satellite” nodes» remote visualization and steering (Astrophysics)
» client-server division of labor (datamining)
� True distributed apps» intelligent environments (AI group)
� Speedup probably not feasible; convenience theprincipal goal
![Page 18: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/18.jpg)
URCS 3/12/02 18
InterWeave Overview
� Sharing of persistent versioned segments, named byURLs
� User-specified relaxed coherence; reader-writer locks� Hash-based consistency� Full support for heterogeneity, using XDR and pointer
swizzling (Eduardo’s talk this afternoon)� Cashmere functions as a single InterWeave node
![Page 19: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/19.jpg)
URCS 3/12/02 19
Segment Creation
� Communicates with server to create segment(checking access rights)
� Allocates local cached copy (not necessarilycontiguous)
� Can follow pointers to other segments
IW_handle_t h = IW_create_segment (URL);IW_wl_acquire (h);my_type* p = (my_type *) IW_malloc (h, my_type_desc);*p = ...IW_wl_release (h);
![Page 20: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/20.jpg)
URCS 3/12/02 20
Coherence
� Relaxed reader-writer locks» Writer grabs current version (does not exclude readers)
» Reader checks to see if current cached copy (if any) is“recent enough”
� Multiple notions of “recent enough”» e.g. immediate, polled, temporal, delta, diff-based» from Beehive [Singla97], InterAct [LCR ’98]
� Cache whole segments; server need only keep mostrecent version, in machine-independent form
� Diff-based updates (both ways) based on block timestamps
![Page 21: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/21.jpg)
URCS 3/12/02 21
Consistency
� Need to respect happens-before� Invalidate cached version of A that is older than
version on which newly-obtained version of Bdepends
� Ideally want to know entire object history� Can approximate using hashing
» slot i of vector contains timestamp of most recent antecedenthashing to i
» invalidate A if B.vec[hash(A)] is newer
![Page 22: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/22.jpg)
URCS 3/12/02 22
InterWeave Related Work
� Distributed objects» Language-specific (Emerald/Amber/VDOM, Argus, Ada,
ORCA, numerous Java systems)
» Language-neutral (PerDiS, Legion, Globe, DCOM, CORBA,Fresco)
� Distributed state (Khazana, Active Harmony, Linda)
� Metacomputing (GLOBUS/GRID, Legion, WebOS)
� Heterogeneity, swizzling (RPC, LOOM, Java pickling)
� Multiple consistency models (Friedman et al., Agrawal etal., Alonso et al., Ramachandran et al., web caching work)
� Transactions, persistence (Feeley et al., Thor)
![Page 23: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/23.jpg)
URCS 3/12/02 23
Status
� Prototype running on Alpha cluster� Barnes-Hut demo running: Cashmere plus remote
front end� Distributed game in the works (MazeWars)� Coherence models in; consistency in the works� XDR compiler finished; heterogeneity in the works� Extended abstracts at WSDSM and LCR
see http://www.cs.rochester.edu/research/interweave
![Page 24: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/24.jpg)
URCS 3/12/02 24
What We Need (Reprise)
� Single system image» good debuggers» process and memory
management
� Compiler integration
� Non-scientific apps» CSCW, OLTP, games,
e-commerce
� Wide-area distribution» Heterogeneity» Application-specific memory
models» Fault tolerance
� Protocol tweaks� New APIs
� Re-implementations
� SPLASH benchmarks
� 16-node “scalable”systems
![Page 25: In conjunction with ICS 2000 Is S-DSM Dead?](https://reader035.vdocuments.mx/reader035/viewer/2022071613/6157e3967ead36228b6cc8a8/html5/thumbnails/25.jpg)
URCS 3/12/02 25
A Plug for LCR
Languages, Compilers, and Run-TimeSystems for Scalable Computers
May 25-27, 2000
Rochester, NY
Program and registration information available at
http://www.cs.rochester.edu/~LCR2k