ogo 2.1 sgi origin 2000 robert van liere cwi, amsterdam tu/e, eindhoven 11 september 2001

23
OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Post on 23-Jan-2016

234 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

OGO 2.1SGI Origin 2000

Robert van Liere

CWI, Amsterdam

TU/e, Eindhoven

11 September 2001

Page 2: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

unite.sara.nl

• SGI Origin 2000• Located at SARA in Amsterdam

• Hardware configuration :– 128 MIPS R10000 CPUs @ 250 Mhz– 64 Gbyte main memory– 1 Tbyte disk storage– 11 ethernet @ 100 Mbits– 1 ethernet @ 1 Gbit

Page 3: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Contents

• Architecture– Overview– Module interconnect– Memory hierarchies

• Programming– Parallel models– Data placement

• Pros and cons

Page 4: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Overview - Features

• 64 bit RISC microprocessors

• Large main memory

• “Scalable” in CPU, memory and I/O

• Shared memory programming model

Page 5: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Overview - Applications

• Worldwide : +/- 30.000 systems – ~ 50 with >128 CPUs– ~ 100 with 64-128 CPUs– ~ 500 with 32-64 CPUs

• Computing serving : many CPUs and memory• Database serving : many disks• Web serving : many I/O

Page 6: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

System architecture – 1 CPU

• CPU + cache• One system bus• Memory• I/O (network + disk)

• Cached data

Page 7: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

System architecture – N CPU

• Symmetric multi-processing (SMP)

• Multi-CPU + caches• One shared bus• Memory• I/O

Page 8: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

N CPU – cache coherency

• Problem:– Inconsistent cached data

• Solution:– Snooping– Broadcasting

• Not scalable

Page 9: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Architecture – Origin 2000

• Node board

• 2 CPU + cache• Memory• Directory• HUB• I/O

Page 10: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Origin 2000 Interconnect

• Node boards

• Routers– Six ports

Page 11: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Interconnect Topology

Page 12: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Sample Topologies

Page 13: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

128 Topology

Page 14: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Virtual Memory

• One CPU, multi programs

• Page• Paging disk• Page replacement

Page 15: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

O2000 Virtual Memory

• Multi CPU, Multi progs

• Non-Uniform Memory Access

• Efficient programs:– Minimize data movement– Data “close” to CPU

Page 16: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Latencies and Bandwidth

Page 17: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Application performance

• Scientific computing– LU, ocean, barnes, radiosity

• Linear speedup– More CPUs -> performance

Page 18: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Programming support

• IRIX operating system• Parallel programming

– C source level with compiler pragmas– Posix Threads– UNIX processes

• Data placement– dplace , dlock, dperf

• Profiling– timex, ssrun

Page 19: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Parallel Programs

• Functional Decomposition– Decompose the problem into different tasks

• Domain Decomposition– Partition the problem’s data structure

• Consider– Mapping tasks/parts onto CPUs– Coordinate work and communication of CPUs

Page 20: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Task Decomposition

• Decompose problem

• Determine dependencies

Page 21: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Task Decomposition

• Map tasks on threads

• Compare:– Sequential case– Parallel case

Page 22: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Efficient programs

• Use many CPUs– Measure speedups

• Avoid:– Excessive data dependencies – Excessive cache misses– Excessive inter-node communication

Page 23: OGO 2.1 SGI Origin 2000 Robert van Liere CWI, Amsterdam TU/e, Eindhoven 11 September 2001

Pros vs Cons

• Multi-processor (128 )• Large memory (64 Gbyte)

• Shared memory programming

• Slow integer CPU

• Performance penalty:– Data dependencies– Off board memory