predictive models for bandwidth sharing in high ...€¦ · predictive models for bandwidth sharing...

29
Introduction Analysis of concurrent communications Modelization Evaluation Conclusion Predictive models for bandwidth sharing in high performance clusters erˆ ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT, Jean-Fran¸ cois MEHAUT Laboratoire LIG, Mescal Team, Grenoble, France. BULL - HPC, Benchmark Team, ´ Echirolles, France. 30 September 2008 1 / 22 Predictive models for bandwidth sharing in high performance clusters

Upload: others

Post on 26-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Predictive models for bandwidth sharing inhigh performance clusters

Jerome VIENNE, Maxime MARTINASSO,Jean-Marc VINCENT, Jean-Francois MEHAUT

Laboratoire LIG, Mescal Team, Grenoble, France.

BULL - HPC, Benchmark Team, Echirolles, France.

30 September 2008

1 / 22Predictive models for bandwidth sharing in high performance clusters

Page 2: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Composition of Clusters

Memory Memory

Node 1

Memory Memory Memory Memory

Node 2 Node 3

Switch

2 / 22Predictive models for bandwidth sharing in high performance clusters

Page 3: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Network concurrency

Memory Memory

Node 1

Memory Memory Memory Memory

Node 2 Node 3

Switch

Sharing of the network resource

3 / 22Predictive models for bandwidth sharing in high performance clusters

Page 4: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Network concurrency

Memory Memory Memory Memory Memory Memory

Node 2Node 1 Node 3

Switch

Task

t1 t2 t3

t5

t4 t6

Communication Contention point

3 / 22Predictive models for bandwidth sharing in high performance clusters

Page 5: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Network concurrency

Memory Memory Memory Memory Memory Memory

Node 1 Node 2 Node 3

Switch

Contention pointCommunication

t1 t2 t3

t5

t4 t6

Task

3 / 22Predictive models for bandwidth sharing in high performance clusters

Page 6: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Network concurrency

MemoryMemory Memory Memory Memory Memory

Node 1 Node 2 Node 3

Switch

Contention pointCommunication

t1 t2 t3

t5

t4 t6

Task

3 / 22Predictive models for bandwidth sharing in high performance clusters

Page 7: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Network concurrency

MemoryMemoryMemoryMemory Memory Memory

Node 1 Node 2 Node 3

Switch

Contention pointCommunication

t1 t2 t3

t5

t4 t6

Task

3 / 22Predictive models for bandwidth sharing in high performance clusters

Page 8: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Network concurrency

Memory Memory Memory Memory MemoryMemory

Node 1 Node 2 Node 3

Switch

Contention pointCommunication

t1 t2 t3

t5

t4 t6

Task

3 / 22Predictive models for bandwidth sharing in high performance clusters

Page 9: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Plan

1 Experiences and analysis of concurrent communicationsExperimental protocolExperiences

2 Modelization of concurrent communicationsExisting modelsOur modelisation

3 Models EvaluationSynthetic GraphsComparaison between simple models

4 Conclusion

4 / 22Predictive models for bandwidth sharing in high performance clusters

Page 10: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Plan

1 Experiences and analysis of concurrent communicationsExperimental protocolExperiences

2 Modelization of concurrent communicationsExisting modelsOur modelisation

3 Models EvaluationSynthetic GraphsComparaison between simple models

4 Conclusion

5 / 22Predictive models for bandwidth sharing in high performance clusters

Page 11: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Objectives

Study of network concurrency

High Performance Networks (Myrinet, Gigabit Ethernet)

Different kind of conflicts

Analysis of performance

⇒ Observation of communication times

6 / 22Predictive models for bandwidth sharing in high performance clusters

Page 12: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Experimental Method

Software Support

Communication interface MPI : Mpich

Synchronous communication : MPI Send/MPI Recv

Creation of a Benchmark

A set of MPI tasks only sending or receiving.

Communications : Same start time and size for identical datas

Measurement : issue rate

7 / 22Predictive models for bandwidth sharing in high performance clusters

Page 13: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Communication patterns

Synthetic graphs

SyntheticTask Cluster

Binding

Synthetic graph

Communication patterns : conflicts

⇒ Comparison of times without conflict and with conflicts⇒ Progressive augmentation of pattern complexity

8 / 22Predictive models for bandwidth sharing in high performance clusters

Page 14: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Basic experiences

Outgoing/Incoming Conflict

Myrinet (loglog)

Myrinet

Size Ref.O/I conflict

Com 1 Com 216B [µs] 0.5 0.5 0.5

16KB [ms] 0.003 0.003 0.00316MB [s] 0.071 0.071 0.071

Ratio 1 1 1

9 / 22Predictive models for bandwidth sharing in high performance clusters

Page 15: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Basic experiences

Outgoing/Incoming Conflict

Gigabit Ethernet (loglog)

Gigabit Ethernet

Size Ref.O/I conflict

Com 1 Com 216B [µs] 4.2 4.3 5.0

16KB [ms] 0.033 0.033 0.03716MB [s] 0.194 0.218 0.207

Ratio 1 1.12 1.06

9 / 22Predictive models for bandwidth sharing in high performance clusters

Page 16: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Basic experiences

Incoming/Incoming or Outgoing/Outgoing Conflicts

Myrinet (loglog)

Myrinet

Size Ref.I/I conflict

Com 1 Com 216B [µs] 0.5 0.5 0.6

16KB [ms] 0.003 0.004 0.00316MB [s] 0.071 0.138 0.138

Ratio 1 1.94 1.94

10 / 22Predictive models for bandwidth sharing in high performance clusters

Page 17: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Basic experiences

Incoming/Incoming or Outgoing/Outgoing Conflicts

Gigabit Ethernet (loglog)

Gigabit Ethernet

Size Ref.I/I conflict

Com 1 Com 216B [µs] 4.2 4.7 4.2

16KB [ms] 0.033 0.033 0.03716MB [s] 0.194 0.290 0.291

Ratio 1 1.5 1.5

10 / 22Predictive models for bandwidth sharing in high performance clusters

Page 18: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Elaborated experiences

Myrinet

(1)

(1.9)(1.9)(2.8)

(2.8)

(2.8)

(2.8)

(2.8) (2.8)

(1.45)

(4.4)

(4.2)

(4.2)

(2.5)(2.5)

(4.5)

(4.5)

(4.5)

(2.5) (1.3)(2.5)

Gigabit Ethernet

(1)

(1.5)(1.5)(2.25)

(2.25)

(2.25)

(2.15)

(2.15)

(2.15)

(1.15)

(4.4)

(2.6)

(2.6)

(2.6)(2.6)

(4.4)

(2.0)

(3.3)

(2.6) (1.4)(2.6)

Gigabit Ethernet +TCP : Better concurrency management

11 / 22Predictive models for bandwidth sharing in high performance clusters

Page 19: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Plan

1 Experiences and analysis of concurrent communicationsExperimental protocolExperiences

2 Modelization of concurrent communicationsExisting modelsOur modelisation

3 Models EvaluationSynthetic GraphsComparaison between simple models

4 Conclusion

12 / 22Predictive models for bandwidth sharing in high performance clusters

Page 20: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Existing models of communication(without concurrency)

Parameters :

L : Latency

β : Bandwidth

m : Message size

LogP, LogGP, pLogP

LogP et LogGP :

2o + L + (m − 1) ∗ Go overhead, G = 1/β

Generalization : pLogP

o(m), g(m)

Hockney Model (linear)

t(m) = m/β + L

13 / 22Predictive models for bandwidth sharing in high performance clusters

Page 21: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Modelisation : goals

Complete existing models to modelise the concurrency⇒ Predict communication times taking into account conflicts

Structure of the model

Enriched linear model, penality p :

t(m) = p ∗m/β + L

Evolution of communication patterns

14 / 22Predictive models for bandwidth sharing in high performance clusters

Page 22: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Determine penalties for each network

Gigabit Ethernet

Complex flow control : TCP

Many constructors : heterogeneous components

⇒ Model : Quantitative approach (parameters + measures)

Myrinet

Simple flow control

Only one constructor : homogeneous components

⇒ Model : Descriptive approach

15 / 22Predictive models for bandwidth sharing in high performance clusters

Page 23: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Plan

1 Experiences and analysis of concurrent communicationsExperimental protocolExperiences

2 Modelization of concurrent communicationsExisting modelsOur modelisation

3 Models EvaluationSynthetic GraphsComparaison between simple models

4 Conclusion

16 / 22Predictive models for bandwidth sharing in high performance clusters

Page 24: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Evaluation method

Comparisons of predicted and measured values

Synthetic Graphs : trees et complete graphs

Application graph : benchmark Linpack (HPL)

Formulas

Erel(ck) =Tp − Tm

Tm× 100

Eabs(G ) =1

N

N∑k=1

|Erel(ck)|

17 / 22Predictive models for bandwidth sharing in high performance clusters

Page 25: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Synthetic graphs : trees

Trees

a

bc

d

e g

f

Ar1

a

bc

d

e gf

Ar2

MyrinetAr1

com. Tm Tp Erela 0.097 0.107 10.3b 0.103 0.107 3.9c 0.092 0.107 16.3d 0.067 0.071 6.0e 0.070 0.071 1.4f 0.065 0.071 9.2g 0.070 0.071 1.4

Eabs = 6.9

Ar2com. Tm Tp Erel

a 0.087 0.089 2.3b 0.087 0.089 2.3c 0.070 0.071 1.4d 0.052 0.053 1.9e 0.037 0.035 -5.4f 0.051 0.053 3.9g 0.070 0.071 1.4

Eabs = 2.6

Gigabit EthernetAr1

com. Tm Tp Erela 0.223 0.214 -4.0b 0.170 0.214 25.9c 0.226 0.214 -4.0d 0.147 0.143 -2.7e 0.146 0.143 -2.0f 0.147 0.143 -2.7g 0.146 0.143 -2.0

Eabs = 6.0

Ar2com. Tm Tp Erel

a 0.166 0.147 -11.4b 0.164 0.147 -10.4c 0.147 0.143 -2.7d 0.140 0.137 -2.1e 0.131 0.100 -23.7f 0.131 0.137 4.6g 0.145 0.143 -1.4

Eabs = 8.0

18 / 22Predictive models for bandwidth sharing in high performance clusters

Page 26: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Synthetic graphs : complete graphs

Graphs

c

f

e

i

j

h

b

a d

g

Kn1

c

f

e

i

j

b

a d

g

h

Kn2

MyrinetKn1

com. Tm Tp Erela 0.171 0.191 11.7b 0.171 0.191 11.7c 0.171 0.191 11.7d 0.171 0.191 11.7e 0.149 0.144 -3.3f 0.149 0.144 -3.3g 0.149 0.144 -3.3h 0.123 0.117 -4.9i 0.123 0.117 -4.9j 0.083 0.099 19.3

Eabs = 8.6

Kn2com. Tm Tp Erel

a 0.164 0.177 7.9b 0.164 0.177 7.9c 0.164 0.177 7.9d 0.164 0.177 7.9e 0.043 0.053 23.2f 0.086 0.085 -1.2g 0.087 0.085 -2.3h 0.108 0.101 -6.5i 0.108 0.101 -6.5j 0.059 0.073 23.7

Eabs = 9.5

Gigabit EthernetKn1

com. Tm Tp Erela 0.257 0.252 -1.9b 0.256 0.252 -1.6c 0.270 0.252 -6.7d 0.311 0.302 -2.9e 0.181 0.189 4.4f 0.207 0.206 -0.5g 0.274 0.276 0.7h 0.182 0.206 13.2i 0.258 0.276 7.0j 0.287 0.276 -3.8

Eabs = 4.3

Kn2com. Tm Tp Erel

a 0.270 0.285 5.5b 0.284 0.304 7.0c 0.284 0.304 7.0d 0.270 0.285 5.5e 0.184 0.137 -25.5f 0.167 0.137 -17.9g 0.194 0.206 6.2h 0.198 0.206 4.4i 0.215 0.206 -4.2j 0.188 0.206 9.6

Eabs = 9.3

19 / 22Predictive models for bandwidth sharing in high performance clusters

Page 27: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Comparison with simple models

Simple models

p = 1, Hockney model without concurrency

p = ∆s , proportionnal sharing between communications with thesame origin

p = ∆e , proportionnal sharing between communications with thesame destination

Comparison

Eabs on Myrinet [%] Eabs on Gigabit Ethernet [%]Models Ar1 Ar2 Kn1 Kn2 Ar1 Ar2 Kn1 Kn2p = 1 54.2 40.7 74.8 62.8 42.3 33.7 60 55.7

p = ∆s 54.2 15.7 30.8 21.8 42.3 26.2 37.9 28.7p = ∆e 6.9 30.7 33.4 31.3 37.3 31 36.4 27.4

Our models 6.9 2.6 8.6 9.5 6 8 4.3 9.3

20 / 22Predictive models for bandwidth sharing in high performance clusters

Page 28: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Conclusion

Problem

Bandwith sharing between concurrent communications

High Performance Network : Myrinet et Gigabit Ethernet

Experimentations

Study of communication patterns

Concept of conflict and penalty

21 / 22Predictive models for bandwidth sharing in high performance clusters

Page 29: Predictive models for bandwidth sharing in high ...€¦ · Predictive models for bandwidth sharing in high performance clusters J er^ome VIENNE, Maxime MARTINASSO, Jean-Marc VINCENT,

Introduction Analysis of concurrent communications Modelization Evaluation Conclusion

Conclusion

Modelization

Model to get penalties :

Descriptive approach : MyrinetQuantitative approach : Gigabit Ethernet

Estimation of communication times

Dynamic evolution of patternsDicrete event simulation

Futur Work

Study of contention on Infiniband network

Estimation of computation time

22 / 22Predictive models for bandwidth sharing in high performance clusters