algorithms and scheduling techniques for heterogeneous...

143
Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on Algorithms and scheduling techniques for heterogeneous platforms Yves Robert ´ Ecole Normale Sup´ erieure de Lyon [email protected] http://graal.ens-lyon.fr/ yrobert joint work with Olivier Beaumont, Anne Benoit, Larry Carter, Henri Casanova, Jeanne Ferrante, Arnaud Legrand, Loris Marchal, Fr´ ed´ eric Vivien Tokyo, July 2007 [email protected] POP’07 Algorithms and scheduling techniques 1/ 79

Upload: others

Post on 21-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Algorithms and scheduling techniques forheterogeneous platforms

Yves Robert

Ecole Normale Superieure de [email protected]

http://graal.ens-lyon.fr/∼yrobert

joint work withOlivier Beaumont, Anne Benoit, Larry Carter, Henri Casanova,

Jeanne Ferrante, Arnaud Legrand, Loris Marchal, Frederic Vivien

Tokyo, July 2007

[email protected] POP’07 Algorithms and scheduling techniques 1/ 79

Page 2: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Evolution of parallel machines

From (good old) parallel architectures . . .

[email protected] POP’07 Algorithms and scheduling techniques 2/ 79

Page 3: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Evolution of parallel machines

. . . to heterogeneous clusters . . .

[email protected] POP’07 Algorithms and scheduling techniques 2/ 79

Page 4: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Evolution of parallel machines

. . . and to large-scale grid platforms?

[email protected] POP’07 Algorithms and scheduling techniques 2/ 79

Page 5: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Evolution of parallel machines

. . . and to large-scale grid platforms?

Parallel algorithm design and scheduling were already difficult taskswith homogeneous machines

[email protected] POP’07 Algorithms and scheduling techniques 2/ 79

Page 6: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Evolution of parallel machines

. . . and to large-scale grid platforms?

Parallel algorithm design and scheduling were already difficult taskswith homogeneous machinesOn heterogeneous platforms, it gets worse

[email protected] POP’07 Algorithms and scheduling techniques 2/ 79

Page 7: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

New platforms, new problems, new solutions

Target platforms: Large-scale heterogenous platforms(networks of workstations, clusters, collections of clusters, grids, ...)

New problems

Heterogeneity of processors (CPU power, memory)

Heterogeneity of communication links

Irregularity of interconnection networks

Non-dedicated platforms

Need to adapt algorithms and scheduling strategies: new objectivefunctions, new models

[email protected] POP’07 Algorithms and scheduling techniques 3/ 79

Page 8: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

New platforms, new problems, new solutions

Target platforms: Large-scale heterogenous platforms(networks of workstations, clusters, collections of clusters, grids, ...)

New problems

Heterogeneity of processors (CPU power, memory)

Heterogeneity of communication links

Irregularity of interconnection networks

Non-dedicated platforms

Need to adapt algorithms and scheduling strategies: new objectivefunctions, new models

[email protected] POP’07 Algorithms and scheduling techniques 3/ 79

Page 9: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Outline

1 Parallel algorithms for heterogeneous clusters

2 Background on traditional scheduling

3 Packet routing

4 Master-worker on heterogeneous platforms

5 Limitations & going further

6 Conclusion

[email protected] POP’07 Algorithms and scheduling techniques 4/ 79

Page 10: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Outline

1 Parallel algorithms for heterogeneous clusters

2 Background on traditional scheduling

3 Packet routing

4 Master-worker on heterogeneous platforms

5 Limitations & going further

6 Conclusion

[email protected] POP’07 Algorithms and scheduling techniques 5/ 79

Page 11: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Independent chunks

B independent equal-size tasks • • • • • • • • ••p processors P1, P2, . . ., Pp

wi = time for Pi to process a task •

Intuition: load of Pi proportional to its speed 1/wi

Assign ni tasks to Pi

Objective: minimize Texe = max∑pi=1 ni=B

(ni × wi )

[email protected] POP’07 Algorithms and scheduling techniques 6/ 79

Page 12: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Dynamic programming

With 3 processors: w1 = 3, w2 = 5, and w3 = 8

P1 • • • • • • • • •P2 N N N N N M M M M M N N N N NP3

Task n1 n2 n3 Texe Selected proc.0 0 0 0 11 1 0 0 3 22 1 1 0 5 13 2 1 0 6 34 2 1 1 8 15 3 1 1 9 26 3 2 1 10 17 4 2 1 12 18 5 2 1 15 29 5 3 1 15 310 5 3 2 16

[email protected] POP’07 Algorithms and scheduling techniques 7/ 79

Page 13: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Dynamic programming

With 3 processors: w1 = 3, w2 = 5, and w3 = 8

P1 • • • • • • • • •P2 N N N N N M M M M M N N N N NP3

Task n1 n2 n3 Texe Selected proc.0 0 0 0 11 1 0 0 3 22 1 1 0 5 13 2 1 0 6 34 2 1 1 8 15 3 1 1 9 26 3 2 1 10 17 4 2 1 12 18 5 2 1 15 29 5 3 1 15 310 5 3 2 16

[email protected] POP’07 Algorithms and scheduling techniques 7/ 79

Page 14: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Dynamic programming

With 3 processors: w1 = 3, w2 = 5, and w3 = 8

P1 • • • • • • • • •P2 N N N N N M M M M M N N N N NP3

Task n1 n2 n3 Texe Selected proc.0 0 0 0 11 1 0 0 3 22 1 1 0 5 13 2 1 0 6 34 2 1 1 8 15 3 1 1 9 26 3 2 1 10 17 4 2 1 12 18 5 2 1 15 29 5 3 1 15 310 5 3 2 16

[email protected] POP’07 Algorithms and scheduling techniques 7/ 79

Page 15: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Dynamic programming

With 3 processors: w1 = 3, w2 = 5, and w3 = 8

P1 • • • • • • • • •P2 N N N N N M M M M M N N N N NP3

Task n1 n2 n3 Texe Selected proc.0 0 0 0 11 1 0 0 3 22 1 1 0 5 13 2 1 0 6 34 2 1 1 8 15 3 1 1 9 26 3 2 1 10 17 4 2 1 12 18 5 2 1 15 29 5 3 1 15 310 5 3 2 16

[email protected] POP’07 Algorithms and scheduling techniques 7/ 79

Page 16: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Dynamic programming

With 3 processors: w1 = 3, w2 = 5, and w3 = 8

P1 • • • • • • • • •P2 N N N N N M M M M M N N N N NP3

Task n1 n2 n3 Texe Selected proc.0 0 0 0 11 1 0 0 3 22 1 1 0 5 13 2 1 0 6 34 2 1 1 8 15 3 1 1 9 26 3 2 1 10 17 4 2 1 12 18 5 2 1 15 29 5 3 1 15 310 5 3 2 16

[email protected] POP’07 Algorithms and scheduling techniques 7/ 79

Page 17: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Dynamic programming

With 3 processors: w1 = 3, w2 = 5, and w3 = 8

P1 • • • • • • • • •P2 N N N N N M M M M M N N N N NP3

Task n1 n2 n3 Texe Selected proc.0 0 0 0 11 1 0 0 3 22 1 1 0 5 13 2 1 0 6 34 2 1 1 8 15 3 1 1 9 26 3 2 1 10 17 4 2 1 12 18 5 2 1 15 29 5 3 1 15 310 5 3 2 16

[email protected] POP’07 Algorithms and scheduling techniques 7/ 79

Page 18: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Dynamic programming

With 3 processors: w1 = 3, w2 = 5, and w3 = 8

P1 • • • • • • • • •P2 N N N N N M M M M M N N N N NP3

Task n1 n2 n3 Texe Selected proc.0 0 0 0 11 1 0 0 3 22 1 1 0 5 13 2 1 0 6 34 2 1 1 8 15 3 1 1 9 26 3 2 1 10 17 4 2 1 12 18 5 2 1 15 29 5 3 1 15 310 5 3 2 16

[email protected] POP’07 Algorithms and scheduling techniques 7/ 79

Page 19: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Dynamic programming

With 3 processors: w1 = 3, w2 = 5, and w3 = 8

P1 • • • • • • • • •P2 N N N N N M M M M M N N N N NP3

Task n1 n2 n3 Texe Selected proc.0 0 0 0 11 1 0 0 3 22 1 1 0 5 13 2 1 0 6 34 2 1 1 8 15 3 1 1 9 26 3 2 1 10 17 4 2 1 12 18 5 2 1 15 29 5 3 1 15 310 5 3 2 16

[email protected] POP’07 Algorithms and scheduling techniques 7/ 79

Page 20: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Dynamic programming

With 3 processors: w1 = 3, w2 = 5, and w3 = 8

P1 • • • • • • • • •P2 N N N N N M M M M M N N N N NP3

Task n1 n2 n3 Texe Selected proc.0 0 0 0 11 1 0 0 3 22 1 1 0 5 13 2 1 0 6 34 2 1 1 8 15 3 1 1 9 26 3 2 1 10 17 4 2 1 12 18 5 2 1 15 29 5 3 1 15 310 5 3 2 16

[email protected] POP’07 Algorithms and scheduling techniques 7/ 79

Page 21: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Dynamic programming

With 3 processors: w1 = 3, w2 = 5, and w3 = 8

P1 • • • • • • • • •P2 N N N N N M M M M M N N N N NP3

Task n1 n2 n3 Texe Selected proc.0 0 0 0 11 1 0 0 3 22 1 1 0 5 13 2 1 0 6 34 2 1 1 8 15 3 1 1 9 26 3 2 1 10 17 4 2 1 12 18 5 2 1 15 29 5 3 1 15 310 5 3 2 16

[email protected] POP’07 Algorithms and scheduling techniques 7/ 79

Page 22: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Dynamic programming

With 3 processors: w1 = 3, w2 = 5, and w3 = 8

P1 • • • • • • • • •P2 N N N N N M M M M M N N N N NP3

Task n1 n2 n3 Texe Selected proc.0 0 0 0 11 1 0 0 3 22 1 1 0 5 13 2 1 0 6 34 2 1 1 8 15 3 1 1 9 26 3 2 1 10 17 4 2 1 12 18 5 2 1 15 29 5 3 1 15 310 5 3 2 16

[email protected] POP’07 Algorithms and scheduling techniques 7/ 79

Page 23: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Static versus dynamic

Greedy (demand-driven) would have done a perfect job

Would even be better (possible variations in processor speeds)

Static assignment required useless thinking /

[email protected] POP’07 Algorithms and scheduling techniques 8/ 79

Page 24: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Static versus dynamic

Greedy (demand-driven) would have done a perfect job

Would even be better (possible variations in processor speeds)

Static assignment required useless thinking /

[email protected] POP’07 Algorithms and scheduling techniques 8/ 79

Page 25: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Coping with dependences

x x xx x x

x x xx x x

x x xx x x

x x xx x x

x x xx x x

x x xx x x

x x xx x x

x x xx x x

x x xx x x

x x xx x x

x x xx x x

x x xx x x

x x xx x x

x x xx x x

x x xx x x

j

i

x x xx x x

x x xx x x

N

N

1

2

x x xx x x

x x xx x x

T2,3

A simple finite difference problem

Iteration space: 2D rectangle of size N1 × N2

Dependences between tiles

(10

),

(01

)

[email protected] POP’07 Algorithms and scheduling techniques 9/ 79

Page 26: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Allocation strategy (1/3)

Use column-wise allocation to enhance locality

...

6

5 → 6

4 → 5 → 6

3 → 4 → 5

2 → 3 → 4

1 → 2 → 3 . . .

Stepwise execution

[email protected] POP’07 Algorithms and scheduling techniques 10/ 79

Page 27: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Allocation strategy (1/3)

Use column-wise allocation to enhance locality

...

6

5 → 6

4 → 5 → 6

3 → 4 → 5

2 → 3 → 4

1 → 2 → 3 . . .

Stepwise execution

[email protected] POP’07 Algorithms and scheduling techniques 10/ 79

Page 28: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Allocation strategy (1/3)

Use column-wise allocation to enhance locality

...

6

5 → 6

4 → 5 → 6

3 → 4 → 5

2 → 3 → 4

1 → 2 → 3 . . .

Stepwise execution

[email protected] POP’07 Algorithms and scheduling techniques 10/ 79

Page 29: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Allocation strategy (1/3)

Use column-wise allocation to enhance locality

...

6

5 → 6

4 → 5 → 6

3 → 4 → 5

2 → 3 → 4

1 → 2 → 3 . . .

Stepwise execution

[email protected] POP’07 Algorithms and scheduling techniques 10/ 79

Page 30: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Allocation strategy (1/3)

Use column-wise allocation to enhance locality

...

6

5 → 6

4 → 5 → 6

3 → 4 → 5

2 → 3 → 4

1 → 2 → 3 . . .

Stepwise execution

[email protected] POP’07 Algorithms and scheduling techniques 10/ 79

Page 31: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Allocation strategy (1/3)

Use column-wise allocation to enhance locality

...

6

5 → 6

4 → 5 → 6

3 → 4 → 5

2 → 3 → 4

1 → 2 → 3 . . .

Stepwise execution

[email protected] POP’07 Algorithms and scheduling techniques 10/ 79

Page 32: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Allocation strategy (1/3)

Use column-wise allocation to enhance locality

...

6

5 → 6

4 → 5 → 6

3 → 4 → 5

2 → 3 → 4

1 → 2 → 3 . . .

Stepwise execution

[email protected] POP’07 Algorithms and scheduling techniques 10/ 79

Page 33: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Allocation strategy (2/3)

With column-wise allocation,

Topt ≈N1 × N2∑p

i=11wi

.

Greedy (demand-driven) allocation ⇒ slowdown ?!Execution progresses at the pace of the slowest processor /

[email protected] POP’07 Algorithms and scheduling techniques 11/ 79

Page 34: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Allocation strategy (2/3)

With column-wise allocation,

Topt ≈N1 × N2∑p

i=11wi

.

Greedy (demand-driven) allocation ⇒ slowdown ?!Execution progresses at the pace of the slowest processor /

[email protected] POP’07 Algorithms and scheduling techniques 11/ 79

Page 35: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Allocation strategy (3/3)

With 3 processors, w1 = 3, w2 = 5, and w3 = 8:

...

33

w =3

w =5 1

2

w =8P

P

P1

2

i

j

Texe ≈8

3N1N2 ≈ 2.67 N1N2

Topt ≈120

79N1N2 ≈ 1.52 N1N2

[email protected] POP’07 Algorithms and scheduling techniques 12/ 79

Page 36: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Periodic static allocation (1/2)

With 3 processors, w1 = 3, w2 = 5, and w3 = 8:

...

33

w =3

w =5 1

2

w =8P

P

P1

2

i

j

Assigning blocks of B = 10 columns, Texe ≈ 1.6 N1N2

[email protected] POP’07 Algorithms and scheduling techniques 13/ 79

Page 37: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Periodic static allocation (2/2)

L = lcm(w1,w2, . . . ,wp)Example: L = lcm(3, 5, 8) = 120

P1 receives first n1 = L/w1 columns, P2 next n2 = L/w2

columns, and so on

Period: block of B = n1 + n2 + . . . + np contiguous columnsExample: B = n1 + n2 + n3 = 40 + 24 + 15 = 79

Change schedule:- Sort processors so that n1w1 ≤ n2w2 ≤ . . . ≤ npwp

- Process horizontally within blocks

Optimal ,

[email protected] POP’07 Algorithms and scheduling techniques 14/ 79

Page 38: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Lesson learnt?

With different-speed processors . . .. . . we need to think (design static schedules)

. . . but implementation may remain dynamic ,

Example: demand-driven allocation of blocks of adequate size

. . . well, in some cases it gets truly complicated /

[email protected] POP’07 Algorithms and scheduling techniques 15/ 79

Page 39: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Lesson learnt?

With different-speed processors . . .. . . we need to think (design static schedules)

. . . but implementation may remain dynamic ,

Example: demand-driven allocation of blocks of adequate size

. . . well, in some cases it gets truly complicated /

[email protected] POP’07 Algorithms and scheduling techniques 15/ 79

Page 40: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Lesson learnt?

With different-speed processors . . .. . . we need to think (design static schedules)

. . . but implementation may remain dynamic ,

Example: demand-driven allocation of blocks of adequate size

. . . well, in some cases it gets truly complicated /

[email protected] POP’07 Algorithms and scheduling techniques 15/ 79

Page 41: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Outline

1 Parallel algorithms for heterogeneous clusters

2 Background on traditional scheduling

3 Packet routing

4 Master-worker on heterogeneous platforms

5 Limitations & going further

6 Conclusion

[email protected] POP’07 Algorithms and scheduling techniques 16/ 79

Page 42: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Traditional scheduling – Framework

Application = DAG G = (T ,E ,w)

T = set of tasksE = dependence constraintsw(T ) = computational cost of task T (execution time)c(T ,T ′) = communication cost (data sent from T to T ′)

Platform

Set of p identical processors

Schedule

σ(T ) = date to begin execution of task Talloc(T ) = processor assigned to it

[email protected] POP’07 Algorithms and scheduling techniques 17/ 79

Page 43: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Traditional scheduling – Constraints

w(T’)

time

w(T)

comm(T,T’)

σ(T ) + w(T )σ(T )

T T’

σ(T ′)

Data dependences If (T ,T ′) ∈ E then

if alloc(T ) = alloc(T ′) then σ(T ) + w(T ) ≤ σ(T ′)if alloc(T ) 6= alloc(T ′) then σ(T ) + w(T ) + c(T ,T ′) ≤ σ(T ′)

Resource constraints

alloc(T ) = alloc(T ′) ⇒(σ(T ) + w(T ) ≤ σ(T ′)) or (σ(T ′) + w(T ′) ≤ σ(T ))

[email protected] POP’07 Algorithms and scheduling techniques 18/ 79

Page 44: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Traditional scheduling – Objective functions

Makespan or total execution time

MS(σ) = maxT∈T

(σ(T ) + w(T ))

Other classical objectives:

Sum of completion timesWith release dates: max flow (response time), or sum flowFairness oriented: max stretch, or sum stretch

[email protected] POP’07 Algorithms and scheduling techniques 19/ 79

Page 45: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Traditional scheduling – About the model

Simple but OK for computational resources

No CPU sharing, even in models with preemptionAt most one task running per processor at any time-step

Very crude for network resources

Unlimited number of simultaneous sends/receives per processorNo contention → unbounded bandwidth on any linkFully connected interconnection graph (clique)

In fact, model assumes infinite network capacity

[email protected] POP’07 Algorithms and scheduling techniques 20/ 79

Page 46: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Makespan minimization

NP-hardness

Pb(p) NP-complete for independent tasks and nocommunications(E = ∅, p = 2 and c = 0)Pb(p) NP-complete for UET-UCT graphs (w = c = 1)

Approximation algorithms

Without communications, list scheduling is a(2− 1

p )-approximationWith communications, result extends to coarse-grain graphsWith communications, no λ-approximation in general

[email protected] POP’07 Algorithms and scheduling techniques 21/ 79

Page 47: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

List scheduling – Without communications

Initialization:

Compute priority level of all tasks

Priority queue = list of free tasks (tasks without predecessors)sorted by priority

While there remain tasks to execute:

Add new free tasks, if any, to the queue.

If there are q available processors and r tasks in the queue,remove first min(q, r) tasks from the queue and execute them

Priority level

Use critical path: longest path from the task to an exit node

Computed recursively by a bottom-up traversal of the graph

[email protected] POP’07 Algorithms and scheduling techniques 22/ 79

Page 48: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

List scheduling – Without communications

Initialization:

Compute priority level of all tasks

Priority queue = list of free tasks (tasks without predecessors)sorted by priority

While there remain tasks to execute:

Add new free tasks, if any, to the queue.

If there are q available processors and r tasks in the queue,remove first min(q, r) tasks from the queue and execute them

Priority level

Use critical path: longest path from the task to an exit node

Computed recursively by a bottom-up traversal of the graph

[email protected] POP’07 Algorithms and scheduling techniques 22/ 79

Page 49: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

List scheduling – With communications (1/2)

Priority level

Use pessimistic critical path: include all edge costs in theweightComputed recursively by a bottom-up traversal of the graph

MCP Modified Critical Path

Assign free task with highest priority to best processorBest processor = finishes execution first, given already takenscheduling decisionsFree tasks may not be ready for execution (communicationdelays)May explore inserting the task in empty slots of scheduleComplexity O(|V | log |V |+ (|E |+ |V |)p)

[email protected] POP’07 Algorithms and scheduling techniques 23/ 79

Page 50: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

List scheduling – With communications (2/2)

EFT Earliest Finish Time

Dynamically recompute priorities of free tasksSelect free task that finishes execution first (on bestprocessor), given already taken scheduling decisionsHigher complexity O(|V |3p)May miss “urgent” tasks on critical path

Other approaches

Two-step: clustering + load balancing- DSC Dominant Sequence Clustering O((|V |+ |E |) log |V |)- LLB List-based Load Balancing O(C log C + |V |) (C numberof clusters generated by DSC)Low-cost: FCP Fast Critical Path- Maintain constant-size sorted list of free tasks:- Best processor = first idle or the one sending last message- Low complexity O(|V | log p + |E |)

[email protected] POP’07 Algorithms and scheduling techniques 24/ 79

Page 51: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Extending the model to heterogeneous clusters

Task graph with n tasks T1, . . . ,Tn.

Platform with p heterogeneous processors P1, . . . ,Pp.

Computation costs:- wiq = execution time of Ti on Pq

- wi =∑p

q=1 wiq

p average execution time of Ti

- particular case: consistent tasks wiq = wi × γq

Communication costs:- data(i , j): data volume for edge eij : Ti → Tj

- vqr : communication time for unit-size message from Pq toPr (zero if q = r)- com(i , j , q, r) = data(i , j)× vqr communication time from Ti

executed on Pq to Pj executed on Pr

- comij = data(i , j)×∑

1≤q,r≤p,q 6=r vqr

p(p−1) average communicationcost for edge eij : Ti → Tj

[email protected] POP’07 Algorithms and scheduling techniques 25/ 79

Page 52: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Rewriting constraints

Dependences For eij : Ti → Tj , q = alloc(Ti ) and r = alloc(Tj):

σ(Ti ) + wiq + com(i , j , q, r) ≤ σ(Tj)

Resources If q = alloc(Ti ) = alloc(Tj), then

(σ(Ti ) + wiq ≤ σ(Tj)) or (σ(Tj) + wjq ≤ σ(Ti ))

Makespanmax

1≤i≤n

(σ(Ti ) + wi ,alloc(Ti )

)

[email protected] POP’07 Algorithms and scheduling techniques 26/ 79

Page 53: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

HEFT: Heterogeneous Earliest Finish Time

Priority level:

rank(Ti ) = wi + maxTj∈Succ(Ti )

(comij + rank(Tj)),

where Succ(T ) is the set of successors of TRecursive computation by bottom-up traversal of the graph

Allocation

For current task Ti , determine best processor Pq:minimize σ(Ti ) + wiq

Enforce constraints related to communication costsInsertion scheduling: look for t = σ(Ti ) s.t. Pq is availableduring interval [t, t + wiq[

Complexity: same as MCP without/with insertion

[email protected] POP’07 Algorithms and scheduling techniques 27/ 79

Page 54: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

What’s wrong?

, Nothing (still may need to map a DAG onto a platform!)

/ Absurd communication model:complicated: many parameters to instantiatewhile not realistic (clique + no contention)

/ Wrong metric: need to relax makespan minimizationobjective

[email protected] POP’07 Algorithms and scheduling techniques 28/ 79

Page 55: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

What’s wrong?

, Nothing (still may need to map a DAG onto a platform!)

/ Absurd communication model:complicated: many parameters to instantiatewhile not realistic (clique + no contention)

/ Wrong metric: need to relax makespan minimizationobjective

[email protected] POP’07 Algorithms and scheduling techniques 28/ 79

Page 56: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

What’s wrong?

, Nothing (still may need to map a DAG onto a platform!)

/ Absurd communication model:complicated: many parameters to instantiatewhile not realistic (clique + no contention)

/ Wrong metric: need to relax makespan minimizationobjective

[email protected] POP’07 Algorithms and scheduling techniques 28/ 79

Page 57: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Outline

1 Parallel algorithms for heterogeneous clusters

2 Background on traditional scheduling

3 Packet routing

4 Master-worker on heterogeneous platforms

5 Limitations & going further

6 Conclusion

[email protected] POP’07 Algorithms and scheduling techniques 29/ 79

Page 58: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Problem

A

E

G

C

H

DF

B

Routing sets of messages from sources to destinations

Paths not fixed a priori

Packets of same message may follow different paths

[email protected] POP’07 Algorithms and scheduling techniques 30/ 79

Page 59: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Hypotheses

A

E

G

C

H

DF

B

A packet crosses an edge within one time-step

At any time-step, at most one packet crosses an edge

[email protected] POP’07 Algorithms and scheduling techniques 31/ 79

Page 60: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Hypotheses

A

E

G

C

H

DF

B

A packet crosses an edge within one time-step

At any time-step, at most one packet crosses an edge

Scheduling: for each time-step, decide which packet crosses anygiven edge

[email protected] POP’07 Algorithms and scheduling techniques 31/ 79

Page 61: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Notation

k

i

l

j

nk ,l

nk ,li ,j

nk,l : total number of packets to be routed from k to l

nk,li ,j : total number of packets routed from k to l and crossing

edge (i , j)

[email protected] POP’07 Algorithms and scheduling techniques 32/ 79

Page 62: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Lower bound

Congestion Ci ,j of edge (i , j)= total number of packets that cross (i , j)

Ci ,j =∑

(k,l)|nk,l>0

nk,li ,j Cmax = maxi ,j Ci ,j

Cmax lower bound on schedule makespanC ∗ ≥ Cmax

⇒ “Fluidified” solution in Cmax?

[email protected] POP’07 Algorithms and scheduling techniques 33/ 79

Page 63: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Lower bound

Congestion Ci ,j of edge (i , j)= total number of packets that cross (i , j)

Ci ,j =∑

(k,l)|nk,l>0

nk,li ,j Cmax = maxi ,j Ci ,j

Cmax lower bound on schedule makespanC ∗ ≥ Cmax

⇒ “Fluidified” solution in Cmax?

[email protected] POP’07 Algorithms and scheduling techniques 33/ 79

Page 64: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Equations (1/2)

A

E

G

C

H

DF

B

A

E

G

B

Initialization (packets leave node k):∑

j |(k,j)∈A

nk,lk,j = nk,l

Reception (packets reach node l):∑

i |(i ,l)∈A

nk,li ,l = nk,l

Conservation law (crossing intermediate node i):∑i |(i ,j)∈A

nk,li ,j =

∑i |(j ,i)∈A

nk,lj ,i ∀(k, l), j 6= k, j 6= l

[email protected] POP’07 Algorithms and scheduling techniques 34/ 79

Page 65: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Equations (1/2)

A

E

G

C

H

DF

B

G

H

D

Initialization (packets leave node k):∑

j |(k,j)∈A

nk,lk,j = nk,l

Reception (packets reach node l):∑

i |(i ,l)∈A

nk,li ,l = nk,l

Conservation law (crossing intermediate node i):∑i |(i ,j)∈A

nk,li ,j =

∑i |(j ,i)∈A

nk,lj ,i ∀(k, l), j 6= k, j 6= l

[email protected] POP’07 Algorithms and scheduling techniques 34/ 79

Page 66: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Equations (1/2)

G G

Initialization (packets leave node k):∑

j |(k,j)∈A

nk,lk,j = nk,l

Reception (packets reach node l):∑

i |(i ,l)∈A

nk,li ,l = nk,l

Conservation law (crossing intermediate node i):∑i |(i ,j)∈A

nk,li ,j =

∑i |(j ,i)∈A

nk,lj ,i ∀(k, l), j 6= k, j 6= l

[email protected] POP’07 Algorithms and scheduling techniques 34/ 79

Page 67: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Equations (2/2)

CongestionCi ,j =

∑(k,l)|nk,l>0 nk,l

i ,j

Objective function

Cmax ≥ Ci ,j , ∀i , j

Minimize Cmax

Linear program in rational numbers: polynomial-time solution. Inpractice use GLPK, Maple, Mupad . . .

[email protected] POP’07 Algorithms and scheduling techniques 35/ 79

Page 68: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Equations (2/2)

CongestionCi ,j =

∑(k,l)|nk,l>0 nk,l

i ,j

Objective function

Cmax ≥ Ci ,j , ∀i , j

Minimize Cmax

Linear program in rational numbers: polynomial-time solution. Inpractice use GLPK, Maple, Mupad . . .

[email protected] POP’07 Algorithms and scheduling techniques 35/ 79

Page 69: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Routing algorithm

Compute optimal solution Cmax, nk,li ,j of previous linear

program

Periodic schedule:

Define Ω =√

Cmax

Use⌈

Cmax

Ω

⌉periods of length Ω

During each period, edge (i , j) forwards (at most)

mk,li,j =

⌊nk,l

i,j Ω

Cmax

packets that go from k to l

Clean-up: sequentially process residual packets inside network

[email protected] POP’07 Algorithms and scheduling techniques 36/ 79

Page 70: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Performance

Schedule is feasible

Schedule is asymptotically optimal:

Cmax ≤ C ∗ ≤ Cmax + O(√

Cmax)

[email protected] POP’07 Algorithms and scheduling techniques 37/ 79

Page 71: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Why does it work?

Relaxation of objective function

Rational number of packets in LP formulation

Periods long enough so that rounding down to integernumbers has negligible impact

Periods numerous enough so that loss in first and last periodshas negligible impact

Periodic schedule, described in compact form

[email protected] POP’07 Algorithms and scheduling techniques 38/ 79

Page 72: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Outline

1 Parallel algorithms for heterogeneous clusters

2 Background on traditional scheduling

3 Packet routing

4 Master-worker on heterogeneous platforms

5 Limitations & going further

6 Conclusion

[email protected] POP’07 Algorithms and scheduling techniques 39/ 79

Page 73: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Master-worker tasking: framework

Heterogeneous resources

Processors of different speedsCommunication links with various bandwidths

Large number of independent tasks to process

Tasks are atomicTasks have same size

Single data repository

One master initially holds data for all tasksSeveral workers arranged along a star, a tree ora general graph

[email protected] POP’07 Algorithms and scheduling techniques 40/ 79

Page 74: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Application examples

Monte Carlo methods

SETI@home

Factoring large numbers

Searching for Mersenne primes

Particle detection at CERN (LHC@home)

... and many others: see BOINC at http://boinc.berkeley.edu

[email protected] POP’07 Algorithms and scheduling techniques 41/ 79

Page 75: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Makespan vs. steady state

Two-different problems

Makespan Maximize total number of tasks processed within atime-bound

Steady state Determine periodic task allocation which maximizestotal throughput

[email protected] POP’07 Algorithms and scheduling techniques 42/ 79

Page 76: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Example

!" #" $ #" % # $ &' %'!" #" $ #" % # $ &' %'

( ) *,+ -./0 1 23 45 6

78 9:; < 9=> ?@

AB CDE FE G B D E H I AJ K LM NO PQ RS T U M UV Q N T

W N UX RY X O V M UX NQ O X TZ M N ZQ Y [\ UX UQ Q

[email protected] POP’07 Algorithms and scheduling techniques 43/ 79

Page 77: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Example

Time for computingTime for computing

one task in Cone task in C

Time for sending Time for sending

one task from A to Bone task from A to B

A is the root of the tree;A is the root of the tree;

all tasks start at Aall tasks start at A

[email protected] POP’07 Algorithms and scheduling techniques 43/ 79

Page 78: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Example

A computeA computeA sendA send

B receiveB receiveB computeB compute

C receiveC receiveC computeC compute

C sendC send

D receiveD receiveD computeD compute

11 22 33

[email protected] POP’07 Algorithms and scheduling techniques 43/ 79

Page 79: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Example

A computeA computeA sendA send

B receiveB receiveB computeB compute

C receiveC receiveC computeC compute

C sendC send

D receiveD receiveD computeD compute

11 22 33

[email protected] POP’07 Algorithms and scheduling techniques 43/ 79

Page 80: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Example

A computeA computeA sendA send

B receiveB receiveB computeB compute

C receiveC receiveC computeC compute

C sendC send

D receiveD receiveD computeD compute

11 22 33

[email protected] POP’07 Algorithms and scheduling techniques 43/ 79

Page 81: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Example

A computeA computeA sendA send

B receiveB receiveB computeB compute

C receiveC receiveC computeC compute

C sendC send

D receiveD receiveD computeD compute

11 22 33

[email protected] POP’07 Algorithms and scheduling techniques 43/ 79

Page 82: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Example

A computeA computeA sendA send

B receiveB receiveB computeB compute

C receiveC receiveC computeC compute

C sendC send

D receiveD receiveD computeD compute

11 22 33

[email protected] POP’07 Algorithms and scheduling techniques 43/ 79

Page 83: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Example

A computeA computeA sendA send

B receiveB receiveB computeB compute

C receiveC receiveC computeC compute

C sendC send

D receiveD receiveD computeD compute

11 22 33

StartupStartupRepeatedRepeated

patternpatternCleanClean--upup

SteadySteady--state: 7 tasks every 6 time unitsstate: 7 tasks every 6 time units

[email protected] POP’07 Algorithms and scheduling techniques 43/ 79

Page 84: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Solution for star-shaped platforms

Communication links between master and workers havedifferent bandwidths

Workers have different computing power

[email protected] POP’07 Algorithms and scheduling techniques 44/ 79

Page 85: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Rule of the game

M

P1 P2 Pi Pp

w1 w2 wi wp

ci

cpc1

c2

Master sends tasks to workers sequentially, and withoutpreemption

Full computation/communication overlap for each worker

Worker Pi receives a task in ci time-units

Worker Pi processes a task in wi time-units

[email protected] POP’07 Algorithms and scheduling techniques 45/ 79

Page 86: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Equations

M

P1 P2 Pi Pp

w1 w2 wi wp

ci

cpc1

c2

Worker Pi executes αi tasks per time-unit

Computations: αiwi ≤ 1

Communications:∑

i αici ≤ 1

Objective: maximize throughput

ρ =∑

i

αi

[email protected] POP’07 Algorithms and scheduling techniques 45/ 79

Page 87: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Solution

Faster-communicating workers first: c1 ≤ c2 ≤ . . .

Make full use of first q workers, where q largest index s.t.

q∑i=1

ci

wi≤ 1

Make partial use of next worker Pq+1

Discard other workers

Bandwidth-centric strategy- Delegate work to the fastest communicating workers- It doesn’t matter if these workers are computing slowly- Slow workers will not contribute much to overall throughput

[email protected] POP’07 Algorithms and scheduling techniques 46/ 79

Page 88: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Example

Fully active

210

201

3 6 1 1 1

M

3

Discarded

Tasks Communication Computation6 tasks to P1 6c1 = 6 6w1 = 183 tasks to P2 3c2 = 6 3w2 = 182 tasks to P3 2c3 = 6 2w3 = 2

11 tasks every 18 time-units (ρ = 11/18 ≈ 0.6)

[email protected] POP’07 Algorithms and scheduling techniques 47/ 79

Page 89: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Example

Fully active

210

201

3 6 1 1 1

M

3

Discarded

, Compare to purely greedy (demand-driven) strategy!5 tasks every 36 time-units (ρ = 5/36 ≈ 0.14)

Even if resources are cheap and abundant,resource selection is key to performance

[email protected] POP’07 Algorithms and scheduling techniques 47/ 79

Page 90: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Extension to trees

Fully used node

Partially used node

Idle node

1432

1 1 1

3

0

2

555

26666

5 5

9

10

12

12

Resource selection based on local information (children)

[email protected] POP’07 Algorithms and scheduling techniques 48/ 79

Page 91: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Does this really work?

Can we deal with arbitrary platforms (including cycles)?

Can we deal with return messages?

In fact, can we deal with more complex applications (arbitrarycollections of DAGs)?

[email protected] POP’07 Algorithms and scheduling techniques 49/ 79

Page 92: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Does this really work?

Can we deal with arbitrary platforms (including cycles)? Yes

Can we deal with return messages?

In fact, can we deal with more complex applications (arbitrarycollections of DAGs)?

[email protected] POP’07 Algorithms and scheduling techniques 49/ 79

Page 93: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Does this really work?

Can we deal with arbitrary platforms (including cycles)? Yes

Can we deal with return messages?

In fact, can we deal with more complex applications (arbitrarycollections of DAGs)?

[email protected] POP’07 Algorithms and scheduling techniques 49/ 79

Page 94: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Does this really work?

Can we deal with arbitrary platforms (including cycles)? Yes

Can we deal with return messages? Yes

In fact, can we deal with more complex applications (arbitrarycollections of DAGs)?

[email protected] POP’07 Algorithms and scheduling techniques 49/ 79

Page 95: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Does this really work?

Can we deal with arbitrary platforms (including cycles)? Yes

Can we deal with return messages? Yes

In fact, can we deal with more complex applications (arbitrarycollections of DAGs)?

[email protected] POP’07 Algorithms and scheduling techniques 49/ 79

Page 96: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Does this really work?

Can we deal with arbitrary platforms (including cycles)? Yes

Can we deal with return messages? Yes

In fact, can we deal with more complex applications (arbitrarycollections of DAGs)? Yes, I mean, almost!

[email protected] POP’07 Algorithms and scheduling techniques 49/ 79

Page 97: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

LP formulation still works well . . .

Tm

file emnPj

Pi

Pk

wi

cik

cji

Tn

Conservation law

∀m, n∑

j

sent(Pj → Pi , emn) + executed(Pi ,Tm)

= executed(Pi ,Tn) +∑k

sent(Pi → Pk , emn)

Computations∑m

executed(Pi ,Tm)× flops(Tm)× wi ≤ 1

Outgoing communications∑m,n

∑j

sent(Pj → Pi , emn)× bytes(emn)× cij ≤ 1

[email protected] POP’07 Algorithms and scheduling techniques 50/ 79

Page 98: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

. . . but schedule reconstruction is harder

P1 → P2

P2 → P1

P1 → P3

P3 → P1

P2 → P4

P4 → P2

P3 → P4

P4 → P3

P2 → P3

P3 → P2

P4

P3

P2

P1

χ3 χ4 χ1 χ2 χ3 χ4 χ1 χ2 χ3 χ4 χ1 χ2 χ3 χ4 χ1 χ2

0 40 80 120 160

A5A4A3A2A1

, Actual (cyclic) schedule obtained in polynomial time

, Asymptotic optimality

/ A couple of practical problems (large period, # buffers)

/ No local scheduling policy

[email protected] POP’07 Algorithms and scheduling techniques 51/ 79

Page 99: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

The beauty of steady-state scheduling

Rationale Maximize throughput (total load executed perperiod)

Simplicity Relaxation of makespan minimization problem

Ignore initialization and clean-up phasesPrecise ordering of tasks/messages not neededCharacterize resource activity per time-unit:- which (rational) fraction of time is spentcomputing for which application?- which (rational) fraction of time is spentreceiving from or sending to which neighbor?

Efficiency Optimal throughput ⇒ optimal schedule (up to aconstant number of tasks)

Periodic schedule, described in compact form⇒ compiling a loop instead of a DAG!

[email protected] POP’07 Algorithms and scheduling techniques 52/ 79

Page 100: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Lesson learnt?

Resource selection is mandatory

. . . implementation may still be dynamic,provided that static allocation is enforced by scheduler

Example: demand-driven assignment of enrolled workers

[email protected] POP’07 Algorithms and scheduling techniques 53/ 79

Page 101: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Lesson learnt?

Resource selection is mandatory

. . . implementation may still be dynamic,provided that static allocation is enforced by scheduler

Example: demand-driven assignment of enrolled workers

[email protected] POP’07 Algorithms and scheduling techniques 53/ 79

Page 102: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Outline

1 Parallel algorithms for heterogeneous clusters

2 Background on traditional scheduling

3 Packet routing

4 Master-worker on heterogeneous platforms

5 Limitations & going further

6 Conclusion

[email protected] POP’07 Algorithms and scheduling techniques 54/ 79

Page 103: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Good news and bad news

, One-port model: first step towards designing realisticscheduling heuristics

, Steady-state circumvents complexity of schedulingproblems. . . while deriving efficient (often asympotically optimal)scheduling algorithms

/ Need to acquire a good knowledge of the platform graph

/ Need to run extensive experiments or simulations

[email protected] POP’07 Algorithms and scheduling techniques 55/ 79

Page 104: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Good news and bad news

, One-port model: first step towards designing realisticscheduling heuristics

, Steady-state circumvents complexity of schedulingproblems. . . while deriving efficient (often asympotically optimal)scheduling algorithms

/ Need to acquire a good knowledge of the platform graph

/ Need to run extensive experiments or simulations

[email protected] POP’07 Algorithms and scheduling techniques 55/ 79

Page 105: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Knowledge of the platform graph

For regular problems, the structure of the task graph (nodesand edges) only depends upon the application, not upon thetarget platform

Problems arise from weights, i.e. the estimation of executionand communication times

Classical answer: “use the past to predict the future”

Divide scheduling into phases, during which machine andnetwork parameters are collected (with NWS)⇒ This information guides scheduling decisions for next phase

Moving from heterogeneous clusters to computational gridscauses further problems (even discovering the characteristicsof the surrounding computing resources may prove a difficulttask)

[email protected] POP’07 Algorithms and scheduling techniques 56/ 79

Page 106: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Knowledge of the platform graph

For regular problems, the structure of the task graph (nodesand edges) only depends upon the application, not upon thetarget platform

Problems arise from weights, i.e. the estimation of executionand communication times

Classical answer: “use the past to predict the future”

Divide scheduling into phases, during which machine andnetwork parameters are collected (with NWS)⇒ This information guides scheduling decisions for next phase

Moving from heterogeneous clusters to computational gridscauses further problems (even discovering the characteristicsof the surrounding computing resources may prove a difficulttask)

[email protected] POP’07 Algorithms and scheduling techniques 56/ 79

Page 107: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Knowledge of the platform graph

For regular problems, the structure of the task graph (nodesand edges) only depends upon the application, not upon thetarget platform

Problems arise from weights, i.e. the estimation of executionand communication times

Classical answer: “use the past to predict the future”

Divide scheduling into phases, during which machine andnetwork parameters are collected (with NWS)⇒ This information guides scheduling decisions for next phase

Moving from heterogeneous clusters to computational gridscauses further problems (even discovering the characteristicsof the surrounding computing resources may prove a difficulttask)

[email protected] POP’07 Algorithms and scheduling techniques 56/ 79

Page 108: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Experiments versus simulations

Real experiments difficult to drive (genuine instability ofnon-dedicated platforms)

Simulations ensure reproducibility of measured data

Key issue: run simulations against a realistic environment

Trace-based simulation: record platform parameters today,and simulate the algorithms tomorrow, against recorded data

Use SIMGRID, an event-driven simulation toolkit

[email protected] POP’07 Algorithms and scheduling techniques 57/ 79

Page 109: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Experiments versus simulations

Real experiments difficult to drive (genuine instability ofnon-dedicated platforms)

Simulations ensure reproducibility of measured data

Key issue: run simulations against a realistic environment

Trace-based simulation: record platform parameters today,and simulate the algorithms tomorrow, against recorded data

Use SIMGRID, an event-driven simulation toolkit

[email protected] POP’07 Algorithms and scheduling techniques 57/ 79

Page 110: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

SIMGRID traces

server #1

client #1

client #2

client #3

server #2

router

switch

hub

Internet

CPU availability

Network bandwidth

Transient Failure X

See http://simgrid.gforge.inria.fr/

[email protected] POP’07 Algorithms and scheduling techniques 58/ 79

Page 111: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Across physical links

Network = directed graph P = (V ,E )

P0

P1

P3

P2

time

recv 2,3P3

T2,3(L)link e2,3

send2,3P2

General case: affine model (includes latencies)

Common variant: sending and receiving processors busyduring whole transfer

[email protected] POP’07 Algorithms and scheduling techniques 59/ 79

Page 112: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Across physical links

Network = directed graph P = (V ,E )

P0

P1

P3

P2

time

r2,3

r2,3 · LP3

α2,3

β2,3 · Llink e2,3

s2,3 · Ls2,3P2

General case: affine model (includes latencies)

Common variant: sending and receiving processors busyduring whole transfer

[email protected] POP’07 Algorithms and scheduling techniques 59/ 79

Page 113: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Across physical links

Network = directed graph P = (V ,E )

P0

P1

P3

P2

time

recv 2,3P3

T2,3(L)link e2,3

send2,3P2

General case: affine model (includes latencies)

Common variant: sending and receiving processors busyduring whole transfer

[email protected] POP’07 Algorithms and scheduling techniques 59/ 79

Page 114: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Multi-port

Bar-Noy, Guha, Naor, Schieber:occupation time of sender Pu independent of target Pv

time

recv vPv

Tu,v(L)link eu,v

senduPu

not fully multi-port model, but allows for starting a new transfer

from Pu without waiting for previous one to finish

[email protected] POP’07 Algorithms and scheduling techniques 60/ 79

Page 115: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

One-port

Bhat, Raghavendra and Prasanna:same parameters for sender Pu, link eu,v and receiver Pv

time

ru,v · Lru,vPv

βu,v · Lαu,vlink eu,v

su,v · Lsu,vPu

two flavors:- bidirectional: simultaneous send and receive transfers allowed

- unidirectional: only one send or receive transfer at a given

time-step

[email protected] POP’07 Algorithms and scheduling techniques 61/ 79

Page 116: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Store & Forward, WormHole, TCP

How to model a file transfer along a path?

[email protected] POP’07 Algorithms and scheduling techniques 62/ 79

Page 117: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Store & Forward, WormHole, TCP

How to model a file transfer along a path?

S

l1

l3

l2

[email protected] POP’07 Algorithms and scheduling techniques 62/ 79

Page 118: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Store & Forward, WormHole, TCP

How to model a file transfer along a path?

S

l1

l3

l2

Store & Forward : bad model for contention

[email protected] POP’07 Algorithms and scheduling techniques 62/ 79

Page 119: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Store & Forward, WormHole, TCP

How to model a file transfer along a path?

S

l1

l3

l2

[email protected] POP’07 Algorithms and scheduling techniques 62/ 79

Page 120: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Store & Forward, WormHole, TCP

How to model a file transfer along a path?

pi ,j

s

S

l1

l3

l2

WormHole : computation intensive (packets), not that realistic

[email protected] POP’07 Algorithms and scheduling techniques 62/ 79

Page 121: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Store & Forward, WormHole, TCP

How to model a file transfer along a path?

∀l ∈ L,∑

r∈R s.t. l∈r

ρr ≤ cl

Analytical model

[email protected] POP’07 Algorithms and scheduling techniques 62/ 79

Page 122: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Store & Forward, WormHole, TCP

How to model a file transfer along a path?

∀l ∈ L,∑

r∈R s.t. l∈r

ρr ≤ cl

Max-Min Fairness maximize minr∈R

ρr

Proportional Fairness maximize∑r∈R

ρr log(ρr )

MCT minimization maximize minr∈R

1

ρr

TCP behavior Close to max-min.In SIMGRID: max-min + bound by 1/RTT

[email protected] POP’07 Algorithms and scheduling techniques 62/ 79

Page 123: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Bandwidth sharing

Traditional assumption: Fair Sharing

Open i TCP connections, receive bw(i) bandwidth perconnection

bw(i) = bw(1)/i on a LAN

Experimental evidence → bw(i) = bw(1) on a WAN

Backbone links have so many connections that interferenceamong a few selected connections is negligible

Better model: bw(i) =bw(1)

1 + (i − 1).γ

γ = 1 for a perfect LAN, γ = 0 for a perfect WAN

[email protected] POP’07 Algorithms and scheduling techniques 63/ 79

Page 124: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Bandwidth sharing

Traditional assumption: Fair Sharing

Open i TCP connections, receive bw(i) bandwidth perconnection

bw(i) = bw(1)/i on a LAN

Experimental evidence → bw(i) = bw(1) on a WAN

Backbone links have so many connections that interferenceamong a few selected connections is negligible

Better model: bw(i) =bw(1)

1 + (i − 1).γ

γ = 1 for a perfect LAN, γ = 0 for a perfect WAN

[email protected] POP’07 Algorithms and scheduling techniques 63/ 79

Page 125: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Bandwidth sharing

Traditional assumption: Fair Sharing

Open i TCP connections, receive bw(i) bandwidth perconnection

bw(i) = bw(1)/i on a LAN

Experimental evidence → bw(i) = bw(1) on a WAN

Backbone links have so many connections that interferenceamong a few selected connections is negligible

Better model: bw(i) =bw(1)

1 + (i − 1).γ

γ = 1 for a perfect LAN, γ = 0 for a perfect WAN

[email protected] POP’07 Algorithms and scheduling techniques 63/ 79

Page 126: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Sample large-scale platform

Primergy

Primergy

backbone link

router

front end

cluster

[email protected] POP’07 Algorithms and scheduling techniques 64/ 79

Page 127: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

What topology?

Generated (GT-ITM, BRITE, etc.) or obtained frommonitoring?

Very complex (Layer 2 information)Not clear that a scheduling algorithm could exploit/know allthat information

Need a simple model that is

More accurate than traditional models (e.g., LAN links,fully-connected)Still amenable to analysis

[email protected] POP’07 Algorithms and scheduling techniques 65/ 79

Page 128: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

What topology?

Generated (GT-ITM, BRITE, etc.) or obtained frommonitoring?

Very complex (Layer 2 information)Not clear that a scheduling algorithm could exploit/know allthat information

Need a simple model that is

More accurate than traditional models (e.g., LAN links,fully-connected)Still amenable to analysis

[email protected] POP’07 Algorithms and scheduling techniques 65/ 79

Page 129: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

What topology? (cont’d)

Primergy

Primergy

uuγ = 0.5

Unknown topology

(complete graph)

[email protected] POP’07 Algorithms and scheduling techniques 66/ 79

Page 130: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

What topology? (cont’d)

Primergy

Primergy

backbone link

router

front end

cluster

Hierarchy + BW sharing, but assume knowledge of

Routing

Backbone bandwidths

CPU speeds

[email protected] POP’07 Algorithms and scheduling techniques 67/ 79

Page 131: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

A first trial

sk

sl

gk

gl

b3

b1

Lk,l

b2

C k

C kmaster

C krouter

C lrouter

C lmaster

C l

Clusters and backbone links

[email protected] POP’07 Algorithms and scheduling techniques 68/ 79

Page 132: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

A first trial (cont’d)

sk

sl

gk

gl

b3

b1

Lk,l

b2

C k

C kmaster

C krouter

C lrouter

C lmaster

C l

Clusters

K clusters C k , 1 ≤ k ≤ K

C kmaster front-end processor

C krouter router to external world

sk cumulated speed of C k

gk bandwidth of the LAN link (γ = 1) from C kmaster to C k

router

[email protected] POP’07 Algorithms and scheduling techniques 69/ 79

Page 133: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

A first trial (cont’d)

sk

sl

gk

gl

b3

b1

Lk,l

b2

C k

C kmaster

C krouter

C lrouter

C lmaster

C l

Network

Set R of routers and B of backbone links li

bw(li ) bandwidth available for a new connection

max-connect(li ) max. number of connections that can beopened

Fixed routing: path Lk,l of backbones from C krouter to C l

router

[email protected] POP’07 Algorithms and scheduling techniques 69/ 79

Page 134: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Scheduling multiple applications

Large-scale platforms not likely to be exploited in dedicatedmode/single application

Investigate scenarios in which multiple applications aresimultaneously executed on the platform⇒ competition for CPU and network resources

[email protected] POP’07 Algorithms and scheduling techniques 70/ 79

Page 135: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Target problem

Large complex platform: several clusters and backbone links

One (divisible load) application running on each cluster

Which fraction of the job to delegate to other clusters?

Applications have different communication-to-computationratios

How to ensure fair scheduling and good resource utilization?

[email protected] POP’07 Algorithms and scheduling techniques 71/ 79

Page 136: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Linear program

Maximize mink

αkπk

,

under the constraints

(1a) ∀C k ,∑

l

αk,l = αk

(1b) ∀C k ,∑

l

αl ,k .τl ≤ sk

(1c) ∀C k ,∑l 6=k

αk,l .δk +∑j 6=k

αj ,k .δj ≤ gk

(1d) ∀li ,∑

li∈Lk,l

βk,l ≤ max-connect(li )

(1e) ∀k, l , αk,l .δk ≤ βk,l × gk,l

(1f) ∀k, l , αk,l ≥ 0

(1g) ∀k, l , βk,l ∈ N

(1)

[email protected] POP’07 Algorithms and scheduling techniques 72/ 79

Page 137: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Approach

Solution to rational linear problem as comparator/upperbound

Several heuristics, greedy and LP-based

Use Tiers as topology generator, and then SIMGRID

[email protected] POP’07 Algorithms and scheduling techniques 73/ 79

Page 138: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Methodology

-1000

0

1000

2000

3000

4000

5000

-1000 0 1000 2000 3000 4000 5000

Ver

tical

Dis

tanc

e

Horizontal Distance Number of nodes: 641, Number of links:934

Original Network

WANMANLAN

500

1000

1500

2000

2500

3000

3500

4000

0 500 1000 1500 2000 2500

Ver

tical

Dis

tanc

e

Horizontal Distance Number of nodes: 41, Number of links:45

Pruned Network WANMANLAN

Sample full and pruned Tiers topology

distribution

K 5, 7, . . . , 90log(bw(lk)), log(gk) normal (mean= log(2000), std=log(10))sk uniform, 1000 — 10000max-connect, δk , τk , πk uniform, 1 — 10

Platform parameters used in simulation

[email protected] POP’07 Algorithms and scheduling techniques 74/ 79

Page 139: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Hints for implementation

Participants sharing resources in a Virtual Organization

Centralized broker managing applications and resources

Broker gathers all parameters of LP program

Priority factors

Various policies and refinements possible⇒ e.g. fixed number of connections per application

[email protected] POP’07 Algorithms and scheduling techniques 75/ 79

Page 140: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Outline

1 Parallel algorithms for heterogeneous clusters

2 Background on traditional scheduling

3 Packet routing

4 Master-worker on heterogeneous platforms

5 Limitations & going further

6 Conclusion

[email protected] POP’07 Algorithms and scheduling techniques 76/ 79

Page 141: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Key advantages of steady-state scheduling

Simplicity

From local equations to global behaviorThroughput characterized from activity variables

Efficiency

Periodic schedule, described in compact formAsymptotic optimality

Adaptability

Record observed performance during currentperiod, and inject information to computeschedule for next periodReact on the fly to resource variations

[email protected] POP’07 Algorithms and scheduling techniques 77/ 79

Page 142: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Open problems

Decentralized scheduling

From local strategies to provably good performance?Adapt Awerbuch-Leighton algorithm for multicommodityflows?

Concurrent scheduling

Several workflows competing for resourcesMulti-criteria and fairness?Adapt economic models and buzz-words (e.g., Nashequilibrium)?

[email protected] POP’07 Algorithms and scheduling techniques 78/ 79

Page 143: Algorithms and scheduling techniques for heterogeneous ...graal.ens-lyon.fr/~yrobert/slides/tokyo2007.pdf · Yves.Robert@ens-lyon.fr POP’07 Algorithms and scheduling techniques

Introduction Parallel algorithms for heterogeneous clusters Background on traditional scheduling Packet routing Master-worker on heterogeneous platforms Limitations & going further Conclusion

Scheduling for large-scale platforms

If platform is well identified and relatively stable, try to:(i) accurately model hierarchical structure(ii) design well-suited scheduling algorithms

If platform is not stable enough, or if it evolves too fast,dynamic schedulers are the only option

Otherwise, grab any opportunity to

inject static knowledge into dynamic schedulers

/ Is this opportunity a niche?, Does it encompass a wide range of applications?

[email protected] POP’07 Algorithms and scheduling techniques 79/ 79