optimal dynamic frequency scaling for energy - performance...

18
Optimal Dynamic Frequency Scaling for Energy - Performance of Parallel MPI Programs Jean-Claude Charr, Raphaël Couturier, Ahmed Fanfakh and Arnaud Giersch FEMTO-ST - DISC Department - AND Team June 3th, 2014

Upload: ngohanh

Post on 19-Jul-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Optimal Dynamic Frequency Scaling forEnergy - Performance of Parallel MPIPrograms

Jean-Claude Charr, Raphaël Couturier, Ahmed Fanfakh andArnaud Giersch

FEMTO-ST - DISC Department - AND Team

June 3th, 2014

Outline

1. Definitions and objectives

2. Energy and performance models

3. Performance and energy reduction trade-off

4. Experimental results and comparison

5. Conclusions and future works

DISC Department - AND Team 2 / 18

Definitions

• Modern processors provide Dynamic Voltage and FrequencyScaling (DVFS) technique.

• DVFS is used to reduce the frequency and thus to reduce theenergy consumption by a CPU while computing.

• But scaling the frequency to lower level reduces theperformance (execution time) of parallel program.

• Energy consumption for individual processor depends on twopower metrics: the static power Pstatic and the dynamic powerPdyn.

DISC Department - AND Team 3 / 18

Definitions

• Pdyn = α · CL · V 2 · F .

• Pstatic = V · Ntrans · Kdesign · Ileak .

• Energy consumption by individual processor of a synchronousparallel program:Eind = Pdyn · TComp + Pstatic · (TComp + TComm).

• The frequency scaling factor is the ratio between the maximumand the new frequency, S = Fmax

Fnew.

DISC Department - AND Team 4 / 18

Objectives• Study the effect of the scaling factor S on energy consumption

of parallel iterative applications such as NAS Benchmarks.

• Study the effect of the scaling factor S on performance of thesebenchmarks.

• Discovering the energy-performance trade-off relation whenchanging the frequency.

• We propose an algorithm for selecting the scaling factor Sproducing optimal trade-off between the energy andperformance.

• Improving Rauber and Rünger’s1 method that our method beston.

1Thomas Rauber and Gudula Rünger. Analytical modeling and simulation of the energy consumptionof independent tasks. In Proceedings of the Winter Simulation Conference, 2012.

DISC Department - AND Team 5 / 18

Energy model for homogeneous platform

The dynamic power is exponentially related to the scaling factor Sand the static consumed energy is linearly related to this factor.

Rauber and Rünger’s energy model

E = Pdyn · S−21 ·

(T1 +

∑Ni=2

T 3i

T 21

)+ Pstatic · S1 · T1 · N

S1: is the max. scaling factor, TI : is the time of the slower task, Ti : isthe time of the other tasks and N: is the number of nodes.

Rauber and Rünger’s optimal scaling factor

Sopt =3

√2N · Pdyn

Pstatic·(

1 +∑N

i=2T 3

iT 3

1

)They reduce degradation of the performance by setting the highestfrequency to the slowest task.

DISC Department - AND Team 6 / 18

Slack times of the sync. parallel program

Communication Time

Time

Barrier

Idle time

Barrier

Task 3

Task 2

Task 1

Task 4

Task N

(a) Sync. imbalanced communications

Computation Time

Time

Barrier

Idle time

Barrier

Task 3

Task 2

Task 1

Task 4

Task N

(b) Sync. imbalanced computations

ProgramTime = maxi=1,2,...,N Ti

DISC Department - AND Team 7 / 18

Performance evaluation of MPI programs

Execution time prediction model

Tnew = TMaxCompOld · S + TMaxCommOld

1

1.2

1.4

1.6

1.8

1 1.5 2 2.5 3

No

rma

lize

d t

ime

Frequency scaling factors

CG Class B

Normalized predicted timeNormalized real time

1.2

1.5

1.8

2.1

2.4

2.7

1 1.5 2 2.5 3N

orm

aliz

ed

tim

e

Frequency scaling factors

LU Class B

Normalized predicted timeNormalized real time

The maximum normalized error for CG=0.0073 (the smallest) andLU=0.031 (the worst).

DISC Department - AND Team 8 / 18

Performance and energy reduction trade-off

0.5

1

1.5

2

2.5

1 1.5 2 2.5 3Norm

aliz

ed e

nerg

y a

nd e

xecution tim

e

Frequency scaling factors

Normalized execution timeNormalized energy

(c) Real relation.

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1 1.5 2 2.5 3

Norm

aliz

ed e

nerg

y a

nd p

erf

orm

ance

Frequency scaling factors

Optimal scaling factor

Normalized performanceNormalized energy

(d) Converted relation.

Performance = 1execution time

Our objective function

MaxDist = maxj=1,2,...,F (

Maximize︷ ︸︸ ︷PNorm(Sj)−

Minimize︷ ︸︸ ︷ENorm(Sj))

DISC Department - AND Team 9 / 18

Scaling factor selection algorithm

Enumerate the available scaling factors and find Soptimal forwhich PNorm − ENorm is maximal.

Where:

ENorm = EReducedEOriginal

=Pdyn·S−2

1 ·(

T1+∑N

i=2T3

iT2

1

)+Pstatic ·T1·S1·N

Pdyn·(

T1+∑N

i=2T 3

iT 2

1

)+Pstatic ·T1·N

PNorm = ToldTnew

=TMaxCompOld+TMaxCommOld

TMaxCompOld ·S+TMaxCommOld

DISC Department - AND Team 10 / 18

Scaling factor selection algorithm

Algorithm characteristics• It works online.

• It predicts both the energy consumption and performance.

• It is simultaneously reduces the energy consumption andmaintaining performance of iterative algorithm.

• It takes into account the communication time.

• It is well adapted to imbalanced tasks. Fi =Fmax ·Ti

Soptimal ·Tmax

• It has a very small overhead. It takes 6.65 µs for 32 nodes.

DISC Department - AND Team 11 / 18

Experimental results

• Our experiments are executed on the simulator SimGrid/SMPIv3.10.

• Our algorithm is applied to NAS parallel benchmarks.

• Each node in the cluster has 18 frequency values from 2.5GHzto 800MHz.

• We run the classes A, B and C on 4, 8 or 9 and 16 nodesrespectively.

• The dynamic power with the highest frequency is equal to 20 Wand the power static is equal to 4 W .

DISC Department - AND Team 12 / 18

Experimental results

0.2

0.4

0.6

0.8

1

1.2

1 1.5 2 2.5 3

No

rma

lize

d e

ne

rgy a

nd

pe

rfo

rma

nce

Frequency scaling factors

Optimal scaling factor=1.04

EP Class C Normalized performanceNormalized energy

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1 1.5 2 2.5 3

No

rma

lize

d e

ne

rgy a

nd

pe

rfo

rma

nce

Frequency scaling factors

Optimal scaling factor=1.56

CG Class C Normalized performanceNormalized energy

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1 1.5 2 2.5 3

No

rma

lize

d e

ne

rgy a

nd

pe

rfo

rma

nce

Frequency scaling factors

Optimal scaling factor=1.315

BT Class C Normalized performanceNormalized energy

CG MG EP LU BT SP FT0

5

10

15

20

25

30

35

40

45

39.23

34.97

22.14

35.83

29.6

33.4834.72

14.88

21.69 20.7322.49

21.28 21.3519

Energy Saving % Performance Degradation %

NAS Benchmarks Class C

DISC Department - AND Team 13 / 18

Results comparison

CG MG EP LU BT SP FT0

5

10

15

20

25

30

35

40

45

Energy Saving %Performance Degradation %

Our Method for NAS benchmarks Class C

CG MG EP LU BT SP FT0

10

20

30

40

50

60

Energy Saving %Performance Degradation %

Rauber and Rünger for NAS benchmarks Class C

DISC Department - AND Team 14 / 18

Results comparison

EP CG MG BT LU SP FT

-20

-15

-10

-5

0

5

10

15

20

25

30Rauber E-P EPSA

Comparing our method with Rauber and Rünger method for NAS benchmarks class C

En

ergy

to

Per

form

ance

Dis

tan

ce

DISC Department - AND Team 15 / 18

Conclusions

• We have presented a new online scaling factor selection methodthat optimizes simultaneously the energy and performance.

• It predicts the energy consumption and the performance ofthe parallel applications.

• Our algorithm saves more energy when the communicationand the other slacks times are big.

• It gives the best trade-off between energy reduction andperformance.

• Our method outperforms Rauber and Rünger’s method interms of energy-performance ratio.

DISC Department - AND Team 16 / 18

Future works

• We will apply the proposed algorithm to a heterogeneousplatform.

• While the nodes of a heterogeneous platform are different in:

- Dynamic and static power.- Individual energy consumption.- The available frequencies.- Performance capabilities.

• We will apply the proposed algorithm to a real cluster.

• We will apply the proposed algorithm to a real applications.

DISC Department - AND Team 17 / 18

Thanks for Listening

To appear

This work will be appear in ISPA conference proceedings,August 2014

Questions?

DISC Department - AND Team 18 / 18