Task Scheduling and Voltage Selection for Energy shu/research/papers/dac02.pdf · 2005-09-30 · Task…

Download Task Scheduling and Voltage Selection for Energy shu/research/papers/dac02.pdf · 2005-09-30 · Task…

Post on 22-Jul-2018




0 download

Embed Size (px)


<ul><li><p>Task Scheduling and Voltage Selection for EnergyMinimization</p><p>Yumin ZhangSynopsys, Inc.</p><p>700 East Middlefield RoadMountain View, CA 94043</p><p>yumin@synopsys.com</p><p>Xiaobo (Sharon) Hu Danny Z. ChenDepartment of Computer Science and Engineering</p><p>University of Notre DameNotre Dame, IN 46556, USA</p><p>fshu, cheng@cse.nd.edu</p><p>Categories and Subject DescriptorsI.2.8 [Problem Solving, Control Methods, and Search]:Scheduling</p><p>General TermsAlgorithms, Design</p><p>Keywordsvoltage selection, task scheduling</p><p>ABSTRACTIn this paper, we present a two-phase framework that inte-grates task assignment, ordering and voltage selection (VS)together to minimize energy consumption of real-time de-pendent tasks executing on a given number of variable volt-age processors. Task assignment and ordering in the rstphase strive to maximize the opportunities that can be ex-ploited for lowering voltage levels during the second phase,i.e., voltage selection. In the second phase, we formulate theVS problem as an Integer Programming (IP) problem andsolve the IP eciently. Experimental results demonstratethat our framework is very eective in executing tasks atlower voltage levels under dierent system congurations.</p><p>1. INTRODUCTIONEnergy consumption has become a primary concern in</p><p>today's IC systems. Energy minimization techniques ap-plied at higher design levels tend to be more eective thantechniques applied at lower levels [4]. Processors that canoperate at variable supply voltages or be switched to sleepmode while idling are becoming available [1, 11]. Two mainsystem level energy saving techniques are: voltage selection(VS) (also called voltage scheduling [4]) which selects proces-sor's supply voltage according to the performance require-ment, and power management (PM) [2] which shuts downa processor when it is idle. In general VS is more eectivethan PM [10]. The switching of voltages values can be doneon the y and the incurred overhead can be ignored if suchswitching does not happen very frequently [16]. In real-timesystems where tasks can take tens or hundreds of thousands</p><p>Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.DAC2002, June 10-14, 2002, New Orleans, Louisiana, USA.Copyright 2002 ACM 1-58113-461-4.02.0006 ...$5.00.</p><p>of cycles to execute, applying VS judiciously can achieve alarge amount of energy saving [17].In this paper, we consider real-time dependent tasks with</p><p>deadlines. These tasks are to be executed on variable voltageprocessors. Our challenge is to nd a system implementationthat consumes the least amount of energy. A system imple-mentation is determined by the task assignment (whichtask runs on which processor), the task ordering (the ex-ecution order of tasks on each processor), and the voltageselection (which cycles of each task use which voltage level).We refer the combination of task assignment and orderingas task scheduling in the rest of the paper.Most related work, e.g., [3, 10, 12, 13, 17], concentrates on</p><p>saving energy of independent tasks or on a single processor.However, tasks in real-world applications usually have con-trol or data dependencies and many systems have multipleprocessors. Approaches in [8, 15] solve the energy minimiza-tion problem for dependent tasks on multiple variable volt-age processors. [8] assumes a given task assignment, while[15] assumes a given task scheduling. However, task schedul-ing without consideration of VS may not present the bestenergy saving potential. Furthermore, the approach in [8],which evaluates eligible tasks one by one, fails to recognizethe joint eect of slowing down certain tasks simultaneously.The joint eect is critical in nding global optimal voltagesettings and an example in Section 2 explains the eect inmuch detail. The heuristics in [15] try to distribute slacksevenly among tasks, which does not necessarily lead to bestenergy saving because the slowdown of dierent tasks hasdierent eect on other tasks and contributes dierently tothe overall energy saving. To achieve the best energy saving,the scheduling should try to present the best potential forVS and the VS decision should be made in a global manner.In this paper, we present a framework that integrates task</p><p>scheduling and voltage selection together to minimize energyconsumption of dependent tasks on systems with a givennumber of variable voltage processors. The integration isimportant because scheduling results determine the poten-tial energy saving that can be achieved through VS. In orderto nd the scheduling with the best slowing down opportu-nity in the rst phase, we apply an earliest-deadline-rst(EDF) scheduling which can be proved to be optimal for asingle processor, and a scheduling with priority-based taskordering and a best-t processor assignment for multipleprocessors. Tasks after scheduling can be modeled as a Di-rect Acyclic Graph (DAG). In the second phase, we formu-late the VS problem on a DAG as an Integer-Programming(IP) problem. Each task can have cycles running at dier-</p></li><li><p>0 2 4 6 8 10 120</p><p>5</p><p>10</p><p>15</p><p>20</p><p>25</p><p>Cycle time at voltage 1.0 to 5.0</p><p>Ene</p><p>rgy </p><p>cons</p><p>umpt</p><p>io</p><p>Relation of energy and cycle time</p><p>a=2 a=1.6a=1.2</p><p>Figure 1: Convex relation of energy and delay</p><p>ent voltages. We have implemented the framework and con-ducted experiments on dierent systems and tasksets. Theresults show that our framework is very eective in reducingenergy consumption at the system level. Our IP approachfor solving the VS problem can slowdown 58% more cyclesthan a baseline scaling VS approach.Our main contributions can be summarized as:</p><p>First integrated framework: we solve the energy min-imization problem of dependent tasks on variable voltageprocessors in two phases with the rst phase task schedul-ing being guided by the VS approach in the second phase.Exact algorithm for VS problem: our IP formulationsolves the VS problem exactly and we prove the continuousvoltage and special discrete voltages case are polynomial-time solvable. The ILP for general discrete voltages is solvedwith approximation in short runtime and the approximationis tested to be within 97% of the optimal solution.Substantial energy saving: the framework can slowdown7-98% of cycles with very short runtime. It can be used indesign space exploration to meet energy target and nd thebest system conguration.</p><p>2. PRELIMINARIES AND A MOTIVATIONALEXAMPLE</p><p>The number of cycles, Nu, that task u needs to nish re-mains a constant during VS, while processor's cycle timeCT , u's delay du and the the dominant part of total energyconsumption, dynamic energy consumption, Eu, change withthe supply voltage Vdd. CT , du and Eu can be computed as</p><p>CT =k Vdd</p><p>(Vdd Vth)a(1)</p><p>du = Nu CT = Nu k Vdd</p><p>(Vdd Vth)a(2)</p><p>Eu = Nu Cu V2</p><p>dd (3)</p><p>where k is a device related parameter, Vth is the thresholdvoltage, a ranges from 2 to 1.2 and Cu is the eective switch-ing capacitance per cycle. We depict the relation of Eu anddu for dierent a in Figure 1. It is clear that Eu is a convexfunction of du [12] and both are functions of Vdd. Note thatour work allows tasks to have dierent power characteristics,such as switching capacitance.We use a DAG to represent a taskset. In the DAG, each</p><p>node u represents a task u, while an edge e(u; v) indicatesthat v can only start after u nishes. Deadline dlu constrainsu's nish time and Tcon on a set restricts all tasks' nishtime. A taskset after scheduling can still be representedas a DAG with additional edges, if necessary, to captureprocessor sharing. The DAG for a 5-task set before andafter scheduling to two processors is shown in Figure 2 (a)and (b). Slack is the maximum amount of time that a taskcan be slowed down without violating timing constraints.</p><p>OUT</p><p>p1 p2IN</p><p>4</p><p>(a)</p><p>t35</p><p>t1 t26 5</p><p>(b)</p><p>t16</p><p>5</p><p>4t4</p><p>Tcon = 19 Tcon = 19</p><p>t4</p><p>t35</p><p>t1 t26 5</p><p>4t4</p><p>Tcon = 19</p><p>(c)</p><p>t25</p><p>t3</p><p>1</p><p>1t5 1</p><p>t5</p><p>t5</p><p>Figure 2: (a) A 5-task set (b) Tasks scheduled on P1and P2 (c) DAG with IN and OUT</p><p>p1</p><p>p2</p><p>p1</p><p>p2</p><p>p2</p><p>p1</p><p>320 1 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19</p><p>p1</p><p>p2</p><p>(a)</p><p>(b)</p><p>(d)</p><p>(c)</p><p>t1</p><p>t2</p><p>t1</p><p>t2</p><p>t2t1</p><p>t1</p><p>t4t3</p><p>t3 t4</p><p>t3 t4</p><p>t3 t4</p><p>t2</p><p>t5</p><p>t5</p><p>t5</p><p>t5</p><p>Figure 3: VS results by dierent approaches. Cyclesin shade are executed at Vl.</p><p>2.1 A motivational exampleWe use the example in Figure 2 to show that dierent VS</p><p>approaches on multiple processors perform quite dierently.This example motivated us to make VS decisions in a globalmanner. In Figure 2 (b), the number inside each task isNu, the number of cycles needed to nish the task. AssumeP1 and P2 can operate on Vh and Vl and Vh &gt; Vl. Forsimplicity, assume CTVh is 1 time unit, CTVl is 2, and theaverage energy saving per slowdown cycle is the same for alltasks on P1 and P2. Let Tcon be 19.The VS results by approaches in [8, 15] are shown in Fig-</p><p>ure 3 (b) and (c). Figure 3 (a) shows the execution of taskson P1 and P2 when the two processors are always on Vh andFigure 3 (d) shows an optimal solution for this problem. It isclear that slowing down t1 and t2 simultaneously is the besteven though slowing down each will take away the potentialof slowing down t3, t4 and t5. Also the even distribution ofslack is not optimal in saving energy. The number of cyclesat Vl is 10 in Figure 3 (d) and this is an optimal solution,which is is 67% more than the 6 Vl cycles by approach in [8]and 43% more than the 7 Vl cycles by approach in [15]. OurVS approach, which is to be presented in the next section,achieves the optimal solution for this example.</p><p>3. VOLTAGE SELECTIONOur approach integrates task scheduling and voltage se-</p><p>lection together and strives to achieve the maximum energysaving on variable voltage processors. The framework canbe used to optimize energy at the system level in a design</p><p>Task graph </p><p>System configuration</p><p>NoNo Meet energytarget?</p><p>Yes</p><p> Framework Our</p><p>Voltage selection</p><p>System implementations determined for each task</p><p>constraintsTask scheduling</p><p>Figure 4: Design ow with our approach</p></li><li><p>ow like the one shown in Figure 4. In this section, wewill present an IP formulation that solves the VS problemon a given scheduling of dependent tasks on multiple pro-cessor. A simplied version of the IP formulation can beused to solve the VS problem on a single processor. The IPformulation solves the VS problem exactly and it providesguidance for task scheduling to be discussed in Section 4.</p><p>3.1 Unified IP formulation for VSWe need to guarantee that after the VS technique is ap-</p><p>plied, tasks with deadlines still nish before their deadlinesand the last task on each processor nishes no later thanTcon. That is the delay on each path in the DAG should stillnot be greater than Tcon. However, the number of paths in aDAG can be exponential to the number of edges and makesit not practical to do optimization on a path base. This isthe reason that VS approaches on single processor systemscannot be readily extended to solve the VS problem on mul-tiple processors. Here we adopt a model proposed in [5] toformulate timing constraints on a DAG not on paths, but onnodes and edges. We add two nodes, IN and OUT , edgesfrom IN to the rst task on each processor, and edges fromthe last task on each processor to OUT to form a new DAG.Figure 2 (c) shows the new DAG, G(V;E), for the taskset inFigure 2 (a). Execution time of task u at the highest supplyvoltage Vh is a constant Tu, and its deadline dlu remains aconstant. TOUT and TIN are set to be 0. Tcon is the timingconstraint on the DAG. Besides task's delay du, we associateanother variables Du with u. Timing constraints on G(V;E)in Figure 2 (c) can be modeled as</p><p>DOUT DIN Tcon (4)</p><p>Dv Du du 0 8e(u; v) 2 E (5)</p><p>Du + du dlu 8u with deadline (6)</p><p>du Tu; Du 0; int; 8u 2 V (7)</p><p>Intuitively, Du represents u's start time. For a feasiblescheduling, if DIN is set to be 0, the above constraints guar-antee that tasks with deadlines will nish before their dead-lines and the nish time of all task is not greater than Tcon.Single processor systems are a special case of systems with</p><p>multiple processors. The timing constraints in (4)-(7) onmultiple processors also constrain the timing on a singleprocessor. However, the scheduling on a single processorhas special features that we can utilize to simplify the con-straints. The main feature on a single processor is that alltasks are on one path. The constraints for single processorcan be simplied as follows.X</p><p>u2V</p><p>du Tcon (8)</p><p>X</p><p>v2Bu</p><p>dv dlu 8u withdeadline (9)</p><p>du Tu; int; 8u 2 V (10)</p><p>where Bu is the set that includes u and all tasks scheduledbefore u, du's are variables whose values need to be deter-mined, Tcon, dlu and Tu are constants.The objective of VS is to minimize the sum of each task</p><p>u's energy consumption,P</p><p>u2VEu. Combing the objective</p><p>and the constraints in (4)-(7) for multiple processors andconstraints in (8)-(10) for single processor, we have the IPformulation for the VS problem. To trade the increase ofdelay for energy saving, we need to establish the relation-</p><p>ship between du and Eu. If the supply voltage can changecontinuously, du can also change continuously. In this case,Eu is a convex function of du, as depicted in Figure 1, andwe have Eu = f(du), where f() is a convex function. Sub-stituting Eu with f(du), we have the IP formulation for thecontinuous voltage case.In the discrete voltage case, only a certain number of volt-</p><p>ages are available. For example, the Athlon 4 Processor fromAMD [1] can operate at ve voltage levels. The discretenessof voltage levels implies that a task's delay can only takediscrete values. Let the highest voltages be Vh and otherm available voltages be V1; V2; : : : ; Vm, where Vi &lt; Vh; 1 i m. We introduce m new variables, Nu;i, to representthe number of cycles that task u is executed at Vi. For anygiven Vi, cycle time CTi and energy consumption per cycleEu;i are constants as computed in (1) and (3) in Section 2.The total number of cycles u takes, Nu, remains a constant.du and Eu are computed as linear functions of Nu;i.</p><p>du = Tu +mX</p><p>i=1</p><p>Nu;i(CTi CTh) (11)</p><p>Eu = Cu (mX</p><p>i=1</p><p>Nu;iV2</p><p>i + (Nu mX</p><p>i=1</p><p>Nu;i) V2</p><p>h ) (12)</p><p>Substituting (11)-(12) as Eu and du and addingP</p><p>iNu;i Nu, we have the ILP for the discrete voltage case, whereDu's and Nu;i's are variables.</p><p>3.2 Solving the IP problemIn general, IP problems are hard to solve. But the IP</p><p>problem for the continuous voltage case on systems with agiven number of processors is polynomial time solvable. Aconvex programming solver, like COPL LC from [6], can beused to solve the problem in polynomial time.</p><p>Theorem 1. The VS problem of executing dependent taskson systems with one or more processors with continuous volt-age is polynomial time solvable.</p><p>Proof: The VS problem is formulated as an IP problemwith objective</p><p>Pu f(du), constraints in (4)-(7) for multiple</p><p>processors, and constraints in (8)-(10) for a single processor.The objective f...</p></li></ul>


View more >