research article probabilistic analysis of steady-state...
Post on 13-Aug-2020
2 Views
Preview:
TRANSCRIPT
Research ArticleProbabilistic Analysis of Steady-State Temperature andMaximum Frequency of Multicore Processors consideringWorkload Variation
Biying Zhang12 Zhongchuan Fu1 Hongsong Chen3 and Gang Cui1
1School of Computer Science and Technology Harbin Institute of Technology Harbin 150001 China2School of Computer and Information Engineering Harbin University of Commerce Harbin 150028 China3School of Computer and Communication Engineering University of Science and Technology Beijing Beijing 100083 China
Correspondence should be addressed to Zhongchuan Fu 15124585561163com
Received 3 December 2015 Accepted 6 April 2016
Academic Editor Veljko Milutinovic
Copyright copy 2016 Biying Zhang et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited
A probabilistic method is presented to analyze the temperature and the maximum frequency for multicore processors based onconsideration of workload variation in this paper Firstly at the microarchitecture level dynamic powers are modeled as the linearfunction of IPCs (instructions per cycle) and leakage powers are approximated as the linear function of temperature Secondlythe microarchitecture-level hotspot temperatures of both active cores and inactive cores are derived as the linear functions of IPCsThe normal probabilistic distribution of hotspot temperatures is derived based on the assumption that IPCs of all cores follow thesame normal distribution Thirdly and lastly the probabilistic distribution of the set of discrete frequencies is determined It canbe seen from the experimental results that hotspot temperatures of multicore processors are not deterministic and have significantvariations and the number of active cores and running frequency simultaneously determine the probabilistic distribution of hotspottemperatures The number of active cores not only results in different probabilistic distribution of frequencies but also leads todifferent probabilities for triggering DFS (dynamic frequency scaling)
1 Introduction
Continuous technology scaling and miniaturization haveescalated the power density and temperature of multicoreprocessors In order to decrease manufacturing costs thepackages of multicore processors are mostly designed basedon average power dissipation instead of the maximum andtemperature is controlled with dynamic thermal manage-ment (DTM) techniques such as dynamic voltage and fre-quency scaling (DVFS) and dynamic frequency scaling (DFS)[1]When temperature of processor reaches or approaches thecritical point DVFS orDFS are invoked to ensure the thermalconstraint at the cost of sacrificing the speed of processorsTherefore it is crucial for design space exploration to analyzetemperature and running frequency accurately and fast at theearly stage
11 Motivation To explore the design space of thermal-aware multicore processors at the early stage some thermalmodels have been proposed to estimate the temperature andperformance of processors [2ndash5] and most of estimationapproaches are based on transient analysis [6ndash14] For tran-sient analysis temporal variations of temperature and per-formance depending on workloads are traced contributingto high estimation accuracy However transient analysis istime-consuming and in particular for multicore processorstime complexity is unacceptable at the early design stageAccordingly to speed up the estimation of temperature andperformance of multicore processors researchers resort tosteady-state analysis [15 16] Nevertheless to the best of ourknowledge all previous work related to steady-state analysisis based on the assumption that every workload has the samethermal contribution which greatly hinders the estimation
Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2016 Article ID 2462504 17 pageshttpdxdoiorg10115520162462504
2 Mathematical Problems in Engineering
accuracy In fact temperature of multicore processors hasgreat variations between different workloads [10 17] Ourpreliminary work has demonstrated that the dynamic powerof processors is highly correlated with IPCs (instructions percycle) and that within a small temperature range leakagepower linearly depends on temperature [18] According to theHotSpot thermal model temperature can be derived giventhe power of processors [3] Thus processor temperature hashigher correlation with IPC According to CLT (central limittheorem) when a large number of instructions are executedin the processor the probabilistic distribution of IPC tendsto follow the normal distribution Accordingly given boththe probabilistic distribution of IPC and the relationshipbetween the temperature and IPC the probabilistic distribu-tion of processor temperature can be derived and analyzedSubsequently the probabilistic distribution of the maximumrunning frequency can be inferred given that the zero-slackDTM policy is used by the processor which means thatthe speed of the processor is set to a value which makesthe temperature of the hotspot be the maximum thresholdallowed by the processor [7]
12 Contributions In this paper a probabilistic methodis proposed to analyze the steady-state temperature andfrequency of multicore processors taking into account thevariation of workloads In order to simplify the analyzingprocesses DFS technique rather than DVFS technique isadopted to manage the temperatures of processor where thevoltage is constant and only frequency is adjusted And thedynamic power can be modeled as the linear function of thefrequency [12 16]Themain contributions of this work are asfollows
(i) At the microarchitecture level the dynamic powerof processors is modeled as the linear function ofIPC and running frequency and the leakage powerof processor is approximated as the linear model oftemperature
(ii) The microarchitecture-level hotspot temperatures ofboth active cores and inactive cores are derived as thelinear functions of IPCs of all active cores
(iii) It is inferred that the hotspot temperatures of bothactive cores and inactive cores follow the normalprobabilistic distribution based on the assumptionthat IPCs of all active cores follow the same normaldistribution
(iv) The probabilistic distribution of the set of frequenciesis determined given the zero-slack DTM policy [7]
The remainder of this paper is organized as followsIn Section 2 related work is overviewed In Section 3 themicroarchitecture-level steady-state temperature of a core isformulated as the linear function of its powers based on theHotspot thermalmodel In Section 4 at themicroarchitecturelevel the dynamic powers of processors are modeled asthe linear function of IPC and the running frequency andthe leakage powers are approximated as the linear modelof temperature In Section 5 the microarchitecture-level
hotspot temperatures of both active cores and inactive coresare derived as the linear function of IPCs of all active coresIt is inferred that the hotspot temperatures of both activecores and inactive cores follow the normal probabilisticdistribution based on the assumption that the IPCs of allactive cores follow the same normal probabilistic distribu-tion In Section 6 the probabilistic distribution of the set offrequencies is determined given the zero-slack DTM policyIn Section 7 experimental results are presentedThis paper isconcluded in Section 8
2 Related Work
The estimation approach of temperature and performance ofprocessors can be classified into transient analysis and steady-state analysis and so far most researches have been based ontransient analysis
21 Transient Analysis In order to transiently analyze thetemperature and explore the design space of processors at theearly stage several thermal models for processors have beenproposed Skadron et al and Huang et al [2 3] presenteda compact thermal modeling methodology based on theanalogy between thermal and electrical phenomena namelyHotSpot Using HotSpot the spatial and temporal variationsof processor temperature can be obtained through transientanalysis To improve the accuracy of thermal simulation Janget al [11]made an extension to the thermalmodel forHotSpotby taking into account the different ambient temperatureowing to workload variations To accelerate thermal analysisof multicore processors at the architecture level Wang et al[5] presented a composite thermal model termed Therm-Comp to optimize the model for different large processorsLi et al [4] proposed a parameterized architecture-leveldynamic thermal model namely ParThermPOF in whichmany parameters can be set such as the location of thermalsensors and the conductivity of different components
In order to improve the performance of thermal-awaremulticore processors various DTM techniques have beeninvestigated based on thermal models such as Hotspotand analyzed transiently for the estimation of performanceHanumaiah et al [7 19] presented an online thermal man-agement algorithm for thermal-aware multicore processorsin which DVFS and task allocation techniques are simulta-neously adopted In the context of hard real-time systemsthe time-varying voltage and frequency of multicores arecomputed to satisfy not only the thermal constraint butalso the deadline constraint [6] Shi et al [12] presented aDTM policy under soft thermal constraint in which thetemperature constraint can be exceeded sometimes
In order to simulate the thermal behavior fast andaccurately for multicore processors several researches havebeen performed Wojciechowski et al [10 20] analyzedthe transient characteristics of workloads based on a finiteFourier series expansion to accurately predict the thermalbehavior of multicore processors and a new DVFS approachis presented Liu et al [13] proposed a transient analysismethod of temperature of multicore processors based on
Mathematical Problems in Engineering 3
moment matching and it is used to guide the migrationprocesses of tasks To account for the nondeterministicbehavior of tasks in terms of executing times and decisionbranches Das et al [14] formulated thermal analysis as ahybrid automata reachability verification problem and analgorithm for constructing the automata was provided
For transient analysis temporal variations of temperatureand performance depending onworkload are traced and thenhigh estimation accuracy is obtained However transientanalysis is time-consuming and in particular for multicoreprocessors time complexity is unacceptable at the early designstage
22 Steady-State Analysis In order to speed up the estima-tion of temperature and performance at the early designstage of multicore processors researchers resort to steady-state analysis and have carried out a lot of work Based onAmdahlrsquos Law Lee and Kim [15 21] introduced variations ofprocess and workload parallelism into the analyzing modeland optimized the throughput of thermal-aware multicoreprocessors by exploitingDVFS and the per-core power-gating(PCPG) Based on HotSpot Rao et al [16 22] describedan approximate thermal model for homogeneous multicoreprocessors to fast and accurately predict the maximumsteady-state throughput under thermal constraints In thecontext of a hard real-time system of a single-core processorMohaqeqi et al [23] studied stochastic behavior of the systemfor example performance temperature and reliability basedon Markovian view
To the best of our knowledge all previous work of steady-state analysis is based on the assumption that every workloadhas the same thermal contribution to processors resultingin inaccuracy of temperature and performance estimationThis is the focus of our work In this paper the variation ofworkloads is taken into account to model the thermal andfrequency more accurately
3 Thermal Model
In this paper a microarchitecture-level thermal model fora multicore processor is created by replication of a single-core processor based onHotSpot [3]Themulticore processoris divided into four layers that is chip thermal interfacematerial (TIM) heat spreader and heat sink There are 119898
thermal blocks in the chip and TIM and there are five andnine thermal blocks in the heat spreader and heat sinkrespectively Totally there are 119873 = 119899119898 + 14 thermal blocksin the multicore processor where 119899 is the number of cores
The microarchitecture-level thermal model can be repre-sented by the state-space differential equation as follows [16]
119889T119889119905
= AT + BP (1)
where T and P are119873-dimension vectors respectively denot-ing the temperature and power of the multicore processorand A and B are constant matrices of 119873 times 119873 dimensiondepending on the thermal conduction and capacitance of theprocessor
DieCore 1
DieCore 2
TIMCore 1
TIMCore 2
DieCore 1
0 0 0
DieCore 2
0 0 0
TIMCore 1 0 0 0
TIMCore 2
0 0 0
0 0 0 0
Spreader
Spreadercenter
center
Spreaderperipheral
and heat sink
Spreaderperipheral
and heat sink
Gdie
Gdie
Gdie-int
Gdie-int
Gint-die
Gint-die
Gint-spr
Gint
Gint
Gpkg
Gspr-int
Figure 1Thermal conductionmatrixG of dual-core processor [16]
When the temperature of the processor is in the steady-state 119889T119889119905 = 0 then
GT = P (2)
where G = minusBminus1A isthe thermal conductance matrix ofHotSpot model
The thermal conductance matrix G can be obtainedfrom the HotSpot simulation tool The thermal conductancematrix of a dual-core processor is as shown in Figure 1 wherethe submatrices Gdie Gint and Gpkg along the diagonal arelateral thermal conductance of the chip TIM and packagerespectively the submatrices Gdie-int and Gint-die are thevertical conductance between die and TIM and the vectorsGint-spr and G1015840spr-int are the vertical conductance between theTIM and the spreader Gdie-int is equal to Gint-die and Gint-spris equal to G1015840spr-int The detailed conductance matrix of Gpkgis as shown in Figure 2
Lemma 1 According to HotSpot thermal model when thetemperature of the multicore processor is in the steady-statethere exist 119898-dimension matrices R Q and Z such that
T119894
= RP119894
+Q119899
sum
119895=1
P119895
+ Z (3)
where T119894
and P119894
are119898-dimension vectors representing respec-tively temperature and power of the 119894th core of the processor
Proof According to the arrangement of elements in theconductance matrix G in HotSpot thermal model (2) can bedecomposed into the following equations
GdieT119894 + Gdie-intTint119894 = P119894
(4)
Gdie-intT119894 + GintTint119894 + Gint-spr119879spr = 0 (5)
4 Mathematical Problems in Engineering
Spreadercenter
Spreadercenter
Spreaderperipheral
and heat
Spreader peripheraland heat sink
Gspr
Gother-spr
Gspr-other
Gother-other
sink
Figure 2 Thermal conduction matrix of the package [16]
G1015840int-spr119899
sum
119895=1
Tint119895 + 119866spr119879spr + G1015840spr-otherTother = 0 (6)
Gspr-other119879spr + Gother-otherTother = Pambient (7)
where Tint119894 is119898-dimension vector representing the tempera-ture of TIM layer of the 119894th core 119879spr is a scalar representingthe temperature of the central block of the spreader andTotheris 13-dimension vector representing the temperature of otherblocks of the spreader and the sink
According to (6) and (7)
119879spr = (G1015840spr-otherGminus1
other-otherGspr-other minus 119866spr)minus1
sdot (G1015840int-spr119899
sum
119894=1
Tint119894 + G1015840spr-otherGminus1
other-otherPambient)
(8)
Set
GSPRO = (G1015840spr-otherGminus1
otherGspr-other minus 119866spr)minus1
Gint-spr
119866spamb = (G1015840spr-otherGminus1
other-otherGspr-other minus 119866spr)minus1
sdot G1015840spr-otherGminus1
other-otherPambient
(9)
Equation (8) can be converted into
119879spr = G1015840SPRO119899
sum
119894=1
Tint119894 + 119866spamb (10)
Substitute (10) into (5) and get
Gdie-intT119894 + GintTint119894
+ Gint-spr (G1015840SPRO119899
sum
119894=1
Tint119894 + 119866spamb) = 0(11)
According to (4) and (11) get
(Gdie-int minus GintGminus1
die-intGdie)T119894
minus Gint-sprG1015840
SPROGminus1
die-intGdie
119899
sum
119895=1
T119895
+ GintGminus1
die-intP119894
+ Gint-sprG1015840
SPROGminus1
die-int
119899
sum
119895=1
P119895
+ Gint-spr119866spamb
= 0
(12)
Set
G1
= Gdie-int minus GintGminus1
die-intGdie
G2
= Gint-sprG1015840
SPROGminus1
die-intGdie
G3
= GintGminus1
die-int
G4
= Gint-sprG1015840
SPROGminus1
die-int
G5
= Gint-spr119866spamb
(13)
Then (12) can be transformed into
G1
T119894
minus G2
119899
sum
119895=1
T119895
+ G3
P119894
+ G4
119899
sum
119895=1
P119895
+ G5
= 0 (14)
According to (14) get
G1
119899
sum
119895=1
T119895
minus 119899G2
119899
sum
119895=1
T119895
+ G3
119899
sum
119895=1
P119895
+ 119899G4
119899
sum
119895=1
P119895
+ 119899G5
= 0
(15)
According to (15) get
119899
sum
119895=1
T119895
= (119899G2
minus G1
)minus1
(G3
+ 119899G4
)
119899
sum
119895=1
P119895
+ 119899 (119899G2
minus G1
)minus1G5
(16)
Substitute (16) into (14) and get
Tdie119894
= Gminus11
(G2
(119899G2
minus G1
)minus1
(G3
+ 119899G4
) minus G4
)
119899
sum
119895=1
P119895
minus Gminus11
G3
P119894
+ Gminus11
(119899G2
(119899G2
minus G1
)minus1
minus E)G5
(17)
Set
R = minusGminus11
G3
Q = Gminus11
(G2
(119899G2
minus G1
)minus1
(G3
+ 119899G4
) minus G4
)
Z = Gminus11
(119899G2
(119899G2
minus G1
)minus1
minus E)G5
(18)
Mathematical Problems in Engineering 5
Get
T119894
= RP119894
+Q119899
sum
119895=1
P119895
+ Z (19)
This means that in the steady-state of temperature of amulticore processor given the powers of all cores the tem-perature of any core can be calculated according to Lemma 1at the microarchitecture level
4 Power Model
It is assumed thatmulticore processors have two power statesactive mode and inactive mode and the power state of eachcore can be set separately And global dynamic frequencyscaling (DFS) technique is used where frequencies of all coresare scaled uniformly
41 Active Mode When a core is in the active mode work-loads are executed and the core dissipates both dynamicpower and leakage power At the microarchitecture level thepower of the active core is defined as
Pa119894 = Pdyna119894 + Pleaa119894 (20)
wherePa119894Pdyna119894 andPleaa119894 are respectively the total powerdynamic power and leakage power of the 119894th active core ofsize119898 times 1
For the processor using DVFS technique the dynamicpower is proportional to the product of the square of thevoltage and the frequency [7] that is Pdyn prop voltage2 timesf requency In this paper the primary purpose is to analyzethe impact of workload variation on the temperatures ofprocessors So in order to simplify the analyzing processesDFS technique rather thanDVFS technique is used tomanagethe temperatures of processor where the voltage is constantand only frequency is adjusted Hence the dynamic powercan be modeled as the linear function of the frequency [1216] In addition the dynamic power caused by workloadexecution has a close linear relationship with IPCs
Let 119883a119894 be the IPC of the 119894th core when workloads arerunning on it and let 119891 be the normalized frequency between0 and 1 and then the microarchitecture-level dynamic powerPdyna119894 of the 119894th active core is defined as
Pdyna119894 = (Gdyn119883a119894 + G1198890
+ Edyn) 119891 (21)
where Gdyn and G1198890
are the linear regression coefficientswhen119891 is set tomaximum 1 andEdyn is the regression residualwhich follows the normal distribution with mean of zeronamely Edyn sim 119873(01205902
119864119889
) Gdyn G1198890 Edyn and 120590119864119889 arevectors of size 119898 times 1 and 1205902
119864119889
is the element-by-elementsquares of 120590
119864119889
At the microarchitecture level in order to simplify the
analyzing procedure the relationship between the leakagepower Pleaa119894 and the temperature Ta119894 of the 119894th active coreis approximated by the linear model as follows
Pleaa119894 = GleaaTa119894 + G1198970a + Eleaa (22)
whereGleaa andG1198970a are the regression coefficients and Eleaais the regression residual which follows the normal distribu-tion with mean of zero namely Eleaa sim 119873(01205902
119864a) Gleaa is adiagonal matrix of size 119898 times 119898 G
1198970a Eleaa 120583119864a and 120590119864a are
vectors of size 119898 times 1 1205902119864a is the element-by-element squares
of 120590119864aAccording to (20) (21) and (22) the power of the active
core Pa119894 is represented by
Pa119894 = (Gdyn119883119894 + G1198890
) 119891 + Edyn + GleaaTa119894 + G1198970a
+ Eleaa(23)
42 Inactive Mode When a core is in the inactive mode itis powered off using power-gating technique The inactivecore only dissipates leakage power which is much lower thanthat of the active core Hence the power of the 119894th inactivecore Pina119894 is the leakage power Pleaina119894 which depends on thecorersquos temperature Tina119894 and it can be approximated by thelinear model at the microarchitecture level as follows
Pina119894 = Pleaina119894 = GleainaTina119894 + G1198970ina + Eleaina (24)
whereGleaina andG1198970ina are regression coefficients andEleainais the regression residuals which follow normal distributionwith mean of zero namely Eleaina sim 119873(01205902
119864ina) Gleaina is adiagonal matrix of size 119898 times 119898 G
1198970ina Eleaina and 120590119864ina arevectors of size119898times1 1205902
119864ina is the element-by-element squaresof 120590119864ina
5 Thermal Analysis
Lemma 2 When the temperature of a multicore processor isin the steady-state the temperatures and the powers of differentinactive cores are same that isT
119894119899119886119894
= T119894119899119886119895
and P119894119899119886119894
= P119894119899119886119895
forforall119894 119895 (119894 = 119895) whereT119894119899119886119894
andP119894119899119886119894
are the temperatures andthe powers of the 119894th inactive core respectively
Proof According to (3) for any inactive cores 119894 and 119895
Tina119894 minus Tina119895 = R (Pina119894 minus Pina119895) (25)
Substitute (24) into (25) and get
Tina119894 minus Tina119895 = RGleaina (Tina119894 minus Tina119895) (26)
According to (26) get
(E minus RGleaina) (Tina119894 minus Tina119895) = 0 (27)
The parameters R and Gleaina are invertible matrices soEminusRGleaina is an invertible matrixTherefore Tina119894 minusTina119895 =0 that is Tina119894 = Tina119895 And Pina119894 = Pina119895 can also beobtained according to (24)
To be convenient set Tina119894 = Tina and Pina119894 = Pina (24)can be simplified into
Pina = GleainaTina + G1198970ina + Eleaina (28)
6 Mathematical Problems in Engineering
Theorem 3 Assume that all cores of a processor have thesame hotspot Let H = [ℎ
1
ℎ2
ℎ119898
] be the selectionvector of the hotspot where only one element of H can beset to 1 and the others are 0 s indicating that the corre-sponding functional unit is hotspot There exist functions1205951
(H 119899 119899119886
119891) 1205952
(H 119899 119899119886
119891) 1205953
(H 119899 119899119886
119891) 1205954
(H 119899 119899119886
119891) 1205955
(H 119899 119899119886
119891) and 1205956
(H 119899 119899119886
119891) such that the hotspot119879ℎ119900119905119886119894
of the 119894th active core can be formulated by
119879hot119886119894 = 1205951
(H 119899 119899119886
119891)119883119886119894
+ 1205952
(H 119899 119899119886
119891)
119899
119886
sum
119896=1
119883119886119896
+ 1205953
(H 119899 119899119886
119891)E119889119910119899
+ 1205954
(H 119899 119899119886
119891)E119897119890119886119886
+ 1205955
(H 119899 119899119886
119891)E119897119890119886119894119899119886
+ 1205956
(H 119899 119899119886
119891)
(29)
There exist functions 1205851
(H 119899 119899119886
119891) 1205852
(H 119899 119899119886
119891)1205853
(H 119899 119899119886
119891) 1205854
(H 119899 119899119886
119891) and 1205855
(H 119899 119899119886
119891) such thatthe hotspot 119879
ℎ119900119905119894119899119886
of the inactive core can be formulated by
119879ℎ119900119905119904
= 1205851
(H 119899 119899119886
119891)
119899
119886
sum
119896=1
119883119886119896
+ 1205852
(H 119899 119899119886
119891)E119889119910119899
+ 1205853
(H 119899 119899119886
119891)E119897119890119886119886
+ 1205854
(H 119899 119899119886
119891)E119897119890119886119894119899119886
+ 1205855
(H 119899 119899119886
119891)
(30)
Proof According to (3) (23) and (28) the temperature of theinactive core Tina can be derived as
Tina = (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1QGleaa
119899a
sum
119896=1
Ta119896
+ 119891 (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
sdotQGdyn
119899a
sum
119896=1
119883a119896 + 119891119899a (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1QEdyn + 119899a (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1QEleaa + (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1
(R + (119899 minus 119899a)Q)
sdot Eleaina + (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
sdot ((R + (119899 minus 119899a)Q)G1198970ina + 119891119899aQG
1198890
+ 119899aQG1198970a
+ Z)
(31)
Let
Φ1
(119899 119899a 119891) = (R + (119899 minus 119899a)Q)G1198970ina + 119891119899aQG
1198890
+ 119899aQG1198970a + Z
Φ2
(119899 119899a) = (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
(32)
Then (31) can be transformed into
Tina = Φ2
(119899 119899a)QGleaa
119899a
sum
119896=1
Ta119896
+ 119891Φ2
(119899 119899a)QGdyn
119899a
sum
119896=1
119883a119896
+ 119891119899aΦ2 (119899 119899a)QEdyn + 119899aΦ2 (119899 119899a)QEleaa
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q)Eleaina
+Φ2
(119899 119899a)Φ1 (119899 119899a 119891)
(33)
According to (3) (23) and (28) the temperature of the 119894thactive core Ta119894 can be derived as
Ta119894 = (E minus RGleaa)minus1QGleaa
119899a
sum
119896=1
Ta119896 + 119891 (E
minus RGleaa)minus1QGdyn
119899a
sum
119896=1
119883a119896 + 119891 (E minus RGleaa)minus1
sdot RGdyn119883a119894 + (119899 minus 119899a) (E minus RGleaa)minus1QGleainaTina
+ 119891 (E minus RGleaa)minus1
(R + 119899aQ)Edyn + (E
minus RGleaa)minus1
(R + 119899aQ)Eleaa + (119899 minus 119899a) (E
minus RGleaa)minus1QEleaina + (E minus RGleaa)
minus1
(119891RG1198890
+ 119899a119891QG1198890
+ RG1198970a + 119899aQG
1198970a
+ (119899 minus 119899a)QG1198970ina + Z)
(34)
Let
Φ3
(119899 119899a 119891) = 119891RG1198890
+ 119899a119891QG1198890
+ RG1198970a
+ 119899aQG1198970a + (119899 minus 119899a)QG
1198970ina
+ Z
Φ4
= (E minus RGleaa)minus1
(35)
Then (34) can be transformed into
Ta119894 = Φ4QGleaa
119899a
sum
119896=1
Ta119896 + 119891Φ4
QGdyn
119899a
sum
119896=1
119883a119896
+ 119891Φ4
RGdyn119883a119894 + (119899 minus 119899a)Φ4QGleainaTina
+ 119891Φ4
(R + 119899aQ)Edyn +Φ4
(R + 119899aQ)Eleaa
+ (119899 minus 119899a)Φ4QEleaina +Φ4Φ3 (119899 119899a 119891)
(36)
Mathematical Problems in Engineering 7
Substitute (33) into (36) and get
Ta119894 = (Φ4
QGleaa
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)QGleaa)
119899a
sum
119896=1
Ta119896
+ 119891 (Φ4
QGdyn
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)QGdyn)
119899a
sum
119896=1
119883a119896
+ 119891Φ4
RGdyn119883a119894 + 119891 (Φ4
(R + 119899aQ)
+ 119899a (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)Q)Edyn
+ (Φ4
(R + 119899aQ)
+ 119899a (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)Q)Eleaa
+ ((119899 minus 119899a)Φ4Q
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a) (R + (119899 minus 119899a)Q))
sdot Eleaina + (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)
sdotΦ1
(119899 119899a 119891) +Φ4
Φ3
(119899 119899a 119891)
(37)
LetΦ5
(119899 119899a) = (Φ4
QGleaa + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)QGleaa)
Φ6
(119899 119899a 119891) = 119891 (Φ4
QGdyn + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)QGdyn)Φ7 (119891)
= 119891Φ4
RGdyn
Φ7
(119891) = 119891Φ4
RGdyn
Φ8
(119899 119899a 119891) = 119891 (Φ4
(R + 119899aQ) + 119899a (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)Q)
Φ9
(119899 119899a) = (Φ4
(R + 119899aQ) + 119899a (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)Q)
Φ10
(119899 119899a) = ((119899 minus 119899a)Φ4Q + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a) (R + (119899 minus 119899a)Q))
Φ11
(119899 119899a 119891) = (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)
sdotΦ1
(119899 119899a 119891) +Φ4
Φ3
(119899 119899a 119891)
(38)
Then (37) is transformed into
Ta119894 = Φ5 (119899 119899a)
119899a
sum
119896=1
Ta119896 +Φ6 (119899 119899a 119891)
119899a
sum
119896=1
119883a119896
+Φ7
(119891)119883a119894 +Φ8 (119899 119899a 119891)Edyn
+Φ9
(119899 119899a)Eleaa +Φ10 (119899 119899a)Eleaina
+Φ11
(119899 119899a 119891)
(39)
According to (39) get119899a
sum
119896=1
Ta119896 = (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891))
119899a
sum
119896=1
119883a119896
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)Edyn
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)Eleaa
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)Eleaina
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
(40)
Substitute (40) into (39) and get
Ta119894 = Φ7 (119891)119883a119894
+ (Φ5
(119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899 119899a 119891)
+Φ7
(119891)) +Φ6
(119899 119899a 119891))
119899a
sum
119896=1
119883a119896 + (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891) +Φ8
(119899 119899a 119891))
sdot Edyn + (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ9
(119899 119899a) +Φ9 (119899 119899a))Eleaa + (119899aΦ5 (119899 119899a) (E
minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a) +Φ10 (119899 119899a))Eleaina
+ 119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891)
(41)
The hotspot temperature 119879hota119894 of the active core can begiven by
119879hota119894 = HTa119894 = HΦ7
(119891)119883a119894 +H (Φ5
(119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899 119899a 119891) +Φ7
(119891))
+Φ6
(119899 119899a 119891))
119899a
sum
119896=1
119883a119896 +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+Φ8
(119899 119899a 119891))Edyn +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a) +Φ9 (119899 119899a))Eleaa
+H (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ10
(119899 119899a))Eleaina +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891))
(42)
8 Mathematical Problems in Engineering
Let1205951
(H 119891) = HΦ7
(119891)
1205952
(H 119899 119899a 119891) = H (Φ5
(119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) +Φ6
(119899 119899a 119891))
1205953
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+Φ8
(119899 119899a 119891))
1205954
(H 119899 119899a) = H (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ9
(119899 119899a) +Φ9 (119899 119899a))
1205955
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a) +Φ10 (119899 119899a))
1205956
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891))
(43)
Then (42) is transformed into
119879hota119894 = 1205951
(H 119891)119883a119894 + 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn + 1205954
(H 119899 119899a)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(44)
Substitute (40) into (33) and get
Tina = (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+ 119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891)
(45)
The hotspot temperature 119879hotina of the inactive core isgiven by
119879hotina = HTina = H (Φ2
(119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(46)
Let1205851
(H 119899 119899a 119891)
= H (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899
119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
1205852
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ8
(119899 119899a 119891) + 119891119899aΦ2 (119899 119899a)Q)
1205853
(H 119899 119899a) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ9
(119899 119899a) + 119899aΦ2 (119899 119899a)Q)
1205854
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ10
(119899 119899a) +Φ2 (119899 119899a) (R + (119899 minus 119899a)Q))
1205855
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(47)
Then (46) is transformed into
119879hot119904 = 1205851 (H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896 + 1205852 (H 119899 119899a 119891)Edyn
+ 1205853
(H 119899 119899a 119891)Eleaa
+ 1205854
(H 119899 119899a 119891)Eleaina + 1205855 (H 119899 119899a 119891)
(48)
Mathematical Problems in Engineering 9
According toTheorem 3 it can be known that the hotspottemperature of any core can be expressed as the linearfunction of IPCs of all cores
Theorem 4 Suppose that (a) the random variable 119883119886119894
forthe IPC follows normal distribution with the mean 120583
119883
andthe variance 120590
2
119883
that is 119883119886119894
sim 119873(120583119883
1205902
119883
) (b) the tasksrunning on different cores are mutually independent that is119883119886119894
and 119883119886119895
are independent for 1 le forall119894 119895 le 119898 and (c)workload balancing techniques are used in the processor suchthat 119883
119886119894
and 119883119886119895
follow the same normal distribution Thenthere exist functions 120583
119879119886
(H 119899 119899119886
119891) and 120590119879119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119886119894
of the active corefollows the normal distributionwithmean120583
119879119886
(H 119899 119899119886
119891) andvariance 120590
119879119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119886119894
sim 119873(120583119879119886
(H 119899 119899119886
119891) 120590119879119886
2
(H 119899 119899119886
119891)) (49)
There exist functions 120583119879119894119899119886
(H 119899 119899119886
119891) and 120590119879119894119899119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119894119899119886
of the inactive corefollows the normal distribution with mean 120583
119879119894119899119886
(H 119899 119899119886
119891)
and variance 120590119879119894119899119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119894119899119886
sim 119873(120583119879119894119899119886
(H 119899 119899119886
119891) 120590119879119894119899119886
2
(H 119899 119899119886
119891))
(50)
Proof According to (29) get
119879hota119894 = (1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894
+ 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119896 =119894
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn
+ 1205954
(H 119899 119899a 119891)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(51)
The norm-distributed random variables 119883a119894 and 119883a119895 areindependent in the case of 1 le forall119894 119895 le 119898 so that thelinear combination (120595
1
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 +
1205952
(H 119899 119899a 119891)sum119899a119896=1119896 =119894
119883a119896 of 119883a119894 still follows the normaldistribution
All elements Edyn119896 in the random vector Edyn follow nor-mal distribution and are independent so that the linear com-bination 120595
3
(H 119899 119899a 119891)Edyn of all elements in Edyn followsnormal distribution In the sameway120595
4
(H 119899 119899a 119891)Eleaa and1205955
(H 119899 119899a 119891)Eleaina also follow normal distribution(1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 1205952
(H 119899 119899a119891)sum119899a119896=1119896 =119894
119883a119896 1205953
(H 119899 119899a 119891)Edyn 1205954
(H 119899 119899a 119891)Eleaaand 120595
5
(H 119899 119899a 119891)Eleaina all follow normal distribution andare mutually independent so that 119879hota119894 follows normal
distribution where the mean 120583119879ina(H 119899 119899a 119891) is calculated
by
120583119879a (H 119899 119899a 119891)
= (1205951
(H 119891) + 119899a1205952
(H 119899 119899a 119891)) 120583119883
+ 1205953
(H 119899 119899a 119891)120583119864119889
+ 1205954
(H 119899 119899a)120583119864a
+ 1205955
(H 119899 119899a 119891)120583119864ina + 1205956 (H 119899 119899a 119891)
(52)
and the variance 120590119879ina2
(H 119899 119899a 119891) is calculated by
120590119879a2
(H 119899 119899a 119891)
= (1205951
(H 119891) + 1205952
(H 119899 119899a 119891))2
1205902
119883
+ (119899a minus 1)1205952
2
(H 119899 119899a 119891) 1205902
119883
+ 1205953
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205954
2
(H 119899 119899a)1205902
119864a
+ 1205955
2
(H 119899 119899a 119891)1205902
119864ina
(53)
In the same way 1205851
(H 119899 119899a 119891)sum119899a119896=1
119883a119896 1205852(H 119899 119899a119891)Edyn 1205853(H 119899 119899a 119891)Eleaa and 1205854(H 119899 119899a 119891)Eleaina in (30)all follow the normal distribution and are mutually indepen-dent so that 119879hot119904 follows the normal distribution where themean 120583
119879119904
(H 119899 119899a 119891) is calculated by
120583119879119904
(H 119899 119899a 119891) = 119899a1205851 (H 119899 119899a 119891) 120583119883
+ 1205852
(H 119899 119899a 119891)120583119864119889
+ 1205853
(H 119899 119899a)120583119864a
+ 1205854
(H 119899 119899a 119891)120583119864ina
+ 1205855
(H 119899 119899a 119891)
(54)
and the variation 120590119879119904
2
(H 119899 119899a 119891) is calculated by
120590119879119904
2
(H 119899 119899a 119891) = 119899a12058512
(H 119899 119899a 119891) 1205902
119883
+ 1205852
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205853
2
(H 119899 119899a)1205902
119864a
+ 1205854
2
(H 119899 119899a 119891)1205902
119864ina
(55)
According toTheorem 4 it can be known that the hotspottemperature of any core follows the normal probabilisticdistribution
6 Frequency Analysis
The zero-slack policy is used as the DTM strategy of proces-sors that is the speed of processor is set to a value whichmakes the temperature of the hotspot be the threshold [7]However the frequencies are discrete in this workThereforein most cases there is no frequency in the set of frequenciesmaking the hotspot temperature be the threshold exactly
10 Mathematical Problems in Engineering
Theorem 5 Let 119865119878119890119905 = 1198911
119891120587
119891120587+1
119891119891119888119900119906119899119905
be the set of frequencies of multicore processors where1198911
lt sdot sdot sdot lt 119891120587
lt 119891120587+1
lt sdot sdot sdot lt 119891119891119888119900119906119899119905
then the probabilisticdistribution of the frequency f follows
119875 119891 = 119891120587
=
119875 119879ℎ119900119905119886
(119899119886
119891120587
) lt 119879V119886119897V119890 119891120587
= 119891119888119900119906119899119905
119875 119879ℎ119900119905119886
(119899119886
119891120587
) le 119879V119886119897V119890 minus 119875 119879ℎ119900119905119886
(119899119886
119891120587+1
) le 119879V119886119897V119890 119891120587
lt 119891119888119900119906119899119905
(56)
where119879ℎ119900119905119886
(119899119886
119891120587
) is the function of the number of active cores119899119886
and the frequency 119891120587
representing the hotspot temperatureof the active core and 119879V119886119897V119890 is the temperature threshold of theprocessor
Proof When 119891120587
= 119891count according to the zero-slack DTMpolicy obviously
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve (57)
When 119891120587
lt 119891count then the probabilities of 119891 = 119891120587
can bebroken into two cases
(a) if 119879hota(119899a 119891120587) gt 119879hota(119899a 119891120587+1) according to thezero-slack DTM strategy the probability of 119891 = 119891
120587
is 0 that is
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) gt 119879hota (119899a 119891120587+1) = 0 (58)
(b) if 119879hota(119899a 119891120587) le 119879hota(119899a 119891120587+1) the probability of119891 = 119891
120587
is given by
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
(59)
The right-hand side of (59) can be derived by
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve minus 119875 119879hota (119899a 119891120587)
le 119879valve 119879hota (119899a 119891120587+1) le 119879valve | 119879hota (119899a 119891120587)
le 119879hota (119899a 119891120587+1)
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
le 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587+1) le 119879valve
(60)
According to (59) and (60) get
119875 119891119879hota = (119899a 119891120587) 119891120587 le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve
(61)
Hence when 119891120587
lt 119891count according to (58) and (61) get
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve (62)
Therefore according to Theorem 5 the probabilistic dis-tribution of the set of frequencies can be obtained based onthe assumption that the zero-slack policy is used
Given the probabilistic distribution of the frequency andthe mean of hotspot temperature of the active core for acertain frequency the average hotspot temperature of theactive core 119879
averhota can be obtained by
119879averhota = sum
119891isin119865Set120583119879a (H 119899 119899a 119891) sdot 119875 (119891) (63)
where 119865Set = 1198911
119891120587
119891120587+1
119891119891count denotes the set
of frequencies of multicore processors and 119875(119891) denotes theprobability that the frequency is 119891 120583
119879a(H 119899 119899a 119891) denotesthe mean of hotspot temperature of the active core given thefrequency 119891
7 Experimental Results
71 Experimental Methodology Amulticore version of Alpha21264 processor is used as the processor model in ourexperiment [24] and there are eight cores in the processorThe cores have two working states active state and inac-tive state and the working state of each core can be setseparately The processor employs a global DFS techniquewhich means that frequencies of all cores in the processorare scaled uniformly There are four discrete frequenciesused by the processor that is 15 GHz 2GHz 25 GHz and3GHz To facilitate analysis the frequencies are normalizedinto the interval [0 1] so that the maximum frequency isnormalized to 1 After normalization the set of frequenciesis 05 067 083 1 According to our previous work thehotspot of Alpha 21264 processor is the branch predictor[18] So the second element of the hotspot selection vectorH corresponding to the branch predictor is set to 1 and theother elements are set to 0 sThe thermal threshold that is themaximum temperature allowed by processor is set to 100∘C
TheHotSpot is used as the thermalmodel of themulticoreprocessor and the parameters such as thermal conductanceand capacitance are set to default values of HotSpot sim-ulation tool [3] In order to construct the linear model ofdynamic power PTScalar is modified to obtain both the
Mathematical Problems in Engineering 11
dynamic power profiles of each functional unit in Alpha21264 processor and the IPC profiles [25] The parameters ofPTScalar are set to default values as well The mean 120583
119883
andvariance 120590
2
119883
of norm-distributed 119883a119894 are determined usingIPC profiles based on the maximum likelihood estimationSome representative tasks such as mesa ammp quake bzipmcfmath and qsort fromMiBench [26] and SPECCPU2000[27] are selected as the benchmarks These selected tasks aremutually independent that is no task takes precedence overthe others so the tasks can be parallel executed at the tasklevel In addition workload balancing techniques are used inthe processor A task is not fixed on a core and the tasks canbe migrated among all cores such that the IPCs of differentcores are equal The simple linear regression analysis is usedto determine the coefficients Gdyn and G
1198890
in (21) and thevariance 1205902
119864119889
of regression residuals Edyn is obtainedThe leakage power is the nonlinear monotonic increasing
function of temperature which is given by [25]
119875119897
= 120572 sdot 1198792
sdot 119890120574119879
+ 120573 (64)
where 120572 120573 and 120574 are parameters which depend on topologysize technology and design of processors In order toconstruct the linear model of leakage power (64) is regressedlinearly to determine the coefficients Gleaa and G
1198970a in (22)and the coefficients Gleaina and G
1198970ina in (24) as well asthe variance 1205902
119864a and 1205902
119864ina of regression residuals Eleaa andEleaina The parameters 120572 120573 and 120574 are set to the defaultvalues of PTScalar The smaller the range of temperaturethe higher the linear correlation between leakage power andtemperature [7] The temperature of a processor using DTMtechniques does not exceed the maximum value and thelower temperature has no impact on the design optimizationof thermal-aware processors Therefore the linear regressionanalysis of leakage powers is performed at the temperatureinterval between 60∘C and 100∘C and the regression resultsare used to estimate the leakage power at the whole tempera-ture interval
72 Estimated Accuracy For the hotspot of the processorthat is the branch predictor Figure 3 shows the comparisonbetween the actual value and the estimated value of leakagepower for active cores and Figure 4 shows that for inactivecores A higher estimation accuracy of leakage power isobtained at the temperature interval between 60∘C and 100∘Cat the cost of lower accuracy at the other intervals
After regression analysis the dynamic and leakage powercan be estimated with the linear model in (21) (22) and(24) Figure 5 shows the estimation error rate of the dynamicpower and that of the leakage power for active cores andinactive cores in thermal range between 60∘C and 100∘C Itcan be seen that the error rates of the dynamic powers fordifferent functional units have significant variations For thedecoder the estimation error rate of dynamic power is only262 but 1014 for the floating point register (FPReg) Thereason for this fact is that the linear correlations betweenthe IPC and the dynamic powers for various functionalunits are different Lower error rate results from highercorrelation Obviously the IPC and the dynamic power of
Actual leakage powerEstimated leakage power
60 80 100 12040Temperature (∘C)
minus02
0
02
04
06
08
1
Leak
age p
ower
(W)
Figure 3 Leakage power estimation of hotspot for active cores
Actual leakage powerEstimated leakage power
times10minus4
60 80 100 12040Temperature (∘C)
minus2
0
2
4
6
8
10
12
14
Leak
age p
ower
(W)
Figure 4 Leakage power estimation of hotspot for inactive cores
the decoder have the highest linear correlation while theIPC and that of FPReg have the lowest linear correlationIt can also be seen from Figure 5 that the estimation errorrates of leakage power for both active cores and inactivecores are similar The estimation error rates of leakage powerfor active cores are between 315 and 326 and those forinactive cores are between 306 and 591 This is becausethe temperature and the leakage power of various functionalunits have similar linear correlations In order to consider the
12 Mathematical Problems in Engineering
Table 1 Means (120583) and standard deviations (120590) of hotspot temperature distribution for active cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 5193 403 6125 540 7002 669 7935 8067 5920 425 6973 569 7964 705 9017 8508 6669 448 7846 600 8954 743 10131 895
Table 2 Means (120583) and standard deviations (120590) of hotspot temperature distribution for inactive cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 4059 171 4708 229 5319 284 5968 3427 4778 195 5546 261 6269 323 7037 389
Dynamic power
FPRe
gIT
LB IL1
DTB DL1 L2
Dec
ode
Bran
chRA
TRU
ULS
QIA
LU1
IALU
2IA
LU3
IALU
4FP
Add
FPM
ulIn
tReg
Leakage power for active coreLeakage power for inactive core
02468
1012
Erro
r rat
e (
)
Figure 5 Error rate of the power estimation
impact of estimation errors on the analysis of temperatureand frequency the error terms Edyn Eleaa and Eleaina areintroduced into the linear models of dynamic power andleakage power as expressed in (21) (22) and (24)
73 Probabilistic Distribution of Temperature Based on theassumption that no DTM techniques are used to controlthe temperature of processors according to Theorem 4the probabilistic distribution of the hotspot temperaturesfor both active cores and inactive cores can be obtainedTable 1 presents the means and the standard deviations ofthe probabilistic distribution of the hotspot temperature foractive cores and the corresponding probabilistic densitycurves are as shown in Figure 6 It can be seen that thehotspot temperature of processors is not deterministic andhas significant variations for a certain number of active coresand a certain frequency The number of active cores and therunning frequency simultaneously determine the range inwhich the temperature lies and the probabilistic distributionof the hotspot temperature For the same running frequencymore active cores will yield higher temperature and viceversa For the same number of active cores higher frequencywill bring higher temperature and vice versa Accordingto the characteristics of normal distribution curve it can
be known that the shape of probabilistic density curvecorresponds to the variations of data distribution dependingon the standard deviation of random variables The curvewith a higher peak implies a smaller standard deviationthat is a lower variation of data distribution whereas thecurve with a lower peak implies a bigger standard deviationthat is a larger variation it can be seen from Figure 6 thatthe probabilistic density curves corresponding to variousfrequencies have different peaks This observation impliesthat the degree of temperature variation has close correlationwith working frequency and higher frequency will yieldhigher variation of temperature
Table 2 presents the means and standard deviations ofthe probabilistic distribution of the hotspot temperature forinactive cores and the corresponding probabilistic densitycurves are as shown in Figure 7 When the number of activecores is eight inactive core does not exist so there is nocase where the number of active cores is eight in Table 2and Figure 7 The effect of the frequency and the number ofactive cores at the hotspot temperature for inactive cores is thesame as that for active cores except that the mean value andvariation of hotspot temperature of inactive cores are lowerthan those of active cores under the same frequency and thenumber of active cores
74 Probabilistic Distribution of Frequencies If the power-gating and the DFS techniques are simultaneously usedto manage the temperature of a processor according toTheorem 5 the probabilistic distribution of working frequen-cies can be determined Figure 8 presents the probabilisticdistribution of frequencies when the number of active cores issix seven and eight respectively When the frequency is lessthan 1 it is implied that the hotspot temperature surpasses thethreshold and the DFS technique is triggered to reduce thefrequency of the processor So the probability for triggeringDFS can be obtained from the probabilistic distribution offrequencies Figure 9 presents the probability for triggeringDFS when the number of active cores is six seven and eightrespectively
Mathematical Problems in Engineering 13
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
0
002
004
006
008
01Pr
obab
ility
den
sity
40 60 80 100 120 14020Temperature (∘C)
(a)
40 60 80 100 120 14020Temperature (∘C)
0
002
004
006
008
01
Prob
abili
ty d
ensit
yFrequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
0
002
004
006
008
01
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(c)
Figure 6 The probability density of the hotspot temperature for active cores (a) The number of active cores is six (b) The number of activecores is seven (c) The number of active cores is eight
If a core is powered off and made inactive using thepower-gating technique it only dissipates the leakage powerwhich is much less than that of an active core so the powerdissipated by the processor is reduced significantlyWhen thenumber of active cores is less than six that is more thantwo cores are powered off the decreased power makes it
enough for the rest of active cores of a processor to executeat the full speed and the DFS is not necessary to be triggeredTherefore when the number of active cores is less than six thefrequency of the processor is constantly 1 and the probabilityfor triggering DFS is constantly 0 This situation is not givenin Figures 8 and 9
14 Mathematical Problems in Engineering
Table 3 Comparisons of temperatures between active cores with and without DFS
Number of active cores Average hotspot temperature (∘C) Probability for exceedingtemperature threshold ()
With DFS Without DFS With DFS Without DFS6 7930 7935 0 0527 8885 9017 0 12378 9385 10131 0 5584
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(a)
40 60 80 100 120 14020Temperature (∘C)
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
Figure 7The probability density of the hotspot temperature for inactive cores (a)The number of active cores is six (b)The number of activecores is seven
It can be seen that various numbers of active cores resultin different probabilistic distributions of frequencies anddifferent probabilities for triggering DFS When all cores arepowered on the probability that the processor runs at themaximum frequency is only 4416 and the probability fortriggering DFS is 5584This means that all cores will run atthe full speed only when the IPCs of tasks are less If the IPCsincrease to an extent the running frequency will be scaleddown to control the temperature under the threshold Asthe number of active cores decreases the probability that theprocessor runs at the maximum frequency increases and theprobability for triggering DFS decreases When the numberof active cores is less than six no matter what the IPCs arethe saved power by shutting off more than two cores makes itdeterministic for the active cores to run at the full speed sothe probability that the processor runs at the maximum fre-quency is 100 and the probability for triggering DFS is 0
75 Comparisons of Temperatures with and without DFS IftheDFS technique is not used then the processor always runs
at the full speed that is the running frequency is constantly 1So the average hotspot temperature of the active core withoutthe DFS can be obtained according to (52) If both the power-gating and DFS techniques are used simultaneously for thedynamic thermal management then the running frequencycan be scaled to control the temperature of the processorunder the thermal threshold According to (63) the averagehotspot temperature of the active core with the DFS canbe obtained In terms of the average hotspot temperatureand the probability that the hotspot temperature exceeds thethreshold Table 3 presents the comparative results betweenthe active cores with and without the DFS when the numberof active cores is 6 7 and 8
If the DFS technique is used by the processor the runningfrequency will be scaled down to reduce the temperatureonce the hotspot temperature reaches the threshold So thehotspot temperature will not exceed the threshold that isthe probability that the hotspot temperature exceeds thethreshold is 0 If the DFS technique is not used by the pro-cessor the running frequency is always the maximum and
Mathematical Problems in Engineering 15
Number of active cores
Frequency is 1 Frequency is 083Frequency is 067 Frequency is 05
0
20
40
60
80
100
Prob
abili
ty o
f fre
quen
cy (
)
6 7 8
Figure 8 Probabilistic distribution of frequencies
6 7 8Number of active cores
0
10
20
30
40
50
60
Prob
abili
ty fo
r trig
gerin
g D
FS (
)
Figure 9 Probability for triggering DFS
the hotspot temperature is possible to exceed the thresholdthat is the probability that the hotspot temperature exceedsthe threshold is larger than 0Therefore the average hotspottemperature of the processor with the DFS is lower than thatwithout the DFS
As the number of active cores decreases the saved powerby shutting off cores makes it more possible for the activecores to execute at the full speed and the effect of the DFSon cooling down the processor weakens until it disappearsTherefore for the processors with and without the DFS the
average hotspot temperatures become closer as the number ofactive cores decreases as shown in Table 3 Even though theDFS is not used the probability that the hotspot temperatureexceeds the threshold will reduce until 0 when more coresare powered off
When the number of active cores is lower than 6 that ismore than 2 cores are powered off no matter what speed theprocessor runs at the hotspot temperature will not exceedthe threshold so all active cores are not necessary to triggerthe DFS for managing the temperature of the processor andcan run at the maximum frequency Therefore when thenumber of active cores is lower than 6 the probability thatthe hotspot temperature exceeds the threshold is 0 andthe average hotspot temperatures of the processor with andwithout theDFS are sameThere is no differencewith theDFSandwithout theDFSwhen the number of active cores is lowerthan 6 so the comparisons in this situation are not given inTable 3
8 Conclusions
In this paper a probabilistic analysis method of the tem-perature and frequency of multicore processors is presentedtaking the variation of workloads into account It is provedtheoretically in this paper that (1) the hotspot temperaturesof both active cores and inactive cores are the linear functionsof the IPC (2) the hotspot temperature follows the normalprobabilistic distribution based on the assumption that IPCsof all cores follow the same normal distribution and (3) therunning frequency follows a probabilistic distribution
From the experimental results it can be seen that (1) theestimation error rates of the dynamic powers for differentfunctional units have significant variations indicating thatthe linear correlations between the IPC and the dynamicpowers for various functional units are different (2) theestimation error rates of leakage powers for both activecores and inactive cores are similar showing similar linearcorrelations between temperature and leakage power acrossvarious functional units (3) a higher estimation accuracy ofleakage powers can be obtained at the temperature intervalbetween 60∘C and 100∘C at the cost of lower accuracy atother intervals (4) the hotspot temperature of the proces-sor is not deterministic and has significant variation fora certain number of active cores and a certain frequencyand the number of active cores and the running frequencydetermine simultaneously the probabilistic distribution ofhotspot temperature and (5) various numbers of active coresresult in different probabilistic distributions of frequenciesand different probabilities for triggering DFS
Competing Interests
The authors declare no competing interests
Authorsrsquo Contributions
Biying Zhang performed the theoretical analysis and wrotethe major part of this paper Gang Cui and Zhongchuan
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
2 Mathematical Problems in Engineering
accuracy In fact temperature of multicore processors hasgreat variations between different workloads [10 17] Ourpreliminary work has demonstrated that the dynamic powerof processors is highly correlated with IPCs (instructions percycle) and that within a small temperature range leakagepower linearly depends on temperature [18] According to theHotSpot thermal model temperature can be derived giventhe power of processors [3] Thus processor temperature hashigher correlation with IPC According to CLT (central limittheorem) when a large number of instructions are executedin the processor the probabilistic distribution of IPC tendsto follow the normal distribution Accordingly given boththe probabilistic distribution of IPC and the relationshipbetween the temperature and IPC the probabilistic distribu-tion of processor temperature can be derived and analyzedSubsequently the probabilistic distribution of the maximumrunning frequency can be inferred given that the zero-slackDTM policy is used by the processor which means thatthe speed of the processor is set to a value which makesthe temperature of the hotspot be the maximum thresholdallowed by the processor [7]
12 Contributions In this paper a probabilistic methodis proposed to analyze the steady-state temperature andfrequency of multicore processors taking into account thevariation of workloads In order to simplify the analyzingprocesses DFS technique rather than DVFS technique isadopted to manage the temperatures of processor where thevoltage is constant and only frequency is adjusted And thedynamic power can be modeled as the linear function of thefrequency [12 16]Themain contributions of this work are asfollows
(i) At the microarchitecture level the dynamic powerof processors is modeled as the linear function ofIPC and running frequency and the leakage powerof processor is approximated as the linear model oftemperature
(ii) The microarchitecture-level hotspot temperatures ofboth active cores and inactive cores are derived as thelinear functions of IPCs of all active cores
(iii) It is inferred that the hotspot temperatures of bothactive cores and inactive cores follow the normalprobabilistic distribution based on the assumptionthat IPCs of all active cores follow the same normaldistribution
(iv) The probabilistic distribution of the set of frequenciesis determined given the zero-slack DTM policy [7]
The remainder of this paper is organized as followsIn Section 2 related work is overviewed In Section 3 themicroarchitecture-level steady-state temperature of a core isformulated as the linear function of its powers based on theHotspot thermalmodel In Section 4 at themicroarchitecturelevel the dynamic powers of processors are modeled asthe linear function of IPC and the running frequency andthe leakage powers are approximated as the linear modelof temperature In Section 5 the microarchitecture-level
hotspot temperatures of both active cores and inactive coresare derived as the linear function of IPCs of all active coresIt is inferred that the hotspot temperatures of both activecores and inactive cores follow the normal probabilisticdistribution based on the assumption that the IPCs of allactive cores follow the same normal probabilistic distribu-tion In Section 6 the probabilistic distribution of the set offrequencies is determined given the zero-slack DTM policyIn Section 7 experimental results are presentedThis paper isconcluded in Section 8
2 Related Work
The estimation approach of temperature and performance ofprocessors can be classified into transient analysis and steady-state analysis and so far most researches have been based ontransient analysis
21 Transient Analysis In order to transiently analyze thetemperature and explore the design space of processors at theearly stage several thermal models for processors have beenproposed Skadron et al and Huang et al [2 3] presenteda compact thermal modeling methodology based on theanalogy between thermal and electrical phenomena namelyHotSpot Using HotSpot the spatial and temporal variationsof processor temperature can be obtained through transientanalysis To improve the accuracy of thermal simulation Janget al [11]made an extension to the thermalmodel forHotSpotby taking into account the different ambient temperatureowing to workload variations To accelerate thermal analysisof multicore processors at the architecture level Wang et al[5] presented a composite thermal model termed Therm-Comp to optimize the model for different large processorsLi et al [4] proposed a parameterized architecture-leveldynamic thermal model namely ParThermPOF in whichmany parameters can be set such as the location of thermalsensors and the conductivity of different components
In order to improve the performance of thermal-awaremulticore processors various DTM techniques have beeninvestigated based on thermal models such as Hotspotand analyzed transiently for the estimation of performanceHanumaiah et al [7 19] presented an online thermal man-agement algorithm for thermal-aware multicore processorsin which DVFS and task allocation techniques are simulta-neously adopted In the context of hard real-time systemsthe time-varying voltage and frequency of multicores arecomputed to satisfy not only the thermal constraint butalso the deadline constraint [6] Shi et al [12] presented aDTM policy under soft thermal constraint in which thetemperature constraint can be exceeded sometimes
In order to simulate the thermal behavior fast andaccurately for multicore processors several researches havebeen performed Wojciechowski et al [10 20] analyzedthe transient characteristics of workloads based on a finiteFourier series expansion to accurately predict the thermalbehavior of multicore processors and a new DVFS approachis presented Liu et al [13] proposed a transient analysismethod of temperature of multicore processors based on
Mathematical Problems in Engineering 3
moment matching and it is used to guide the migrationprocesses of tasks To account for the nondeterministicbehavior of tasks in terms of executing times and decisionbranches Das et al [14] formulated thermal analysis as ahybrid automata reachability verification problem and analgorithm for constructing the automata was provided
For transient analysis temporal variations of temperatureand performance depending onworkload are traced and thenhigh estimation accuracy is obtained However transientanalysis is time-consuming and in particular for multicoreprocessors time complexity is unacceptable at the early designstage
22 Steady-State Analysis In order to speed up the estima-tion of temperature and performance at the early designstage of multicore processors researchers resort to steady-state analysis and have carried out a lot of work Based onAmdahlrsquos Law Lee and Kim [15 21] introduced variations ofprocess and workload parallelism into the analyzing modeland optimized the throughput of thermal-aware multicoreprocessors by exploitingDVFS and the per-core power-gating(PCPG) Based on HotSpot Rao et al [16 22] describedan approximate thermal model for homogeneous multicoreprocessors to fast and accurately predict the maximumsteady-state throughput under thermal constraints In thecontext of a hard real-time system of a single-core processorMohaqeqi et al [23] studied stochastic behavior of the systemfor example performance temperature and reliability basedon Markovian view
To the best of our knowledge all previous work of steady-state analysis is based on the assumption that every workloadhas the same thermal contribution to processors resultingin inaccuracy of temperature and performance estimationThis is the focus of our work In this paper the variation ofworkloads is taken into account to model the thermal andfrequency more accurately
3 Thermal Model
In this paper a microarchitecture-level thermal model fora multicore processor is created by replication of a single-core processor based onHotSpot [3]Themulticore processoris divided into four layers that is chip thermal interfacematerial (TIM) heat spreader and heat sink There are 119898
thermal blocks in the chip and TIM and there are five andnine thermal blocks in the heat spreader and heat sinkrespectively Totally there are 119873 = 119899119898 + 14 thermal blocksin the multicore processor where 119899 is the number of cores
The microarchitecture-level thermal model can be repre-sented by the state-space differential equation as follows [16]
119889T119889119905
= AT + BP (1)
where T and P are119873-dimension vectors respectively denot-ing the temperature and power of the multicore processorand A and B are constant matrices of 119873 times 119873 dimensiondepending on the thermal conduction and capacitance of theprocessor
DieCore 1
DieCore 2
TIMCore 1
TIMCore 2
DieCore 1
0 0 0
DieCore 2
0 0 0
TIMCore 1 0 0 0
TIMCore 2
0 0 0
0 0 0 0
Spreader
Spreadercenter
center
Spreaderperipheral
and heat sink
Spreaderperipheral
and heat sink
Gdie
Gdie
Gdie-int
Gdie-int
Gint-die
Gint-die
Gint-spr
Gint
Gint
Gpkg
Gspr-int
Figure 1Thermal conductionmatrixG of dual-core processor [16]
When the temperature of the processor is in the steady-state 119889T119889119905 = 0 then
GT = P (2)
where G = minusBminus1A isthe thermal conductance matrix ofHotSpot model
The thermal conductance matrix G can be obtainedfrom the HotSpot simulation tool The thermal conductancematrix of a dual-core processor is as shown in Figure 1 wherethe submatrices Gdie Gint and Gpkg along the diagonal arelateral thermal conductance of the chip TIM and packagerespectively the submatrices Gdie-int and Gint-die are thevertical conductance between die and TIM and the vectorsGint-spr and G1015840spr-int are the vertical conductance between theTIM and the spreader Gdie-int is equal to Gint-die and Gint-spris equal to G1015840spr-int The detailed conductance matrix of Gpkgis as shown in Figure 2
Lemma 1 According to HotSpot thermal model when thetemperature of the multicore processor is in the steady-statethere exist 119898-dimension matrices R Q and Z such that
T119894
= RP119894
+Q119899
sum
119895=1
P119895
+ Z (3)
where T119894
and P119894
are119898-dimension vectors representing respec-tively temperature and power of the 119894th core of the processor
Proof According to the arrangement of elements in theconductance matrix G in HotSpot thermal model (2) can bedecomposed into the following equations
GdieT119894 + Gdie-intTint119894 = P119894
(4)
Gdie-intT119894 + GintTint119894 + Gint-spr119879spr = 0 (5)
4 Mathematical Problems in Engineering
Spreadercenter
Spreadercenter
Spreaderperipheral
and heat
Spreader peripheraland heat sink
Gspr
Gother-spr
Gspr-other
Gother-other
sink
Figure 2 Thermal conduction matrix of the package [16]
G1015840int-spr119899
sum
119895=1
Tint119895 + 119866spr119879spr + G1015840spr-otherTother = 0 (6)
Gspr-other119879spr + Gother-otherTother = Pambient (7)
where Tint119894 is119898-dimension vector representing the tempera-ture of TIM layer of the 119894th core 119879spr is a scalar representingthe temperature of the central block of the spreader andTotheris 13-dimension vector representing the temperature of otherblocks of the spreader and the sink
According to (6) and (7)
119879spr = (G1015840spr-otherGminus1
other-otherGspr-other minus 119866spr)minus1
sdot (G1015840int-spr119899
sum
119894=1
Tint119894 + G1015840spr-otherGminus1
other-otherPambient)
(8)
Set
GSPRO = (G1015840spr-otherGminus1
otherGspr-other minus 119866spr)minus1
Gint-spr
119866spamb = (G1015840spr-otherGminus1
other-otherGspr-other minus 119866spr)minus1
sdot G1015840spr-otherGminus1
other-otherPambient
(9)
Equation (8) can be converted into
119879spr = G1015840SPRO119899
sum
119894=1
Tint119894 + 119866spamb (10)
Substitute (10) into (5) and get
Gdie-intT119894 + GintTint119894
+ Gint-spr (G1015840SPRO119899
sum
119894=1
Tint119894 + 119866spamb) = 0(11)
According to (4) and (11) get
(Gdie-int minus GintGminus1
die-intGdie)T119894
minus Gint-sprG1015840
SPROGminus1
die-intGdie
119899
sum
119895=1
T119895
+ GintGminus1
die-intP119894
+ Gint-sprG1015840
SPROGminus1
die-int
119899
sum
119895=1
P119895
+ Gint-spr119866spamb
= 0
(12)
Set
G1
= Gdie-int minus GintGminus1
die-intGdie
G2
= Gint-sprG1015840
SPROGminus1
die-intGdie
G3
= GintGminus1
die-int
G4
= Gint-sprG1015840
SPROGminus1
die-int
G5
= Gint-spr119866spamb
(13)
Then (12) can be transformed into
G1
T119894
minus G2
119899
sum
119895=1
T119895
+ G3
P119894
+ G4
119899
sum
119895=1
P119895
+ G5
= 0 (14)
According to (14) get
G1
119899
sum
119895=1
T119895
minus 119899G2
119899
sum
119895=1
T119895
+ G3
119899
sum
119895=1
P119895
+ 119899G4
119899
sum
119895=1
P119895
+ 119899G5
= 0
(15)
According to (15) get
119899
sum
119895=1
T119895
= (119899G2
minus G1
)minus1
(G3
+ 119899G4
)
119899
sum
119895=1
P119895
+ 119899 (119899G2
minus G1
)minus1G5
(16)
Substitute (16) into (14) and get
Tdie119894
= Gminus11
(G2
(119899G2
minus G1
)minus1
(G3
+ 119899G4
) minus G4
)
119899
sum
119895=1
P119895
minus Gminus11
G3
P119894
+ Gminus11
(119899G2
(119899G2
minus G1
)minus1
minus E)G5
(17)
Set
R = minusGminus11
G3
Q = Gminus11
(G2
(119899G2
minus G1
)minus1
(G3
+ 119899G4
) minus G4
)
Z = Gminus11
(119899G2
(119899G2
minus G1
)minus1
minus E)G5
(18)
Mathematical Problems in Engineering 5
Get
T119894
= RP119894
+Q119899
sum
119895=1
P119895
+ Z (19)
This means that in the steady-state of temperature of amulticore processor given the powers of all cores the tem-perature of any core can be calculated according to Lemma 1at the microarchitecture level
4 Power Model
It is assumed thatmulticore processors have two power statesactive mode and inactive mode and the power state of eachcore can be set separately And global dynamic frequencyscaling (DFS) technique is used where frequencies of all coresare scaled uniformly
41 Active Mode When a core is in the active mode work-loads are executed and the core dissipates both dynamicpower and leakage power At the microarchitecture level thepower of the active core is defined as
Pa119894 = Pdyna119894 + Pleaa119894 (20)
wherePa119894Pdyna119894 andPleaa119894 are respectively the total powerdynamic power and leakage power of the 119894th active core ofsize119898 times 1
For the processor using DVFS technique the dynamicpower is proportional to the product of the square of thevoltage and the frequency [7] that is Pdyn prop voltage2 timesf requency In this paper the primary purpose is to analyzethe impact of workload variation on the temperatures ofprocessors So in order to simplify the analyzing processesDFS technique rather thanDVFS technique is used tomanagethe temperatures of processor where the voltage is constantand only frequency is adjusted Hence the dynamic powercan be modeled as the linear function of the frequency [1216] In addition the dynamic power caused by workloadexecution has a close linear relationship with IPCs
Let 119883a119894 be the IPC of the 119894th core when workloads arerunning on it and let 119891 be the normalized frequency between0 and 1 and then the microarchitecture-level dynamic powerPdyna119894 of the 119894th active core is defined as
Pdyna119894 = (Gdyn119883a119894 + G1198890
+ Edyn) 119891 (21)
where Gdyn and G1198890
are the linear regression coefficientswhen119891 is set tomaximum 1 andEdyn is the regression residualwhich follows the normal distribution with mean of zeronamely Edyn sim 119873(01205902
119864119889
) Gdyn G1198890 Edyn and 120590119864119889 arevectors of size 119898 times 1 and 1205902
119864119889
is the element-by-elementsquares of 120590
119864119889
At the microarchitecture level in order to simplify the
analyzing procedure the relationship between the leakagepower Pleaa119894 and the temperature Ta119894 of the 119894th active coreis approximated by the linear model as follows
Pleaa119894 = GleaaTa119894 + G1198970a + Eleaa (22)
whereGleaa andG1198970a are the regression coefficients and Eleaais the regression residual which follows the normal distribu-tion with mean of zero namely Eleaa sim 119873(01205902
119864a) Gleaa is adiagonal matrix of size 119898 times 119898 G
1198970a Eleaa 120583119864a and 120590119864a are
vectors of size 119898 times 1 1205902119864a is the element-by-element squares
of 120590119864aAccording to (20) (21) and (22) the power of the active
core Pa119894 is represented by
Pa119894 = (Gdyn119883119894 + G1198890
) 119891 + Edyn + GleaaTa119894 + G1198970a
+ Eleaa(23)
42 Inactive Mode When a core is in the inactive mode itis powered off using power-gating technique The inactivecore only dissipates leakage power which is much lower thanthat of the active core Hence the power of the 119894th inactivecore Pina119894 is the leakage power Pleaina119894 which depends on thecorersquos temperature Tina119894 and it can be approximated by thelinear model at the microarchitecture level as follows
Pina119894 = Pleaina119894 = GleainaTina119894 + G1198970ina + Eleaina (24)
whereGleaina andG1198970ina are regression coefficients andEleainais the regression residuals which follow normal distributionwith mean of zero namely Eleaina sim 119873(01205902
119864ina) Gleaina is adiagonal matrix of size 119898 times 119898 G
1198970ina Eleaina and 120590119864ina arevectors of size119898times1 1205902
119864ina is the element-by-element squaresof 120590119864ina
5 Thermal Analysis
Lemma 2 When the temperature of a multicore processor isin the steady-state the temperatures and the powers of differentinactive cores are same that isT
119894119899119886119894
= T119894119899119886119895
and P119894119899119886119894
= P119894119899119886119895
forforall119894 119895 (119894 = 119895) whereT119894119899119886119894
andP119894119899119886119894
are the temperatures andthe powers of the 119894th inactive core respectively
Proof According to (3) for any inactive cores 119894 and 119895
Tina119894 minus Tina119895 = R (Pina119894 minus Pina119895) (25)
Substitute (24) into (25) and get
Tina119894 minus Tina119895 = RGleaina (Tina119894 minus Tina119895) (26)
According to (26) get
(E minus RGleaina) (Tina119894 minus Tina119895) = 0 (27)
The parameters R and Gleaina are invertible matrices soEminusRGleaina is an invertible matrixTherefore Tina119894 minusTina119895 =0 that is Tina119894 = Tina119895 And Pina119894 = Pina119895 can also beobtained according to (24)
To be convenient set Tina119894 = Tina and Pina119894 = Pina (24)can be simplified into
Pina = GleainaTina + G1198970ina + Eleaina (28)
6 Mathematical Problems in Engineering
Theorem 3 Assume that all cores of a processor have thesame hotspot Let H = [ℎ
1
ℎ2
ℎ119898
] be the selectionvector of the hotspot where only one element of H can beset to 1 and the others are 0 s indicating that the corre-sponding functional unit is hotspot There exist functions1205951
(H 119899 119899119886
119891) 1205952
(H 119899 119899119886
119891) 1205953
(H 119899 119899119886
119891) 1205954
(H 119899 119899119886
119891) 1205955
(H 119899 119899119886
119891) and 1205956
(H 119899 119899119886
119891) such that the hotspot119879ℎ119900119905119886119894
of the 119894th active core can be formulated by
119879hot119886119894 = 1205951
(H 119899 119899119886
119891)119883119886119894
+ 1205952
(H 119899 119899119886
119891)
119899
119886
sum
119896=1
119883119886119896
+ 1205953
(H 119899 119899119886
119891)E119889119910119899
+ 1205954
(H 119899 119899119886
119891)E119897119890119886119886
+ 1205955
(H 119899 119899119886
119891)E119897119890119886119894119899119886
+ 1205956
(H 119899 119899119886
119891)
(29)
There exist functions 1205851
(H 119899 119899119886
119891) 1205852
(H 119899 119899119886
119891)1205853
(H 119899 119899119886
119891) 1205854
(H 119899 119899119886
119891) and 1205855
(H 119899 119899119886
119891) such thatthe hotspot 119879
ℎ119900119905119894119899119886
of the inactive core can be formulated by
119879ℎ119900119905119904
= 1205851
(H 119899 119899119886
119891)
119899
119886
sum
119896=1
119883119886119896
+ 1205852
(H 119899 119899119886
119891)E119889119910119899
+ 1205853
(H 119899 119899119886
119891)E119897119890119886119886
+ 1205854
(H 119899 119899119886
119891)E119897119890119886119894119899119886
+ 1205855
(H 119899 119899119886
119891)
(30)
Proof According to (3) (23) and (28) the temperature of theinactive core Tina can be derived as
Tina = (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1QGleaa
119899a
sum
119896=1
Ta119896
+ 119891 (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
sdotQGdyn
119899a
sum
119896=1
119883a119896 + 119891119899a (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1QEdyn + 119899a (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1QEleaa + (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1
(R + (119899 minus 119899a)Q)
sdot Eleaina + (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
sdot ((R + (119899 minus 119899a)Q)G1198970ina + 119891119899aQG
1198890
+ 119899aQG1198970a
+ Z)
(31)
Let
Φ1
(119899 119899a 119891) = (R + (119899 minus 119899a)Q)G1198970ina + 119891119899aQG
1198890
+ 119899aQG1198970a + Z
Φ2
(119899 119899a) = (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
(32)
Then (31) can be transformed into
Tina = Φ2
(119899 119899a)QGleaa
119899a
sum
119896=1
Ta119896
+ 119891Φ2
(119899 119899a)QGdyn
119899a
sum
119896=1
119883a119896
+ 119891119899aΦ2 (119899 119899a)QEdyn + 119899aΦ2 (119899 119899a)QEleaa
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q)Eleaina
+Φ2
(119899 119899a)Φ1 (119899 119899a 119891)
(33)
According to (3) (23) and (28) the temperature of the 119894thactive core Ta119894 can be derived as
Ta119894 = (E minus RGleaa)minus1QGleaa
119899a
sum
119896=1
Ta119896 + 119891 (E
minus RGleaa)minus1QGdyn
119899a
sum
119896=1
119883a119896 + 119891 (E minus RGleaa)minus1
sdot RGdyn119883a119894 + (119899 minus 119899a) (E minus RGleaa)minus1QGleainaTina
+ 119891 (E minus RGleaa)minus1
(R + 119899aQ)Edyn + (E
minus RGleaa)minus1
(R + 119899aQ)Eleaa + (119899 minus 119899a) (E
minus RGleaa)minus1QEleaina + (E minus RGleaa)
minus1
(119891RG1198890
+ 119899a119891QG1198890
+ RG1198970a + 119899aQG
1198970a
+ (119899 minus 119899a)QG1198970ina + Z)
(34)
Let
Φ3
(119899 119899a 119891) = 119891RG1198890
+ 119899a119891QG1198890
+ RG1198970a
+ 119899aQG1198970a + (119899 minus 119899a)QG
1198970ina
+ Z
Φ4
= (E minus RGleaa)minus1
(35)
Then (34) can be transformed into
Ta119894 = Φ4QGleaa
119899a
sum
119896=1
Ta119896 + 119891Φ4
QGdyn
119899a
sum
119896=1
119883a119896
+ 119891Φ4
RGdyn119883a119894 + (119899 minus 119899a)Φ4QGleainaTina
+ 119891Φ4
(R + 119899aQ)Edyn +Φ4
(R + 119899aQ)Eleaa
+ (119899 minus 119899a)Φ4QEleaina +Φ4Φ3 (119899 119899a 119891)
(36)
Mathematical Problems in Engineering 7
Substitute (33) into (36) and get
Ta119894 = (Φ4
QGleaa
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)QGleaa)
119899a
sum
119896=1
Ta119896
+ 119891 (Φ4
QGdyn
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)QGdyn)
119899a
sum
119896=1
119883a119896
+ 119891Φ4
RGdyn119883a119894 + 119891 (Φ4
(R + 119899aQ)
+ 119899a (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)Q)Edyn
+ (Φ4
(R + 119899aQ)
+ 119899a (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)Q)Eleaa
+ ((119899 minus 119899a)Φ4Q
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a) (R + (119899 minus 119899a)Q))
sdot Eleaina + (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)
sdotΦ1
(119899 119899a 119891) +Φ4
Φ3
(119899 119899a 119891)
(37)
LetΦ5
(119899 119899a) = (Φ4
QGleaa + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)QGleaa)
Φ6
(119899 119899a 119891) = 119891 (Φ4
QGdyn + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)QGdyn)Φ7 (119891)
= 119891Φ4
RGdyn
Φ7
(119891) = 119891Φ4
RGdyn
Φ8
(119899 119899a 119891) = 119891 (Φ4
(R + 119899aQ) + 119899a (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)Q)
Φ9
(119899 119899a) = (Φ4
(R + 119899aQ) + 119899a (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)Q)
Φ10
(119899 119899a) = ((119899 minus 119899a)Φ4Q + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a) (R + (119899 minus 119899a)Q))
Φ11
(119899 119899a 119891) = (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)
sdotΦ1
(119899 119899a 119891) +Φ4
Φ3
(119899 119899a 119891)
(38)
Then (37) is transformed into
Ta119894 = Φ5 (119899 119899a)
119899a
sum
119896=1
Ta119896 +Φ6 (119899 119899a 119891)
119899a
sum
119896=1
119883a119896
+Φ7
(119891)119883a119894 +Φ8 (119899 119899a 119891)Edyn
+Φ9
(119899 119899a)Eleaa +Φ10 (119899 119899a)Eleaina
+Φ11
(119899 119899a 119891)
(39)
According to (39) get119899a
sum
119896=1
Ta119896 = (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891))
119899a
sum
119896=1
119883a119896
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)Edyn
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)Eleaa
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)Eleaina
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
(40)
Substitute (40) into (39) and get
Ta119894 = Φ7 (119891)119883a119894
+ (Φ5
(119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899 119899a 119891)
+Φ7
(119891)) +Φ6
(119899 119899a 119891))
119899a
sum
119896=1
119883a119896 + (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891) +Φ8
(119899 119899a 119891))
sdot Edyn + (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ9
(119899 119899a) +Φ9 (119899 119899a))Eleaa + (119899aΦ5 (119899 119899a) (E
minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a) +Φ10 (119899 119899a))Eleaina
+ 119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891)
(41)
The hotspot temperature 119879hota119894 of the active core can begiven by
119879hota119894 = HTa119894 = HΦ7
(119891)119883a119894 +H (Φ5
(119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899 119899a 119891) +Φ7
(119891))
+Φ6
(119899 119899a 119891))
119899a
sum
119896=1
119883a119896 +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+Φ8
(119899 119899a 119891))Edyn +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a) +Φ9 (119899 119899a))Eleaa
+H (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ10
(119899 119899a))Eleaina +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891))
(42)
8 Mathematical Problems in Engineering
Let1205951
(H 119891) = HΦ7
(119891)
1205952
(H 119899 119899a 119891) = H (Φ5
(119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) +Φ6
(119899 119899a 119891))
1205953
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+Φ8
(119899 119899a 119891))
1205954
(H 119899 119899a) = H (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ9
(119899 119899a) +Φ9 (119899 119899a))
1205955
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a) +Φ10 (119899 119899a))
1205956
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891))
(43)
Then (42) is transformed into
119879hota119894 = 1205951
(H 119891)119883a119894 + 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn + 1205954
(H 119899 119899a)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(44)
Substitute (40) into (33) and get
Tina = (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+ 119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891)
(45)
The hotspot temperature 119879hotina of the inactive core isgiven by
119879hotina = HTina = H (Φ2
(119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(46)
Let1205851
(H 119899 119899a 119891)
= H (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899
119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
1205852
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ8
(119899 119899a 119891) + 119891119899aΦ2 (119899 119899a)Q)
1205853
(H 119899 119899a) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ9
(119899 119899a) + 119899aΦ2 (119899 119899a)Q)
1205854
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ10
(119899 119899a) +Φ2 (119899 119899a) (R + (119899 minus 119899a)Q))
1205855
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(47)
Then (46) is transformed into
119879hot119904 = 1205851 (H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896 + 1205852 (H 119899 119899a 119891)Edyn
+ 1205853
(H 119899 119899a 119891)Eleaa
+ 1205854
(H 119899 119899a 119891)Eleaina + 1205855 (H 119899 119899a 119891)
(48)
Mathematical Problems in Engineering 9
According toTheorem 3 it can be known that the hotspottemperature of any core can be expressed as the linearfunction of IPCs of all cores
Theorem 4 Suppose that (a) the random variable 119883119886119894
forthe IPC follows normal distribution with the mean 120583
119883
andthe variance 120590
2
119883
that is 119883119886119894
sim 119873(120583119883
1205902
119883
) (b) the tasksrunning on different cores are mutually independent that is119883119886119894
and 119883119886119895
are independent for 1 le forall119894 119895 le 119898 and (c)workload balancing techniques are used in the processor suchthat 119883
119886119894
and 119883119886119895
follow the same normal distribution Thenthere exist functions 120583
119879119886
(H 119899 119899119886
119891) and 120590119879119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119886119894
of the active corefollows the normal distributionwithmean120583
119879119886
(H 119899 119899119886
119891) andvariance 120590
119879119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119886119894
sim 119873(120583119879119886
(H 119899 119899119886
119891) 120590119879119886
2
(H 119899 119899119886
119891)) (49)
There exist functions 120583119879119894119899119886
(H 119899 119899119886
119891) and 120590119879119894119899119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119894119899119886
of the inactive corefollows the normal distribution with mean 120583
119879119894119899119886
(H 119899 119899119886
119891)
and variance 120590119879119894119899119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119894119899119886
sim 119873(120583119879119894119899119886
(H 119899 119899119886
119891) 120590119879119894119899119886
2
(H 119899 119899119886
119891))
(50)
Proof According to (29) get
119879hota119894 = (1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894
+ 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119896 =119894
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn
+ 1205954
(H 119899 119899a 119891)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(51)
The norm-distributed random variables 119883a119894 and 119883a119895 areindependent in the case of 1 le forall119894 119895 le 119898 so that thelinear combination (120595
1
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 +
1205952
(H 119899 119899a 119891)sum119899a119896=1119896 =119894
119883a119896 of 119883a119894 still follows the normaldistribution
All elements Edyn119896 in the random vector Edyn follow nor-mal distribution and are independent so that the linear com-bination 120595
3
(H 119899 119899a 119891)Edyn of all elements in Edyn followsnormal distribution In the sameway120595
4
(H 119899 119899a 119891)Eleaa and1205955
(H 119899 119899a 119891)Eleaina also follow normal distribution(1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 1205952
(H 119899 119899a119891)sum119899a119896=1119896 =119894
119883a119896 1205953
(H 119899 119899a 119891)Edyn 1205954
(H 119899 119899a 119891)Eleaaand 120595
5
(H 119899 119899a 119891)Eleaina all follow normal distribution andare mutually independent so that 119879hota119894 follows normal
distribution where the mean 120583119879ina(H 119899 119899a 119891) is calculated
by
120583119879a (H 119899 119899a 119891)
= (1205951
(H 119891) + 119899a1205952
(H 119899 119899a 119891)) 120583119883
+ 1205953
(H 119899 119899a 119891)120583119864119889
+ 1205954
(H 119899 119899a)120583119864a
+ 1205955
(H 119899 119899a 119891)120583119864ina + 1205956 (H 119899 119899a 119891)
(52)
and the variance 120590119879ina2
(H 119899 119899a 119891) is calculated by
120590119879a2
(H 119899 119899a 119891)
= (1205951
(H 119891) + 1205952
(H 119899 119899a 119891))2
1205902
119883
+ (119899a minus 1)1205952
2
(H 119899 119899a 119891) 1205902
119883
+ 1205953
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205954
2
(H 119899 119899a)1205902
119864a
+ 1205955
2
(H 119899 119899a 119891)1205902
119864ina
(53)
In the same way 1205851
(H 119899 119899a 119891)sum119899a119896=1
119883a119896 1205852(H 119899 119899a119891)Edyn 1205853(H 119899 119899a 119891)Eleaa and 1205854(H 119899 119899a 119891)Eleaina in (30)all follow the normal distribution and are mutually indepen-dent so that 119879hot119904 follows the normal distribution where themean 120583
119879119904
(H 119899 119899a 119891) is calculated by
120583119879119904
(H 119899 119899a 119891) = 119899a1205851 (H 119899 119899a 119891) 120583119883
+ 1205852
(H 119899 119899a 119891)120583119864119889
+ 1205853
(H 119899 119899a)120583119864a
+ 1205854
(H 119899 119899a 119891)120583119864ina
+ 1205855
(H 119899 119899a 119891)
(54)
and the variation 120590119879119904
2
(H 119899 119899a 119891) is calculated by
120590119879119904
2
(H 119899 119899a 119891) = 119899a12058512
(H 119899 119899a 119891) 1205902
119883
+ 1205852
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205853
2
(H 119899 119899a)1205902
119864a
+ 1205854
2
(H 119899 119899a 119891)1205902
119864ina
(55)
According toTheorem 4 it can be known that the hotspottemperature of any core follows the normal probabilisticdistribution
6 Frequency Analysis
The zero-slack policy is used as the DTM strategy of proces-sors that is the speed of processor is set to a value whichmakes the temperature of the hotspot be the threshold [7]However the frequencies are discrete in this workThereforein most cases there is no frequency in the set of frequenciesmaking the hotspot temperature be the threshold exactly
10 Mathematical Problems in Engineering
Theorem 5 Let 119865119878119890119905 = 1198911
119891120587
119891120587+1
119891119891119888119900119906119899119905
be the set of frequencies of multicore processors where1198911
lt sdot sdot sdot lt 119891120587
lt 119891120587+1
lt sdot sdot sdot lt 119891119891119888119900119906119899119905
then the probabilisticdistribution of the frequency f follows
119875 119891 = 119891120587
=
119875 119879ℎ119900119905119886
(119899119886
119891120587
) lt 119879V119886119897V119890 119891120587
= 119891119888119900119906119899119905
119875 119879ℎ119900119905119886
(119899119886
119891120587
) le 119879V119886119897V119890 minus 119875 119879ℎ119900119905119886
(119899119886
119891120587+1
) le 119879V119886119897V119890 119891120587
lt 119891119888119900119906119899119905
(56)
where119879ℎ119900119905119886
(119899119886
119891120587
) is the function of the number of active cores119899119886
and the frequency 119891120587
representing the hotspot temperatureof the active core and 119879V119886119897V119890 is the temperature threshold of theprocessor
Proof When 119891120587
= 119891count according to the zero-slack DTMpolicy obviously
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve (57)
When 119891120587
lt 119891count then the probabilities of 119891 = 119891120587
can bebroken into two cases
(a) if 119879hota(119899a 119891120587) gt 119879hota(119899a 119891120587+1) according to thezero-slack DTM strategy the probability of 119891 = 119891
120587
is 0 that is
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) gt 119879hota (119899a 119891120587+1) = 0 (58)
(b) if 119879hota(119899a 119891120587) le 119879hota(119899a 119891120587+1) the probability of119891 = 119891
120587
is given by
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
(59)
The right-hand side of (59) can be derived by
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve minus 119875 119879hota (119899a 119891120587)
le 119879valve 119879hota (119899a 119891120587+1) le 119879valve | 119879hota (119899a 119891120587)
le 119879hota (119899a 119891120587+1)
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
le 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587+1) le 119879valve
(60)
According to (59) and (60) get
119875 119891119879hota = (119899a 119891120587) 119891120587 le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve
(61)
Hence when 119891120587
lt 119891count according to (58) and (61) get
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve (62)
Therefore according to Theorem 5 the probabilistic dis-tribution of the set of frequencies can be obtained based onthe assumption that the zero-slack policy is used
Given the probabilistic distribution of the frequency andthe mean of hotspot temperature of the active core for acertain frequency the average hotspot temperature of theactive core 119879
averhota can be obtained by
119879averhota = sum
119891isin119865Set120583119879a (H 119899 119899a 119891) sdot 119875 (119891) (63)
where 119865Set = 1198911
119891120587
119891120587+1
119891119891count denotes the set
of frequencies of multicore processors and 119875(119891) denotes theprobability that the frequency is 119891 120583
119879a(H 119899 119899a 119891) denotesthe mean of hotspot temperature of the active core given thefrequency 119891
7 Experimental Results
71 Experimental Methodology Amulticore version of Alpha21264 processor is used as the processor model in ourexperiment [24] and there are eight cores in the processorThe cores have two working states active state and inac-tive state and the working state of each core can be setseparately The processor employs a global DFS techniquewhich means that frequencies of all cores in the processorare scaled uniformly There are four discrete frequenciesused by the processor that is 15 GHz 2GHz 25 GHz and3GHz To facilitate analysis the frequencies are normalizedinto the interval [0 1] so that the maximum frequency isnormalized to 1 After normalization the set of frequenciesis 05 067 083 1 According to our previous work thehotspot of Alpha 21264 processor is the branch predictor[18] So the second element of the hotspot selection vectorH corresponding to the branch predictor is set to 1 and theother elements are set to 0 sThe thermal threshold that is themaximum temperature allowed by processor is set to 100∘C
TheHotSpot is used as the thermalmodel of themulticoreprocessor and the parameters such as thermal conductanceand capacitance are set to default values of HotSpot sim-ulation tool [3] In order to construct the linear model ofdynamic power PTScalar is modified to obtain both the
Mathematical Problems in Engineering 11
dynamic power profiles of each functional unit in Alpha21264 processor and the IPC profiles [25] The parameters ofPTScalar are set to default values as well The mean 120583
119883
andvariance 120590
2
119883
of norm-distributed 119883a119894 are determined usingIPC profiles based on the maximum likelihood estimationSome representative tasks such as mesa ammp quake bzipmcfmath and qsort fromMiBench [26] and SPECCPU2000[27] are selected as the benchmarks These selected tasks aremutually independent that is no task takes precedence overthe others so the tasks can be parallel executed at the tasklevel In addition workload balancing techniques are used inthe processor A task is not fixed on a core and the tasks canbe migrated among all cores such that the IPCs of differentcores are equal The simple linear regression analysis is usedto determine the coefficients Gdyn and G
1198890
in (21) and thevariance 1205902
119864119889
of regression residuals Edyn is obtainedThe leakage power is the nonlinear monotonic increasing
function of temperature which is given by [25]
119875119897
= 120572 sdot 1198792
sdot 119890120574119879
+ 120573 (64)
where 120572 120573 and 120574 are parameters which depend on topologysize technology and design of processors In order toconstruct the linear model of leakage power (64) is regressedlinearly to determine the coefficients Gleaa and G
1198970a in (22)and the coefficients Gleaina and G
1198970ina in (24) as well asthe variance 1205902
119864a and 1205902
119864ina of regression residuals Eleaa andEleaina The parameters 120572 120573 and 120574 are set to the defaultvalues of PTScalar The smaller the range of temperaturethe higher the linear correlation between leakage power andtemperature [7] The temperature of a processor using DTMtechniques does not exceed the maximum value and thelower temperature has no impact on the design optimizationof thermal-aware processors Therefore the linear regressionanalysis of leakage powers is performed at the temperatureinterval between 60∘C and 100∘C and the regression resultsare used to estimate the leakage power at the whole tempera-ture interval
72 Estimated Accuracy For the hotspot of the processorthat is the branch predictor Figure 3 shows the comparisonbetween the actual value and the estimated value of leakagepower for active cores and Figure 4 shows that for inactivecores A higher estimation accuracy of leakage power isobtained at the temperature interval between 60∘C and 100∘Cat the cost of lower accuracy at the other intervals
After regression analysis the dynamic and leakage powercan be estimated with the linear model in (21) (22) and(24) Figure 5 shows the estimation error rate of the dynamicpower and that of the leakage power for active cores andinactive cores in thermal range between 60∘C and 100∘C Itcan be seen that the error rates of the dynamic powers fordifferent functional units have significant variations For thedecoder the estimation error rate of dynamic power is only262 but 1014 for the floating point register (FPReg) Thereason for this fact is that the linear correlations betweenthe IPC and the dynamic powers for various functionalunits are different Lower error rate results from highercorrelation Obviously the IPC and the dynamic power of
Actual leakage powerEstimated leakage power
60 80 100 12040Temperature (∘C)
minus02
0
02
04
06
08
1
Leak
age p
ower
(W)
Figure 3 Leakage power estimation of hotspot for active cores
Actual leakage powerEstimated leakage power
times10minus4
60 80 100 12040Temperature (∘C)
minus2
0
2
4
6
8
10
12
14
Leak
age p
ower
(W)
Figure 4 Leakage power estimation of hotspot for inactive cores
the decoder have the highest linear correlation while theIPC and that of FPReg have the lowest linear correlationIt can also be seen from Figure 5 that the estimation errorrates of leakage power for both active cores and inactivecores are similar The estimation error rates of leakage powerfor active cores are between 315 and 326 and those forinactive cores are between 306 and 591 This is becausethe temperature and the leakage power of various functionalunits have similar linear correlations In order to consider the
12 Mathematical Problems in Engineering
Table 1 Means (120583) and standard deviations (120590) of hotspot temperature distribution for active cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 5193 403 6125 540 7002 669 7935 8067 5920 425 6973 569 7964 705 9017 8508 6669 448 7846 600 8954 743 10131 895
Table 2 Means (120583) and standard deviations (120590) of hotspot temperature distribution for inactive cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 4059 171 4708 229 5319 284 5968 3427 4778 195 5546 261 6269 323 7037 389
Dynamic power
FPRe
gIT
LB IL1
DTB DL1 L2
Dec
ode
Bran
chRA
TRU
ULS
QIA
LU1
IALU
2IA
LU3
IALU
4FP
Add
FPM
ulIn
tReg
Leakage power for active coreLeakage power for inactive core
02468
1012
Erro
r rat
e (
)
Figure 5 Error rate of the power estimation
impact of estimation errors on the analysis of temperatureand frequency the error terms Edyn Eleaa and Eleaina areintroduced into the linear models of dynamic power andleakage power as expressed in (21) (22) and (24)
73 Probabilistic Distribution of Temperature Based on theassumption that no DTM techniques are used to controlthe temperature of processors according to Theorem 4the probabilistic distribution of the hotspot temperaturesfor both active cores and inactive cores can be obtainedTable 1 presents the means and the standard deviations ofthe probabilistic distribution of the hotspot temperature foractive cores and the corresponding probabilistic densitycurves are as shown in Figure 6 It can be seen that thehotspot temperature of processors is not deterministic andhas significant variations for a certain number of active coresand a certain frequency The number of active cores and therunning frequency simultaneously determine the range inwhich the temperature lies and the probabilistic distributionof the hotspot temperature For the same running frequencymore active cores will yield higher temperature and viceversa For the same number of active cores higher frequencywill bring higher temperature and vice versa Accordingto the characteristics of normal distribution curve it can
be known that the shape of probabilistic density curvecorresponds to the variations of data distribution dependingon the standard deviation of random variables The curvewith a higher peak implies a smaller standard deviationthat is a lower variation of data distribution whereas thecurve with a lower peak implies a bigger standard deviationthat is a larger variation it can be seen from Figure 6 thatthe probabilistic density curves corresponding to variousfrequencies have different peaks This observation impliesthat the degree of temperature variation has close correlationwith working frequency and higher frequency will yieldhigher variation of temperature
Table 2 presents the means and standard deviations ofthe probabilistic distribution of the hotspot temperature forinactive cores and the corresponding probabilistic densitycurves are as shown in Figure 7 When the number of activecores is eight inactive core does not exist so there is nocase where the number of active cores is eight in Table 2and Figure 7 The effect of the frequency and the number ofactive cores at the hotspot temperature for inactive cores is thesame as that for active cores except that the mean value andvariation of hotspot temperature of inactive cores are lowerthan those of active cores under the same frequency and thenumber of active cores
74 Probabilistic Distribution of Frequencies If the power-gating and the DFS techniques are simultaneously usedto manage the temperature of a processor according toTheorem 5 the probabilistic distribution of working frequen-cies can be determined Figure 8 presents the probabilisticdistribution of frequencies when the number of active cores issix seven and eight respectively When the frequency is lessthan 1 it is implied that the hotspot temperature surpasses thethreshold and the DFS technique is triggered to reduce thefrequency of the processor So the probability for triggeringDFS can be obtained from the probabilistic distribution offrequencies Figure 9 presents the probability for triggeringDFS when the number of active cores is six seven and eightrespectively
Mathematical Problems in Engineering 13
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
0
002
004
006
008
01Pr
obab
ility
den
sity
40 60 80 100 120 14020Temperature (∘C)
(a)
40 60 80 100 120 14020Temperature (∘C)
0
002
004
006
008
01
Prob
abili
ty d
ensit
yFrequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
0
002
004
006
008
01
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(c)
Figure 6 The probability density of the hotspot temperature for active cores (a) The number of active cores is six (b) The number of activecores is seven (c) The number of active cores is eight
If a core is powered off and made inactive using thepower-gating technique it only dissipates the leakage powerwhich is much less than that of an active core so the powerdissipated by the processor is reduced significantlyWhen thenumber of active cores is less than six that is more thantwo cores are powered off the decreased power makes it
enough for the rest of active cores of a processor to executeat the full speed and the DFS is not necessary to be triggeredTherefore when the number of active cores is less than six thefrequency of the processor is constantly 1 and the probabilityfor triggering DFS is constantly 0 This situation is not givenin Figures 8 and 9
14 Mathematical Problems in Engineering
Table 3 Comparisons of temperatures between active cores with and without DFS
Number of active cores Average hotspot temperature (∘C) Probability for exceedingtemperature threshold ()
With DFS Without DFS With DFS Without DFS6 7930 7935 0 0527 8885 9017 0 12378 9385 10131 0 5584
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(a)
40 60 80 100 120 14020Temperature (∘C)
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
Figure 7The probability density of the hotspot temperature for inactive cores (a)The number of active cores is six (b)The number of activecores is seven
It can be seen that various numbers of active cores resultin different probabilistic distributions of frequencies anddifferent probabilities for triggering DFS When all cores arepowered on the probability that the processor runs at themaximum frequency is only 4416 and the probability fortriggering DFS is 5584This means that all cores will run atthe full speed only when the IPCs of tasks are less If the IPCsincrease to an extent the running frequency will be scaleddown to control the temperature under the threshold Asthe number of active cores decreases the probability that theprocessor runs at the maximum frequency increases and theprobability for triggering DFS decreases When the numberof active cores is less than six no matter what the IPCs arethe saved power by shutting off more than two cores makes itdeterministic for the active cores to run at the full speed sothe probability that the processor runs at the maximum fre-quency is 100 and the probability for triggering DFS is 0
75 Comparisons of Temperatures with and without DFS IftheDFS technique is not used then the processor always runs
at the full speed that is the running frequency is constantly 1So the average hotspot temperature of the active core withoutthe DFS can be obtained according to (52) If both the power-gating and DFS techniques are used simultaneously for thedynamic thermal management then the running frequencycan be scaled to control the temperature of the processorunder the thermal threshold According to (63) the averagehotspot temperature of the active core with the DFS canbe obtained In terms of the average hotspot temperatureand the probability that the hotspot temperature exceeds thethreshold Table 3 presents the comparative results betweenthe active cores with and without the DFS when the numberof active cores is 6 7 and 8
If the DFS technique is used by the processor the runningfrequency will be scaled down to reduce the temperatureonce the hotspot temperature reaches the threshold So thehotspot temperature will not exceed the threshold that isthe probability that the hotspot temperature exceeds thethreshold is 0 If the DFS technique is not used by the pro-cessor the running frequency is always the maximum and
Mathematical Problems in Engineering 15
Number of active cores
Frequency is 1 Frequency is 083Frequency is 067 Frequency is 05
0
20
40
60
80
100
Prob
abili
ty o
f fre
quen
cy (
)
6 7 8
Figure 8 Probabilistic distribution of frequencies
6 7 8Number of active cores
0
10
20
30
40
50
60
Prob
abili
ty fo
r trig
gerin
g D
FS (
)
Figure 9 Probability for triggering DFS
the hotspot temperature is possible to exceed the thresholdthat is the probability that the hotspot temperature exceedsthe threshold is larger than 0Therefore the average hotspottemperature of the processor with the DFS is lower than thatwithout the DFS
As the number of active cores decreases the saved powerby shutting off cores makes it more possible for the activecores to execute at the full speed and the effect of the DFSon cooling down the processor weakens until it disappearsTherefore for the processors with and without the DFS the
average hotspot temperatures become closer as the number ofactive cores decreases as shown in Table 3 Even though theDFS is not used the probability that the hotspot temperatureexceeds the threshold will reduce until 0 when more coresare powered off
When the number of active cores is lower than 6 that ismore than 2 cores are powered off no matter what speed theprocessor runs at the hotspot temperature will not exceedthe threshold so all active cores are not necessary to triggerthe DFS for managing the temperature of the processor andcan run at the maximum frequency Therefore when thenumber of active cores is lower than 6 the probability thatthe hotspot temperature exceeds the threshold is 0 andthe average hotspot temperatures of the processor with andwithout theDFS are sameThere is no differencewith theDFSandwithout theDFSwhen the number of active cores is lowerthan 6 so the comparisons in this situation are not given inTable 3
8 Conclusions
In this paper a probabilistic analysis method of the tem-perature and frequency of multicore processors is presentedtaking the variation of workloads into account It is provedtheoretically in this paper that (1) the hotspot temperaturesof both active cores and inactive cores are the linear functionsof the IPC (2) the hotspot temperature follows the normalprobabilistic distribution based on the assumption that IPCsof all cores follow the same normal distribution and (3) therunning frequency follows a probabilistic distribution
From the experimental results it can be seen that (1) theestimation error rates of the dynamic powers for differentfunctional units have significant variations indicating thatthe linear correlations between the IPC and the dynamicpowers for various functional units are different (2) theestimation error rates of leakage powers for both activecores and inactive cores are similar showing similar linearcorrelations between temperature and leakage power acrossvarious functional units (3) a higher estimation accuracy ofleakage powers can be obtained at the temperature intervalbetween 60∘C and 100∘C at the cost of lower accuracy atother intervals (4) the hotspot temperature of the proces-sor is not deterministic and has significant variation fora certain number of active cores and a certain frequencyand the number of active cores and the running frequencydetermine simultaneously the probabilistic distribution ofhotspot temperature and (5) various numbers of active coresresult in different probabilistic distributions of frequenciesand different probabilities for triggering DFS
Competing Interests
The authors declare no competing interests
Authorsrsquo Contributions
Biying Zhang performed the theoretical analysis and wrotethe major part of this paper Gang Cui and Zhongchuan
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 3
moment matching and it is used to guide the migrationprocesses of tasks To account for the nondeterministicbehavior of tasks in terms of executing times and decisionbranches Das et al [14] formulated thermal analysis as ahybrid automata reachability verification problem and analgorithm for constructing the automata was provided
For transient analysis temporal variations of temperatureand performance depending onworkload are traced and thenhigh estimation accuracy is obtained However transientanalysis is time-consuming and in particular for multicoreprocessors time complexity is unacceptable at the early designstage
22 Steady-State Analysis In order to speed up the estima-tion of temperature and performance at the early designstage of multicore processors researchers resort to steady-state analysis and have carried out a lot of work Based onAmdahlrsquos Law Lee and Kim [15 21] introduced variations ofprocess and workload parallelism into the analyzing modeland optimized the throughput of thermal-aware multicoreprocessors by exploitingDVFS and the per-core power-gating(PCPG) Based on HotSpot Rao et al [16 22] describedan approximate thermal model for homogeneous multicoreprocessors to fast and accurately predict the maximumsteady-state throughput under thermal constraints In thecontext of a hard real-time system of a single-core processorMohaqeqi et al [23] studied stochastic behavior of the systemfor example performance temperature and reliability basedon Markovian view
To the best of our knowledge all previous work of steady-state analysis is based on the assumption that every workloadhas the same thermal contribution to processors resultingin inaccuracy of temperature and performance estimationThis is the focus of our work In this paper the variation ofworkloads is taken into account to model the thermal andfrequency more accurately
3 Thermal Model
In this paper a microarchitecture-level thermal model fora multicore processor is created by replication of a single-core processor based onHotSpot [3]Themulticore processoris divided into four layers that is chip thermal interfacematerial (TIM) heat spreader and heat sink There are 119898
thermal blocks in the chip and TIM and there are five andnine thermal blocks in the heat spreader and heat sinkrespectively Totally there are 119873 = 119899119898 + 14 thermal blocksin the multicore processor where 119899 is the number of cores
The microarchitecture-level thermal model can be repre-sented by the state-space differential equation as follows [16]
119889T119889119905
= AT + BP (1)
where T and P are119873-dimension vectors respectively denot-ing the temperature and power of the multicore processorand A and B are constant matrices of 119873 times 119873 dimensiondepending on the thermal conduction and capacitance of theprocessor
DieCore 1
DieCore 2
TIMCore 1
TIMCore 2
DieCore 1
0 0 0
DieCore 2
0 0 0
TIMCore 1 0 0 0
TIMCore 2
0 0 0
0 0 0 0
Spreader
Spreadercenter
center
Spreaderperipheral
and heat sink
Spreaderperipheral
and heat sink
Gdie
Gdie
Gdie-int
Gdie-int
Gint-die
Gint-die
Gint-spr
Gint
Gint
Gpkg
Gspr-int
Figure 1Thermal conductionmatrixG of dual-core processor [16]
When the temperature of the processor is in the steady-state 119889T119889119905 = 0 then
GT = P (2)
where G = minusBminus1A isthe thermal conductance matrix ofHotSpot model
The thermal conductance matrix G can be obtainedfrom the HotSpot simulation tool The thermal conductancematrix of a dual-core processor is as shown in Figure 1 wherethe submatrices Gdie Gint and Gpkg along the diagonal arelateral thermal conductance of the chip TIM and packagerespectively the submatrices Gdie-int and Gint-die are thevertical conductance between die and TIM and the vectorsGint-spr and G1015840spr-int are the vertical conductance between theTIM and the spreader Gdie-int is equal to Gint-die and Gint-spris equal to G1015840spr-int The detailed conductance matrix of Gpkgis as shown in Figure 2
Lemma 1 According to HotSpot thermal model when thetemperature of the multicore processor is in the steady-statethere exist 119898-dimension matrices R Q and Z such that
T119894
= RP119894
+Q119899
sum
119895=1
P119895
+ Z (3)
where T119894
and P119894
are119898-dimension vectors representing respec-tively temperature and power of the 119894th core of the processor
Proof According to the arrangement of elements in theconductance matrix G in HotSpot thermal model (2) can bedecomposed into the following equations
GdieT119894 + Gdie-intTint119894 = P119894
(4)
Gdie-intT119894 + GintTint119894 + Gint-spr119879spr = 0 (5)
4 Mathematical Problems in Engineering
Spreadercenter
Spreadercenter
Spreaderperipheral
and heat
Spreader peripheraland heat sink
Gspr
Gother-spr
Gspr-other
Gother-other
sink
Figure 2 Thermal conduction matrix of the package [16]
G1015840int-spr119899
sum
119895=1
Tint119895 + 119866spr119879spr + G1015840spr-otherTother = 0 (6)
Gspr-other119879spr + Gother-otherTother = Pambient (7)
where Tint119894 is119898-dimension vector representing the tempera-ture of TIM layer of the 119894th core 119879spr is a scalar representingthe temperature of the central block of the spreader andTotheris 13-dimension vector representing the temperature of otherblocks of the spreader and the sink
According to (6) and (7)
119879spr = (G1015840spr-otherGminus1
other-otherGspr-other minus 119866spr)minus1
sdot (G1015840int-spr119899
sum
119894=1
Tint119894 + G1015840spr-otherGminus1
other-otherPambient)
(8)
Set
GSPRO = (G1015840spr-otherGminus1
otherGspr-other minus 119866spr)minus1
Gint-spr
119866spamb = (G1015840spr-otherGminus1
other-otherGspr-other minus 119866spr)minus1
sdot G1015840spr-otherGminus1
other-otherPambient
(9)
Equation (8) can be converted into
119879spr = G1015840SPRO119899
sum
119894=1
Tint119894 + 119866spamb (10)
Substitute (10) into (5) and get
Gdie-intT119894 + GintTint119894
+ Gint-spr (G1015840SPRO119899
sum
119894=1
Tint119894 + 119866spamb) = 0(11)
According to (4) and (11) get
(Gdie-int minus GintGminus1
die-intGdie)T119894
minus Gint-sprG1015840
SPROGminus1
die-intGdie
119899
sum
119895=1
T119895
+ GintGminus1
die-intP119894
+ Gint-sprG1015840
SPROGminus1
die-int
119899
sum
119895=1
P119895
+ Gint-spr119866spamb
= 0
(12)
Set
G1
= Gdie-int minus GintGminus1
die-intGdie
G2
= Gint-sprG1015840
SPROGminus1
die-intGdie
G3
= GintGminus1
die-int
G4
= Gint-sprG1015840
SPROGminus1
die-int
G5
= Gint-spr119866spamb
(13)
Then (12) can be transformed into
G1
T119894
minus G2
119899
sum
119895=1
T119895
+ G3
P119894
+ G4
119899
sum
119895=1
P119895
+ G5
= 0 (14)
According to (14) get
G1
119899
sum
119895=1
T119895
minus 119899G2
119899
sum
119895=1
T119895
+ G3
119899
sum
119895=1
P119895
+ 119899G4
119899
sum
119895=1
P119895
+ 119899G5
= 0
(15)
According to (15) get
119899
sum
119895=1
T119895
= (119899G2
minus G1
)minus1
(G3
+ 119899G4
)
119899
sum
119895=1
P119895
+ 119899 (119899G2
minus G1
)minus1G5
(16)
Substitute (16) into (14) and get
Tdie119894
= Gminus11
(G2
(119899G2
minus G1
)minus1
(G3
+ 119899G4
) minus G4
)
119899
sum
119895=1
P119895
minus Gminus11
G3
P119894
+ Gminus11
(119899G2
(119899G2
minus G1
)minus1
minus E)G5
(17)
Set
R = minusGminus11
G3
Q = Gminus11
(G2
(119899G2
minus G1
)minus1
(G3
+ 119899G4
) minus G4
)
Z = Gminus11
(119899G2
(119899G2
minus G1
)minus1
minus E)G5
(18)
Mathematical Problems in Engineering 5
Get
T119894
= RP119894
+Q119899
sum
119895=1
P119895
+ Z (19)
This means that in the steady-state of temperature of amulticore processor given the powers of all cores the tem-perature of any core can be calculated according to Lemma 1at the microarchitecture level
4 Power Model
It is assumed thatmulticore processors have two power statesactive mode and inactive mode and the power state of eachcore can be set separately And global dynamic frequencyscaling (DFS) technique is used where frequencies of all coresare scaled uniformly
41 Active Mode When a core is in the active mode work-loads are executed and the core dissipates both dynamicpower and leakage power At the microarchitecture level thepower of the active core is defined as
Pa119894 = Pdyna119894 + Pleaa119894 (20)
wherePa119894Pdyna119894 andPleaa119894 are respectively the total powerdynamic power and leakage power of the 119894th active core ofsize119898 times 1
For the processor using DVFS technique the dynamicpower is proportional to the product of the square of thevoltage and the frequency [7] that is Pdyn prop voltage2 timesf requency In this paper the primary purpose is to analyzethe impact of workload variation on the temperatures ofprocessors So in order to simplify the analyzing processesDFS technique rather thanDVFS technique is used tomanagethe temperatures of processor where the voltage is constantand only frequency is adjusted Hence the dynamic powercan be modeled as the linear function of the frequency [1216] In addition the dynamic power caused by workloadexecution has a close linear relationship with IPCs
Let 119883a119894 be the IPC of the 119894th core when workloads arerunning on it and let 119891 be the normalized frequency between0 and 1 and then the microarchitecture-level dynamic powerPdyna119894 of the 119894th active core is defined as
Pdyna119894 = (Gdyn119883a119894 + G1198890
+ Edyn) 119891 (21)
where Gdyn and G1198890
are the linear regression coefficientswhen119891 is set tomaximum 1 andEdyn is the regression residualwhich follows the normal distribution with mean of zeronamely Edyn sim 119873(01205902
119864119889
) Gdyn G1198890 Edyn and 120590119864119889 arevectors of size 119898 times 1 and 1205902
119864119889
is the element-by-elementsquares of 120590
119864119889
At the microarchitecture level in order to simplify the
analyzing procedure the relationship between the leakagepower Pleaa119894 and the temperature Ta119894 of the 119894th active coreis approximated by the linear model as follows
Pleaa119894 = GleaaTa119894 + G1198970a + Eleaa (22)
whereGleaa andG1198970a are the regression coefficients and Eleaais the regression residual which follows the normal distribu-tion with mean of zero namely Eleaa sim 119873(01205902
119864a) Gleaa is adiagonal matrix of size 119898 times 119898 G
1198970a Eleaa 120583119864a and 120590119864a are
vectors of size 119898 times 1 1205902119864a is the element-by-element squares
of 120590119864aAccording to (20) (21) and (22) the power of the active
core Pa119894 is represented by
Pa119894 = (Gdyn119883119894 + G1198890
) 119891 + Edyn + GleaaTa119894 + G1198970a
+ Eleaa(23)
42 Inactive Mode When a core is in the inactive mode itis powered off using power-gating technique The inactivecore only dissipates leakage power which is much lower thanthat of the active core Hence the power of the 119894th inactivecore Pina119894 is the leakage power Pleaina119894 which depends on thecorersquos temperature Tina119894 and it can be approximated by thelinear model at the microarchitecture level as follows
Pina119894 = Pleaina119894 = GleainaTina119894 + G1198970ina + Eleaina (24)
whereGleaina andG1198970ina are regression coefficients andEleainais the regression residuals which follow normal distributionwith mean of zero namely Eleaina sim 119873(01205902
119864ina) Gleaina is adiagonal matrix of size 119898 times 119898 G
1198970ina Eleaina and 120590119864ina arevectors of size119898times1 1205902
119864ina is the element-by-element squaresof 120590119864ina
5 Thermal Analysis
Lemma 2 When the temperature of a multicore processor isin the steady-state the temperatures and the powers of differentinactive cores are same that isT
119894119899119886119894
= T119894119899119886119895
and P119894119899119886119894
= P119894119899119886119895
forforall119894 119895 (119894 = 119895) whereT119894119899119886119894
andP119894119899119886119894
are the temperatures andthe powers of the 119894th inactive core respectively
Proof According to (3) for any inactive cores 119894 and 119895
Tina119894 minus Tina119895 = R (Pina119894 minus Pina119895) (25)
Substitute (24) into (25) and get
Tina119894 minus Tina119895 = RGleaina (Tina119894 minus Tina119895) (26)
According to (26) get
(E minus RGleaina) (Tina119894 minus Tina119895) = 0 (27)
The parameters R and Gleaina are invertible matrices soEminusRGleaina is an invertible matrixTherefore Tina119894 minusTina119895 =0 that is Tina119894 = Tina119895 And Pina119894 = Pina119895 can also beobtained according to (24)
To be convenient set Tina119894 = Tina and Pina119894 = Pina (24)can be simplified into
Pina = GleainaTina + G1198970ina + Eleaina (28)
6 Mathematical Problems in Engineering
Theorem 3 Assume that all cores of a processor have thesame hotspot Let H = [ℎ
1
ℎ2
ℎ119898
] be the selectionvector of the hotspot where only one element of H can beset to 1 and the others are 0 s indicating that the corre-sponding functional unit is hotspot There exist functions1205951
(H 119899 119899119886
119891) 1205952
(H 119899 119899119886
119891) 1205953
(H 119899 119899119886
119891) 1205954
(H 119899 119899119886
119891) 1205955
(H 119899 119899119886
119891) and 1205956
(H 119899 119899119886
119891) such that the hotspot119879ℎ119900119905119886119894
of the 119894th active core can be formulated by
119879hot119886119894 = 1205951
(H 119899 119899119886
119891)119883119886119894
+ 1205952
(H 119899 119899119886
119891)
119899
119886
sum
119896=1
119883119886119896
+ 1205953
(H 119899 119899119886
119891)E119889119910119899
+ 1205954
(H 119899 119899119886
119891)E119897119890119886119886
+ 1205955
(H 119899 119899119886
119891)E119897119890119886119894119899119886
+ 1205956
(H 119899 119899119886
119891)
(29)
There exist functions 1205851
(H 119899 119899119886
119891) 1205852
(H 119899 119899119886
119891)1205853
(H 119899 119899119886
119891) 1205854
(H 119899 119899119886
119891) and 1205855
(H 119899 119899119886
119891) such thatthe hotspot 119879
ℎ119900119905119894119899119886
of the inactive core can be formulated by
119879ℎ119900119905119904
= 1205851
(H 119899 119899119886
119891)
119899
119886
sum
119896=1
119883119886119896
+ 1205852
(H 119899 119899119886
119891)E119889119910119899
+ 1205853
(H 119899 119899119886
119891)E119897119890119886119886
+ 1205854
(H 119899 119899119886
119891)E119897119890119886119894119899119886
+ 1205855
(H 119899 119899119886
119891)
(30)
Proof According to (3) (23) and (28) the temperature of theinactive core Tina can be derived as
Tina = (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1QGleaa
119899a
sum
119896=1
Ta119896
+ 119891 (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
sdotQGdyn
119899a
sum
119896=1
119883a119896 + 119891119899a (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1QEdyn + 119899a (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1QEleaa + (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1
(R + (119899 minus 119899a)Q)
sdot Eleaina + (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
sdot ((R + (119899 minus 119899a)Q)G1198970ina + 119891119899aQG
1198890
+ 119899aQG1198970a
+ Z)
(31)
Let
Φ1
(119899 119899a 119891) = (R + (119899 minus 119899a)Q)G1198970ina + 119891119899aQG
1198890
+ 119899aQG1198970a + Z
Φ2
(119899 119899a) = (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
(32)
Then (31) can be transformed into
Tina = Φ2
(119899 119899a)QGleaa
119899a
sum
119896=1
Ta119896
+ 119891Φ2
(119899 119899a)QGdyn
119899a
sum
119896=1
119883a119896
+ 119891119899aΦ2 (119899 119899a)QEdyn + 119899aΦ2 (119899 119899a)QEleaa
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q)Eleaina
+Φ2
(119899 119899a)Φ1 (119899 119899a 119891)
(33)
According to (3) (23) and (28) the temperature of the 119894thactive core Ta119894 can be derived as
Ta119894 = (E minus RGleaa)minus1QGleaa
119899a
sum
119896=1
Ta119896 + 119891 (E
minus RGleaa)minus1QGdyn
119899a
sum
119896=1
119883a119896 + 119891 (E minus RGleaa)minus1
sdot RGdyn119883a119894 + (119899 minus 119899a) (E minus RGleaa)minus1QGleainaTina
+ 119891 (E minus RGleaa)minus1
(R + 119899aQ)Edyn + (E
minus RGleaa)minus1
(R + 119899aQ)Eleaa + (119899 minus 119899a) (E
minus RGleaa)minus1QEleaina + (E minus RGleaa)
minus1
(119891RG1198890
+ 119899a119891QG1198890
+ RG1198970a + 119899aQG
1198970a
+ (119899 minus 119899a)QG1198970ina + Z)
(34)
Let
Φ3
(119899 119899a 119891) = 119891RG1198890
+ 119899a119891QG1198890
+ RG1198970a
+ 119899aQG1198970a + (119899 minus 119899a)QG
1198970ina
+ Z
Φ4
= (E minus RGleaa)minus1
(35)
Then (34) can be transformed into
Ta119894 = Φ4QGleaa
119899a
sum
119896=1
Ta119896 + 119891Φ4
QGdyn
119899a
sum
119896=1
119883a119896
+ 119891Φ4
RGdyn119883a119894 + (119899 minus 119899a)Φ4QGleainaTina
+ 119891Φ4
(R + 119899aQ)Edyn +Φ4
(R + 119899aQ)Eleaa
+ (119899 minus 119899a)Φ4QEleaina +Φ4Φ3 (119899 119899a 119891)
(36)
Mathematical Problems in Engineering 7
Substitute (33) into (36) and get
Ta119894 = (Φ4
QGleaa
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)QGleaa)
119899a
sum
119896=1
Ta119896
+ 119891 (Φ4
QGdyn
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)QGdyn)
119899a
sum
119896=1
119883a119896
+ 119891Φ4
RGdyn119883a119894 + 119891 (Φ4
(R + 119899aQ)
+ 119899a (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)Q)Edyn
+ (Φ4
(R + 119899aQ)
+ 119899a (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)Q)Eleaa
+ ((119899 minus 119899a)Φ4Q
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a) (R + (119899 minus 119899a)Q))
sdot Eleaina + (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)
sdotΦ1
(119899 119899a 119891) +Φ4
Φ3
(119899 119899a 119891)
(37)
LetΦ5
(119899 119899a) = (Φ4
QGleaa + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)QGleaa)
Φ6
(119899 119899a 119891) = 119891 (Φ4
QGdyn + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)QGdyn)Φ7 (119891)
= 119891Φ4
RGdyn
Φ7
(119891) = 119891Φ4
RGdyn
Φ8
(119899 119899a 119891) = 119891 (Φ4
(R + 119899aQ) + 119899a (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)Q)
Φ9
(119899 119899a) = (Φ4
(R + 119899aQ) + 119899a (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)Q)
Φ10
(119899 119899a) = ((119899 minus 119899a)Φ4Q + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a) (R + (119899 minus 119899a)Q))
Φ11
(119899 119899a 119891) = (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)
sdotΦ1
(119899 119899a 119891) +Φ4
Φ3
(119899 119899a 119891)
(38)
Then (37) is transformed into
Ta119894 = Φ5 (119899 119899a)
119899a
sum
119896=1
Ta119896 +Φ6 (119899 119899a 119891)
119899a
sum
119896=1
119883a119896
+Φ7
(119891)119883a119894 +Φ8 (119899 119899a 119891)Edyn
+Φ9
(119899 119899a)Eleaa +Φ10 (119899 119899a)Eleaina
+Φ11
(119899 119899a 119891)
(39)
According to (39) get119899a
sum
119896=1
Ta119896 = (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891))
119899a
sum
119896=1
119883a119896
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)Edyn
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)Eleaa
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)Eleaina
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
(40)
Substitute (40) into (39) and get
Ta119894 = Φ7 (119891)119883a119894
+ (Φ5
(119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899 119899a 119891)
+Φ7
(119891)) +Φ6
(119899 119899a 119891))
119899a
sum
119896=1
119883a119896 + (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891) +Φ8
(119899 119899a 119891))
sdot Edyn + (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ9
(119899 119899a) +Φ9 (119899 119899a))Eleaa + (119899aΦ5 (119899 119899a) (E
minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a) +Φ10 (119899 119899a))Eleaina
+ 119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891)
(41)
The hotspot temperature 119879hota119894 of the active core can begiven by
119879hota119894 = HTa119894 = HΦ7
(119891)119883a119894 +H (Φ5
(119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899 119899a 119891) +Φ7
(119891))
+Φ6
(119899 119899a 119891))
119899a
sum
119896=1
119883a119896 +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+Φ8
(119899 119899a 119891))Edyn +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a) +Φ9 (119899 119899a))Eleaa
+H (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ10
(119899 119899a))Eleaina +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891))
(42)
8 Mathematical Problems in Engineering
Let1205951
(H 119891) = HΦ7
(119891)
1205952
(H 119899 119899a 119891) = H (Φ5
(119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) +Φ6
(119899 119899a 119891))
1205953
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+Φ8
(119899 119899a 119891))
1205954
(H 119899 119899a) = H (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ9
(119899 119899a) +Φ9 (119899 119899a))
1205955
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a) +Φ10 (119899 119899a))
1205956
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891))
(43)
Then (42) is transformed into
119879hota119894 = 1205951
(H 119891)119883a119894 + 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn + 1205954
(H 119899 119899a)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(44)
Substitute (40) into (33) and get
Tina = (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+ 119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891)
(45)
The hotspot temperature 119879hotina of the inactive core isgiven by
119879hotina = HTina = H (Φ2
(119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(46)
Let1205851
(H 119899 119899a 119891)
= H (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899
119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
1205852
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ8
(119899 119899a 119891) + 119891119899aΦ2 (119899 119899a)Q)
1205853
(H 119899 119899a) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ9
(119899 119899a) + 119899aΦ2 (119899 119899a)Q)
1205854
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ10
(119899 119899a) +Φ2 (119899 119899a) (R + (119899 minus 119899a)Q))
1205855
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(47)
Then (46) is transformed into
119879hot119904 = 1205851 (H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896 + 1205852 (H 119899 119899a 119891)Edyn
+ 1205853
(H 119899 119899a 119891)Eleaa
+ 1205854
(H 119899 119899a 119891)Eleaina + 1205855 (H 119899 119899a 119891)
(48)
Mathematical Problems in Engineering 9
According toTheorem 3 it can be known that the hotspottemperature of any core can be expressed as the linearfunction of IPCs of all cores
Theorem 4 Suppose that (a) the random variable 119883119886119894
forthe IPC follows normal distribution with the mean 120583
119883
andthe variance 120590
2
119883
that is 119883119886119894
sim 119873(120583119883
1205902
119883
) (b) the tasksrunning on different cores are mutually independent that is119883119886119894
and 119883119886119895
are independent for 1 le forall119894 119895 le 119898 and (c)workload balancing techniques are used in the processor suchthat 119883
119886119894
and 119883119886119895
follow the same normal distribution Thenthere exist functions 120583
119879119886
(H 119899 119899119886
119891) and 120590119879119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119886119894
of the active corefollows the normal distributionwithmean120583
119879119886
(H 119899 119899119886
119891) andvariance 120590
119879119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119886119894
sim 119873(120583119879119886
(H 119899 119899119886
119891) 120590119879119886
2
(H 119899 119899119886
119891)) (49)
There exist functions 120583119879119894119899119886
(H 119899 119899119886
119891) and 120590119879119894119899119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119894119899119886
of the inactive corefollows the normal distribution with mean 120583
119879119894119899119886
(H 119899 119899119886
119891)
and variance 120590119879119894119899119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119894119899119886
sim 119873(120583119879119894119899119886
(H 119899 119899119886
119891) 120590119879119894119899119886
2
(H 119899 119899119886
119891))
(50)
Proof According to (29) get
119879hota119894 = (1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894
+ 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119896 =119894
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn
+ 1205954
(H 119899 119899a 119891)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(51)
The norm-distributed random variables 119883a119894 and 119883a119895 areindependent in the case of 1 le forall119894 119895 le 119898 so that thelinear combination (120595
1
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 +
1205952
(H 119899 119899a 119891)sum119899a119896=1119896 =119894
119883a119896 of 119883a119894 still follows the normaldistribution
All elements Edyn119896 in the random vector Edyn follow nor-mal distribution and are independent so that the linear com-bination 120595
3
(H 119899 119899a 119891)Edyn of all elements in Edyn followsnormal distribution In the sameway120595
4
(H 119899 119899a 119891)Eleaa and1205955
(H 119899 119899a 119891)Eleaina also follow normal distribution(1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 1205952
(H 119899 119899a119891)sum119899a119896=1119896 =119894
119883a119896 1205953
(H 119899 119899a 119891)Edyn 1205954
(H 119899 119899a 119891)Eleaaand 120595
5
(H 119899 119899a 119891)Eleaina all follow normal distribution andare mutually independent so that 119879hota119894 follows normal
distribution where the mean 120583119879ina(H 119899 119899a 119891) is calculated
by
120583119879a (H 119899 119899a 119891)
= (1205951
(H 119891) + 119899a1205952
(H 119899 119899a 119891)) 120583119883
+ 1205953
(H 119899 119899a 119891)120583119864119889
+ 1205954
(H 119899 119899a)120583119864a
+ 1205955
(H 119899 119899a 119891)120583119864ina + 1205956 (H 119899 119899a 119891)
(52)
and the variance 120590119879ina2
(H 119899 119899a 119891) is calculated by
120590119879a2
(H 119899 119899a 119891)
= (1205951
(H 119891) + 1205952
(H 119899 119899a 119891))2
1205902
119883
+ (119899a minus 1)1205952
2
(H 119899 119899a 119891) 1205902
119883
+ 1205953
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205954
2
(H 119899 119899a)1205902
119864a
+ 1205955
2
(H 119899 119899a 119891)1205902
119864ina
(53)
In the same way 1205851
(H 119899 119899a 119891)sum119899a119896=1
119883a119896 1205852(H 119899 119899a119891)Edyn 1205853(H 119899 119899a 119891)Eleaa and 1205854(H 119899 119899a 119891)Eleaina in (30)all follow the normal distribution and are mutually indepen-dent so that 119879hot119904 follows the normal distribution where themean 120583
119879119904
(H 119899 119899a 119891) is calculated by
120583119879119904
(H 119899 119899a 119891) = 119899a1205851 (H 119899 119899a 119891) 120583119883
+ 1205852
(H 119899 119899a 119891)120583119864119889
+ 1205853
(H 119899 119899a)120583119864a
+ 1205854
(H 119899 119899a 119891)120583119864ina
+ 1205855
(H 119899 119899a 119891)
(54)
and the variation 120590119879119904
2
(H 119899 119899a 119891) is calculated by
120590119879119904
2
(H 119899 119899a 119891) = 119899a12058512
(H 119899 119899a 119891) 1205902
119883
+ 1205852
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205853
2
(H 119899 119899a)1205902
119864a
+ 1205854
2
(H 119899 119899a 119891)1205902
119864ina
(55)
According toTheorem 4 it can be known that the hotspottemperature of any core follows the normal probabilisticdistribution
6 Frequency Analysis
The zero-slack policy is used as the DTM strategy of proces-sors that is the speed of processor is set to a value whichmakes the temperature of the hotspot be the threshold [7]However the frequencies are discrete in this workThereforein most cases there is no frequency in the set of frequenciesmaking the hotspot temperature be the threshold exactly
10 Mathematical Problems in Engineering
Theorem 5 Let 119865119878119890119905 = 1198911
119891120587
119891120587+1
119891119891119888119900119906119899119905
be the set of frequencies of multicore processors where1198911
lt sdot sdot sdot lt 119891120587
lt 119891120587+1
lt sdot sdot sdot lt 119891119891119888119900119906119899119905
then the probabilisticdistribution of the frequency f follows
119875 119891 = 119891120587
=
119875 119879ℎ119900119905119886
(119899119886
119891120587
) lt 119879V119886119897V119890 119891120587
= 119891119888119900119906119899119905
119875 119879ℎ119900119905119886
(119899119886
119891120587
) le 119879V119886119897V119890 minus 119875 119879ℎ119900119905119886
(119899119886
119891120587+1
) le 119879V119886119897V119890 119891120587
lt 119891119888119900119906119899119905
(56)
where119879ℎ119900119905119886
(119899119886
119891120587
) is the function of the number of active cores119899119886
and the frequency 119891120587
representing the hotspot temperatureof the active core and 119879V119886119897V119890 is the temperature threshold of theprocessor
Proof When 119891120587
= 119891count according to the zero-slack DTMpolicy obviously
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve (57)
When 119891120587
lt 119891count then the probabilities of 119891 = 119891120587
can bebroken into two cases
(a) if 119879hota(119899a 119891120587) gt 119879hota(119899a 119891120587+1) according to thezero-slack DTM strategy the probability of 119891 = 119891
120587
is 0 that is
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) gt 119879hota (119899a 119891120587+1) = 0 (58)
(b) if 119879hota(119899a 119891120587) le 119879hota(119899a 119891120587+1) the probability of119891 = 119891
120587
is given by
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
(59)
The right-hand side of (59) can be derived by
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve minus 119875 119879hota (119899a 119891120587)
le 119879valve 119879hota (119899a 119891120587+1) le 119879valve | 119879hota (119899a 119891120587)
le 119879hota (119899a 119891120587+1)
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
le 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587+1) le 119879valve
(60)
According to (59) and (60) get
119875 119891119879hota = (119899a 119891120587) 119891120587 le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve
(61)
Hence when 119891120587
lt 119891count according to (58) and (61) get
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve (62)
Therefore according to Theorem 5 the probabilistic dis-tribution of the set of frequencies can be obtained based onthe assumption that the zero-slack policy is used
Given the probabilistic distribution of the frequency andthe mean of hotspot temperature of the active core for acertain frequency the average hotspot temperature of theactive core 119879
averhota can be obtained by
119879averhota = sum
119891isin119865Set120583119879a (H 119899 119899a 119891) sdot 119875 (119891) (63)
where 119865Set = 1198911
119891120587
119891120587+1
119891119891count denotes the set
of frequencies of multicore processors and 119875(119891) denotes theprobability that the frequency is 119891 120583
119879a(H 119899 119899a 119891) denotesthe mean of hotspot temperature of the active core given thefrequency 119891
7 Experimental Results
71 Experimental Methodology Amulticore version of Alpha21264 processor is used as the processor model in ourexperiment [24] and there are eight cores in the processorThe cores have two working states active state and inac-tive state and the working state of each core can be setseparately The processor employs a global DFS techniquewhich means that frequencies of all cores in the processorare scaled uniformly There are four discrete frequenciesused by the processor that is 15 GHz 2GHz 25 GHz and3GHz To facilitate analysis the frequencies are normalizedinto the interval [0 1] so that the maximum frequency isnormalized to 1 After normalization the set of frequenciesis 05 067 083 1 According to our previous work thehotspot of Alpha 21264 processor is the branch predictor[18] So the second element of the hotspot selection vectorH corresponding to the branch predictor is set to 1 and theother elements are set to 0 sThe thermal threshold that is themaximum temperature allowed by processor is set to 100∘C
TheHotSpot is used as the thermalmodel of themulticoreprocessor and the parameters such as thermal conductanceand capacitance are set to default values of HotSpot sim-ulation tool [3] In order to construct the linear model ofdynamic power PTScalar is modified to obtain both the
Mathematical Problems in Engineering 11
dynamic power profiles of each functional unit in Alpha21264 processor and the IPC profiles [25] The parameters ofPTScalar are set to default values as well The mean 120583
119883
andvariance 120590
2
119883
of norm-distributed 119883a119894 are determined usingIPC profiles based on the maximum likelihood estimationSome representative tasks such as mesa ammp quake bzipmcfmath and qsort fromMiBench [26] and SPECCPU2000[27] are selected as the benchmarks These selected tasks aremutually independent that is no task takes precedence overthe others so the tasks can be parallel executed at the tasklevel In addition workload balancing techniques are used inthe processor A task is not fixed on a core and the tasks canbe migrated among all cores such that the IPCs of differentcores are equal The simple linear regression analysis is usedto determine the coefficients Gdyn and G
1198890
in (21) and thevariance 1205902
119864119889
of regression residuals Edyn is obtainedThe leakage power is the nonlinear monotonic increasing
function of temperature which is given by [25]
119875119897
= 120572 sdot 1198792
sdot 119890120574119879
+ 120573 (64)
where 120572 120573 and 120574 are parameters which depend on topologysize technology and design of processors In order toconstruct the linear model of leakage power (64) is regressedlinearly to determine the coefficients Gleaa and G
1198970a in (22)and the coefficients Gleaina and G
1198970ina in (24) as well asthe variance 1205902
119864a and 1205902
119864ina of regression residuals Eleaa andEleaina The parameters 120572 120573 and 120574 are set to the defaultvalues of PTScalar The smaller the range of temperaturethe higher the linear correlation between leakage power andtemperature [7] The temperature of a processor using DTMtechniques does not exceed the maximum value and thelower temperature has no impact on the design optimizationof thermal-aware processors Therefore the linear regressionanalysis of leakage powers is performed at the temperatureinterval between 60∘C and 100∘C and the regression resultsare used to estimate the leakage power at the whole tempera-ture interval
72 Estimated Accuracy For the hotspot of the processorthat is the branch predictor Figure 3 shows the comparisonbetween the actual value and the estimated value of leakagepower for active cores and Figure 4 shows that for inactivecores A higher estimation accuracy of leakage power isobtained at the temperature interval between 60∘C and 100∘Cat the cost of lower accuracy at the other intervals
After regression analysis the dynamic and leakage powercan be estimated with the linear model in (21) (22) and(24) Figure 5 shows the estimation error rate of the dynamicpower and that of the leakage power for active cores andinactive cores in thermal range between 60∘C and 100∘C Itcan be seen that the error rates of the dynamic powers fordifferent functional units have significant variations For thedecoder the estimation error rate of dynamic power is only262 but 1014 for the floating point register (FPReg) Thereason for this fact is that the linear correlations betweenthe IPC and the dynamic powers for various functionalunits are different Lower error rate results from highercorrelation Obviously the IPC and the dynamic power of
Actual leakage powerEstimated leakage power
60 80 100 12040Temperature (∘C)
minus02
0
02
04
06
08
1
Leak
age p
ower
(W)
Figure 3 Leakage power estimation of hotspot for active cores
Actual leakage powerEstimated leakage power
times10minus4
60 80 100 12040Temperature (∘C)
minus2
0
2
4
6
8
10
12
14
Leak
age p
ower
(W)
Figure 4 Leakage power estimation of hotspot for inactive cores
the decoder have the highest linear correlation while theIPC and that of FPReg have the lowest linear correlationIt can also be seen from Figure 5 that the estimation errorrates of leakage power for both active cores and inactivecores are similar The estimation error rates of leakage powerfor active cores are between 315 and 326 and those forinactive cores are between 306 and 591 This is becausethe temperature and the leakage power of various functionalunits have similar linear correlations In order to consider the
12 Mathematical Problems in Engineering
Table 1 Means (120583) and standard deviations (120590) of hotspot temperature distribution for active cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 5193 403 6125 540 7002 669 7935 8067 5920 425 6973 569 7964 705 9017 8508 6669 448 7846 600 8954 743 10131 895
Table 2 Means (120583) and standard deviations (120590) of hotspot temperature distribution for inactive cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 4059 171 4708 229 5319 284 5968 3427 4778 195 5546 261 6269 323 7037 389
Dynamic power
FPRe
gIT
LB IL1
DTB DL1 L2
Dec
ode
Bran
chRA
TRU
ULS
QIA
LU1
IALU
2IA
LU3
IALU
4FP
Add
FPM
ulIn
tReg
Leakage power for active coreLeakage power for inactive core
02468
1012
Erro
r rat
e (
)
Figure 5 Error rate of the power estimation
impact of estimation errors on the analysis of temperatureand frequency the error terms Edyn Eleaa and Eleaina areintroduced into the linear models of dynamic power andleakage power as expressed in (21) (22) and (24)
73 Probabilistic Distribution of Temperature Based on theassumption that no DTM techniques are used to controlthe temperature of processors according to Theorem 4the probabilistic distribution of the hotspot temperaturesfor both active cores and inactive cores can be obtainedTable 1 presents the means and the standard deviations ofthe probabilistic distribution of the hotspot temperature foractive cores and the corresponding probabilistic densitycurves are as shown in Figure 6 It can be seen that thehotspot temperature of processors is not deterministic andhas significant variations for a certain number of active coresand a certain frequency The number of active cores and therunning frequency simultaneously determine the range inwhich the temperature lies and the probabilistic distributionof the hotspot temperature For the same running frequencymore active cores will yield higher temperature and viceversa For the same number of active cores higher frequencywill bring higher temperature and vice versa Accordingto the characteristics of normal distribution curve it can
be known that the shape of probabilistic density curvecorresponds to the variations of data distribution dependingon the standard deviation of random variables The curvewith a higher peak implies a smaller standard deviationthat is a lower variation of data distribution whereas thecurve with a lower peak implies a bigger standard deviationthat is a larger variation it can be seen from Figure 6 thatthe probabilistic density curves corresponding to variousfrequencies have different peaks This observation impliesthat the degree of temperature variation has close correlationwith working frequency and higher frequency will yieldhigher variation of temperature
Table 2 presents the means and standard deviations ofthe probabilistic distribution of the hotspot temperature forinactive cores and the corresponding probabilistic densitycurves are as shown in Figure 7 When the number of activecores is eight inactive core does not exist so there is nocase where the number of active cores is eight in Table 2and Figure 7 The effect of the frequency and the number ofactive cores at the hotspot temperature for inactive cores is thesame as that for active cores except that the mean value andvariation of hotspot temperature of inactive cores are lowerthan those of active cores under the same frequency and thenumber of active cores
74 Probabilistic Distribution of Frequencies If the power-gating and the DFS techniques are simultaneously usedto manage the temperature of a processor according toTheorem 5 the probabilistic distribution of working frequen-cies can be determined Figure 8 presents the probabilisticdistribution of frequencies when the number of active cores issix seven and eight respectively When the frequency is lessthan 1 it is implied that the hotspot temperature surpasses thethreshold and the DFS technique is triggered to reduce thefrequency of the processor So the probability for triggeringDFS can be obtained from the probabilistic distribution offrequencies Figure 9 presents the probability for triggeringDFS when the number of active cores is six seven and eightrespectively
Mathematical Problems in Engineering 13
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
0
002
004
006
008
01Pr
obab
ility
den
sity
40 60 80 100 120 14020Temperature (∘C)
(a)
40 60 80 100 120 14020Temperature (∘C)
0
002
004
006
008
01
Prob
abili
ty d
ensit
yFrequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
0
002
004
006
008
01
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(c)
Figure 6 The probability density of the hotspot temperature for active cores (a) The number of active cores is six (b) The number of activecores is seven (c) The number of active cores is eight
If a core is powered off and made inactive using thepower-gating technique it only dissipates the leakage powerwhich is much less than that of an active core so the powerdissipated by the processor is reduced significantlyWhen thenumber of active cores is less than six that is more thantwo cores are powered off the decreased power makes it
enough for the rest of active cores of a processor to executeat the full speed and the DFS is not necessary to be triggeredTherefore when the number of active cores is less than six thefrequency of the processor is constantly 1 and the probabilityfor triggering DFS is constantly 0 This situation is not givenin Figures 8 and 9
14 Mathematical Problems in Engineering
Table 3 Comparisons of temperatures between active cores with and without DFS
Number of active cores Average hotspot temperature (∘C) Probability for exceedingtemperature threshold ()
With DFS Without DFS With DFS Without DFS6 7930 7935 0 0527 8885 9017 0 12378 9385 10131 0 5584
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(a)
40 60 80 100 120 14020Temperature (∘C)
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
Figure 7The probability density of the hotspot temperature for inactive cores (a)The number of active cores is six (b)The number of activecores is seven
It can be seen that various numbers of active cores resultin different probabilistic distributions of frequencies anddifferent probabilities for triggering DFS When all cores arepowered on the probability that the processor runs at themaximum frequency is only 4416 and the probability fortriggering DFS is 5584This means that all cores will run atthe full speed only when the IPCs of tasks are less If the IPCsincrease to an extent the running frequency will be scaleddown to control the temperature under the threshold Asthe number of active cores decreases the probability that theprocessor runs at the maximum frequency increases and theprobability for triggering DFS decreases When the numberof active cores is less than six no matter what the IPCs arethe saved power by shutting off more than two cores makes itdeterministic for the active cores to run at the full speed sothe probability that the processor runs at the maximum fre-quency is 100 and the probability for triggering DFS is 0
75 Comparisons of Temperatures with and without DFS IftheDFS technique is not used then the processor always runs
at the full speed that is the running frequency is constantly 1So the average hotspot temperature of the active core withoutthe DFS can be obtained according to (52) If both the power-gating and DFS techniques are used simultaneously for thedynamic thermal management then the running frequencycan be scaled to control the temperature of the processorunder the thermal threshold According to (63) the averagehotspot temperature of the active core with the DFS canbe obtained In terms of the average hotspot temperatureand the probability that the hotspot temperature exceeds thethreshold Table 3 presents the comparative results betweenthe active cores with and without the DFS when the numberof active cores is 6 7 and 8
If the DFS technique is used by the processor the runningfrequency will be scaled down to reduce the temperatureonce the hotspot temperature reaches the threshold So thehotspot temperature will not exceed the threshold that isthe probability that the hotspot temperature exceeds thethreshold is 0 If the DFS technique is not used by the pro-cessor the running frequency is always the maximum and
Mathematical Problems in Engineering 15
Number of active cores
Frequency is 1 Frequency is 083Frequency is 067 Frequency is 05
0
20
40
60
80
100
Prob
abili
ty o
f fre
quen
cy (
)
6 7 8
Figure 8 Probabilistic distribution of frequencies
6 7 8Number of active cores
0
10
20
30
40
50
60
Prob
abili
ty fo
r trig
gerin
g D
FS (
)
Figure 9 Probability for triggering DFS
the hotspot temperature is possible to exceed the thresholdthat is the probability that the hotspot temperature exceedsthe threshold is larger than 0Therefore the average hotspottemperature of the processor with the DFS is lower than thatwithout the DFS
As the number of active cores decreases the saved powerby shutting off cores makes it more possible for the activecores to execute at the full speed and the effect of the DFSon cooling down the processor weakens until it disappearsTherefore for the processors with and without the DFS the
average hotspot temperatures become closer as the number ofactive cores decreases as shown in Table 3 Even though theDFS is not used the probability that the hotspot temperatureexceeds the threshold will reduce until 0 when more coresare powered off
When the number of active cores is lower than 6 that ismore than 2 cores are powered off no matter what speed theprocessor runs at the hotspot temperature will not exceedthe threshold so all active cores are not necessary to triggerthe DFS for managing the temperature of the processor andcan run at the maximum frequency Therefore when thenumber of active cores is lower than 6 the probability thatthe hotspot temperature exceeds the threshold is 0 andthe average hotspot temperatures of the processor with andwithout theDFS are sameThere is no differencewith theDFSandwithout theDFSwhen the number of active cores is lowerthan 6 so the comparisons in this situation are not given inTable 3
8 Conclusions
In this paper a probabilistic analysis method of the tem-perature and frequency of multicore processors is presentedtaking the variation of workloads into account It is provedtheoretically in this paper that (1) the hotspot temperaturesof both active cores and inactive cores are the linear functionsof the IPC (2) the hotspot temperature follows the normalprobabilistic distribution based on the assumption that IPCsof all cores follow the same normal distribution and (3) therunning frequency follows a probabilistic distribution
From the experimental results it can be seen that (1) theestimation error rates of the dynamic powers for differentfunctional units have significant variations indicating thatthe linear correlations between the IPC and the dynamicpowers for various functional units are different (2) theestimation error rates of leakage powers for both activecores and inactive cores are similar showing similar linearcorrelations between temperature and leakage power acrossvarious functional units (3) a higher estimation accuracy ofleakage powers can be obtained at the temperature intervalbetween 60∘C and 100∘C at the cost of lower accuracy atother intervals (4) the hotspot temperature of the proces-sor is not deterministic and has significant variation fora certain number of active cores and a certain frequencyand the number of active cores and the running frequencydetermine simultaneously the probabilistic distribution ofhotspot temperature and (5) various numbers of active coresresult in different probabilistic distributions of frequenciesand different probabilities for triggering DFS
Competing Interests
The authors declare no competing interests
Authorsrsquo Contributions
Biying Zhang performed the theoretical analysis and wrotethe major part of this paper Gang Cui and Zhongchuan
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
4 Mathematical Problems in Engineering
Spreadercenter
Spreadercenter
Spreaderperipheral
and heat
Spreader peripheraland heat sink
Gspr
Gother-spr
Gspr-other
Gother-other
sink
Figure 2 Thermal conduction matrix of the package [16]
G1015840int-spr119899
sum
119895=1
Tint119895 + 119866spr119879spr + G1015840spr-otherTother = 0 (6)
Gspr-other119879spr + Gother-otherTother = Pambient (7)
where Tint119894 is119898-dimension vector representing the tempera-ture of TIM layer of the 119894th core 119879spr is a scalar representingthe temperature of the central block of the spreader andTotheris 13-dimension vector representing the temperature of otherblocks of the spreader and the sink
According to (6) and (7)
119879spr = (G1015840spr-otherGminus1
other-otherGspr-other minus 119866spr)minus1
sdot (G1015840int-spr119899
sum
119894=1
Tint119894 + G1015840spr-otherGminus1
other-otherPambient)
(8)
Set
GSPRO = (G1015840spr-otherGminus1
otherGspr-other minus 119866spr)minus1
Gint-spr
119866spamb = (G1015840spr-otherGminus1
other-otherGspr-other minus 119866spr)minus1
sdot G1015840spr-otherGminus1
other-otherPambient
(9)
Equation (8) can be converted into
119879spr = G1015840SPRO119899
sum
119894=1
Tint119894 + 119866spamb (10)
Substitute (10) into (5) and get
Gdie-intT119894 + GintTint119894
+ Gint-spr (G1015840SPRO119899
sum
119894=1
Tint119894 + 119866spamb) = 0(11)
According to (4) and (11) get
(Gdie-int minus GintGminus1
die-intGdie)T119894
minus Gint-sprG1015840
SPROGminus1
die-intGdie
119899
sum
119895=1
T119895
+ GintGminus1
die-intP119894
+ Gint-sprG1015840
SPROGminus1
die-int
119899
sum
119895=1
P119895
+ Gint-spr119866spamb
= 0
(12)
Set
G1
= Gdie-int minus GintGminus1
die-intGdie
G2
= Gint-sprG1015840
SPROGminus1
die-intGdie
G3
= GintGminus1
die-int
G4
= Gint-sprG1015840
SPROGminus1
die-int
G5
= Gint-spr119866spamb
(13)
Then (12) can be transformed into
G1
T119894
minus G2
119899
sum
119895=1
T119895
+ G3
P119894
+ G4
119899
sum
119895=1
P119895
+ G5
= 0 (14)
According to (14) get
G1
119899
sum
119895=1
T119895
minus 119899G2
119899
sum
119895=1
T119895
+ G3
119899
sum
119895=1
P119895
+ 119899G4
119899
sum
119895=1
P119895
+ 119899G5
= 0
(15)
According to (15) get
119899
sum
119895=1
T119895
= (119899G2
minus G1
)minus1
(G3
+ 119899G4
)
119899
sum
119895=1
P119895
+ 119899 (119899G2
minus G1
)minus1G5
(16)
Substitute (16) into (14) and get
Tdie119894
= Gminus11
(G2
(119899G2
minus G1
)minus1
(G3
+ 119899G4
) minus G4
)
119899
sum
119895=1
P119895
minus Gminus11
G3
P119894
+ Gminus11
(119899G2
(119899G2
minus G1
)minus1
minus E)G5
(17)
Set
R = minusGminus11
G3
Q = Gminus11
(G2
(119899G2
minus G1
)minus1
(G3
+ 119899G4
) minus G4
)
Z = Gminus11
(119899G2
(119899G2
minus G1
)minus1
minus E)G5
(18)
Mathematical Problems in Engineering 5
Get
T119894
= RP119894
+Q119899
sum
119895=1
P119895
+ Z (19)
This means that in the steady-state of temperature of amulticore processor given the powers of all cores the tem-perature of any core can be calculated according to Lemma 1at the microarchitecture level
4 Power Model
It is assumed thatmulticore processors have two power statesactive mode and inactive mode and the power state of eachcore can be set separately And global dynamic frequencyscaling (DFS) technique is used where frequencies of all coresare scaled uniformly
41 Active Mode When a core is in the active mode work-loads are executed and the core dissipates both dynamicpower and leakage power At the microarchitecture level thepower of the active core is defined as
Pa119894 = Pdyna119894 + Pleaa119894 (20)
wherePa119894Pdyna119894 andPleaa119894 are respectively the total powerdynamic power and leakage power of the 119894th active core ofsize119898 times 1
For the processor using DVFS technique the dynamicpower is proportional to the product of the square of thevoltage and the frequency [7] that is Pdyn prop voltage2 timesf requency In this paper the primary purpose is to analyzethe impact of workload variation on the temperatures ofprocessors So in order to simplify the analyzing processesDFS technique rather thanDVFS technique is used tomanagethe temperatures of processor where the voltage is constantand only frequency is adjusted Hence the dynamic powercan be modeled as the linear function of the frequency [1216] In addition the dynamic power caused by workloadexecution has a close linear relationship with IPCs
Let 119883a119894 be the IPC of the 119894th core when workloads arerunning on it and let 119891 be the normalized frequency between0 and 1 and then the microarchitecture-level dynamic powerPdyna119894 of the 119894th active core is defined as
Pdyna119894 = (Gdyn119883a119894 + G1198890
+ Edyn) 119891 (21)
where Gdyn and G1198890
are the linear regression coefficientswhen119891 is set tomaximum 1 andEdyn is the regression residualwhich follows the normal distribution with mean of zeronamely Edyn sim 119873(01205902
119864119889
) Gdyn G1198890 Edyn and 120590119864119889 arevectors of size 119898 times 1 and 1205902
119864119889
is the element-by-elementsquares of 120590
119864119889
At the microarchitecture level in order to simplify the
analyzing procedure the relationship between the leakagepower Pleaa119894 and the temperature Ta119894 of the 119894th active coreis approximated by the linear model as follows
Pleaa119894 = GleaaTa119894 + G1198970a + Eleaa (22)
whereGleaa andG1198970a are the regression coefficients and Eleaais the regression residual which follows the normal distribu-tion with mean of zero namely Eleaa sim 119873(01205902
119864a) Gleaa is adiagonal matrix of size 119898 times 119898 G
1198970a Eleaa 120583119864a and 120590119864a are
vectors of size 119898 times 1 1205902119864a is the element-by-element squares
of 120590119864aAccording to (20) (21) and (22) the power of the active
core Pa119894 is represented by
Pa119894 = (Gdyn119883119894 + G1198890
) 119891 + Edyn + GleaaTa119894 + G1198970a
+ Eleaa(23)
42 Inactive Mode When a core is in the inactive mode itis powered off using power-gating technique The inactivecore only dissipates leakage power which is much lower thanthat of the active core Hence the power of the 119894th inactivecore Pina119894 is the leakage power Pleaina119894 which depends on thecorersquos temperature Tina119894 and it can be approximated by thelinear model at the microarchitecture level as follows
Pina119894 = Pleaina119894 = GleainaTina119894 + G1198970ina + Eleaina (24)
whereGleaina andG1198970ina are regression coefficients andEleainais the regression residuals which follow normal distributionwith mean of zero namely Eleaina sim 119873(01205902
119864ina) Gleaina is adiagonal matrix of size 119898 times 119898 G
1198970ina Eleaina and 120590119864ina arevectors of size119898times1 1205902
119864ina is the element-by-element squaresof 120590119864ina
5 Thermal Analysis
Lemma 2 When the temperature of a multicore processor isin the steady-state the temperatures and the powers of differentinactive cores are same that isT
119894119899119886119894
= T119894119899119886119895
and P119894119899119886119894
= P119894119899119886119895
forforall119894 119895 (119894 = 119895) whereT119894119899119886119894
andP119894119899119886119894
are the temperatures andthe powers of the 119894th inactive core respectively
Proof According to (3) for any inactive cores 119894 and 119895
Tina119894 minus Tina119895 = R (Pina119894 minus Pina119895) (25)
Substitute (24) into (25) and get
Tina119894 minus Tina119895 = RGleaina (Tina119894 minus Tina119895) (26)
According to (26) get
(E minus RGleaina) (Tina119894 minus Tina119895) = 0 (27)
The parameters R and Gleaina are invertible matrices soEminusRGleaina is an invertible matrixTherefore Tina119894 minusTina119895 =0 that is Tina119894 = Tina119895 And Pina119894 = Pina119895 can also beobtained according to (24)
To be convenient set Tina119894 = Tina and Pina119894 = Pina (24)can be simplified into
Pina = GleainaTina + G1198970ina + Eleaina (28)
6 Mathematical Problems in Engineering
Theorem 3 Assume that all cores of a processor have thesame hotspot Let H = [ℎ
1
ℎ2
ℎ119898
] be the selectionvector of the hotspot where only one element of H can beset to 1 and the others are 0 s indicating that the corre-sponding functional unit is hotspot There exist functions1205951
(H 119899 119899119886
119891) 1205952
(H 119899 119899119886
119891) 1205953
(H 119899 119899119886
119891) 1205954
(H 119899 119899119886
119891) 1205955
(H 119899 119899119886
119891) and 1205956
(H 119899 119899119886
119891) such that the hotspot119879ℎ119900119905119886119894
of the 119894th active core can be formulated by
119879hot119886119894 = 1205951
(H 119899 119899119886
119891)119883119886119894
+ 1205952
(H 119899 119899119886
119891)
119899
119886
sum
119896=1
119883119886119896
+ 1205953
(H 119899 119899119886
119891)E119889119910119899
+ 1205954
(H 119899 119899119886
119891)E119897119890119886119886
+ 1205955
(H 119899 119899119886
119891)E119897119890119886119894119899119886
+ 1205956
(H 119899 119899119886
119891)
(29)
There exist functions 1205851
(H 119899 119899119886
119891) 1205852
(H 119899 119899119886
119891)1205853
(H 119899 119899119886
119891) 1205854
(H 119899 119899119886
119891) and 1205855
(H 119899 119899119886
119891) such thatthe hotspot 119879
ℎ119900119905119894119899119886
of the inactive core can be formulated by
119879ℎ119900119905119904
= 1205851
(H 119899 119899119886
119891)
119899
119886
sum
119896=1
119883119886119896
+ 1205852
(H 119899 119899119886
119891)E119889119910119899
+ 1205853
(H 119899 119899119886
119891)E119897119890119886119886
+ 1205854
(H 119899 119899119886
119891)E119897119890119886119894119899119886
+ 1205855
(H 119899 119899119886
119891)
(30)
Proof According to (3) (23) and (28) the temperature of theinactive core Tina can be derived as
Tina = (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1QGleaa
119899a
sum
119896=1
Ta119896
+ 119891 (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
sdotQGdyn
119899a
sum
119896=1
119883a119896 + 119891119899a (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1QEdyn + 119899a (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1QEleaa + (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1
(R + (119899 minus 119899a)Q)
sdot Eleaina + (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
sdot ((R + (119899 minus 119899a)Q)G1198970ina + 119891119899aQG
1198890
+ 119899aQG1198970a
+ Z)
(31)
Let
Φ1
(119899 119899a 119891) = (R + (119899 minus 119899a)Q)G1198970ina + 119891119899aQG
1198890
+ 119899aQG1198970a + Z
Φ2
(119899 119899a) = (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
(32)
Then (31) can be transformed into
Tina = Φ2
(119899 119899a)QGleaa
119899a
sum
119896=1
Ta119896
+ 119891Φ2
(119899 119899a)QGdyn
119899a
sum
119896=1
119883a119896
+ 119891119899aΦ2 (119899 119899a)QEdyn + 119899aΦ2 (119899 119899a)QEleaa
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q)Eleaina
+Φ2
(119899 119899a)Φ1 (119899 119899a 119891)
(33)
According to (3) (23) and (28) the temperature of the 119894thactive core Ta119894 can be derived as
Ta119894 = (E minus RGleaa)minus1QGleaa
119899a
sum
119896=1
Ta119896 + 119891 (E
minus RGleaa)minus1QGdyn
119899a
sum
119896=1
119883a119896 + 119891 (E minus RGleaa)minus1
sdot RGdyn119883a119894 + (119899 minus 119899a) (E minus RGleaa)minus1QGleainaTina
+ 119891 (E minus RGleaa)minus1
(R + 119899aQ)Edyn + (E
minus RGleaa)minus1
(R + 119899aQ)Eleaa + (119899 minus 119899a) (E
minus RGleaa)minus1QEleaina + (E minus RGleaa)
minus1
(119891RG1198890
+ 119899a119891QG1198890
+ RG1198970a + 119899aQG
1198970a
+ (119899 minus 119899a)QG1198970ina + Z)
(34)
Let
Φ3
(119899 119899a 119891) = 119891RG1198890
+ 119899a119891QG1198890
+ RG1198970a
+ 119899aQG1198970a + (119899 minus 119899a)QG
1198970ina
+ Z
Φ4
= (E minus RGleaa)minus1
(35)
Then (34) can be transformed into
Ta119894 = Φ4QGleaa
119899a
sum
119896=1
Ta119896 + 119891Φ4
QGdyn
119899a
sum
119896=1
119883a119896
+ 119891Φ4
RGdyn119883a119894 + (119899 minus 119899a)Φ4QGleainaTina
+ 119891Φ4
(R + 119899aQ)Edyn +Φ4
(R + 119899aQ)Eleaa
+ (119899 minus 119899a)Φ4QEleaina +Φ4Φ3 (119899 119899a 119891)
(36)
Mathematical Problems in Engineering 7
Substitute (33) into (36) and get
Ta119894 = (Φ4
QGleaa
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)QGleaa)
119899a
sum
119896=1
Ta119896
+ 119891 (Φ4
QGdyn
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)QGdyn)
119899a
sum
119896=1
119883a119896
+ 119891Φ4
RGdyn119883a119894 + 119891 (Φ4
(R + 119899aQ)
+ 119899a (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)Q)Edyn
+ (Φ4
(R + 119899aQ)
+ 119899a (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)Q)Eleaa
+ ((119899 minus 119899a)Φ4Q
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a) (R + (119899 minus 119899a)Q))
sdot Eleaina + (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)
sdotΦ1
(119899 119899a 119891) +Φ4
Φ3
(119899 119899a 119891)
(37)
LetΦ5
(119899 119899a) = (Φ4
QGleaa + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)QGleaa)
Φ6
(119899 119899a 119891) = 119891 (Φ4
QGdyn + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)QGdyn)Φ7 (119891)
= 119891Φ4
RGdyn
Φ7
(119891) = 119891Φ4
RGdyn
Φ8
(119899 119899a 119891) = 119891 (Φ4
(R + 119899aQ) + 119899a (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)Q)
Φ9
(119899 119899a) = (Φ4
(R + 119899aQ) + 119899a (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)Q)
Φ10
(119899 119899a) = ((119899 minus 119899a)Φ4Q + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a) (R + (119899 minus 119899a)Q))
Φ11
(119899 119899a 119891) = (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)
sdotΦ1
(119899 119899a 119891) +Φ4
Φ3
(119899 119899a 119891)
(38)
Then (37) is transformed into
Ta119894 = Φ5 (119899 119899a)
119899a
sum
119896=1
Ta119896 +Φ6 (119899 119899a 119891)
119899a
sum
119896=1
119883a119896
+Φ7
(119891)119883a119894 +Φ8 (119899 119899a 119891)Edyn
+Φ9
(119899 119899a)Eleaa +Φ10 (119899 119899a)Eleaina
+Φ11
(119899 119899a 119891)
(39)
According to (39) get119899a
sum
119896=1
Ta119896 = (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891))
119899a
sum
119896=1
119883a119896
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)Edyn
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)Eleaa
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)Eleaina
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
(40)
Substitute (40) into (39) and get
Ta119894 = Φ7 (119891)119883a119894
+ (Φ5
(119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899 119899a 119891)
+Φ7
(119891)) +Φ6
(119899 119899a 119891))
119899a
sum
119896=1
119883a119896 + (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891) +Φ8
(119899 119899a 119891))
sdot Edyn + (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ9
(119899 119899a) +Φ9 (119899 119899a))Eleaa + (119899aΦ5 (119899 119899a) (E
minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a) +Φ10 (119899 119899a))Eleaina
+ 119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891)
(41)
The hotspot temperature 119879hota119894 of the active core can begiven by
119879hota119894 = HTa119894 = HΦ7
(119891)119883a119894 +H (Φ5
(119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899 119899a 119891) +Φ7
(119891))
+Φ6
(119899 119899a 119891))
119899a
sum
119896=1
119883a119896 +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+Φ8
(119899 119899a 119891))Edyn +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a) +Φ9 (119899 119899a))Eleaa
+H (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ10
(119899 119899a))Eleaina +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891))
(42)
8 Mathematical Problems in Engineering
Let1205951
(H 119891) = HΦ7
(119891)
1205952
(H 119899 119899a 119891) = H (Φ5
(119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) +Φ6
(119899 119899a 119891))
1205953
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+Φ8
(119899 119899a 119891))
1205954
(H 119899 119899a) = H (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ9
(119899 119899a) +Φ9 (119899 119899a))
1205955
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a) +Φ10 (119899 119899a))
1205956
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891))
(43)
Then (42) is transformed into
119879hota119894 = 1205951
(H 119891)119883a119894 + 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn + 1205954
(H 119899 119899a)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(44)
Substitute (40) into (33) and get
Tina = (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+ 119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891)
(45)
The hotspot temperature 119879hotina of the inactive core isgiven by
119879hotina = HTina = H (Φ2
(119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(46)
Let1205851
(H 119899 119899a 119891)
= H (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899
119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
1205852
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ8
(119899 119899a 119891) + 119891119899aΦ2 (119899 119899a)Q)
1205853
(H 119899 119899a) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ9
(119899 119899a) + 119899aΦ2 (119899 119899a)Q)
1205854
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ10
(119899 119899a) +Φ2 (119899 119899a) (R + (119899 minus 119899a)Q))
1205855
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(47)
Then (46) is transformed into
119879hot119904 = 1205851 (H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896 + 1205852 (H 119899 119899a 119891)Edyn
+ 1205853
(H 119899 119899a 119891)Eleaa
+ 1205854
(H 119899 119899a 119891)Eleaina + 1205855 (H 119899 119899a 119891)
(48)
Mathematical Problems in Engineering 9
According toTheorem 3 it can be known that the hotspottemperature of any core can be expressed as the linearfunction of IPCs of all cores
Theorem 4 Suppose that (a) the random variable 119883119886119894
forthe IPC follows normal distribution with the mean 120583
119883
andthe variance 120590
2
119883
that is 119883119886119894
sim 119873(120583119883
1205902
119883
) (b) the tasksrunning on different cores are mutually independent that is119883119886119894
and 119883119886119895
are independent for 1 le forall119894 119895 le 119898 and (c)workload balancing techniques are used in the processor suchthat 119883
119886119894
and 119883119886119895
follow the same normal distribution Thenthere exist functions 120583
119879119886
(H 119899 119899119886
119891) and 120590119879119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119886119894
of the active corefollows the normal distributionwithmean120583
119879119886
(H 119899 119899119886
119891) andvariance 120590
119879119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119886119894
sim 119873(120583119879119886
(H 119899 119899119886
119891) 120590119879119886
2
(H 119899 119899119886
119891)) (49)
There exist functions 120583119879119894119899119886
(H 119899 119899119886
119891) and 120590119879119894119899119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119894119899119886
of the inactive corefollows the normal distribution with mean 120583
119879119894119899119886
(H 119899 119899119886
119891)
and variance 120590119879119894119899119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119894119899119886
sim 119873(120583119879119894119899119886
(H 119899 119899119886
119891) 120590119879119894119899119886
2
(H 119899 119899119886
119891))
(50)
Proof According to (29) get
119879hota119894 = (1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894
+ 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119896 =119894
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn
+ 1205954
(H 119899 119899a 119891)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(51)
The norm-distributed random variables 119883a119894 and 119883a119895 areindependent in the case of 1 le forall119894 119895 le 119898 so that thelinear combination (120595
1
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 +
1205952
(H 119899 119899a 119891)sum119899a119896=1119896 =119894
119883a119896 of 119883a119894 still follows the normaldistribution
All elements Edyn119896 in the random vector Edyn follow nor-mal distribution and are independent so that the linear com-bination 120595
3
(H 119899 119899a 119891)Edyn of all elements in Edyn followsnormal distribution In the sameway120595
4
(H 119899 119899a 119891)Eleaa and1205955
(H 119899 119899a 119891)Eleaina also follow normal distribution(1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 1205952
(H 119899 119899a119891)sum119899a119896=1119896 =119894
119883a119896 1205953
(H 119899 119899a 119891)Edyn 1205954
(H 119899 119899a 119891)Eleaaand 120595
5
(H 119899 119899a 119891)Eleaina all follow normal distribution andare mutually independent so that 119879hota119894 follows normal
distribution where the mean 120583119879ina(H 119899 119899a 119891) is calculated
by
120583119879a (H 119899 119899a 119891)
= (1205951
(H 119891) + 119899a1205952
(H 119899 119899a 119891)) 120583119883
+ 1205953
(H 119899 119899a 119891)120583119864119889
+ 1205954
(H 119899 119899a)120583119864a
+ 1205955
(H 119899 119899a 119891)120583119864ina + 1205956 (H 119899 119899a 119891)
(52)
and the variance 120590119879ina2
(H 119899 119899a 119891) is calculated by
120590119879a2
(H 119899 119899a 119891)
= (1205951
(H 119891) + 1205952
(H 119899 119899a 119891))2
1205902
119883
+ (119899a minus 1)1205952
2
(H 119899 119899a 119891) 1205902
119883
+ 1205953
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205954
2
(H 119899 119899a)1205902
119864a
+ 1205955
2
(H 119899 119899a 119891)1205902
119864ina
(53)
In the same way 1205851
(H 119899 119899a 119891)sum119899a119896=1
119883a119896 1205852(H 119899 119899a119891)Edyn 1205853(H 119899 119899a 119891)Eleaa and 1205854(H 119899 119899a 119891)Eleaina in (30)all follow the normal distribution and are mutually indepen-dent so that 119879hot119904 follows the normal distribution where themean 120583
119879119904
(H 119899 119899a 119891) is calculated by
120583119879119904
(H 119899 119899a 119891) = 119899a1205851 (H 119899 119899a 119891) 120583119883
+ 1205852
(H 119899 119899a 119891)120583119864119889
+ 1205853
(H 119899 119899a)120583119864a
+ 1205854
(H 119899 119899a 119891)120583119864ina
+ 1205855
(H 119899 119899a 119891)
(54)
and the variation 120590119879119904
2
(H 119899 119899a 119891) is calculated by
120590119879119904
2
(H 119899 119899a 119891) = 119899a12058512
(H 119899 119899a 119891) 1205902
119883
+ 1205852
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205853
2
(H 119899 119899a)1205902
119864a
+ 1205854
2
(H 119899 119899a 119891)1205902
119864ina
(55)
According toTheorem 4 it can be known that the hotspottemperature of any core follows the normal probabilisticdistribution
6 Frequency Analysis
The zero-slack policy is used as the DTM strategy of proces-sors that is the speed of processor is set to a value whichmakes the temperature of the hotspot be the threshold [7]However the frequencies are discrete in this workThereforein most cases there is no frequency in the set of frequenciesmaking the hotspot temperature be the threshold exactly
10 Mathematical Problems in Engineering
Theorem 5 Let 119865119878119890119905 = 1198911
119891120587
119891120587+1
119891119891119888119900119906119899119905
be the set of frequencies of multicore processors where1198911
lt sdot sdot sdot lt 119891120587
lt 119891120587+1
lt sdot sdot sdot lt 119891119891119888119900119906119899119905
then the probabilisticdistribution of the frequency f follows
119875 119891 = 119891120587
=
119875 119879ℎ119900119905119886
(119899119886
119891120587
) lt 119879V119886119897V119890 119891120587
= 119891119888119900119906119899119905
119875 119879ℎ119900119905119886
(119899119886
119891120587
) le 119879V119886119897V119890 minus 119875 119879ℎ119900119905119886
(119899119886
119891120587+1
) le 119879V119886119897V119890 119891120587
lt 119891119888119900119906119899119905
(56)
where119879ℎ119900119905119886
(119899119886
119891120587
) is the function of the number of active cores119899119886
and the frequency 119891120587
representing the hotspot temperatureof the active core and 119879V119886119897V119890 is the temperature threshold of theprocessor
Proof When 119891120587
= 119891count according to the zero-slack DTMpolicy obviously
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve (57)
When 119891120587
lt 119891count then the probabilities of 119891 = 119891120587
can bebroken into two cases
(a) if 119879hota(119899a 119891120587) gt 119879hota(119899a 119891120587+1) according to thezero-slack DTM strategy the probability of 119891 = 119891
120587
is 0 that is
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) gt 119879hota (119899a 119891120587+1) = 0 (58)
(b) if 119879hota(119899a 119891120587) le 119879hota(119899a 119891120587+1) the probability of119891 = 119891
120587
is given by
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
(59)
The right-hand side of (59) can be derived by
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve minus 119875 119879hota (119899a 119891120587)
le 119879valve 119879hota (119899a 119891120587+1) le 119879valve | 119879hota (119899a 119891120587)
le 119879hota (119899a 119891120587+1)
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
le 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587+1) le 119879valve
(60)
According to (59) and (60) get
119875 119891119879hota = (119899a 119891120587) 119891120587 le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve
(61)
Hence when 119891120587
lt 119891count according to (58) and (61) get
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve (62)
Therefore according to Theorem 5 the probabilistic dis-tribution of the set of frequencies can be obtained based onthe assumption that the zero-slack policy is used
Given the probabilistic distribution of the frequency andthe mean of hotspot temperature of the active core for acertain frequency the average hotspot temperature of theactive core 119879
averhota can be obtained by
119879averhota = sum
119891isin119865Set120583119879a (H 119899 119899a 119891) sdot 119875 (119891) (63)
where 119865Set = 1198911
119891120587
119891120587+1
119891119891count denotes the set
of frequencies of multicore processors and 119875(119891) denotes theprobability that the frequency is 119891 120583
119879a(H 119899 119899a 119891) denotesthe mean of hotspot temperature of the active core given thefrequency 119891
7 Experimental Results
71 Experimental Methodology Amulticore version of Alpha21264 processor is used as the processor model in ourexperiment [24] and there are eight cores in the processorThe cores have two working states active state and inac-tive state and the working state of each core can be setseparately The processor employs a global DFS techniquewhich means that frequencies of all cores in the processorare scaled uniformly There are four discrete frequenciesused by the processor that is 15 GHz 2GHz 25 GHz and3GHz To facilitate analysis the frequencies are normalizedinto the interval [0 1] so that the maximum frequency isnormalized to 1 After normalization the set of frequenciesis 05 067 083 1 According to our previous work thehotspot of Alpha 21264 processor is the branch predictor[18] So the second element of the hotspot selection vectorH corresponding to the branch predictor is set to 1 and theother elements are set to 0 sThe thermal threshold that is themaximum temperature allowed by processor is set to 100∘C
TheHotSpot is used as the thermalmodel of themulticoreprocessor and the parameters such as thermal conductanceand capacitance are set to default values of HotSpot sim-ulation tool [3] In order to construct the linear model ofdynamic power PTScalar is modified to obtain both the
Mathematical Problems in Engineering 11
dynamic power profiles of each functional unit in Alpha21264 processor and the IPC profiles [25] The parameters ofPTScalar are set to default values as well The mean 120583
119883
andvariance 120590
2
119883
of norm-distributed 119883a119894 are determined usingIPC profiles based on the maximum likelihood estimationSome representative tasks such as mesa ammp quake bzipmcfmath and qsort fromMiBench [26] and SPECCPU2000[27] are selected as the benchmarks These selected tasks aremutually independent that is no task takes precedence overthe others so the tasks can be parallel executed at the tasklevel In addition workload balancing techniques are used inthe processor A task is not fixed on a core and the tasks canbe migrated among all cores such that the IPCs of differentcores are equal The simple linear regression analysis is usedto determine the coefficients Gdyn and G
1198890
in (21) and thevariance 1205902
119864119889
of regression residuals Edyn is obtainedThe leakage power is the nonlinear monotonic increasing
function of temperature which is given by [25]
119875119897
= 120572 sdot 1198792
sdot 119890120574119879
+ 120573 (64)
where 120572 120573 and 120574 are parameters which depend on topologysize technology and design of processors In order toconstruct the linear model of leakage power (64) is regressedlinearly to determine the coefficients Gleaa and G
1198970a in (22)and the coefficients Gleaina and G
1198970ina in (24) as well asthe variance 1205902
119864a and 1205902
119864ina of regression residuals Eleaa andEleaina The parameters 120572 120573 and 120574 are set to the defaultvalues of PTScalar The smaller the range of temperaturethe higher the linear correlation between leakage power andtemperature [7] The temperature of a processor using DTMtechniques does not exceed the maximum value and thelower temperature has no impact on the design optimizationof thermal-aware processors Therefore the linear regressionanalysis of leakage powers is performed at the temperatureinterval between 60∘C and 100∘C and the regression resultsare used to estimate the leakage power at the whole tempera-ture interval
72 Estimated Accuracy For the hotspot of the processorthat is the branch predictor Figure 3 shows the comparisonbetween the actual value and the estimated value of leakagepower for active cores and Figure 4 shows that for inactivecores A higher estimation accuracy of leakage power isobtained at the temperature interval between 60∘C and 100∘Cat the cost of lower accuracy at the other intervals
After regression analysis the dynamic and leakage powercan be estimated with the linear model in (21) (22) and(24) Figure 5 shows the estimation error rate of the dynamicpower and that of the leakage power for active cores andinactive cores in thermal range between 60∘C and 100∘C Itcan be seen that the error rates of the dynamic powers fordifferent functional units have significant variations For thedecoder the estimation error rate of dynamic power is only262 but 1014 for the floating point register (FPReg) Thereason for this fact is that the linear correlations betweenthe IPC and the dynamic powers for various functionalunits are different Lower error rate results from highercorrelation Obviously the IPC and the dynamic power of
Actual leakage powerEstimated leakage power
60 80 100 12040Temperature (∘C)
minus02
0
02
04
06
08
1
Leak
age p
ower
(W)
Figure 3 Leakage power estimation of hotspot for active cores
Actual leakage powerEstimated leakage power
times10minus4
60 80 100 12040Temperature (∘C)
minus2
0
2
4
6
8
10
12
14
Leak
age p
ower
(W)
Figure 4 Leakage power estimation of hotspot for inactive cores
the decoder have the highest linear correlation while theIPC and that of FPReg have the lowest linear correlationIt can also be seen from Figure 5 that the estimation errorrates of leakage power for both active cores and inactivecores are similar The estimation error rates of leakage powerfor active cores are between 315 and 326 and those forinactive cores are between 306 and 591 This is becausethe temperature and the leakage power of various functionalunits have similar linear correlations In order to consider the
12 Mathematical Problems in Engineering
Table 1 Means (120583) and standard deviations (120590) of hotspot temperature distribution for active cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 5193 403 6125 540 7002 669 7935 8067 5920 425 6973 569 7964 705 9017 8508 6669 448 7846 600 8954 743 10131 895
Table 2 Means (120583) and standard deviations (120590) of hotspot temperature distribution for inactive cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 4059 171 4708 229 5319 284 5968 3427 4778 195 5546 261 6269 323 7037 389
Dynamic power
FPRe
gIT
LB IL1
DTB DL1 L2
Dec
ode
Bran
chRA
TRU
ULS
QIA
LU1
IALU
2IA
LU3
IALU
4FP
Add
FPM
ulIn
tReg
Leakage power for active coreLeakage power for inactive core
02468
1012
Erro
r rat
e (
)
Figure 5 Error rate of the power estimation
impact of estimation errors on the analysis of temperatureand frequency the error terms Edyn Eleaa and Eleaina areintroduced into the linear models of dynamic power andleakage power as expressed in (21) (22) and (24)
73 Probabilistic Distribution of Temperature Based on theassumption that no DTM techniques are used to controlthe temperature of processors according to Theorem 4the probabilistic distribution of the hotspot temperaturesfor both active cores and inactive cores can be obtainedTable 1 presents the means and the standard deviations ofthe probabilistic distribution of the hotspot temperature foractive cores and the corresponding probabilistic densitycurves are as shown in Figure 6 It can be seen that thehotspot temperature of processors is not deterministic andhas significant variations for a certain number of active coresand a certain frequency The number of active cores and therunning frequency simultaneously determine the range inwhich the temperature lies and the probabilistic distributionof the hotspot temperature For the same running frequencymore active cores will yield higher temperature and viceversa For the same number of active cores higher frequencywill bring higher temperature and vice versa Accordingto the characteristics of normal distribution curve it can
be known that the shape of probabilistic density curvecorresponds to the variations of data distribution dependingon the standard deviation of random variables The curvewith a higher peak implies a smaller standard deviationthat is a lower variation of data distribution whereas thecurve with a lower peak implies a bigger standard deviationthat is a larger variation it can be seen from Figure 6 thatthe probabilistic density curves corresponding to variousfrequencies have different peaks This observation impliesthat the degree of temperature variation has close correlationwith working frequency and higher frequency will yieldhigher variation of temperature
Table 2 presents the means and standard deviations ofthe probabilistic distribution of the hotspot temperature forinactive cores and the corresponding probabilistic densitycurves are as shown in Figure 7 When the number of activecores is eight inactive core does not exist so there is nocase where the number of active cores is eight in Table 2and Figure 7 The effect of the frequency and the number ofactive cores at the hotspot temperature for inactive cores is thesame as that for active cores except that the mean value andvariation of hotspot temperature of inactive cores are lowerthan those of active cores under the same frequency and thenumber of active cores
74 Probabilistic Distribution of Frequencies If the power-gating and the DFS techniques are simultaneously usedto manage the temperature of a processor according toTheorem 5 the probabilistic distribution of working frequen-cies can be determined Figure 8 presents the probabilisticdistribution of frequencies when the number of active cores issix seven and eight respectively When the frequency is lessthan 1 it is implied that the hotspot temperature surpasses thethreshold and the DFS technique is triggered to reduce thefrequency of the processor So the probability for triggeringDFS can be obtained from the probabilistic distribution offrequencies Figure 9 presents the probability for triggeringDFS when the number of active cores is six seven and eightrespectively
Mathematical Problems in Engineering 13
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
0
002
004
006
008
01Pr
obab
ility
den
sity
40 60 80 100 120 14020Temperature (∘C)
(a)
40 60 80 100 120 14020Temperature (∘C)
0
002
004
006
008
01
Prob
abili
ty d
ensit
yFrequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
0
002
004
006
008
01
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(c)
Figure 6 The probability density of the hotspot temperature for active cores (a) The number of active cores is six (b) The number of activecores is seven (c) The number of active cores is eight
If a core is powered off and made inactive using thepower-gating technique it only dissipates the leakage powerwhich is much less than that of an active core so the powerdissipated by the processor is reduced significantlyWhen thenumber of active cores is less than six that is more thantwo cores are powered off the decreased power makes it
enough for the rest of active cores of a processor to executeat the full speed and the DFS is not necessary to be triggeredTherefore when the number of active cores is less than six thefrequency of the processor is constantly 1 and the probabilityfor triggering DFS is constantly 0 This situation is not givenin Figures 8 and 9
14 Mathematical Problems in Engineering
Table 3 Comparisons of temperatures between active cores with and without DFS
Number of active cores Average hotspot temperature (∘C) Probability for exceedingtemperature threshold ()
With DFS Without DFS With DFS Without DFS6 7930 7935 0 0527 8885 9017 0 12378 9385 10131 0 5584
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(a)
40 60 80 100 120 14020Temperature (∘C)
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
Figure 7The probability density of the hotspot temperature for inactive cores (a)The number of active cores is six (b)The number of activecores is seven
It can be seen that various numbers of active cores resultin different probabilistic distributions of frequencies anddifferent probabilities for triggering DFS When all cores arepowered on the probability that the processor runs at themaximum frequency is only 4416 and the probability fortriggering DFS is 5584This means that all cores will run atthe full speed only when the IPCs of tasks are less If the IPCsincrease to an extent the running frequency will be scaleddown to control the temperature under the threshold Asthe number of active cores decreases the probability that theprocessor runs at the maximum frequency increases and theprobability for triggering DFS decreases When the numberof active cores is less than six no matter what the IPCs arethe saved power by shutting off more than two cores makes itdeterministic for the active cores to run at the full speed sothe probability that the processor runs at the maximum fre-quency is 100 and the probability for triggering DFS is 0
75 Comparisons of Temperatures with and without DFS IftheDFS technique is not used then the processor always runs
at the full speed that is the running frequency is constantly 1So the average hotspot temperature of the active core withoutthe DFS can be obtained according to (52) If both the power-gating and DFS techniques are used simultaneously for thedynamic thermal management then the running frequencycan be scaled to control the temperature of the processorunder the thermal threshold According to (63) the averagehotspot temperature of the active core with the DFS canbe obtained In terms of the average hotspot temperatureand the probability that the hotspot temperature exceeds thethreshold Table 3 presents the comparative results betweenthe active cores with and without the DFS when the numberof active cores is 6 7 and 8
If the DFS technique is used by the processor the runningfrequency will be scaled down to reduce the temperatureonce the hotspot temperature reaches the threshold So thehotspot temperature will not exceed the threshold that isthe probability that the hotspot temperature exceeds thethreshold is 0 If the DFS technique is not used by the pro-cessor the running frequency is always the maximum and
Mathematical Problems in Engineering 15
Number of active cores
Frequency is 1 Frequency is 083Frequency is 067 Frequency is 05
0
20
40
60
80
100
Prob
abili
ty o
f fre
quen
cy (
)
6 7 8
Figure 8 Probabilistic distribution of frequencies
6 7 8Number of active cores
0
10
20
30
40
50
60
Prob
abili
ty fo
r trig
gerin
g D
FS (
)
Figure 9 Probability for triggering DFS
the hotspot temperature is possible to exceed the thresholdthat is the probability that the hotspot temperature exceedsthe threshold is larger than 0Therefore the average hotspottemperature of the processor with the DFS is lower than thatwithout the DFS
As the number of active cores decreases the saved powerby shutting off cores makes it more possible for the activecores to execute at the full speed and the effect of the DFSon cooling down the processor weakens until it disappearsTherefore for the processors with and without the DFS the
average hotspot temperatures become closer as the number ofactive cores decreases as shown in Table 3 Even though theDFS is not used the probability that the hotspot temperatureexceeds the threshold will reduce until 0 when more coresare powered off
When the number of active cores is lower than 6 that ismore than 2 cores are powered off no matter what speed theprocessor runs at the hotspot temperature will not exceedthe threshold so all active cores are not necessary to triggerthe DFS for managing the temperature of the processor andcan run at the maximum frequency Therefore when thenumber of active cores is lower than 6 the probability thatthe hotspot temperature exceeds the threshold is 0 andthe average hotspot temperatures of the processor with andwithout theDFS are sameThere is no differencewith theDFSandwithout theDFSwhen the number of active cores is lowerthan 6 so the comparisons in this situation are not given inTable 3
8 Conclusions
In this paper a probabilistic analysis method of the tem-perature and frequency of multicore processors is presentedtaking the variation of workloads into account It is provedtheoretically in this paper that (1) the hotspot temperaturesof both active cores and inactive cores are the linear functionsof the IPC (2) the hotspot temperature follows the normalprobabilistic distribution based on the assumption that IPCsof all cores follow the same normal distribution and (3) therunning frequency follows a probabilistic distribution
From the experimental results it can be seen that (1) theestimation error rates of the dynamic powers for differentfunctional units have significant variations indicating thatthe linear correlations between the IPC and the dynamicpowers for various functional units are different (2) theestimation error rates of leakage powers for both activecores and inactive cores are similar showing similar linearcorrelations between temperature and leakage power acrossvarious functional units (3) a higher estimation accuracy ofleakage powers can be obtained at the temperature intervalbetween 60∘C and 100∘C at the cost of lower accuracy atother intervals (4) the hotspot temperature of the proces-sor is not deterministic and has significant variation fora certain number of active cores and a certain frequencyand the number of active cores and the running frequencydetermine simultaneously the probabilistic distribution ofhotspot temperature and (5) various numbers of active coresresult in different probabilistic distributions of frequenciesand different probabilities for triggering DFS
Competing Interests
The authors declare no competing interests
Authorsrsquo Contributions
Biying Zhang performed the theoretical analysis and wrotethe major part of this paper Gang Cui and Zhongchuan
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 5
Get
T119894
= RP119894
+Q119899
sum
119895=1
P119895
+ Z (19)
This means that in the steady-state of temperature of amulticore processor given the powers of all cores the tem-perature of any core can be calculated according to Lemma 1at the microarchitecture level
4 Power Model
It is assumed thatmulticore processors have two power statesactive mode and inactive mode and the power state of eachcore can be set separately And global dynamic frequencyscaling (DFS) technique is used where frequencies of all coresare scaled uniformly
41 Active Mode When a core is in the active mode work-loads are executed and the core dissipates both dynamicpower and leakage power At the microarchitecture level thepower of the active core is defined as
Pa119894 = Pdyna119894 + Pleaa119894 (20)
wherePa119894Pdyna119894 andPleaa119894 are respectively the total powerdynamic power and leakage power of the 119894th active core ofsize119898 times 1
For the processor using DVFS technique the dynamicpower is proportional to the product of the square of thevoltage and the frequency [7] that is Pdyn prop voltage2 timesf requency In this paper the primary purpose is to analyzethe impact of workload variation on the temperatures ofprocessors So in order to simplify the analyzing processesDFS technique rather thanDVFS technique is used tomanagethe temperatures of processor where the voltage is constantand only frequency is adjusted Hence the dynamic powercan be modeled as the linear function of the frequency [1216] In addition the dynamic power caused by workloadexecution has a close linear relationship with IPCs
Let 119883a119894 be the IPC of the 119894th core when workloads arerunning on it and let 119891 be the normalized frequency between0 and 1 and then the microarchitecture-level dynamic powerPdyna119894 of the 119894th active core is defined as
Pdyna119894 = (Gdyn119883a119894 + G1198890
+ Edyn) 119891 (21)
where Gdyn and G1198890
are the linear regression coefficientswhen119891 is set tomaximum 1 andEdyn is the regression residualwhich follows the normal distribution with mean of zeronamely Edyn sim 119873(01205902
119864119889
) Gdyn G1198890 Edyn and 120590119864119889 arevectors of size 119898 times 1 and 1205902
119864119889
is the element-by-elementsquares of 120590
119864119889
At the microarchitecture level in order to simplify the
analyzing procedure the relationship between the leakagepower Pleaa119894 and the temperature Ta119894 of the 119894th active coreis approximated by the linear model as follows
Pleaa119894 = GleaaTa119894 + G1198970a + Eleaa (22)
whereGleaa andG1198970a are the regression coefficients and Eleaais the regression residual which follows the normal distribu-tion with mean of zero namely Eleaa sim 119873(01205902
119864a) Gleaa is adiagonal matrix of size 119898 times 119898 G
1198970a Eleaa 120583119864a and 120590119864a are
vectors of size 119898 times 1 1205902119864a is the element-by-element squares
of 120590119864aAccording to (20) (21) and (22) the power of the active
core Pa119894 is represented by
Pa119894 = (Gdyn119883119894 + G1198890
) 119891 + Edyn + GleaaTa119894 + G1198970a
+ Eleaa(23)
42 Inactive Mode When a core is in the inactive mode itis powered off using power-gating technique The inactivecore only dissipates leakage power which is much lower thanthat of the active core Hence the power of the 119894th inactivecore Pina119894 is the leakage power Pleaina119894 which depends on thecorersquos temperature Tina119894 and it can be approximated by thelinear model at the microarchitecture level as follows
Pina119894 = Pleaina119894 = GleainaTina119894 + G1198970ina + Eleaina (24)
whereGleaina andG1198970ina are regression coefficients andEleainais the regression residuals which follow normal distributionwith mean of zero namely Eleaina sim 119873(01205902
119864ina) Gleaina is adiagonal matrix of size 119898 times 119898 G
1198970ina Eleaina and 120590119864ina arevectors of size119898times1 1205902
119864ina is the element-by-element squaresof 120590119864ina
5 Thermal Analysis
Lemma 2 When the temperature of a multicore processor isin the steady-state the temperatures and the powers of differentinactive cores are same that isT
119894119899119886119894
= T119894119899119886119895
and P119894119899119886119894
= P119894119899119886119895
forforall119894 119895 (119894 = 119895) whereT119894119899119886119894
andP119894119899119886119894
are the temperatures andthe powers of the 119894th inactive core respectively
Proof According to (3) for any inactive cores 119894 and 119895
Tina119894 minus Tina119895 = R (Pina119894 minus Pina119895) (25)
Substitute (24) into (25) and get
Tina119894 minus Tina119895 = RGleaina (Tina119894 minus Tina119895) (26)
According to (26) get
(E minus RGleaina) (Tina119894 minus Tina119895) = 0 (27)
The parameters R and Gleaina are invertible matrices soEminusRGleaina is an invertible matrixTherefore Tina119894 minusTina119895 =0 that is Tina119894 = Tina119895 And Pina119894 = Pina119895 can also beobtained according to (24)
To be convenient set Tina119894 = Tina and Pina119894 = Pina (24)can be simplified into
Pina = GleainaTina + G1198970ina + Eleaina (28)
6 Mathematical Problems in Engineering
Theorem 3 Assume that all cores of a processor have thesame hotspot Let H = [ℎ
1
ℎ2
ℎ119898
] be the selectionvector of the hotspot where only one element of H can beset to 1 and the others are 0 s indicating that the corre-sponding functional unit is hotspot There exist functions1205951
(H 119899 119899119886
119891) 1205952
(H 119899 119899119886
119891) 1205953
(H 119899 119899119886
119891) 1205954
(H 119899 119899119886
119891) 1205955
(H 119899 119899119886
119891) and 1205956
(H 119899 119899119886
119891) such that the hotspot119879ℎ119900119905119886119894
of the 119894th active core can be formulated by
119879hot119886119894 = 1205951
(H 119899 119899119886
119891)119883119886119894
+ 1205952
(H 119899 119899119886
119891)
119899
119886
sum
119896=1
119883119886119896
+ 1205953
(H 119899 119899119886
119891)E119889119910119899
+ 1205954
(H 119899 119899119886
119891)E119897119890119886119886
+ 1205955
(H 119899 119899119886
119891)E119897119890119886119894119899119886
+ 1205956
(H 119899 119899119886
119891)
(29)
There exist functions 1205851
(H 119899 119899119886
119891) 1205852
(H 119899 119899119886
119891)1205853
(H 119899 119899119886
119891) 1205854
(H 119899 119899119886
119891) and 1205855
(H 119899 119899119886
119891) such thatthe hotspot 119879
ℎ119900119905119894119899119886
of the inactive core can be formulated by
119879ℎ119900119905119904
= 1205851
(H 119899 119899119886
119891)
119899
119886
sum
119896=1
119883119886119896
+ 1205852
(H 119899 119899119886
119891)E119889119910119899
+ 1205853
(H 119899 119899119886
119891)E119897119890119886119886
+ 1205854
(H 119899 119899119886
119891)E119897119890119886119894119899119886
+ 1205855
(H 119899 119899119886
119891)
(30)
Proof According to (3) (23) and (28) the temperature of theinactive core Tina can be derived as
Tina = (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1QGleaa
119899a
sum
119896=1
Ta119896
+ 119891 (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
sdotQGdyn
119899a
sum
119896=1
119883a119896 + 119891119899a (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1QEdyn + 119899a (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1QEleaa + (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1
(R + (119899 minus 119899a)Q)
sdot Eleaina + (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
sdot ((R + (119899 minus 119899a)Q)G1198970ina + 119891119899aQG
1198890
+ 119899aQG1198970a
+ Z)
(31)
Let
Φ1
(119899 119899a 119891) = (R + (119899 minus 119899a)Q)G1198970ina + 119891119899aQG
1198890
+ 119899aQG1198970a + Z
Φ2
(119899 119899a) = (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
(32)
Then (31) can be transformed into
Tina = Φ2
(119899 119899a)QGleaa
119899a
sum
119896=1
Ta119896
+ 119891Φ2
(119899 119899a)QGdyn
119899a
sum
119896=1
119883a119896
+ 119891119899aΦ2 (119899 119899a)QEdyn + 119899aΦ2 (119899 119899a)QEleaa
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q)Eleaina
+Φ2
(119899 119899a)Φ1 (119899 119899a 119891)
(33)
According to (3) (23) and (28) the temperature of the 119894thactive core Ta119894 can be derived as
Ta119894 = (E minus RGleaa)minus1QGleaa
119899a
sum
119896=1
Ta119896 + 119891 (E
minus RGleaa)minus1QGdyn
119899a
sum
119896=1
119883a119896 + 119891 (E minus RGleaa)minus1
sdot RGdyn119883a119894 + (119899 minus 119899a) (E minus RGleaa)minus1QGleainaTina
+ 119891 (E minus RGleaa)minus1
(R + 119899aQ)Edyn + (E
minus RGleaa)minus1
(R + 119899aQ)Eleaa + (119899 minus 119899a) (E
minus RGleaa)minus1QEleaina + (E minus RGleaa)
minus1
(119891RG1198890
+ 119899a119891QG1198890
+ RG1198970a + 119899aQG
1198970a
+ (119899 minus 119899a)QG1198970ina + Z)
(34)
Let
Φ3
(119899 119899a 119891) = 119891RG1198890
+ 119899a119891QG1198890
+ RG1198970a
+ 119899aQG1198970a + (119899 minus 119899a)QG
1198970ina
+ Z
Φ4
= (E minus RGleaa)minus1
(35)
Then (34) can be transformed into
Ta119894 = Φ4QGleaa
119899a
sum
119896=1
Ta119896 + 119891Φ4
QGdyn
119899a
sum
119896=1
119883a119896
+ 119891Φ4
RGdyn119883a119894 + (119899 minus 119899a)Φ4QGleainaTina
+ 119891Φ4
(R + 119899aQ)Edyn +Φ4
(R + 119899aQ)Eleaa
+ (119899 minus 119899a)Φ4QEleaina +Φ4Φ3 (119899 119899a 119891)
(36)
Mathematical Problems in Engineering 7
Substitute (33) into (36) and get
Ta119894 = (Φ4
QGleaa
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)QGleaa)
119899a
sum
119896=1
Ta119896
+ 119891 (Φ4
QGdyn
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)QGdyn)
119899a
sum
119896=1
119883a119896
+ 119891Φ4
RGdyn119883a119894 + 119891 (Φ4
(R + 119899aQ)
+ 119899a (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)Q)Edyn
+ (Φ4
(R + 119899aQ)
+ 119899a (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)Q)Eleaa
+ ((119899 minus 119899a)Φ4Q
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a) (R + (119899 minus 119899a)Q))
sdot Eleaina + (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)
sdotΦ1
(119899 119899a 119891) +Φ4
Φ3
(119899 119899a 119891)
(37)
LetΦ5
(119899 119899a) = (Φ4
QGleaa + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)QGleaa)
Φ6
(119899 119899a 119891) = 119891 (Φ4
QGdyn + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)QGdyn)Φ7 (119891)
= 119891Φ4
RGdyn
Φ7
(119891) = 119891Φ4
RGdyn
Φ8
(119899 119899a 119891) = 119891 (Φ4
(R + 119899aQ) + 119899a (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)Q)
Φ9
(119899 119899a) = (Φ4
(R + 119899aQ) + 119899a (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)Q)
Φ10
(119899 119899a) = ((119899 minus 119899a)Φ4Q + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a) (R + (119899 minus 119899a)Q))
Φ11
(119899 119899a 119891) = (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)
sdotΦ1
(119899 119899a 119891) +Φ4
Φ3
(119899 119899a 119891)
(38)
Then (37) is transformed into
Ta119894 = Φ5 (119899 119899a)
119899a
sum
119896=1
Ta119896 +Φ6 (119899 119899a 119891)
119899a
sum
119896=1
119883a119896
+Φ7
(119891)119883a119894 +Φ8 (119899 119899a 119891)Edyn
+Φ9
(119899 119899a)Eleaa +Φ10 (119899 119899a)Eleaina
+Φ11
(119899 119899a 119891)
(39)
According to (39) get119899a
sum
119896=1
Ta119896 = (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891))
119899a
sum
119896=1
119883a119896
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)Edyn
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)Eleaa
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)Eleaina
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
(40)
Substitute (40) into (39) and get
Ta119894 = Φ7 (119891)119883a119894
+ (Φ5
(119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899 119899a 119891)
+Φ7
(119891)) +Φ6
(119899 119899a 119891))
119899a
sum
119896=1
119883a119896 + (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891) +Φ8
(119899 119899a 119891))
sdot Edyn + (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ9
(119899 119899a) +Φ9 (119899 119899a))Eleaa + (119899aΦ5 (119899 119899a) (E
minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a) +Φ10 (119899 119899a))Eleaina
+ 119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891)
(41)
The hotspot temperature 119879hota119894 of the active core can begiven by
119879hota119894 = HTa119894 = HΦ7
(119891)119883a119894 +H (Φ5
(119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899 119899a 119891) +Φ7
(119891))
+Φ6
(119899 119899a 119891))
119899a
sum
119896=1
119883a119896 +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+Φ8
(119899 119899a 119891))Edyn +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a) +Φ9 (119899 119899a))Eleaa
+H (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ10
(119899 119899a))Eleaina +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891))
(42)
8 Mathematical Problems in Engineering
Let1205951
(H 119891) = HΦ7
(119891)
1205952
(H 119899 119899a 119891) = H (Φ5
(119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) +Φ6
(119899 119899a 119891))
1205953
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+Φ8
(119899 119899a 119891))
1205954
(H 119899 119899a) = H (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ9
(119899 119899a) +Φ9 (119899 119899a))
1205955
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a) +Φ10 (119899 119899a))
1205956
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891))
(43)
Then (42) is transformed into
119879hota119894 = 1205951
(H 119891)119883a119894 + 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn + 1205954
(H 119899 119899a)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(44)
Substitute (40) into (33) and get
Tina = (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+ 119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891)
(45)
The hotspot temperature 119879hotina of the inactive core isgiven by
119879hotina = HTina = H (Φ2
(119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(46)
Let1205851
(H 119899 119899a 119891)
= H (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899
119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
1205852
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ8
(119899 119899a 119891) + 119891119899aΦ2 (119899 119899a)Q)
1205853
(H 119899 119899a) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ9
(119899 119899a) + 119899aΦ2 (119899 119899a)Q)
1205854
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ10
(119899 119899a) +Φ2 (119899 119899a) (R + (119899 minus 119899a)Q))
1205855
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(47)
Then (46) is transformed into
119879hot119904 = 1205851 (H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896 + 1205852 (H 119899 119899a 119891)Edyn
+ 1205853
(H 119899 119899a 119891)Eleaa
+ 1205854
(H 119899 119899a 119891)Eleaina + 1205855 (H 119899 119899a 119891)
(48)
Mathematical Problems in Engineering 9
According toTheorem 3 it can be known that the hotspottemperature of any core can be expressed as the linearfunction of IPCs of all cores
Theorem 4 Suppose that (a) the random variable 119883119886119894
forthe IPC follows normal distribution with the mean 120583
119883
andthe variance 120590
2
119883
that is 119883119886119894
sim 119873(120583119883
1205902
119883
) (b) the tasksrunning on different cores are mutually independent that is119883119886119894
and 119883119886119895
are independent for 1 le forall119894 119895 le 119898 and (c)workload balancing techniques are used in the processor suchthat 119883
119886119894
and 119883119886119895
follow the same normal distribution Thenthere exist functions 120583
119879119886
(H 119899 119899119886
119891) and 120590119879119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119886119894
of the active corefollows the normal distributionwithmean120583
119879119886
(H 119899 119899119886
119891) andvariance 120590
119879119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119886119894
sim 119873(120583119879119886
(H 119899 119899119886
119891) 120590119879119886
2
(H 119899 119899119886
119891)) (49)
There exist functions 120583119879119894119899119886
(H 119899 119899119886
119891) and 120590119879119894119899119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119894119899119886
of the inactive corefollows the normal distribution with mean 120583
119879119894119899119886
(H 119899 119899119886
119891)
and variance 120590119879119894119899119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119894119899119886
sim 119873(120583119879119894119899119886
(H 119899 119899119886
119891) 120590119879119894119899119886
2
(H 119899 119899119886
119891))
(50)
Proof According to (29) get
119879hota119894 = (1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894
+ 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119896 =119894
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn
+ 1205954
(H 119899 119899a 119891)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(51)
The norm-distributed random variables 119883a119894 and 119883a119895 areindependent in the case of 1 le forall119894 119895 le 119898 so that thelinear combination (120595
1
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 +
1205952
(H 119899 119899a 119891)sum119899a119896=1119896 =119894
119883a119896 of 119883a119894 still follows the normaldistribution
All elements Edyn119896 in the random vector Edyn follow nor-mal distribution and are independent so that the linear com-bination 120595
3
(H 119899 119899a 119891)Edyn of all elements in Edyn followsnormal distribution In the sameway120595
4
(H 119899 119899a 119891)Eleaa and1205955
(H 119899 119899a 119891)Eleaina also follow normal distribution(1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 1205952
(H 119899 119899a119891)sum119899a119896=1119896 =119894
119883a119896 1205953
(H 119899 119899a 119891)Edyn 1205954
(H 119899 119899a 119891)Eleaaand 120595
5
(H 119899 119899a 119891)Eleaina all follow normal distribution andare mutually independent so that 119879hota119894 follows normal
distribution where the mean 120583119879ina(H 119899 119899a 119891) is calculated
by
120583119879a (H 119899 119899a 119891)
= (1205951
(H 119891) + 119899a1205952
(H 119899 119899a 119891)) 120583119883
+ 1205953
(H 119899 119899a 119891)120583119864119889
+ 1205954
(H 119899 119899a)120583119864a
+ 1205955
(H 119899 119899a 119891)120583119864ina + 1205956 (H 119899 119899a 119891)
(52)
and the variance 120590119879ina2
(H 119899 119899a 119891) is calculated by
120590119879a2
(H 119899 119899a 119891)
= (1205951
(H 119891) + 1205952
(H 119899 119899a 119891))2
1205902
119883
+ (119899a minus 1)1205952
2
(H 119899 119899a 119891) 1205902
119883
+ 1205953
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205954
2
(H 119899 119899a)1205902
119864a
+ 1205955
2
(H 119899 119899a 119891)1205902
119864ina
(53)
In the same way 1205851
(H 119899 119899a 119891)sum119899a119896=1
119883a119896 1205852(H 119899 119899a119891)Edyn 1205853(H 119899 119899a 119891)Eleaa and 1205854(H 119899 119899a 119891)Eleaina in (30)all follow the normal distribution and are mutually indepen-dent so that 119879hot119904 follows the normal distribution where themean 120583
119879119904
(H 119899 119899a 119891) is calculated by
120583119879119904
(H 119899 119899a 119891) = 119899a1205851 (H 119899 119899a 119891) 120583119883
+ 1205852
(H 119899 119899a 119891)120583119864119889
+ 1205853
(H 119899 119899a)120583119864a
+ 1205854
(H 119899 119899a 119891)120583119864ina
+ 1205855
(H 119899 119899a 119891)
(54)
and the variation 120590119879119904
2
(H 119899 119899a 119891) is calculated by
120590119879119904
2
(H 119899 119899a 119891) = 119899a12058512
(H 119899 119899a 119891) 1205902
119883
+ 1205852
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205853
2
(H 119899 119899a)1205902
119864a
+ 1205854
2
(H 119899 119899a 119891)1205902
119864ina
(55)
According toTheorem 4 it can be known that the hotspottemperature of any core follows the normal probabilisticdistribution
6 Frequency Analysis
The zero-slack policy is used as the DTM strategy of proces-sors that is the speed of processor is set to a value whichmakes the temperature of the hotspot be the threshold [7]However the frequencies are discrete in this workThereforein most cases there is no frequency in the set of frequenciesmaking the hotspot temperature be the threshold exactly
10 Mathematical Problems in Engineering
Theorem 5 Let 119865119878119890119905 = 1198911
119891120587
119891120587+1
119891119891119888119900119906119899119905
be the set of frequencies of multicore processors where1198911
lt sdot sdot sdot lt 119891120587
lt 119891120587+1
lt sdot sdot sdot lt 119891119891119888119900119906119899119905
then the probabilisticdistribution of the frequency f follows
119875 119891 = 119891120587
=
119875 119879ℎ119900119905119886
(119899119886
119891120587
) lt 119879V119886119897V119890 119891120587
= 119891119888119900119906119899119905
119875 119879ℎ119900119905119886
(119899119886
119891120587
) le 119879V119886119897V119890 minus 119875 119879ℎ119900119905119886
(119899119886
119891120587+1
) le 119879V119886119897V119890 119891120587
lt 119891119888119900119906119899119905
(56)
where119879ℎ119900119905119886
(119899119886
119891120587
) is the function of the number of active cores119899119886
and the frequency 119891120587
representing the hotspot temperatureof the active core and 119879V119886119897V119890 is the temperature threshold of theprocessor
Proof When 119891120587
= 119891count according to the zero-slack DTMpolicy obviously
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve (57)
When 119891120587
lt 119891count then the probabilities of 119891 = 119891120587
can bebroken into two cases
(a) if 119879hota(119899a 119891120587) gt 119879hota(119899a 119891120587+1) according to thezero-slack DTM strategy the probability of 119891 = 119891
120587
is 0 that is
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) gt 119879hota (119899a 119891120587+1) = 0 (58)
(b) if 119879hota(119899a 119891120587) le 119879hota(119899a 119891120587+1) the probability of119891 = 119891
120587
is given by
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
(59)
The right-hand side of (59) can be derived by
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve minus 119875 119879hota (119899a 119891120587)
le 119879valve 119879hota (119899a 119891120587+1) le 119879valve | 119879hota (119899a 119891120587)
le 119879hota (119899a 119891120587+1)
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
le 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587+1) le 119879valve
(60)
According to (59) and (60) get
119875 119891119879hota = (119899a 119891120587) 119891120587 le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve
(61)
Hence when 119891120587
lt 119891count according to (58) and (61) get
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve (62)
Therefore according to Theorem 5 the probabilistic dis-tribution of the set of frequencies can be obtained based onthe assumption that the zero-slack policy is used
Given the probabilistic distribution of the frequency andthe mean of hotspot temperature of the active core for acertain frequency the average hotspot temperature of theactive core 119879
averhota can be obtained by
119879averhota = sum
119891isin119865Set120583119879a (H 119899 119899a 119891) sdot 119875 (119891) (63)
where 119865Set = 1198911
119891120587
119891120587+1
119891119891count denotes the set
of frequencies of multicore processors and 119875(119891) denotes theprobability that the frequency is 119891 120583
119879a(H 119899 119899a 119891) denotesthe mean of hotspot temperature of the active core given thefrequency 119891
7 Experimental Results
71 Experimental Methodology Amulticore version of Alpha21264 processor is used as the processor model in ourexperiment [24] and there are eight cores in the processorThe cores have two working states active state and inac-tive state and the working state of each core can be setseparately The processor employs a global DFS techniquewhich means that frequencies of all cores in the processorare scaled uniformly There are four discrete frequenciesused by the processor that is 15 GHz 2GHz 25 GHz and3GHz To facilitate analysis the frequencies are normalizedinto the interval [0 1] so that the maximum frequency isnormalized to 1 After normalization the set of frequenciesis 05 067 083 1 According to our previous work thehotspot of Alpha 21264 processor is the branch predictor[18] So the second element of the hotspot selection vectorH corresponding to the branch predictor is set to 1 and theother elements are set to 0 sThe thermal threshold that is themaximum temperature allowed by processor is set to 100∘C
TheHotSpot is used as the thermalmodel of themulticoreprocessor and the parameters such as thermal conductanceand capacitance are set to default values of HotSpot sim-ulation tool [3] In order to construct the linear model ofdynamic power PTScalar is modified to obtain both the
Mathematical Problems in Engineering 11
dynamic power profiles of each functional unit in Alpha21264 processor and the IPC profiles [25] The parameters ofPTScalar are set to default values as well The mean 120583
119883
andvariance 120590
2
119883
of norm-distributed 119883a119894 are determined usingIPC profiles based on the maximum likelihood estimationSome representative tasks such as mesa ammp quake bzipmcfmath and qsort fromMiBench [26] and SPECCPU2000[27] are selected as the benchmarks These selected tasks aremutually independent that is no task takes precedence overthe others so the tasks can be parallel executed at the tasklevel In addition workload balancing techniques are used inthe processor A task is not fixed on a core and the tasks canbe migrated among all cores such that the IPCs of differentcores are equal The simple linear regression analysis is usedto determine the coefficients Gdyn and G
1198890
in (21) and thevariance 1205902
119864119889
of regression residuals Edyn is obtainedThe leakage power is the nonlinear monotonic increasing
function of temperature which is given by [25]
119875119897
= 120572 sdot 1198792
sdot 119890120574119879
+ 120573 (64)
where 120572 120573 and 120574 are parameters which depend on topologysize technology and design of processors In order toconstruct the linear model of leakage power (64) is regressedlinearly to determine the coefficients Gleaa and G
1198970a in (22)and the coefficients Gleaina and G
1198970ina in (24) as well asthe variance 1205902
119864a and 1205902
119864ina of regression residuals Eleaa andEleaina The parameters 120572 120573 and 120574 are set to the defaultvalues of PTScalar The smaller the range of temperaturethe higher the linear correlation between leakage power andtemperature [7] The temperature of a processor using DTMtechniques does not exceed the maximum value and thelower temperature has no impact on the design optimizationof thermal-aware processors Therefore the linear regressionanalysis of leakage powers is performed at the temperatureinterval between 60∘C and 100∘C and the regression resultsare used to estimate the leakage power at the whole tempera-ture interval
72 Estimated Accuracy For the hotspot of the processorthat is the branch predictor Figure 3 shows the comparisonbetween the actual value and the estimated value of leakagepower for active cores and Figure 4 shows that for inactivecores A higher estimation accuracy of leakage power isobtained at the temperature interval between 60∘C and 100∘Cat the cost of lower accuracy at the other intervals
After regression analysis the dynamic and leakage powercan be estimated with the linear model in (21) (22) and(24) Figure 5 shows the estimation error rate of the dynamicpower and that of the leakage power for active cores andinactive cores in thermal range between 60∘C and 100∘C Itcan be seen that the error rates of the dynamic powers fordifferent functional units have significant variations For thedecoder the estimation error rate of dynamic power is only262 but 1014 for the floating point register (FPReg) Thereason for this fact is that the linear correlations betweenthe IPC and the dynamic powers for various functionalunits are different Lower error rate results from highercorrelation Obviously the IPC and the dynamic power of
Actual leakage powerEstimated leakage power
60 80 100 12040Temperature (∘C)
minus02
0
02
04
06
08
1
Leak
age p
ower
(W)
Figure 3 Leakage power estimation of hotspot for active cores
Actual leakage powerEstimated leakage power
times10minus4
60 80 100 12040Temperature (∘C)
minus2
0
2
4
6
8
10
12
14
Leak
age p
ower
(W)
Figure 4 Leakage power estimation of hotspot for inactive cores
the decoder have the highest linear correlation while theIPC and that of FPReg have the lowest linear correlationIt can also be seen from Figure 5 that the estimation errorrates of leakage power for both active cores and inactivecores are similar The estimation error rates of leakage powerfor active cores are between 315 and 326 and those forinactive cores are between 306 and 591 This is becausethe temperature and the leakage power of various functionalunits have similar linear correlations In order to consider the
12 Mathematical Problems in Engineering
Table 1 Means (120583) and standard deviations (120590) of hotspot temperature distribution for active cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 5193 403 6125 540 7002 669 7935 8067 5920 425 6973 569 7964 705 9017 8508 6669 448 7846 600 8954 743 10131 895
Table 2 Means (120583) and standard deviations (120590) of hotspot temperature distribution for inactive cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 4059 171 4708 229 5319 284 5968 3427 4778 195 5546 261 6269 323 7037 389
Dynamic power
FPRe
gIT
LB IL1
DTB DL1 L2
Dec
ode
Bran
chRA
TRU
ULS
QIA
LU1
IALU
2IA
LU3
IALU
4FP
Add
FPM
ulIn
tReg
Leakage power for active coreLeakage power for inactive core
02468
1012
Erro
r rat
e (
)
Figure 5 Error rate of the power estimation
impact of estimation errors on the analysis of temperatureand frequency the error terms Edyn Eleaa and Eleaina areintroduced into the linear models of dynamic power andleakage power as expressed in (21) (22) and (24)
73 Probabilistic Distribution of Temperature Based on theassumption that no DTM techniques are used to controlthe temperature of processors according to Theorem 4the probabilistic distribution of the hotspot temperaturesfor both active cores and inactive cores can be obtainedTable 1 presents the means and the standard deviations ofthe probabilistic distribution of the hotspot temperature foractive cores and the corresponding probabilistic densitycurves are as shown in Figure 6 It can be seen that thehotspot temperature of processors is not deterministic andhas significant variations for a certain number of active coresand a certain frequency The number of active cores and therunning frequency simultaneously determine the range inwhich the temperature lies and the probabilistic distributionof the hotspot temperature For the same running frequencymore active cores will yield higher temperature and viceversa For the same number of active cores higher frequencywill bring higher temperature and vice versa Accordingto the characteristics of normal distribution curve it can
be known that the shape of probabilistic density curvecorresponds to the variations of data distribution dependingon the standard deviation of random variables The curvewith a higher peak implies a smaller standard deviationthat is a lower variation of data distribution whereas thecurve with a lower peak implies a bigger standard deviationthat is a larger variation it can be seen from Figure 6 thatthe probabilistic density curves corresponding to variousfrequencies have different peaks This observation impliesthat the degree of temperature variation has close correlationwith working frequency and higher frequency will yieldhigher variation of temperature
Table 2 presents the means and standard deviations ofthe probabilistic distribution of the hotspot temperature forinactive cores and the corresponding probabilistic densitycurves are as shown in Figure 7 When the number of activecores is eight inactive core does not exist so there is nocase where the number of active cores is eight in Table 2and Figure 7 The effect of the frequency and the number ofactive cores at the hotspot temperature for inactive cores is thesame as that for active cores except that the mean value andvariation of hotspot temperature of inactive cores are lowerthan those of active cores under the same frequency and thenumber of active cores
74 Probabilistic Distribution of Frequencies If the power-gating and the DFS techniques are simultaneously usedto manage the temperature of a processor according toTheorem 5 the probabilistic distribution of working frequen-cies can be determined Figure 8 presents the probabilisticdistribution of frequencies when the number of active cores issix seven and eight respectively When the frequency is lessthan 1 it is implied that the hotspot temperature surpasses thethreshold and the DFS technique is triggered to reduce thefrequency of the processor So the probability for triggeringDFS can be obtained from the probabilistic distribution offrequencies Figure 9 presents the probability for triggeringDFS when the number of active cores is six seven and eightrespectively
Mathematical Problems in Engineering 13
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
0
002
004
006
008
01Pr
obab
ility
den
sity
40 60 80 100 120 14020Temperature (∘C)
(a)
40 60 80 100 120 14020Temperature (∘C)
0
002
004
006
008
01
Prob
abili
ty d
ensit
yFrequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
0
002
004
006
008
01
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(c)
Figure 6 The probability density of the hotspot temperature for active cores (a) The number of active cores is six (b) The number of activecores is seven (c) The number of active cores is eight
If a core is powered off and made inactive using thepower-gating technique it only dissipates the leakage powerwhich is much less than that of an active core so the powerdissipated by the processor is reduced significantlyWhen thenumber of active cores is less than six that is more thantwo cores are powered off the decreased power makes it
enough for the rest of active cores of a processor to executeat the full speed and the DFS is not necessary to be triggeredTherefore when the number of active cores is less than six thefrequency of the processor is constantly 1 and the probabilityfor triggering DFS is constantly 0 This situation is not givenin Figures 8 and 9
14 Mathematical Problems in Engineering
Table 3 Comparisons of temperatures between active cores with and without DFS
Number of active cores Average hotspot temperature (∘C) Probability for exceedingtemperature threshold ()
With DFS Without DFS With DFS Without DFS6 7930 7935 0 0527 8885 9017 0 12378 9385 10131 0 5584
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(a)
40 60 80 100 120 14020Temperature (∘C)
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
Figure 7The probability density of the hotspot temperature for inactive cores (a)The number of active cores is six (b)The number of activecores is seven
It can be seen that various numbers of active cores resultin different probabilistic distributions of frequencies anddifferent probabilities for triggering DFS When all cores arepowered on the probability that the processor runs at themaximum frequency is only 4416 and the probability fortriggering DFS is 5584This means that all cores will run atthe full speed only when the IPCs of tasks are less If the IPCsincrease to an extent the running frequency will be scaleddown to control the temperature under the threshold Asthe number of active cores decreases the probability that theprocessor runs at the maximum frequency increases and theprobability for triggering DFS decreases When the numberof active cores is less than six no matter what the IPCs arethe saved power by shutting off more than two cores makes itdeterministic for the active cores to run at the full speed sothe probability that the processor runs at the maximum fre-quency is 100 and the probability for triggering DFS is 0
75 Comparisons of Temperatures with and without DFS IftheDFS technique is not used then the processor always runs
at the full speed that is the running frequency is constantly 1So the average hotspot temperature of the active core withoutthe DFS can be obtained according to (52) If both the power-gating and DFS techniques are used simultaneously for thedynamic thermal management then the running frequencycan be scaled to control the temperature of the processorunder the thermal threshold According to (63) the averagehotspot temperature of the active core with the DFS canbe obtained In terms of the average hotspot temperatureand the probability that the hotspot temperature exceeds thethreshold Table 3 presents the comparative results betweenthe active cores with and without the DFS when the numberof active cores is 6 7 and 8
If the DFS technique is used by the processor the runningfrequency will be scaled down to reduce the temperatureonce the hotspot temperature reaches the threshold So thehotspot temperature will not exceed the threshold that isthe probability that the hotspot temperature exceeds thethreshold is 0 If the DFS technique is not used by the pro-cessor the running frequency is always the maximum and
Mathematical Problems in Engineering 15
Number of active cores
Frequency is 1 Frequency is 083Frequency is 067 Frequency is 05
0
20
40
60
80
100
Prob
abili
ty o
f fre
quen
cy (
)
6 7 8
Figure 8 Probabilistic distribution of frequencies
6 7 8Number of active cores
0
10
20
30
40
50
60
Prob
abili
ty fo
r trig
gerin
g D
FS (
)
Figure 9 Probability for triggering DFS
the hotspot temperature is possible to exceed the thresholdthat is the probability that the hotspot temperature exceedsthe threshold is larger than 0Therefore the average hotspottemperature of the processor with the DFS is lower than thatwithout the DFS
As the number of active cores decreases the saved powerby shutting off cores makes it more possible for the activecores to execute at the full speed and the effect of the DFSon cooling down the processor weakens until it disappearsTherefore for the processors with and without the DFS the
average hotspot temperatures become closer as the number ofactive cores decreases as shown in Table 3 Even though theDFS is not used the probability that the hotspot temperatureexceeds the threshold will reduce until 0 when more coresare powered off
When the number of active cores is lower than 6 that ismore than 2 cores are powered off no matter what speed theprocessor runs at the hotspot temperature will not exceedthe threshold so all active cores are not necessary to triggerthe DFS for managing the temperature of the processor andcan run at the maximum frequency Therefore when thenumber of active cores is lower than 6 the probability thatthe hotspot temperature exceeds the threshold is 0 andthe average hotspot temperatures of the processor with andwithout theDFS are sameThere is no differencewith theDFSandwithout theDFSwhen the number of active cores is lowerthan 6 so the comparisons in this situation are not given inTable 3
8 Conclusions
In this paper a probabilistic analysis method of the tem-perature and frequency of multicore processors is presentedtaking the variation of workloads into account It is provedtheoretically in this paper that (1) the hotspot temperaturesof both active cores and inactive cores are the linear functionsof the IPC (2) the hotspot temperature follows the normalprobabilistic distribution based on the assumption that IPCsof all cores follow the same normal distribution and (3) therunning frequency follows a probabilistic distribution
From the experimental results it can be seen that (1) theestimation error rates of the dynamic powers for differentfunctional units have significant variations indicating thatthe linear correlations between the IPC and the dynamicpowers for various functional units are different (2) theestimation error rates of leakage powers for both activecores and inactive cores are similar showing similar linearcorrelations between temperature and leakage power acrossvarious functional units (3) a higher estimation accuracy ofleakage powers can be obtained at the temperature intervalbetween 60∘C and 100∘C at the cost of lower accuracy atother intervals (4) the hotspot temperature of the proces-sor is not deterministic and has significant variation fora certain number of active cores and a certain frequencyand the number of active cores and the running frequencydetermine simultaneously the probabilistic distribution ofhotspot temperature and (5) various numbers of active coresresult in different probabilistic distributions of frequenciesand different probabilities for triggering DFS
Competing Interests
The authors declare no competing interests
Authorsrsquo Contributions
Biying Zhang performed the theoretical analysis and wrotethe major part of this paper Gang Cui and Zhongchuan
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
6 Mathematical Problems in Engineering
Theorem 3 Assume that all cores of a processor have thesame hotspot Let H = [ℎ
1
ℎ2
ℎ119898
] be the selectionvector of the hotspot where only one element of H can beset to 1 and the others are 0 s indicating that the corre-sponding functional unit is hotspot There exist functions1205951
(H 119899 119899119886
119891) 1205952
(H 119899 119899119886
119891) 1205953
(H 119899 119899119886
119891) 1205954
(H 119899 119899119886
119891) 1205955
(H 119899 119899119886
119891) and 1205956
(H 119899 119899119886
119891) such that the hotspot119879ℎ119900119905119886119894
of the 119894th active core can be formulated by
119879hot119886119894 = 1205951
(H 119899 119899119886
119891)119883119886119894
+ 1205952
(H 119899 119899119886
119891)
119899
119886
sum
119896=1
119883119886119896
+ 1205953
(H 119899 119899119886
119891)E119889119910119899
+ 1205954
(H 119899 119899119886
119891)E119897119890119886119886
+ 1205955
(H 119899 119899119886
119891)E119897119890119886119894119899119886
+ 1205956
(H 119899 119899119886
119891)
(29)
There exist functions 1205851
(H 119899 119899119886
119891) 1205852
(H 119899 119899119886
119891)1205853
(H 119899 119899119886
119891) 1205854
(H 119899 119899119886
119891) and 1205855
(H 119899 119899119886
119891) such thatthe hotspot 119879
ℎ119900119905119894119899119886
of the inactive core can be formulated by
119879ℎ119900119905119904
= 1205851
(H 119899 119899119886
119891)
119899
119886
sum
119896=1
119883119886119896
+ 1205852
(H 119899 119899119886
119891)E119889119910119899
+ 1205853
(H 119899 119899119886
119891)E119897119890119886119886
+ 1205854
(H 119899 119899119886
119891)E119897119890119886119894119899119886
+ 1205855
(H 119899 119899119886
119891)
(30)
Proof According to (3) (23) and (28) the temperature of theinactive core Tina can be derived as
Tina = (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1QGleaa
119899a
sum
119896=1
Ta119896
+ 119891 (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
sdotQGdyn
119899a
sum
119896=1
119883a119896 + 119891119899a (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1QEdyn + 119899a (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1QEleaa + (E
minus (R + (119899 minus 119899a)Q)Gleaina)minus1
(R + (119899 minus 119899a)Q)
sdot Eleaina + (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
sdot ((R + (119899 minus 119899a)Q)G1198970ina + 119891119899aQG
1198890
+ 119899aQG1198970a
+ Z)
(31)
Let
Φ1
(119899 119899a 119891) = (R + (119899 minus 119899a)Q)G1198970ina + 119891119899aQG
1198890
+ 119899aQG1198970a + Z
Φ2
(119899 119899a) = (E minus (R + (119899 minus 119899a)Q)Gleaina)minus1
(32)
Then (31) can be transformed into
Tina = Φ2
(119899 119899a)QGleaa
119899a
sum
119896=1
Ta119896
+ 119891Φ2
(119899 119899a)QGdyn
119899a
sum
119896=1
119883a119896
+ 119891119899aΦ2 (119899 119899a)QEdyn + 119899aΦ2 (119899 119899a)QEleaa
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q)Eleaina
+Φ2
(119899 119899a)Φ1 (119899 119899a 119891)
(33)
According to (3) (23) and (28) the temperature of the 119894thactive core Ta119894 can be derived as
Ta119894 = (E minus RGleaa)minus1QGleaa
119899a
sum
119896=1
Ta119896 + 119891 (E
minus RGleaa)minus1QGdyn
119899a
sum
119896=1
119883a119896 + 119891 (E minus RGleaa)minus1
sdot RGdyn119883a119894 + (119899 minus 119899a) (E minus RGleaa)minus1QGleainaTina
+ 119891 (E minus RGleaa)minus1
(R + 119899aQ)Edyn + (E
minus RGleaa)minus1
(R + 119899aQ)Eleaa + (119899 minus 119899a) (E
minus RGleaa)minus1QEleaina + (E minus RGleaa)
minus1
(119891RG1198890
+ 119899a119891QG1198890
+ RG1198970a + 119899aQG
1198970a
+ (119899 minus 119899a)QG1198970ina + Z)
(34)
Let
Φ3
(119899 119899a 119891) = 119891RG1198890
+ 119899a119891QG1198890
+ RG1198970a
+ 119899aQG1198970a + (119899 minus 119899a)QG
1198970ina
+ Z
Φ4
= (E minus RGleaa)minus1
(35)
Then (34) can be transformed into
Ta119894 = Φ4QGleaa
119899a
sum
119896=1
Ta119896 + 119891Φ4
QGdyn
119899a
sum
119896=1
119883a119896
+ 119891Φ4
RGdyn119883a119894 + (119899 minus 119899a)Φ4QGleainaTina
+ 119891Φ4
(R + 119899aQ)Edyn +Φ4
(R + 119899aQ)Eleaa
+ (119899 minus 119899a)Φ4QEleaina +Φ4Φ3 (119899 119899a 119891)
(36)
Mathematical Problems in Engineering 7
Substitute (33) into (36) and get
Ta119894 = (Φ4
QGleaa
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)QGleaa)
119899a
sum
119896=1
Ta119896
+ 119891 (Φ4
QGdyn
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)QGdyn)
119899a
sum
119896=1
119883a119896
+ 119891Φ4
RGdyn119883a119894 + 119891 (Φ4
(R + 119899aQ)
+ 119899a (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)Q)Edyn
+ (Φ4
(R + 119899aQ)
+ 119899a (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)Q)Eleaa
+ ((119899 minus 119899a)Φ4Q
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a) (R + (119899 minus 119899a)Q))
sdot Eleaina + (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)
sdotΦ1
(119899 119899a 119891) +Φ4
Φ3
(119899 119899a 119891)
(37)
LetΦ5
(119899 119899a) = (Φ4
QGleaa + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)QGleaa)
Φ6
(119899 119899a 119891) = 119891 (Φ4
QGdyn + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)QGdyn)Φ7 (119891)
= 119891Φ4
RGdyn
Φ7
(119891) = 119891Φ4
RGdyn
Φ8
(119899 119899a 119891) = 119891 (Φ4
(R + 119899aQ) + 119899a (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)Q)
Φ9
(119899 119899a) = (Φ4
(R + 119899aQ) + 119899a (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)Q)
Φ10
(119899 119899a) = ((119899 minus 119899a)Φ4Q + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a) (R + (119899 minus 119899a)Q))
Φ11
(119899 119899a 119891) = (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)
sdotΦ1
(119899 119899a 119891) +Φ4
Φ3
(119899 119899a 119891)
(38)
Then (37) is transformed into
Ta119894 = Φ5 (119899 119899a)
119899a
sum
119896=1
Ta119896 +Φ6 (119899 119899a 119891)
119899a
sum
119896=1
119883a119896
+Φ7
(119891)119883a119894 +Φ8 (119899 119899a 119891)Edyn
+Φ9
(119899 119899a)Eleaa +Φ10 (119899 119899a)Eleaina
+Φ11
(119899 119899a 119891)
(39)
According to (39) get119899a
sum
119896=1
Ta119896 = (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891))
119899a
sum
119896=1
119883a119896
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)Edyn
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)Eleaa
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)Eleaina
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
(40)
Substitute (40) into (39) and get
Ta119894 = Φ7 (119891)119883a119894
+ (Φ5
(119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899 119899a 119891)
+Φ7
(119891)) +Φ6
(119899 119899a 119891))
119899a
sum
119896=1
119883a119896 + (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891) +Φ8
(119899 119899a 119891))
sdot Edyn + (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ9
(119899 119899a) +Φ9 (119899 119899a))Eleaa + (119899aΦ5 (119899 119899a) (E
minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a) +Φ10 (119899 119899a))Eleaina
+ 119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891)
(41)
The hotspot temperature 119879hota119894 of the active core can begiven by
119879hota119894 = HTa119894 = HΦ7
(119891)119883a119894 +H (Φ5
(119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899 119899a 119891) +Φ7
(119891))
+Φ6
(119899 119899a 119891))
119899a
sum
119896=1
119883a119896 +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+Φ8
(119899 119899a 119891))Edyn +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a) +Φ9 (119899 119899a))Eleaa
+H (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ10
(119899 119899a))Eleaina +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891))
(42)
8 Mathematical Problems in Engineering
Let1205951
(H 119891) = HΦ7
(119891)
1205952
(H 119899 119899a 119891) = H (Φ5
(119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) +Φ6
(119899 119899a 119891))
1205953
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+Φ8
(119899 119899a 119891))
1205954
(H 119899 119899a) = H (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ9
(119899 119899a) +Φ9 (119899 119899a))
1205955
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a) +Φ10 (119899 119899a))
1205956
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891))
(43)
Then (42) is transformed into
119879hota119894 = 1205951
(H 119891)119883a119894 + 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn + 1205954
(H 119899 119899a)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(44)
Substitute (40) into (33) and get
Tina = (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+ 119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891)
(45)
The hotspot temperature 119879hotina of the inactive core isgiven by
119879hotina = HTina = H (Φ2
(119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(46)
Let1205851
(H 119899 119899a 119891)
= H (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899
119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
1205852
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ8
(119899 119899a 119891) + 119891119899aΦ2 (119899 119899a)Q)
1205853
(H 119899 119899a) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ9
(119899 119899a) + 119899aΦ2 (119899 119899a)Q)
1205854
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ10
(119899 119899a) +Φ2 (119899 119899a) (R + (119899 minus 119899a)Q))
1205855
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(47)
Then (46) is transformed into
119879hot119904 = 1205851 (H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896 + 1205852 (H 119899 119899a 119891)Edyn
+ 1205853
(H 119899 119899a 119891)Eleaa
+ 1205854
(H 119899 119899a 119891)Eleaina + 1205855 (H 119899 119899a 119891)
(48)
Mathematical Problems in Engineering 9
According toTheorem 3 it can be known that the hotspottemperature of any core can be expressed as the linearfunction of IPCs of all cores
Theorem 4 Suppose that (a) the random variable 119883119886119894
forthe IPC follows normal distribution with the mean 120583
119883
andthe variance 120590
2
119883
that is 119883119886119894
sim 119873(120583119883
1205902
119883
) (b) the tasksrunning on different cores are mutually independent that is119883119886119894
and 119883119886119895
are independent for 1 le forall119894 119895 le 119898 and (c)workload balancing techniques are used in the processor suchthat 119883
119886119894
and 119883119886119895
follow the same normal distribution Thenthere exist functions 120583
119879119886
(H 119899 119899119886
119891) and 120590119879119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119886119894
of the active corefollows the normal distributionwithmean120583
119879119886
(H 119899 119899119886
119891) andvariance 120590
119879119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119886119894
sim 119873(120583119879119886
(H 119899 119899119886
119891) 120590119879119886
2
(H 119899 119899119886
119891)) (49)
There exist functions 120583119879119894119899119886
(H 119899 119899119886
119891) and 120590119879119894119899119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119894119899119886
of the inactive corefollows the normal distribution with mean 120583
119879119894119899119886
(H 119899 119899119886
119891)
and variance 120590119879119894119899119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119894119899119886
sim 119873(120583119879119894119899119886
(H 119899 119899119886
119891) 120590119879119894119899119886
2
(H 119899 119899119886
119891))
(50)
Proof According to (29) get
119879hota119894 = (1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894
+ 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119896 =119894
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn
+ 1205954
(H 119899 119899a 119891)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(51)
The norm-distributed random variables 119883a119894 and 119883a119895 areindependent in the case of 1 le forall119894 119895 le 119898 so that thelinear combination (120595
1
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 +
1205952
(H 119899 119899a 119891)sum119899a119896=1119896 =119894
119883a119896 of 119883a119894 still follows the normaldistribution
All elements Edyn119896 in the random vector Edyn follow nor-mal distribution and are independent so that the linear com-bination 120595
3
(H 119899 119899a 119891)Edyn of all elements in Edyn followsnormal distribution In the sameway120595
4
(H 119899 119899a 119891)Eleaa and1205955
(H 119899 119899a 119891)Eleaina also follow normal distribution(1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 1205952
(H 119899 119899a119891)sum119899a119896=1119896 =119894
119883a119896 1205953
(H 119899 119899a 119891)Edyn 1205954
(H 119899 119899a 119891)Eleaaand 120595
5
(H 119899 119899a 119891)Eleaina all follow normal distribution andare mutually independent so that 119879hota119894 follows normal
distribution where the mean 120583119879ina(H 119899 119899a 119891) is calculated
by
120583119879a (H 119899 119899a 119891)
= (1205951
(H 119891) + 119899a1205952
(H 119899 119899a 119891)) 120583119883
+ 1205953
(H 119899 119899a 119891)120583119864119889
+ 1205954
(H 119899 119899a)120583119864a
+ 1205955
(H 119899 119899a 119891)120583119864ina + 1205956 (H 119899 119899a 119891)
(52)
and the variance 120590119879ina2
(H 119899 119899a 119891) is calculated by
120590119879a2
(H 119899 119899a 119891)
= (1205951
(H 119891) + 1205952
(H 119899 119899a 119891))2
1205902
119883
+ (119899a minus 1)1205952
2
(H 119899 119899a 119891) 1205902
119883
+ 1205953
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205954
2
(H 119899 119899a)1205902
119864a
+ 1205955
2
(H 119899 119899a 119891)1205902
119864ina
(53)
In the same way 1205851
(H 119899 119899a 119891)sum119899a119896=1
119883a119896 1205852(H 119899 119899a119891)Edyn 1205853(H 119899 119899a 119891)Eleaa and 1205854(H 119899 119899a 119891)Eleaina in (30)all follow the normal distribution and are mutually indepen-dent so that 119879hot119904 follows the normal distribution where themean 120583
119879119904
(H 119899 119899a 119891) is calculated by
120583119879119904
(H 119899 119899a 119891) = 119899a1205851 (H 119899 119899a 119891) 120583119883
+ 1205852
(H 119899 119899a 119891)120583119864119889
+ 1205853
(H 119899 119899a)120583119864a
+ 1205854
(H 119899 119899a 119891)120583119864ina
+ 1205855
(H 119899 119899a 119891)
(54)
and the variation 120590119879119904
2
(H 119899 119899a 119891) is calculated by
120590119879119904
2
(H 119899 119899a 119891) = 119899a12058512
(H 119899 119899a 119891) 1205902
119883
+ 1205852
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205853
2
(H 119899 119899a)1205902
119864a
+ 1205854
2
(H 119899 119899a 119891)1205902
119864ina
(55)
According toTheorem 4 it can be known that the hotspottemperature of any core follows the normal probabilisticdistribution
6 Frequency Analysis
The zero-slack policy is used as the DTM strategy of proces-sors that is the speed of processor is set to a value whichmakes the temperature of the hotspot be the threshold [7]However the frequencies are discrete in this workThereforein most cases there is no frequency in the set of frequenciesmaking the hotspot temperature be the threshold exactly
10 Mathematical Problems in Engineering
Theorem 5 Let 119865119878119890119905 = 1198911
119891120587
119891120587+1
119891119891119888119900119906119899119905
be the set of frequencies of multicore processors where1198911
lt sdot sdot sdot lt 119891120587
lt 119891120587+1
lt sdot sdot sdot lt 119891119891119888119900119906119899119905
then the probabilisticdistribution of the frequency f follows
119875 119891 = 119891120587
=
119875 119879ℎ119900119905119886
(119899119886
119891120587
) lt 119879V119886119897V119890 119891120587
= 119891119888119900119906119899119905
119875 119879ℎ119900119905119886
(119899119886
119891120587
) le 119879V119886119897V119890 minus 119875 119879ℎ119900119905119886
(119899119886
119891120587+1
) le 119879V119886119897V119890 119891120587
lt 119891119888119900119906119899119905
(56)
where119879ℎ119900119905119886
(119899119886
119891120587
) is the function of the number of active cores119899119886
and the frequency 119891120587
representing the hotspot temperatureof the active core and 119879V119886119897V119890 is the temperature threshold of theprocessor
Proof When 119891120587
= 119891count according to the zero-slack DTMpolicy obviously
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve (57)
When 119891120587
lt 119891count then the probabilities of 119891 = 119891120587
can bebroken into two cases
(a) if 119879hota(119899a 119891120587) gt 119879hota(119899a 119891120587+1) according to thezero-slack DTM strategy the probability of 119891 = 119891
120587
is 0 that is
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) gt 119879hota (119899a 119891120587+1) = 0 (58)
(b) if 119879hota(119899a 119891120587) le 119879hota(119899a 119891120587+1) the probability of119891 = 119891
120587
is given by
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
(59)
The right-hand side of (59) can be derived by
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve minus 119875 119879hota (119899a 119891120587)
le 119879valve 119879hota (119899a 119891120587+1) le 119879valve | 119879hota (119899a 119891120587)
le 119879hota (119899a 119891120587+1)
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
le 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587+1) le 119879valve
(60)
According to (59) and (60) get
119875 119891119879hota = (119899a 119891120587) 119891120587 le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve
(61)
Hence when 119891120587
lt 119891count according to (58) and (61) get
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve (62)
Therefore according to Theorem 5 the probabilistic dis-tribution of the set of frequencies can be obtained based onthe assumption that the zero-slack policy is used
Given the probabilistic distribution of the frequency andthe mean of hotspot temperature of the active core for acertain frequency the average hotspot temperature of theactive core 119879
averhota can be obtained by
119879averhota = sum
119891isin119865Set120583119879a (H 119899 119899a 119891) sdot 119875 (119891) (63)
where 119865Set = 1198911
119891120587
119891120587+1
119891119891count denotes the set
of frequencies of multicore processors and 119875(119891) denotes theprobability that the frequency is 119891 120583
119879a(H 119899 119899a 119891) denotesthe mean of hotspot temperature of the active core given thefrequency 119891
7 Experimental Results
71 Experimental Methodology Amulticore version of Alpha21264 processor is used as the processor model in ourexperiment [24] and there are eight cores in the processorThe cores have two working states active state and inac-tive state and the working state of each core can be setseparately The processor employs a global DFS techniquewhich means that frequencies of all cores in the processorare scaled uniformly There are four discrete frequenciesused by the processor that is 15 GHz 2GHz 25 GHz and3GHz To facilitate analysis the frequencies are normalizedinto the interval [0 1] so that the maximum frequency isnormalized to 1 After normalization the set of frequenciesis 05 067 083 1 According to our previous work thehotspot of Alpha 21264 processor is the branch predictor[18] So the second element of the hotspot selection vectorH corresponding to the branch predictor is set to 1 and theother elements are set to 0 sThe thermal threshold that is themaximum temperature allowed by processor is set to 100∘C
TheHotSpot is used as the thermalmodel of themulticoreprocessor and the parameters such as thermal conductanceand capacitance are set to default values of HotSpot sim-ulation tool [3] In order to construct the linear model ofdynamic power PTScalar is modified to obtain both the
Mathematical Problems in Engineering 11
dynamic power profiles of each functional unit in Alpha21264 processor and the IPC profiles [25] The parameters ofPTScalar are set to default values as well The mean 120583
119883
andvariance 120590
2
119883
of norm-distributed 119883a119894 are determined usingIPC profiles based on the maximum likelihood estimationSome representative tasks such as mesa ammp quake bzipmcfmath and qsort fromMiBench [26] and SPECCPU2000[27] are selected as the benchmarks These selected tasks aremutually independent that is no task takes precedence overthe others so the tasks can be parallel executed at the tasklevel In addition workload balancing techniques are used inthe processor A task is not fixed on a core and the tasks canbe migrated among all cores such that the IPCs of differentcores are equal The simple linear regression analysis is usedto determine the coefficients Gdyn and G
1198890
in (21) and thevariance 1205902
119864119889
of regression residuals Edyn is obtainedThe leakage power is the nonlinear monotonic increasing
function of temperature which is given by [25]
119875119897
= 120572 sdot 1198792
sdot 119890120574119879
+ 120573 (64)
where 120572 120573 and 120574 are parameters which depend on topologysize technology and design of processors In order toconstruct the linear model of leakage power (64) is regressedlinearly to determine the coefficients Gleaa and G
1198970a in (22)and the coefficients Gleaina and G
1198970ina in (24) as well asthe variance 1205902
119864a and 1205902
119864ina of regression residuals Eleaa andEleaina The parameters 120572 120573 and 120574 are set to the defaultvalues of PTScalar The smaller the range of temperaturethe higher the linear correlation between leakage power andtemperature [7] The temperature of a processor using DTMtechniques does not exceed the maximum value and thelower temperature has no impact on the design optimizationof thermal-aware processors Therefore the linear regressionanalysis of leakage powers is performed at the temperatureinterval between 60∘C and 100∘C and the regression resultsare used to estimate the leakage power at the whole tempera-ture interval
72 Estimated Accuracy For the hotspot of the processorthat is the branch predictor Figure 3 shows the comparisonbetween the actual value and the estimated value of leakagepower for active cores and Figure 4 shows that for inactivecores A higher estimation accuracy of leakage power isobtained at the temperature interval between 60∘C and 100∘Cat the cost of lower accuracy at the other intervals
After regression analysis the dynamic and leakage powercan be estimated with the linear model in (21) (22) and(24) Figure 5 shows the estimation error rate of the dynamicpower and that of the leakage power for active cores andinactive cores in thermal range between 60∘C and 100∘C Itcan be seen that the error rates of the dynamic powers fordifferent functional units have significant variations For thedecoder the estimation error rate of dynamic power is only262 but 1014 for the floating point register (FPReg) Thereason for this fact is that the linear correlations betweenthe IPC and the dynamic powers for various functionalunits are different Lower error rate results from highercorrelation Obviously the IPC and the dynamic power of
Actual leakage powerEstimated leakage power
60 80 100 12040Temperature (∘C)
minus02
0
02
04
06
08
1
Leak
age p
ower
(W)
Figure 3 Leakage power estimation of hotspot for active cores
Actual leakage powerEstimated leakage power
times10minus4
60 80 100 12040Temperature (∘C)
minus2
0
2
4
6
8
10
12
14
Leak
age p
ower
(W)
Figure 4 Leakage power estimation of hotspot for inactive cores
the decoder have the highest linear correlation while theIPC and that of FPReg have the lowest linear correlationIt can also be seen from Figure 5 that the estimation errorrates of leakage power for both active cores and inactivecores are similar The estimation error rates of leakage powerfor active cores are between 315 and 326 and those forinactive cores are between 306 and 591 This is becausethe temperature and the leakage power of various functionalunits have similar linear correlations In order to consider the
12 Mathematical Problems in Engineering
Table 1 Means (120583) and standard deviations (120590) of hotspot temperature distribution for active cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 5193 403 6125 540 7002 669 7935 8067 5920 425 6973 569 7964 705 9017 8508 6669 448 7846 600 8954 743 10131 895
Table 2 Means (120583) and standard deviations (120590) of hotspot temperature distribution for inactive cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 4059 171 4708 229 5319 284 5968 3427 4778 195 5546 261 6269 323 7037 389
Dynamic power
FPRe
gIT
LB IL1
DTB DL1 L2
Dec
ode
Bran
chRA
TRU
ULS
QIA
LU1
IALU
2IA
LU3
IALU
4FP
Add
FPM
ulIn
tReg
Leakage power for active coreLeakage power for inactive core
02468
1012
Erro
r rat
e (
)
Figure 5 Error rate of the power estimation
impact of estimation errors on the analysis of temperatureand frequency the error terms Edyn Eleaa and Eleaina areintroduced into the linear models of dynamic power andleakage power as expressed in (21) (22) and (24)
73 Probabilistic Distribution of Temperature Based on theassumption that no DTM techniques are used to controlthe temperature of processors according to Theorem 4the probabilistic distribution of the hotspot temperaturesfor both active cores and inactive cores can be obtainedTable 1 presents the means and the standard deviations ofthe probabilistic distribution of the hotspot temperature foractive cores and the corresponding probabilistic densitycurves are as shown in Figure 6 It can be seen that thehotspot temperature of processors is not deterministic andhas significant variations for a certain number of active coresand a certain frequency The number of active cores and therunning frequency simultaneously determine the range inwhich the temperature lies and the probabilistic distributionof the hotspot temperature For the same running frequencymore active cores will yield higher temperature and viceversa For the same number of active cores higher frequencywill bring higher temperature and vice versa Accordingto the characteristics of normal distribution curve it can
be known that the shape of probabilistic density curvecorresponds to the variations of data distribution dependingon the standard deviation of random variables The curvewith a higher peak implies a smaller standard deviationthat is a lower variation of data distribution whereas thecurve with a lower peak implies a bigger standard deviationthat is a larger variation it can be seen from Figure 6 thatthe probabilistic density curves corresponding to variousfrequencies have different peaks This observation impliesthat the degree of temperature variation has close correlationwith working frequency and higher frequency will yieldhigher variation of temperature
Table 2 presents the means and standard deviations ofthe probabilistic distribution of the hotspot temperature forinactive cores and the corresponding probabilistic densitycurves are as shown in Figure 7 When the number of activecores is eight inactive core does not exist so there is nocase where the number of active cores is eight in Table 2and Figure 7 The effect of the frequency and the number ofactive cores at the hotspot temperature for inactive cores is thesame as that for active cores except that the mean value andvariation of hotspot temperature of inactive cores are lowerthan those of active cores under the same frequency and thenumber of active cores
74 Probabilistic Distribution of Frequencies If the power-gating and the DFS techniques are simultaneously usedto manage the temperature of a processor according toTheorem 5 the probabilistic distribution of working frequen-cies can be determined Figure 8 presents the probabilisticdistribution of frequencies when the number of active cores issix seven and eight respectively When the frequency is lessthan 1 it is implied that the hotspot temperature surpasses thethreshold and the DFS technique is triggered to reduce thefrequency of the processor So the probability for triggeringDFS can be obtained from the probabilistic distribution offrequencies Figure 9 presents the probability for triggeringDFS when the number of active cores is six seven and eightrespectively
Mathematical Problems in Engineering 13
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
0
002
004
006
008
01Pr
obab
ility
den
sity
40 60 80 100 120 14020Temperature (∘C)
(a)
40 60 80 100 120 14020Temperature (∘C)
0
002
004
006
008
01
Prob
abili
ty d
ensit
yFrequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
0
002
004
006
008
01
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(c)
Figure 6 The probability density of the hotspot temperature for active cores (a) The number of active cores is six (b) The number of activecores is seven (c) The number of active cores is eight
If a core is powered off and made inactive using thepower-gating technique it only dissipates the leakage powerwhich is much less than that of an active core so the powerdissipated by the processor is reduced significantlyWhen thenumber of active cores is less than six that is more thantwo cores are powered off the decreased power makes it
enough for the rest of active cores of a processor to executeat the full speed and the DFS is not necessary to be triggeredTherefore when the number of active cores is less than six thefrequency of the processor is constantly 1 and the probabilityfor triggering DFS is constantly 0 This situation is not givenin Figures 8 and 9
14 Mathematical Problems in Engineering
Table 3 Comparisons of temperatures between active cores with and without DFS
Number of active cores Average hotspot temperature (∘C) Probability for exceedingtemperature threshold ()
With DFS Without DFS With DFS Without DFS6 7930 7935 0 0527 8885 9017 0 12378 9385 10131 0 5584
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(a)
40 60 80 100 120 14020Temperature (∘C)
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
Figure 7The probability density of the hotspot temperature for inactive cores (a)The number of active cores is six (b)The number of activecores is seven
It can be seen that various numbers of active cores resultin different probabilistic distributions of frequencies anddifferent probabilities for triggering DFS When all cores arepowered on the probability that the processor runs at themaximum frequency is only 4416 and the probability fortriggering DFS is 5584This means that all cores will run atthe full speed only when the IPCs of tasks are less If the IPCsincrease to an extent the running frequency will be scaleddown to control the temperature under the threshold Asthe number of active cores decreases the probability that theprocessor runs at the maximum frequency increases and theprobability for triggering DFS decreases When the numberof active cores is less than six no matter what the IPCs arethe saved power by shutting off more than two cores makes itdeterministic for the active cores to run at the full speed sothe probability that the processor runs at the maximum fre-quency is 100 and the probability for triggering DFS is 0
75 Comparisons of Temperatures with and without DFS IftheDFS technique is not used then the processor always runs
at the full speed that is the running frequency is constantly 1So the average hotspot temperature of the active core withoutthe DFS can be obtained according to (52) If both the power-gating and DFS techniques are used simultaneously for thedynamic thermal management then the running frequencycan be scaled to control the temperature of the processorunder the thermal threshold According to (63) the averagehotspot temperature of the active core with the DFS canbe obtained In terms of the average hotspot temperatureand the probability that the hotspot temperature exceeds thethreshold Table 3 presents the comparative results betweenthe active cores with and without the DFS when the numberof active cores is 6 7 and 8
If the DFS technique is used by the processor the runningfrequency will be scaled down to reduce the temperatureonce the hotspot temperature reaches the threshold So thehotspot temperature will not exceed the threshold that isthe probability that the hotspot temperature exceeds thethreshold is 0 If the DFS technique is not used by the pro-cessor the running frequency is always the maximum and
Mathematical Problems in Engineering 15
Number of active cores
Frequency is 1 Frequency is 083Frequency is 067 Frequency is 05
0
20
40
60
80
100
Prob
abili
ty o
f fre
quen
cy (
)
6 7 8
Figure 8 Probabilistic distribution of frequencies
6 7 8Number of active cores
0
10
20
30
40
50
60
Prob
abili
ty fo
r trig
gerin
g D
FS (
)
Figure 9 Probability for triggering DFS
the hotspot temperature is possible to exceed the thresholdthat is the probability that the hotspot temperature exceedsthe threshold is larger than 0Therefore the average hotspottemperature of the processor with the DFS is lower than thatwithout the DFS
As the number of active cores decreases the saved powerby shutting off cores makes it more possible for the activecores to execute at the full speed and the effect of the DFSon cooling down the processor weakens until it disappearsTherefore for the processors with and without the DFS the
average hotspot temperatures become closer as the number ofactive cores decreases as shown in Table 3 Even though theDFS is not used the probability that the hotspot temperatureexceeds the threshold will reduce until 0 when more coresare powered off
When the number of active cores is lower than 6 that ismore than 2 cores are powered off no matter what speed theprocessor runs at the hotspot temperature will not exceedthe threshold so all active cores are not necessary to triggerthe DFS for managing the temperature of the processor andcan run at the maximum frequency Therefore when thenumber of active cores is lower than 6 the probability thatthe hotspot temperature exceeds the threshold is 0 andthe average hotspot temperatures of the processor with andwithout theDFS are sameThere is no differencewith theDFSandwithout theDFSwhen the number of active cores is lowerthan 6 so the comparisons in this situation are not given inTable 3
8 Conclusions
In this paper a probabilistic analysis method of the tem-perature and frequency of multicore processors is presentedtaking the variation of workloads into account It is provedtheoretically in this paper that (1) the hotspot temperaturesof both active cores and inactive cores are the linear functionsof the IPC (2) the hotspot temperature follows the normalprobabilistic distribution based on the assumption that IPCsof all cores follow the same normal distribution and (3) therunning frequency follows a probabilistic distribution
From the experimental results it can be seen that (1) theestimation error rates of the dynamic powers for differentfunctional units have significant variations indicating thatthe linear correlations between the IPC and the dynamicpowers for various functional units are different (2) theestimation error rates of leakage powers for both activecores and inactive cores are similar showing similar linearcorrelations between temperature and leakage power acrossvarious functional units (3) a higher estimation accuracy ofleakage powers can be obtained at the temperature intervalbetween 60∘C and 100∘C at the cost of lower accuracy atother intervals (4) the hotspot temperature of the proces-sor is not deterministic and has significant variation fora certain number of active cores and a certain frequencyand the number of active cores and the running frequencydetermine simultaneously the probabilistic distribution ofhotspot temperature and (5) various numbers of active coresresult in different probabilistic distributions of frequenciesand different probabilities for triggering DFS
Competing Interests
The authors declare no competing interests
Authorsrsquo Contributions
Biying Zhang performed the theoretical analysis and wrotethe major part of this paper Gang Cui and Zhongchuan
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 7
Substitute (33) into (36) and get
Ta119894 = (Φ4
QGleaa
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)QGleaa)
119899a
sum
119896=1
Ta119896
+ 119891 (Φ4
QGdyn
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)QGdyn)
119899a
sum
119896=1
119883a119896
+ 119891Φ4
RGdyn119883a119894 + 119891 (Φ4
(R + 119899aQ)
+ 119899a (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)Q)Edyn
+ (Φ4
(R + 119899aQ)
+ 119899a (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)Q)Eleaa
+ ((119899 minus 119899a)Φ4Q
+ (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a) (R + (119899 minus 119899a)Q))
sdot Eleaina + (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)
sdotΦ1
(119899 119899a 119891) +Φ4
Φ3
(119899 119899a 119891)
(37)
LetΦ5
(119899 119899a) = (Φ4
QGleaa + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)QGleaa)
Φ6
(119899 119899a 119891) = 119891 (Φ4
QGdyn + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)QGdyn)Φ7 (119891)
= 119891Φ4
RGdyn
Φ7
(119891) = 119891Φ4
RGdyn
Φ8
(119899 119899a 119891) = 119891 (Φ4
(R + 119899aQ) + 119899a (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)Q)
Φ9
(119899 119899a) = (Φ4
(R + 119899aQ) + 119899a (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a)Q)
Φ10
(119899 119899a) = ((119899 minus 119899a)Φ4Q + (119899 minus 119899a)
sdotΦ4
QGleainaΦ2 (119899 119899a) (R + (119899 minus 119899a)Q))
Φ11
(119899 119899a 119891) = (119899 minus 119899a)Φ4QGleainaΦ2 (119899 119899a)
sdotΦ1
(119899 119899a 119891) +Φ4
Φ3
(119899 119899a 119891)
(38)
Then (37) is transformed into
Ta119894 = Φ5 (119899 119899a)
119899a
sum
119896=1
Ta119896 +Φ6 (119899 119899a 119891)
119899a
sum
119896=1
119883a119896
+Φ7
(119891)119883a119894 +Φ8 (119899 119899a 119891)Edyn
+Φ9
(119899 119899a)Eleaa +Φ10 (119899 119899a)Eleaina
+Φ11
(119899 119899a 119891)
(39)
According to (39) get119899a
sum
119896=1
Ta119896 = (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891))
119899a
sum
119896=1
119883a119896
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)Edyn
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)Eleaa
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)Eleaina
+ 119899a (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
(40)
Substitute (40) into (39) and get
Ta119894 = Φ7 (119891)119883a119894
+ (Φ5
(119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899 119899a 119891)
+Φ7
(119891)) +Φ6
(119899 119899a 119891))
119899a
sum
119896=1
119883a119896 + (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891) +Φ8
(119899 119899a 119891))
sdot Edyn + (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ9
(119899 119899a) +Φ9 (119899 119899a))Eleaa + (119899aΦ5 (119899 119899a) (E
minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a) +Φ10 (119899 119899a))Eleaina
+ 119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891)
(41)
The hotspot temperature 119879hota119894 of the active core can begiven by
119879hota119894 = HTa119894 = HΦ7
(119891)119883a119894 +H (Φ5
(119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899 119899a 119891) +Φ7
(119891))
+Φ6
(119899 119899a 119891))
119899a
sum
119896=1
119883a119896 +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+Φ8
(119899 119899a 119891))Edyn +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a) +Φ9 (119899 119899a))Eleaa
+H (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ10
(119899 119899a))Eleaina +H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891))
(42)
8 Mathematical Problems in Engineering
Let1205951
(H 119891) = HΦ7
(119891)
1205952
(H 119899 119899a 119891) = H (Φ5
(119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) +Φ6
(119899 119899a 119891))
1205953
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+Φ8
(119899 119899a 119891))
1205954
(H 119899 119899a) = H (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ9
(119899 119899a) +Φ9 (119899 119899a))
1205955
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a) +Φ10 (119899 119899a))
1205956
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891))
(43)
Then (42) is transformed into
119879hota119894 = 1205951
(H 119891)119883a119894 + 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn + 1205954
(H 119899 119899a)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(44)
Substitute (40) into (33) and get
Tina = (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+ 119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891)
(45)
The hotspot temperature 119879hotina of the inactive core isgiven by
119879hotina = HTina = H (Φ2
(119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(46)
Let1205851
(H 119899 119899a 119891)
= H (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899
119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
1205852
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ8
(119899 119899a 119891) + 119891119899aΦ2 (119899 119899a)Q)
1205853
(H 119899 119899a) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ9
(119899 119899a) + 119899aΦ2 (119899 119899a)Q)
1205854
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ10
(119899 119899a) +Φ2 (119899 119899a) (R + (119899 minus 119899a)Q))
1205855
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(47)
Then (46) is transformed into
119879hot119904 = 1205851 (H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896 + 1205852 (H 119899 119899a 119891)Edyn
+ 1205853
(H 119899 119899a 119891)Eleaa
+ 1205854
(H 119899 119899a 119891)Eleaina + 1205855 (H 119899 119899a 119891)
(48)
Mathematical Problems in Engineering 9
According toTheorem 3 it can be known that the hotspottemperature of any core can be expressed as the linearfunction of IPCs of all cores
Theorem 4 Suppose that (a) the random variable 119883119886119894
forthe IPC follows normal distribution with the mean 120583
119883
andthe variance 120590
2
119883
that is 119883119886119894
sim 119873(120583119883
1205902
119883
) (b) the tasksrunning on different cores are mutually independent that is119883119886119894
and 119883119886119895
are independent for 1 le forall119894 119895 le 119898 and (c)workload balancing techniques are used in the processor suchthat 119883
119886119894
and 119883119886119895
follow the same normal distribution Thenthere exist functions 120583
119879119886
(H 119899 119899119886
119891) and 120590119879119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119886119894
of the active corefollows the normal distributionwithmean120583
119879119886
(H 119899 119899119886
119891) andvariance 120590
119879119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119886119894
sim 119873(120583119879119886
(H 119899 119899119886
119891) 120590119879119886
2
(H 119899 119899119886
119891)) (49)
There exist functions 120583119879119894119899119886
(H 119899 119899119886
119891) and 120590119879119894119899119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119894119899119886
of the inactive corefollows the normal distribution with mean 120583
119879119894119899119886
(H 119899 119899119886
119891)
and variance 120590119879119894119899119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119894119899119886
sim 119873(120583119879119894119899119886
(H 119899 119899119886
119891) 120590119879119894119899119886
2
(H 119899 119899119886
119891))
(50)
Proof According to (29) get
119879hota119894 = (1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894
+ 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119896 =119894
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn
+ 1205954
(H 119899 119899a 119891)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(51)
The norm-distributed random variables 119883a119894 and 119883a119895 areindependent in the case of 1 le forall119894 119895 le 119898 so that thelinear combination (120595
1
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 +
1205952
(H 119899 119899a 119891)sum119899a119896=1119896 =119894
119883a119896 of 119883a119894 still follows the normaldistribution
All elements Edyn119896 in the random vector Edyn follow nor-mal distribution and are independent so that the linear com-bination 120595
3
(H 119899 119899a 119891)Edyn of all elements in Edyn followsnormal distribution In the sameway120595
4
(H 119899 119899a 119891)Eleaa and1205955
(H 119899 119899a 119891)Eleaina also follow normal distribution(1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 1205952
(H 119899 119899a119891)sum119899a119896=1119896 =119894
119883a119896 1205953
(H 119899 119899a 119891)Edyn 1205954
(H 119899 119899a 119891)Eleaaand 120595
5
(H 119899 119899a 119891)Eleaina all follow normal distribution andare mutually independent so that 119879hota119894 follows normal
distribution where the mean 120583119879ina(H 119899 119899a 119891) is calculated
by
120583119879a (H 119899 119899a 119891)
= (1205951
(H 119891) + 119899a1205952
(H 119899 119899a 119891)) 120583119883
+ 1205953
(H 119899 119899a 119891)120583119864119889
+ 1205954
(H 119899 119899a)120583119864a
+ 1205955
(H 119899 119899a 119891)120583119864ina + 1205956 (H 119899 119899a 119891)
(52)
and the variance 120590119879ina2
(H 119899 119899a 119891) is calculated by
120590119879a2
(H 119899 119899a 119891)
= (1205951
(H 119891) + 1205952
(H 119899 119899a 119891))2
1205902
119883
+ (119899a minus 1)1205952
2
(H 119899 119899a 119891) 1205902
119883
+ 1205953
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205954
2
(H 119899 119899a)1205902
119864a
+ 1205955
2
(H 119899 119899a 119891)1205902
119864ina
(53)
In the same way 1205851
(H 119899 119899a 119891)sum119899a119896=1
119883a119896 1205852(H 119899 119899a119891)Edyn 1205853(H 119899 119899a 119891)Eleaa and 1205854(H 119899 119899a 119891)Eleaina in (30)all follow the normal distribution and are mutually indepen-dent so that 119879hot119904 follows the normal distribution where themean 120583
119879119904
(H 119899 119899a 119891) is calculated by
120583119879119904
(H 119899 119899a 119891) = 119899a1205851 (H 119899 119899a 119891) 120583119883
+ 1205852
(H 119899 119899a 119891)120583119864119889
+ 1205853
(H 119899 119899a)120583119864a
+ 1205854
(H 119899 119899a 119891)120583119864ina
+ 1205855
(H 119899 119899a 119891)
(54)
and the variation 120590119879119904
2
(H 119899 119899a 119891) is calculated by
120590119879119904
2
(H 119899 119899a 119891) = 119899a12058512
(H 119899 119899a 119891) 1205902
119883
+ 1205852
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205853
2
(H 119899 119899a)1205902
119864a
+ 1205854
2
(H 119899 119899a 119891)1205902
119864ina
(55)
According toTheorem 4 it can be known that the hotspottemperature of any core follows the normal probabilisticdistribution
6 Frequency Analysis
The zero-slack policy is used as the DTM strategy of proces-sors that is the speed of processor is set to a value whichmakes the temperature of the hotspot be the threshold [7]However the frequencies are discrete in this workThereforein most cases there is no frequency in the set of frequenciesmaking the hotspot temperature be the threshold exactly
10 Mathematical Problems in Engineering
Theorem 5 Let 119865119878119890119905 = 1198911
119891120587
119891120587+1
119891119891119888119900119906119899119905
be the set of frequencies of multicore processors where1198911
lt sdot sdot sdot lt 119891120587
lt 119891120587+1
lt sdot sdot sdot lt 119891119891119888119900119906119899119905
then the probabilisticdistribution of the frequency f follows
119875 119891 = 119891120587
=
119875 119879ℎ119900119905119886
(119899119886
119891120587
) lt 119879V119886119897V119890 119891120587
= 119891119888119900119906119899119905
119875 119879ℎ119900119905119886
(119899119886
119891120587
) le 119879V119886119897V119890 minus 119875 119879ℎ119900119905119886
(119899119886
119891120587+1
) le 119879V119886119897V119890 119891120587
lt 119891119888119900119906119899119905
(56)
where119879ℎ119900119905119886
(119899119886
119891120587
) is the function of the number of active cores119899119886
and the frequency 119891120587
representing the hotspot temperatureof the active core and 119879V119886119897V119890 is the temperature threshold of theprocessor
Proof When 119891120587
= 119891count according to the zero-slack DTMpolicy obviously
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve (57)
When 119891120587
lt 119891count then the probabilities of 119891 = 119891120587
can bebroken into two cases
(a) if 119879hota(119899a 119891120587) gt 119879hota(119899a 119891120587+1) according to thezero-slack DTM strategy the probability of 119891 = 119891
120587
is 0 that is
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) gt 119879hota (119899a 119891120587+1) = 0 (58)
(b) if 119879hota(119899a 119891120587) le 119879hota(119899a 119891120587+1) the probability of119891 = 119891
120587
is given by
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
(59)
The right-hand side of (59) can be derived by
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve minus 119875 119879hota (119899a 119891120587)
le 119879valve 119879hota (119899a 119891120587+1) le 119879valve | 119879hota (119899a 119891120587)
le 119879hota (119899a 119891120587+1)
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
le 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587+1) le 119879valve
(60)
According to (59) and (60) get
119875 119891119879hota = (119899a 119891120587) 119891120587 le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve
(61)
Hence when 119891120587
lt 119891count according to (58) and (61) get
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve (62)
Therefore according to Theorem 5 the probabilistic dis-tribution of the set of frequencies can be obtained based onthe assumption that the zero-slack policy is used
Given the probabilistic distribution of the frequency andthe mean of hotspot temperature of the active core for acertain frequency the average hotspot temperature of theactive core 119879
averhota can be obtained by
119879averhota = sum
119891isin119865Set120583119879a (H 119899 119899a 119891) sdot 119875 (119891) (63)
where 119865Set = 1198911
119891120587
119891120587+1
119891119891count denotes the set
of frequencies of multicore processors and 119875(119891) denotes theprobability that the frequency is 119891 120583
119879a(H 119899 119899a 119891) denotesthe mean of hotspot temperature of the active core given thefrequency 119891
7 Experimental Results
71 Experimental Methodology Amulticore version of Alpha21264 processor is used as the processor model in ourexperiment [24] and there are eight cores in the processorThe cores have two working states active state and inac-tive state and the working state of each core can be setseparately The processor employs a global DFS techniquewhich means that frequencies of all cores in the processorare scaled uniformly There are four discrete frequenciesused by the processor that is 15 GHz 2GHz 25 GHz and3GHz To facilitate analysis the frequencies are normalizedinto the interval [0 1] so that the maximum frequency isnormalized to 1 After normalization the set of frequenciesis 05 067 083 1 According to our previous work thehotspot of Alpha 21264 processor is the branch predictor[18] So the second element of the hotspot selection vectorH corresponding to the branch predictor is set to 1 and theother elements are set to 0 sThe thermal threshold that is themaximum temperature allowed by processor is set to 100∘C
TheHotSpot is used as the thermalmodel of themulticoreprocessor and the parameters such as thermal conductanceand capacitance are set to default values of HotSpot sim-ulation tool [3] In order to construct the linear model ofdynamic power PTScalar is modified to obtain both the
Mathematical Problems in Engineering 11
dynamic power profiles of each functional unit in Alpha21264 processor and the IPC profiles [25] The parameters ofPTScalar are set to default values as well The mean 120583
119883
andvariance 120590
2
119883
of norm-distributed 119883a119894 are determined usingIPC profiles based on the maximum likelihood estimationSome representative tasks such as mesa ammp quake bzipmcfmath and qsort fromMiBench [26] and SPECCPU2000[27] are selected as the benchmarks These selected tasks aremutually independent that is no task takes precedence overthe others so the tasks can be parallel executed at the tasklevel In addition workload balancing techniques are used inthe processor A task is not fixed on a core and the tasks canbe migrated among all cores such that the IPCs of differentcores are equal The simple linear regression analysis is usedto determine the coefficients Gdyn and G
1198890
in (21) and thevariance 1205902
119864119889
of regression residuals Edyn is obtainedThe leakage power is the nonlinear monotonic increasing
function of temperature which is given by [25]
119875119897
= 120572 sdot 1198792
sdot 119890120574119879
+ 120573 (64)
where 120572 120573 and 120574 are parameters which depend on topologysize technology and design of processors In order toconstruct the linear model of leakage power (64) is regressedlinearly to determine the coefficients Gleaa and G
1198970a in (22)and the coefficients Gleaina and G
1198970ina in (24) as well asthe variance 1205902
119864a and 1205902
119864ina of regression residuals Eleaa andEleaina The parameters 120572 120573 and 120574 are set to the defaultvalues of PTScalar The smaller the range of temperaturethe higher the linear correlation between leakage power andtemperature [7] The temperature of a processor using DTMtechniques does not exceed the maximum value and thelower temperature has no impact on the design optimizationof thermal-aware processors Therefore the linear regressionanalysis of leakage powers is performed at the temperatureinterval between 60∘C and 100∘C and the regression resultsare used to estimate the leakage power at the whole tempera-ture interval
72 Estimated Accuracy For the hotspot of the processorthat is the branch predictor Figure 3 shows the comparisonbetween the actual value and the estimated value of leakagepower for active cores and Figure 4 shows that for inactivecores A higher estimation accuracy of leakage power isobtained at the temperature interval between 60∘C and 100∘Cat the cost of lower accuracy at the other intervals
After regression analysis the dynamic and leakage powercan be estimated with the linear model in (21) (22) and(24) Figure 5 shows the estimation error rate of the dynamicpower and that of the leakage power for active cores andinactive cores in thermal range between 60∘C and 100∘C Itcan be seen that the error rates of the dynamic powers fordifferent functional units have significant variations For thedecoder the estimation error rate of dynamic power is only262 but 1014 for the floating point register (FPReg) Thereason for this fact is that the linear correlations betweenthe IPC and the dynamic powers for various functionalunits are different Lower error rate results from highercorrelation Obviously the IPC and the dynamic power of
Actual leakage powerEstimated leakage power
60 80 100 12040Temperature (∘C)
minus02
0
02
04
06
08
1
Leak
age p
ower
(W)
Figure 3 Leakage power estimation of hotspot for active cores
Actual leakage powerEstimated leakage power
times10minus4
60 80 100 12040Temperature (∘C)
minus2
0
2
4
6
8
10
12
14
Leak
age p
ower
(W)
Figure 4 Leakage power estimation of hotspot for inactive cores
the decoder have the highest linear correlation while theIPC and that of FPReg have the lowest linear correlationIt can also be seen from Figure 5 that the estimation errorrates of leakage power for both active cores and inactivecores are similar The estimation error rates of leakage powerfor active cores are between 315 and 326 and those forinactive cores are between 306 and 591 This is becausethe temperature and the leakage power of various functionalunits have similar linear correlations In order to consider the
12 Mathematical Problems in Engineering
Table 1 Means (120583) and standard deviations (120590) of hotspot temperature distribution for active cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 5193 403 6125 540 7002 669 7935 8067 5920 425 6973 569 7964 705 9017 8508 6669 448 7846 600 8954 743 10131 895
Table 2 Means (120583) and standard deviations (120590) of hotspot temperature distribution for inactive cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 4059 171 4708 229 5319 284 5968 3427 4778 195 5546 261 6269 323 7037 389
Dynamic power
FPRe
gIT
LB IL1
DTB DL1 L2
Dec
ode
Bran
chRA
TRU
ULS
QIA
LU1
IALU
2IA
LU3
IALU
4FP
Add
FPM
ulIn
tReg
Leakage power for active coreLeakage power for inactive core
02468
1012
Erro
r rat
e (
)
Figure 5 Error rate of the power estimation
impact of estimation errors on the analysis of temperatureand frequency the error terms Edyn Eleaa and Eleaina areintroduced into the linear models of dynamic power andleakage power as expressed in (21) (22) and (24)
73 Probabilistic Distribution of Temperature Based on theassumption that no DTM techniques are used to controlthe temperature of processors according to Theorem 4the probabilistic distribution of the hotspot temperaturesfor both active cores and inactive cores can be obtainedTable 1 presents the means and the standard deviations ofthe probabilistic distribution of the hotspot temperature foractive cores and the corresponding probabilistic densitycurves are as shown in Figure 6 It can be seen that thehotspot temperature of processors is not deterministic andhas significant variations for a certain number of active coresand a certain frequency The number of active cores and therunning frequency simultaneously determine the range inwhich the temperature lies and the probabilistic distributionof the hotspot temperature For the same running frequencymore active cores will yield higher temperature and viceversa For the same number of active cores higher frequencywill bring higher temperature and vice versa Accordingto the characteristics of normal distribution curve it can
be known that the shape of probabilistic density curvecorresponds to the variations of data distribution dependingon the standard deviation of random variables The curvewith a higher peak implies a smaller standard deviationthat is a lower variation of data distribution whereas thecurve with a lower peak implies a bigger standard deviationthat is a larger variation it can be seen from Figure 6 thatthe probabilistic density curves corresponding to variousfrequencies have different peaks This observation impliesthat the degree of temperature variation has close correlationwith working frequency and higher frequency will yieldhigher variation of temperature
Table 2 presents the means and standard deviations ofthe probabilistic distribution of the hotspot temperature forinactive cores and the corresponding probabilistic densitycurves are as shown in Figure 7 When the number of activecores is eight inactive core does not exist so there is nocase where the number of active cores is eight in Table 2and Figure 7 The effect of the frequency and the number ofactive cores at the hotspot temperature for inactive cores is thesame as that for active cores except that the mean value andvariation of hotspot temperature of inactive cores are lowerthan those of active cores under the same frequency and thenumber of active cores
74 Probabilistic Distribution of Frequencies If the power-gating and the DFS techniques are simultaneously usedto manage the temperature of a processor according toTheorem 5 the probabilistic distribution of working frequen-cies can be determined Figure 8 presents the probabilisticdistribution of frequencies when the number of active cores issix seven and eight respectively When the frequency is lessthan 1 it is implied that the hotspot temperature surpasses thethreshold and the DFS technique is triggered to reduce thefrequency of the processor So the probability for triggeringDFS can be obtained from the probabilistic distribution offrequencies Figure 9 presents the probability for triggeringDFS when the number of active cores is six seven and eightrespectively
Mathematical Problems in Engineering 13
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
0
002
004
006
008
01Pr
obab
ility
den
sity
40 60 80 100 120 14020Temperature (∘C)
(a)
40 60 80 100 120 14020Temperature (∘C)
0
002
004
006
008
01
Prob
abili
ty d
ensit
yFrequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
0
002
004
006
008
01
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(c)
Figure 6 The probability density of the hotspot temperature for active cores (a) The number of active cores is six (b) The number of activecores is seven (c) The number of active cores is eight
If a core is powered off and made inactive using thepower-gating technique it only dissipates the leakage powerwhich is much less than that of an active core so the powerdissipated by the processor is reduced significantlyWhen thenumber of active cores is less than six that is more thantwo cores are powered off the decreased power makes it
enough for the rest of active cores of a processor to executeat the full speed and the DFS is not necessary to be triggeredTherefore when the number of active cores is less than six thefrequency of the processor is constantly 1 and the probabilityfor triggering DFS is constantly 0 This situation is not givenin Figures 8 and 9
14 Mathematical Problems in Engineering
Table 3 Comparisons of temperatures between active cores with and without DFS
Number of active cores Average hotspot temperature (∘C) Probability for exceedingtemperature threshold ()
With DFS Without DFS With DFS Without DFS6 7930 7935 0 0527 8885 9017 0 12378 9385 10131 0 5584
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(a)
40 60 80 100 120 14020Temperature (∘C)
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
Figure 7The probability density of the hotspot temperature for inactive cores (a)The number of active cores is six (b)The number of activecores is seven
It can be seen that various numbers of active cores resultin different probabilistic distributions of frequencies anddifferent probabilities for triggering DFS When all cores arepowered on the probability that the processor runs at themaximum frequency is only 4416 and the probability fortriggering DFS is 5584This means that all cores will run atthe full speed only when the IPCs of tasks are less If the IPCsincrease to an extent the running frequency will be scaleddown to control the temperature under the threshold Asthe number of active cores decreases the probability that theprocessor runs at the maximum frequency increases and theprobability for triggering DFS decreases When the numberof active cores is less than six no matter what the IPCs arethe saved power by shutting off more than two cores makes itdeterministic for the active cores to run at the full speed sothe probability that the processor runs at the maximum fre-quency is 100 and the probability for triggering DFS is 0
75 Comparisons of Temperatures with and without DFS IftheDFS technique is not used then the processor always runs
at the full speed that is the running frequency is constantly 1So the average hotspot temperature of the active core withoutthe DFS can be obtained according to (52) If both the power-gating and DFS techniques are used simultaneously for thedynamic thermal management then the running frequencycan be scaled to control the temperature of the processorunder the thermal threshold According to (63) the averagehotspot temperature of the active core with the DFS canbe obtained In terms of the average hotspot temperatureand the probability that the hotspot temperature exceeds thethreshold Table 3 presents the comparative results betweenthe active cores with and without the DFS when the numberof active cores is 6 7 and 8
If the DFS technique is used by the processor the runningfrequency will be scaled down to reduce the temperatureonce the hotspot temperature reaches the threshold So thehotspot temperature will not exceed the threshold that isthe probability that the hotspot temperature exceeds thethreshold is 0 If the DFS technique is not used by the pro-cessor the running frequency is always the maximum and
Mathematical Problems in Engineering 15
Number of active cores
Frequency is 1 Frequency is 083Frequency is 067 Frequency is 05
0
20
40
60
80
100
Prob
abili
ty o
f fre
quen
cy (
)
6 7 8
Figure 8 Probabilistic distribution of frequencies
6 7 8Number of active cores
0
10
20
30
40
50
60
Prob
abili
ty fo
r trig
gerin
g D
FS (
)
Figure 9 Probability for triggering DFS
the hotspot temperature is possible to exceed the thresholdthat is the probability that the hotspot temperature exceedsthe threshold is larger than 0Therefore the average hotspottemperature of the processor with the DFS is lower than thatwithout the DFS
As the number of active cores decreases the saved powerby shutting off cores makes it more possible for the activecores to execute at the full speed and the effect of the DFSon cooling down the processor weakens until it disappearsTherefore for the processors with and without the DFS the
average hotspot temperatures become closer as the number ofactive cores decreases as shown in Table 3 Even though theDFS is not used the probability that the hotspot temperatureexceeds the threshold will reduce until 0 when more coresare powered off
When the number of active cores is lower than 6 that ismore than 2 cores are powered off no matter what speed theprocessor runs at the hotspot temperature will not exceedthe threshold so all active cores are not necessary to triggerthe DFS for managing the temperature of the processor andcan run at the maximum frequency Therefore when thenumber of active cores is lower than 6 the probability thatthe hotspot temperature exceeds the threshold is 0 andthe average hotspot temperatures of the processor with andwithout theDFS are sameThere is no differencewith theDFSandwithout theDFSwhen the number of active cores is lowerthan 6 so the comparisons in this situation are not given inTable 3
8 Conclusions
In this paper a probabilistic analysis method of the tem-perature and frequency of multicore processors is presentedtaking the variation of workloads into account It is provedtheoretically in this paper that (1) the hotspot temperaturesof both active cores and inactive cores are the linear functionsof the IPC (2) the hotspot temperature follows the normalprobabilistic distribution based on the assumption that IPCsof all cores follow the same normal distribution and (3) therunning frequency follows a probabilistic distribution
From the experimental results it can be seen that (1) theestimation error rates of the dynamic powers for differentfunctional units have significant variations indicating thatthe linear correlations between the IPC and the dynamicpowers for various functional units are different (2) theestimation error rates of leakage powers for both activecores and inactive cores are similar showing similar linearcorrelations between temperature and leakage power acrossvarious functional units (3) a higher estimation accuracy ofleakage powers can be obtained at the temperature intervalbetween 60∘C and 100∘C at the cost of lower accuracy atother intervals (4) the hotspot temperature of the proces-sor is not deterministic and has significant variation fora certain number of active cores and a certain frequencyand the number of active cores and the running frequencydetermine simultaneously the probabilistic distribution ofhotspot temperature and (5) various numbers of active coresresult in different probabilistic distributions of frequenciesand different probabilities for triggering DFS
Competing Interests
The authors declare no competing interests
Authorsrsquo Contributions
Biying Zhang performed the theoretical analysis and wrotethe major part of this paper Gang Cui and Zhongchuan
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
8 Mathematical Problems in Engineering
Let1205951
(H 119891) = HΦ7
(119891)
1205952
(H 119899 119899a 119891) = H (Φ5
(119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) +Φ6
(119899 119899a 119891))
1205953
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+Φ8
(119899 119899a 119891))
1205954
(H 119899 119899a) = H (119899aΦ5 (119899 119899a) (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ9
(119899 119899a) +Φ9 (119899 119899a))
1205955
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a) +Φ10 (119899 119899a))
1205956
(H 119899 119899a 119891) = H (119899aΦ5 (119899 119899a)
sdot (E minus 119899aΦ5 (119899 119899a))minus1
Φ11
(119899 119899a 119891)
+Φ11
(119899 119899a 119891))
(43)
Then (42) is transformed into
119879hota119894 = 1205951
(H 119891)119883a119894 + 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn + 1205954
(H 119899 119899a)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(44)
Substitute (40) into (33) and get
Tina = (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa + (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+ 119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891)
(45)
The hotspot temperature 119879hotina of the inactive core isgiven by
119879hotina = HTina = H (Φ2
(119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdot (119899aΦ6 (119899 119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
sdot
119899a
sum
119896=1
119883a119896 +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ8
(119899 119899a 119891)
+ 119891119899aΦ2 (119899 119899a)Q)Edyn +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ9
(119899 119899a)
+ 119899aΦ2 (119899 119899a)Q)Eleaa +H (119899aΦ2 (119899 119899a)
sdotQGleaa (E minus 119899aΦ5 (119899 119899a))minus1
Φ10
(119899 119899a)
+Φ2
(119899 119899a) (R + (119899 minus 119899a)Q))Eleaina
+H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
sdotΦ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(46)
Let1205851
(H 119899 119899a 119891)
= H (Φ2
(119899 119899a)QGleaa (E minus 119899aΦ5 (119899 119899a))minus1
(119899aΦ6 (119899
119899a 119891) +Φ7
(119891)) + 119891Φ2
(119899 119899a)QGdyn)
1205852
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ8
(119899 119899a 119891) + 119891119899aΦ2 (119899 119899a)Q)
1205853
(H 119899 119899a) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ9
(119899 119899a) + 119899aΦ2 (119899 119899a)Q)
1205854
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ10
(119899 119899a) +Φ2 (119899 119899a) (R + (119899 minus 119899a)Q))
1205855
(H 119899 119899a 119891) = H (119899aΦ2 (119899 119899a)QGleaa (E minus 119899aΦ5 (119899
119899a))minus1
Φ11
(119899 119899a 119891) +Φ2
(119899 119899a)Φ1 (119899 119899a 119891))
(47)
Then (46) is transformed into
119879hot119904 = 1205851 (H 119899 119899a 119891)
119899a
sum
119896=1
119883a119896 + 1205852 (H 119899 119899a 119891)Edyn
+ 1205853
(H 119899 119899a 119891)Eleaa
+ 1205854
(H 119899 119899a 119891)Eleaina + 1205855 (H 119899 119899a 119891)
(48)
Mathematical Problems in Engineering 9
According toTheorem 3 it can be known that the hotspottemperature of any core can be expressed as the linearfunction of IPCs of all cores
Theorem 4 Suppose that (a) the random variable 119883119886119894
forthe IPC follows normal distribution with the mean 120583
119883
andthe variance 120590
2
119883
that is 119883119886119894
sim 119873(120583119883
1205902
119883
) (b) the tasksrunning on different cores are mutually independent that is119883119886119894
and 119883119886119895
are independent for 1 le forall119894 119895 le 119898 and (c)workload balancing techniques are used in the processor suchthat 119883
119886119894
and 119883119886119895
follow the same normal distribution Thenthere exist functions 120583
119879119886
(H 119899 119899119886
119891) and 120590119879119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119886119894
of the active corefollows the normal distributionwithmean120583
119879119886
(H 119899 119899119886
119891) andvariance 120590
119879119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119886119894
sim 119873(120583119879119886
(H 119899 119899119886
119891) 120590119879119886
2
(H 119899 119899119886
119891)) (49)
There exist functions 120583119879119894119899119886
(H 119899 119899119886
119891) and 120590119879119894119899119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119894119899119886
of the inactive corefollows the normal distribution with mean 120583
119879119894119899119886
(H 119899 119899119886
119891)
and variance 120590119879119894119899119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119894119899119886
sim 119873(120583119879119894119899119886
(H 119899 119899119886
119891) 120590119879119894119899119886
2
(H 119899 119899119886
119891))
(50)
Proof According to (29) get
119879hota119894 = (1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894
+ 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119896 =119894
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn
+ 1205954
(H 119899 119899a 119891)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(51)
The norm-distributed random variables 119883a119894 and 119883a119895 areindependent in the case of 1 le forall119894 119895 le 119898 so that thelinear combination (120595
1
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 +
1205952
(H 119899 119899a 119891)sum119899a119896=1119896 =119894
119883a119896 of 119883a119894 still follows the normaldistribution
All elements Edyn119896 in the random vector Edyn follow nor-mal distribution and are independent so that the linear com-bination 120595
3
(H 119899 119899a 119891)Edyn of all elements in Edyn followsnormal distribution In the sameway120595
4
(H 119899 119899a 119891)Eleaa and1205955
(H 119899 119899a 119891)Eleaina also follow normal distribution(1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 1205952
(H 119899 119899a119891)sum119899a119896=1119896 =119894
119883a119896 1205953
(H 119899 119899a 119891)Edyn 1205954
(H 119899 119899a 119891)Eleaaand 120595
5
(H 119899 119899a 119891)Eleaina all follow normal distribution andare mutually independent so that 119879hota119894 follows normal
distribution where the mean 120583119879ina(H 119899 119899a 119891) is calculated
by
120583119879a (H 119899 119899a 119891)
= (1205951
(H 119891) + 119899a1205952
(H 119899 119899a 119891)) 120583119883
+ 1205953
(H 119899 119899a 119891)120583119864119889
+ 1205954
(H 119899 119899a)120583119864a
+ 1205955
(H 119899 119899a 119891)120583119864ina + 1205956 (H 119899 119899a 119891)
(52)
and the variance 120590119879ina2
(H 119899 119899a 119891) is calculated by
120590119879a2
(H 119899 119899a 119891)
= (1205951
(H 119891) + 1205952
(H 119899 119899a 119891))2
1205902
119883
+ (119899a minus 1)1205952
2
(H 119899 119899a 119891) 1205902
119883
+ 1205953
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205954
2
(H 119899 119899a)1205902
119864a
+ 1205955
2
(H 119899 119899a 119891)1205902
119864ina
(53)
In the same way 1205851
(H 119899 119899a 119891)sum119899a119896=1
119883a119896 1205852(H 119899 119899a119891)Edyn 1205853(H 119899 119899a 119891)Eleaa and 1205854(H 119899 119899a 119891)Eleaina in (30)all follow the normal distribution and are mutually indepen-dent so that 119879hot119904 follows the normal distribution where themean 120583
119879119904
(H 119899 119899a 119891) is calculated by
120583119879119904
(H 119899 119899a 119891) = 119899a1205851 (H 119899 119899a 119891) 120583119883
+ 1205852
(H 119899 119899a 119891)120583119864119889
+ 1205853
(H 119899 119899a)120583119864a
+ 1205854
(H 119899 119899a 119891)120583119864ina
+ 1205855
(H 119899 119899a 119891)
(54)
and the variation 120590119879119904
2
(H 119899 119899a 119891) is calculated by
120590119879119904
2
(H 119899 119899a 119891) = 119899a12058512
(H 119899 119899a 119891) 1205902
119883
+ 1205852
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205853
2
(H 119899 119899a)1205902
119864a
+ 1205854
2
(H 119899 119899a 119891)1205902
119864ina
(55)
According toTheorem 4 it can be known that the hotspottemperature of any core follows the normal probabilisticdistribution
6 Frequency Analysis
The zero-slack policy is used as the DTM strategy of proces-sors that is the speed of processor is set to a value whichmakes the temperature of the hotspot be the threshold [7]However the frequencies are discrete in this workThereforein most cases there is no frequency in the set of frequenciesmaking the hotspot temperature be the threshold exactly
10 Mathematical Problems in Engineering
Theorem 5 Let 119865119878119890119905 = 1198911
119891120587
119891120587+1
119891119891119888119900119906119899119905
be the set of frequencies of multicore processors where1198911
lt sdot sdot sdot lt 119891120587
lt 119891120587+1
lt sdot sdot sdot lt 119891119891119888119900119906119899119905
then the probabilisticdistribution of the frequency f follows
119875 119891 = 119891120587
=
119875 119879ℎ119900119905119886
(119899119886
119891120587
) lt 119879V119886119897V119890 119891120587
= 119891119888119900119906119899119905
119875 119879ℎ119900119905119886
(119899119886
119891120587
) le 119879V119886119897V119890 minus 119875 119879ℎ119900119905119886
(119899119886
119891120587+1
) le 119879V119886119897V119890 119891120587
lt 119891119888119900119906119899119905
(56)
where119879ℎ119900119905119886
(119899119886
119891120587
) is the function of the number of active cores119899119886
and the frequency 119891120587
representing the hotspot temperatureof the active core and 119879V119886119897V119890 is the temperature threshold of theprocessor
Proof When 119891120587
= 119891count according to the zero-slack DTMpolicy obviously
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve (57)
When 119891120587
lt 119891count then the probabilities of 119891 = 119891120587
can bebroken into two cases
(a) if 119879hota(119899a 119891120587) gt 119879hota(119899a 119891120587+1) according to thezero-slack DTM strategy the probability of 119891 = 119891
120587
is 0 that is
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) gt 119879hota (119899a 119891120587+1) = 0 (58)
(b) if 119879hota(119899a 119891120587) le 119879hota(119899a 119891120587+1) the probability of119891 = 119891
120587
is given by
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
(59)
The right-hand side of (59) can be derived by
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve minus 119875 119879hota (119899a 119891120587)
le 119879valve 119879hota (119899a 119891120587+1) le 119879valve | 119879hota (119899a 119891120587)
le 119879hota (119899a 119891120587+1)
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
le 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587+1) le 119879valve
(60)
According to (59) and (60) get
119875 119891119879hota = (119899a 119891120587) 119891120587 le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve
(61)
Hence when 119891120587
lt 119891count according to (58) and (61) get
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve (62)
Therefore according to Theorem 5 the probabilistic dis-tribution of the set of frequencies can be obtained based onthe assumption that the zero-slack policy is used
Given the probabilistic distribution of the frequency andthe mean of hotspot temperature of the active core for acertain frequency the average hotspot temperature of theactive core 119879
averhota can be obtained by
119879averhota = sum
119891isin119865Set120583119879a (H 119899 119899a 119891) sdot 119875 (119891) (63)
where 119865Set = 1198911
119891120587
119891120587+1
119891119891count denotes the set
of frequencies of multicore processors and 119875(119891) denotes theprobability that the frequency is 119891 120583
119879a(H 119899 119899a 119891) denotesthe mean of hotspot temperature of the active core given thefrequency 119891
7 Experimental Results
71 Experimental Methodology Amulticore version of Alpha21264 processor is used as the processor model in ourexperiment [24] and there are eight cores in the processorThe cores have two working states active state and inac-tive state and the working state of each core can be setseparately The processor employs a global DFS techniquewhich means that frequencies of all cores in the processorare scaled uniformly There are four discrete frequenciesused by the processor that is 15 GHz 2GHz 25 GHz and3GHz To facilitate analysis the frequencies are normalizedinto the interval [0 1] so that the maximum frequency isnormalized to 1 After normalization the set of frequenciesis 05 067 083 1 According to our previous work thehotspot of Alpha 21264 processor is the branch predictor[18] So the second element of the hotspot selection vectorH corresponding to the branch predictor is set to 1 and theother elements are set to 0 sThe thermal threshold that is themaximum temperature allowed by processor is set to 100∘C
TheHotSpot is used as the thermalmodel of themulticoreprocessor and the parameters such as thermal conductanceand capacitance are set to default values of HotSpot sim-ulation tool [3] In order to construct the linear model ofdynamic power PTScalar is modified to obtain both the
Mathematical Problems in Engineering 11
dynamic power profiles of each functional unit in Alpha21264 processor and the IPC profiles [25] The parameters ofPTScalar are set to default values as well The mean 120583
119883
andvariance 120590
2
119883
of norm-distributed 119883a119894 are determined usingIPC profiles based on the maximum likelihood estimationSome representative tasks such as mesa ammp quake bzipmcfmath and qsort fromMiBench [26] and SPECCPU2000[27] are selected as the benchmarks These selected tasks aremutually independent that is no task takes precedence overthe others so the tasks can be parallel executed at the tasklevel In addition workload balancing techniques are used inthe processor A task is not fixed on a core and the tasks canbe migrated among all cores such that the IPCs of differentcores are equal The simple linear regression analysis is usedto determine the coefficients Gdyn and G
1198890
in (21) and thevariance 1205902
119864119889
of regression residuals Edyn is obtainedThe leakage power is the nonlinear monotonic increasing
function of temperature which is given by [25]
119875119897
= 120572 sdot 1198792
sdot 119890120574119879
+ 120573 (64)
where 120572 120573 and 120574 are parameters which depend on topologysize technology and design of processors In order toconstruct the linear model of leakage power (64) is regressedlinearly to determine the coefficients Gleaa and G
1198970a in (22)and the coefficients Gleaina and G
1198970ina in (24) as well asthe variance 1205902
119864a and 1205902
119864ina of regression residuals Eleaa andEleaina The parameters 120572 120573 and 120574 are set to the defaultvalues of PTScalar The smaller the range of temperaturethe higher the linear correlation between leakage power andtemperature [7] The temperature of a processor using DTMtechniques does not exceed the maximum value and thelower temperature has no impact on the design optimizationof thermal-aware processors Therefore the linear regressionanalysis of leakage powers is performed at the temperatureinterval between 60∘C and 100∘C and the regression resultsare used to estimate the leakage power at the whole tempera-ture interval
72 Estimated Accuracy For the hotspot of the processorthat is the branch predictor Figure 3 shows the comparisonbetween the actual value and the estimated value of leakagepower for active cores and Figure 4 shows that for inactivecores A higher estimation accuracy of leakage power isobtained at the temperature interval between 60∘C and 100∘Cat the cost of lower accuracy at the other intervals
After regression analysis the dynamic and leakage powercan be estimated with the linear model in (21) (22) and(24) Figure 5 shows the estimation error rate of the dynamicpower and that of the leakage power for active cores andinactive cores in thermal range between 60∘C and 100∘C Itcan be seen that the error rates of the dynamic powers fordifferent functional units have significant variations For thedecoder the estimation error rate of dynamic power is only262 but 1014 for the floating point register (FPReg) Thereason for this fact is that the linear correlations betweenthe IPC and the dynamic powers for various functionalunits are different Lower error rate results from highercorrelation Obviously the IPC and the dynamic power of
Actual leakage powerEstimated leakage power
60 80 100 12040Temperature (∘C)
minus02
0
02
04
06
08
1
Leak
age p
ower
(W)
Figure 3 Leakage power estimation of hotspot for active cores
Actual leakage powerEstimated leakage power
times10minus4
60 80 100 12040Temperature (∘C)
minus2
0
2
4
6
8
10
12
14
Leak
age p
ower
(W)
Figure 4 Leakage power estimation of hotspot for inactive cores
the decoder have the highest linear correlation while theIPC and that of FPReg have the lowest linear correlationIt can also be seen from Figure 5 that the estimation errorrates of leakage power for both active cores and inactivecores are similar The estimation error rates of leakage powerfor active cores are between 315 and 326 and those forinactive cores are between 306 and 591 This is becausethe temperature and the leakage power of various functionalunits have similar linear correlations In order to consider the
12 Mathematical Problems in Engineering
Table 1 Means (120583) and standard deviations (120590) of hotspot temperature distribution for active cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 5193 403 6125 540 7002 669 7935 8067 5920 425 6973 569 7964 705 9017 8508 6669 448 7846 600 8954 743 10131 895
Table 2 Means (120583) and standard deviations (120590) of hotspot temperature distribution for inactive cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 4059 171 4708 229 5319 284 5968 3427 4778 195 5546 261 6269 323 7037 389
Dynamic power
FPRe
gIT
LB IL1
DTB DL1 L2
Dec
ode
Bran
chRA
TRU
ULS
QIA
LU1
IALU
2IA
LU3
IALU
4FP
Add
FPM
ulIn
tReg
Leakage power for active coreLeakage power for inactive core
02468
1012
Erro
r rat
e (
)
Figure 5 Error rate of the power estimation
impact of estimation errors on the analysis of temperatureand frequency the error terms Edyn Eleaa and Eleaina areintroduced into the linear models of dynamic power andleakage power as expressed in (21) (22) and (24)
73 Probabilistic Distribution of Temperature Based on theassumption that no DTM techniques are used to controlthe temperature of processors according to Theorem 4the probabilistic distribution of the hotspot temperaturesfor both active cores and inactive cores can be obtainedTable 1 presents the means and the standard deviations ofthe probabilistic distribution of the hotspot temperature foractive cores and the corresponding probabilistic densitycurves are as shown in Figure 6 It can be seen that thehotspot temperature of processors is not deterministic andhas significant variations for a certain number of active coresand a certain frequency The number of active cores and therunning frequency simultaneously determine the range inwhich the temperature lies and the probabilistic distributionof the hotspot temperature For the same running frequencymore active cores will yield higher temperature and viceversa For the same number of active cores higher frequencywill bring higher temperature and vice versa Accordingto the characteristics of normal distribution curve it can
be known that the shape of probabilistic density curvecorresponds to the variations of data distribution dependingon the standard deviation of random variables The curvewith a higher peak implies a smaller standard deviationthat is a lower variation of data distribution whereas thecurve with a lower peak implies a bigger standard deviationthat is a larger variation it can be seen from Figure 6 thatthe probabilistic density curves corresponding to variousfrequencies have different peaks This observation impliesthat the degree of temperature variation has close correlationwith working frequency and higher frequency will yieldhigher variation of temperature
Table 2 presents the means and standard deviations ofthe probabilistic distribution of the hotspot temperature forinactive cores and the corresponding probabilistic densitycurves are as shown in Figure 7 When the number of activecores is eight inactive core does not exist so there is nocase where the number of active cores is eight in Table 2and Figure 7 The effect of the frequency and the number ofactive cores at the hotspot temperature for inactive cores is thesame as that for active cores except that the mean value andvariation of hotspot temperature of inactive cores are lowerthan those of active cores under the same frequency and thenumber of active cores
74 Probabilistic Distribution of Frequencies If the power-gating and the DFS techniques are simultaneously usedto manage the temperature of a processor according toTheorem 5 the probabilistic distribution of working frequen-cies can be determined Figure 8 presents the probabilisticdistribution of frequencies when the number of active cores issix seven and eight respectively When the frequency is lessthan 1 it is implied that the hotspot temperature surpasses thethreshold and the DFS technique is triggered to reduce thefrequency of the processor So the probability for triggeringDFS can be obtained from the probabilistic distribution offrequencies Figure 9 presents the probability for triggeringDFS when the number of active cores is six seven and eightrespectively
Mathematical Problems in Engineering 13
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
0
002
004
006
008
01Pr
obab
ility
den
sity
40 60 80 100 120 14020Temperature (∘C)
(a)
40 60 80 100 120 14020Temperature (∘C)
0
002
004
006
008
01
Prob
abili
ty d
ensit
yFrequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
0
002
004
006
008
01
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(c)
Figure 6 The probability density of the hotspot temperature for active cores (a) The number of active cores is six (b) The number of activecores is seven (c) The number of active cores is eight
If a core is powered off and made inactive using thepower-gating technique it only dissipates the leakage powerwhich is much less than that of an active core so the powerdissipated by the processor is reduced significantlyWhen thenumber of active cores is less than six that is more thantwo cores are powered off the decreased power makes it
enough for the rest of active cores of a processor to executeat the full speed and the DFS is not necessary to be triggeredTherefore when the number of active cores is less than six thefrequency of the processor is constantly 1 and the probabilityfor triggering DFS is constantly 0 This situation is not givenin Figures 8 and 9
14 Mathematical Problems in Engineering
Table 3 Comparisons of temperatures between active cores with and without DFS
Number of active cores Average hotspot temperature (∘C) Probability for exceedingtemperature threshold ()
With DFS Without DFS With DFS Without DFS6 7930 7935 0 0527 8885 9017 0 12378 9385 10131 0 5584
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(a)
40 60 80 100 120 14020Temperature (∘C)
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
Figure 7The probability density of the hotspot temperature for inactive cores (a)The number of active cores is six (b)The number of activecores is seven
It can be seen that various numbers of active cores resultin different probabilistic distributions of frequencies anddifferent probabilities for triggering DFS When all cores arepowered on the probability that the processor runs at themaximum frequency is only 4416 and the probability fortriggering DFS is 5584This means that all cores will run atthe full speed only when the IPCs of tasks are less If the IPCsincrease to an extent the running frequency will be scaleddown to control the temperature under the threshold Asthe number of active cores decreases the probability that theprocessor runs at the maximum frequency increases and theprobability for triggering DFS decreases When the numberof active cores is less than six no matter what the IPCs arethe saved power by shutting off more than two cores makes itdeterministic for the active cores to run at the full speed sothe probability that the processor runs at the maximum fre-quency is 100 and the probability for triggering DFS is 0
75 Comparisons of Temperatures with and without DFS IftheDFS technique is not used then the processor always runs
at the full speed that is the running frequency is constantly 1So the average hotspot temperature of the active core withoutthe DFS can be obtained according to (52) If both the power-gating and DFS techniques are used simultaneously for thedynamic thermal management then the running frequencycan be scaled to control the temperature of the processorunder the thermal threshold According to (63) the averagehotspot temperature of the active core with the DFS canbe obtained In terms of the average hotspot temperatureand the probability that the hotspot temperature exceeds thethreshold Table 3 presents the comparative results betweenthe active cores with and without the DFS when the numberof active cores is 6 7 and 8
If the DFS technique is used by the processor the runningfrequency will be scaled down to reduce the temperatureonce the hotspot temperature reaches the threshold So thehotspot temperature will not exceed the threshold that isthe probability that the hotspot temperature exceeds thethreshold is 0 If the DFS technique is not used by the pro-cessor the running frequency is always the maximum and
Mathematical Problems in Engineering 15
Number of active cores
Frequency is 1 Frequency is 083Frequency is 067 Frequency is 05
0
20
40
60
80
100
Prob
abili
ty o
f fre
quen
cy (
)
6 7 8
Figure 8 Probabilistic distribution of frequencies
6 7 8Number of active cores
0
10
20
30
40
50
60
Prob
abili
ty fo
r trig
gerin
g D
FS (
)
Figure 9 Probability for triggering DFS
the hotspot temperature is possible to exceed the thresholdthat is the probability that the hotspot temperature exceedsthe threshold is larger than 0Therefore the average hotspottemperature of the processor with the DFS is lower than thatwithout the DFS
As the number of active cores decreases the saved powerby shutting off cores makes it more possible for the activecores to execute at the full speed and the effect of the DFSon cooling down the processor weakens until it disappearsTherefore for the processors with and without the DFS the
average hotspot temperatures become closer as the number ofactive cores decreases as shown in Table 3 Even though theDFS is not used the probability that the hotspot temperatureexceeds the threshold will reduce until 0 when more coresare powered off
When the number of active cores is lower than 6 that ismore than 2 cores are powered off no matter what speed theprocessor runs at the hotspot temperature will not exceedthe threshold so all active cores are not necessary to triggerthe DFS for managing the temperature of the processor andcan run at the maximum frequency Therefore when thenumber of active cores is lower than 6 the probability thatthe hotspot temperature exceeds the threshold is 0 andthe average hotspot temperatures of the processor with andwithout theDFS are sameThere is no differencewith theDFSandwithout theDFSwhen the number of active cores is lowerthan 6 so the comparisons in this situation are not given inTable 3
8 Conclusions
In this paper a probabilistic analysis method of the tem-perature and frequency of multicore processors is presentedtaking the variation of workloads into account It is provedtheoretically in this paper that (1) the hotspot temperaturesof both active cores and inactive cores are the linear functionsof the IPC (2) the hotspot temperature follows the normalprobabilistic distribution based on the assumption that IPCsof all cores follow the same normal distribution and (3) therunning frequency follows a probabilistic distribution
From the experimental results it can be seen that (1) theestimation error rates of the dynamic powers for differentfunctional units have significant variations indicating thatthe linear correlations between the IPC and the dynamicpowers for various functional units are different (2) theestimation error rates of leakage powers for both activecores and inactive cores are similar showing similar linearcorrelations between temperature and leakage power acrossvarious functional units (3) a higher estimation accuracy ofleakage powers can be obtained at the temperature intervalbetween 60∘C and 100∘C at the cost of lower accuracy atother intervals (4) the hotspot temperature of the proces-sor is not deterministic and has significant variation fora certain number of active cores and a certain frequencyand the number of active cores and the running frequencydetermine simultaneously the probabilistic distribution ofhotspot temperature and (5) various numbers of active coresresult in different probabilistic distributions of frequenciesand different probabilities for triggering DFS
Competing Interests
The authors declare no competing interests
Authorsrsquo Contributions
Biying Zhang performed the theoretical analysis and wrotethe major part of this paper Gang Cui and Zhongchuan
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 9
According toTheorem 3 it can be known that the hotspottemperature of any core can be expressed as the linearfunction of IPCs of all cores
Theorem 4 Suppose that (a) the random variable 119883119886119894
forthe IPC follows normal distribution with the mean 120583
119883
andthe variance 120590
2
119883
that is 119883119886119894
sim 119873(120583119883
1205902
119883
) (b) the tasksrunning on different cores are mutually independent that is119883119886119894
and 119883119886119895
are independent for 1 le forall119894 119895 le 119898 and (c)workload balancing techniques are used in the processor suchthat 119883
119886119894
and 119883119886119895
follow the same normal distribution Thenthere exist functions 120583
119879119886
(H 119899 119899119886
119891) and 120590119879119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119886119894
of the active corefollows the normal distributionwithmean120583
119879119886
(H 119899 119899119886
119891) andvariance 120590
119879119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119886119894
sim 119873(120583119879119886
(H 119899 119899119886
119891) 120590119879119886
2
(H 119899 119899119886
119891)) (49)
There exist functions 120583119879119894119899119886
(H 119899 119899119886
119891) and 120590119879119894119899119886
2
(H 119899 119899119886
119891)such that the hotspot temperature 119879
ℎ119900119905119894119899119886
of the inactive corefollows the normal distribution with mean 120583
119879119894119899119886
(H 119899 119899119886
119891)
and variance 120590119879119894119899119886
2
(H 119899 119899119886
119891) that is
119879ℎ119900119905119894119899119886
sim 119873(120583119879119894119899119886
(H 119899 119899119886
119891) 120590119879119894119899119886
2
(H 119899 119899119886
119891))
(50)
Proof According to (29) get
119879hota119894 = (1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894
+ 1205952
(H 119899 119899a 119891)
119899a
sum
119896=1
119896 =119894
119883a119896
+ 1205953
(H 119899 119899a 119891)Edyn
+ 1205954
(H 119899 119899a 119891)Eleaa
+ 1205955
(H 119899 119899a 119891)Eleaina + 1205956
(H 119899 119899a 119891)
(51)
The norm-distributed random variables 119883a119894 and 119883a119895 areindependent in the case of 1 le forall119894 119895 le 119898 so that thelinear combination (120595
1
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 +
1205952
(H 119899 119899a 119891)sum119899a119896=1119896 =119894
119883a119896 of 119883a119894 still follows the normaldistribution
All elements Edyn119896 in the random vector Edyn follow nor-mal distribution and are independent so that the linear com-bination 120595
3
(H 119899 119899a 119891)Edyn of all elements in Edyn followsnormal distribution In the sameway120595
4
(H 119899 119899a 119891)Eleaa and1205955
(H 119899 119899a 119891)Eleaina also follow normal distribution(1205951
(H 119899 119899a 119891) + 1205952
(H 119899 119899a 119891))119883a119894 1205952
(H 119899 119899a119891)sum119899a119896=1119896 =119894
119883a119896 1205953
(H 119899 119899a 119891)Edyn 1205954
(H 119899 119899a 119891)Eleaaand 120595
5
(H 119899 119899a 119891)Eleaina all follow normal distribution andare mutually independent so that 119879hota119894 follows normal
distribution where the mean 120583119879ina(H 119899 119899a 119891) is calculated
by
120583119879a (H 119899 119899a 119891)
= (1205951
(H 119891) + 119899a1205952
(H 119899 119899a 119891)) 120583119883
+ 1205953
(H 119899 119899a 119891)120583119864119889
+ 1205954
(H 119899 119899a)120583119864a
+ 1205955
(H 119899 119899a 119891)120583119864ina + 1205956 (H 119899 119899a 119891)
(52)
and the variance 120590119879ina2
(H 119899 119899a 119891) is calculated by
120590119879a2
(H 119899 119899a 119891)
= (1205951
(H 119891) + 1205952
(H 119899 119899a 119891))2
1205902
119883
+ (119899a minus 1)1205952
2
(H 119899 119899a 119891) 1205902
119883
+ 1205953
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205954
2
(H 119899 119899a)1205902
119864a
+ 1205955
2
(H 119899 119899a 119891)1205902
119864ina
(53)
In the same way 1205851
(H 119899 119899a 119891)sum119899a119896=1
119883a119896 1205852(H 119899 119899a119891)Edyn 1205853(H 119899 119899a 119891)Eleaa and 1205854(H 119899 119899a 119891)Eleaina in (30)all follow the normal distribution and are mutually indepen-dent so that 119879hot119904 follows the normal distribution where themean 120583
119879119904
(H 119899 119899a 119891) is calculated by
120583119879119904
(H 119899 119899a 119891) = 119899a1205851 (H 119899 119899a 119891) 120583119883
+ 1205852
(H 119899 119899a 119891)120583119864119889
+ 1205853
(H 119899 119899a)120583119864a
+ 1205854
(H 119899 119899a 119891)120583119864ina
+ 1205855
(H 119899 119899a 119891)
(54)
and the variation 120590119879119904
2
(H 119899 119899a 119891) is calculated by
120590119879119904
2
(H 119899 119899a 119891) = 119899a12058512
(H 119899 119899a 119891) 1205902
119883
+ 1205852
2
(H 119899 119899a 119891)1205902
119864119889
+ 1205853
2
(H 119899 119899a)1205902
119864a
+ 1205854
2
(H 119899 119899a 119891)1205902
119864ina
(55)
According toTheorem 4 it can be known that the hotspottemperature of any core follows the normal probabilisticdistribution
6 Frequency Analysis
The zero-slack policy is used as the DTM strategy of proces-sors that is the speed of processor is set to a value whichmakes the temperature of the hotspot be the threshold [7]However the frequencies are discrete in this workThereforein most cases there is no frequency in the set of frequenciesmaking the hotspot temperature be the threshold exactly
10 Mathematical Problems in Engineering
Theorem 5 Let 119865119878119890119905 = 1198911
119891120587
119891120587+1
119891119891119888119900119906119899119905
be the set of frequencies of multicore processors where1198911
lt sdot sdot sdot lt 119891120587
lt 119891120587+1
lt sdot sdot sdot lt 119891119891119888119900119906119899119905
then the probabilisticdistribution of the frequency f follows
119875 119891 = 119891120587
=
119875 119879ℎ119900119905119886
(119899119886
119891120587
) lt 119879V119886119897V119890 119891120587
= 119891119888119900119906119899119905
119875 119879ℎ119900119905119886
(119899119886
119891120587
) le 119879V119886119897V119890 minus 119875 119879ℎ119900119905119886
(119899119886
119891120587+1
) le 119879V119886119897V119890 119891120587
lt 119891119888119900119906119899119905
(56)
where119879ℎ119900119905119886
(119899119886
119891120587
) is the function of the number of active cores119899119886
and the frequency 119891120587
representing the hotspot temperatureof the active core and 119879V119886119897V119890 is the temperature threshold of theprocessor
Proof When 119891120587
= 119891count according to the zero-slack DTMpolicy obviously
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve (57)
When 119891120587
lt 119891count then the probabilities of 119891 = 119891120587
can bebroken into two cases
(a) if 119879hota(119899a 119891120587) gt 119879hota(119899a 119891120587+1) according to thezero-slack DTM strategy the probability of 119891 = 119891
120587
is 0 that is
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) gt 119879hota (119899a 119891120587+1) = 0 (58)
(b) if 119879hota(119899a 119891120587) le 119879hota(119899a 119891120587+1) the probability of119891 = 119891
120587
is given by
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
(59)
The right-hand side of (59) can be derived by
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve minus 119875 119879hota (119899a 119891120587)
le 119879valve 119879hota (119899a 119891120587+1) le 119879valve | 119879hota (119899a 119891120587)
le 119879hota (119899a 119891120587+1)
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
le 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587+1) le 119879valve
(60)
According to (59) and (60) get
119875 119891119879hota = (119899a 119891120587) 119891120587 le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve
(61)
Hence when 119891120587
lt 119891count according to (58) and (61) get
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve (62)
Therefore according to Theorem 5 the probabilistic dis-tribution of the set of frequencies can be obtained based onthe assumption that the zero-slack policy is used
Given the probabilistic distribution of the frequency andthe mean of hotspot temperature of the active core for acertain frequency the average hotspot temperature of theactive core 119879
averhota can be obtained by
119879averhota = sum
119891isin119865Set120583119879a (H 119899 119899a 119891) sdot 119875 (119891) (63)
where 119865Set = 1198911
119891120587
119891120587+1
119891119891count denotes the set
of frequencies of multicore processors and 119875(119891) denotes theprobability that the frequency is 119891 120583
119879a(H 119899 119899a 119891) denotesthe mean of hotspot temperature of the active core given thefrequency 119891
7 Experimental Results
71 Experimental Methodology Amulticore version of Alpha21264 processor is used as the processor model in ourexperiment [24] and there are eight cores in the processorThe cores have two working states active state and inac-tive state and the working state of each core can be setseparately The processor employs a global DFS techniquewhich means that frequencies of all cores in the processorare scaled uniformly There are four discrete frequenciesused by the processor that is 15 GHz 2GHz 25 GHz and3GHz To facilitate analysis the frequencies are normalizedinto the interval [0 1] so that the maximum frequency isnormalized to 1 After normalization the set of frequenciesis 05 067 083 1 According to our previous work thehotspot of Alpha 21264 processor is the branch predictor[18] So the second element of the hotspot selection vectorH corresponding to the branch predictor is set to 1 and theother elements are set to 0 sThe thermal threshold that is themaximum temperature allowed by processor is set to 100∘C
TheHotSpot is used as the thermalmodel of themulticoreprocessor and the parameters such as thermal conductanceand capacitance are set to default values of HotSpot sim-ulation tool [3] In order to construct the linear model ofdynamic power PTScalar is modified to obtain both the
Mathematical Problems in Engineering 11
dynamic power profiles of each functional unit in Alpha21264 processor and the IPC profiles [25] The parameters ofPTScalar are set to default values as well The mean 120583
119883
andvariance 120590
2
119883
of norm-distributed 119883a119894 are determined usingIPC profiles based on the maximum likelihood estimationSome representative tasks such as mesa ammp quake bzipmcfmath and qsort fromMiBench [26] and SPECCPU2000[27] are selected as the benchmarks These selected tasks aremutually independent that is no task takes precedence overthe others so the tasks can be parallel executed at the tasklevel In addition workload balancing techniques are used inthe processor A task is not fixed on a core and the tasks canbe migrated among all cores such that the IPCs of differentcores are equal The simple linear regression analysis is usedto determine the coefficients Gdyn and G
1198890
in (21) and thevariance 1205902
119864119889
of regression residuals Edyn is obtainedThe leakage power is the nonlinear monotonic increasing
function of temperature which is given by [25]
119875119897
= 120572 sdot 1198792
sdot 119890120574119879
+ 120573 (64)
where 120572 120573 and 120574 are parameters which depend on topologysize technology and design of processors In order toconstruct the linear model of leakage power (64) is regressedlinearly to determine the coefficients Gleaa and G
1198970a in (22)and the coefficients Gleaina and G
1198970ina in (24) as well asthe variance 1205902
119864a and 1205902
119864ina of regression residuals Eleaa andEleaina The parameters 120572 120573 and 120574 are set to the defaultvalues of PTScalar The smaller the range of temperaturethe higher the linear correlation between leakage power andtemperature [7] The temperature of a processor using DTMtechniques does not exceed the maximum value and thelower temperature has no impact on the design optimizationof thermal-aware processors Therefore the linear regressionanalysis of leakage powers is performed at the temperatureinterval between 60∘C and 100∘C and the regression resultsare used to estimate the leakage power at the whole tempera-ture interval
72 Estimated Accuracy For the hotspot of the processorthat is the branch predictor Figure 3 shows the comparisonbetween the actual value and the estimated value of leakagepower for active cores and Figure 4 shows that for inactivecores A higher estimation accuracy of leakage power isobtained at the temperature interval between 60∘C and 100∘Cat the cost of lower accuracy at the other intervals
After regression analysis the dynamic and leakage powercan be estimated with the linear model in (21) (22) and(24) Figure 5 shows the estimation error rate of the dynamicpower and that of the leakage power for active cores andinactive cores in thermal range between 60∘C and 100∘C Itcan be seen that the error rates of the dynamic powers fordifferent functional units have significant variations For thedecoder the estimation error rate of dynamic power is only262 but 1014 for the floating point register (FPReg) Thereason for this fact is that the linear correlations betweenthe IPC and the dynamic powers for various functionalunits are different Lower error rate results from highercorrelation Obviously the IPC and the dynamic power of
Actual leakage powerEstimated leakage power
60 80 100 12040Temperature (∘C)
minus02
0
02
04
06
08
1
Leak
age p
ower
(W)
Figure 3 Leakage power estimation of hotspot for active cores
Actual leakage powerEstimated leakage power
times10minus4
60 80 100 12040Temperature (∘C)
minus2
0
2
4
6
8
10
12
14
Leak
age p
ower
(W)
Figure 4 Leakage power estimation of hotspot for inactive cores
the decoder have the highest linear correlation while theIPC and that of FPReg have the lowest linear correlationIt can also be seen from Figure 5 that the estimation errorrates of leakage power for both active cores and inactivecores are similar The estimation error rates of leakage powerfor active cores are between 315 and 326 and those forinactive cores are between 306 and 591 This is becausethe temperature and the leakage power of various functionalunits have similar linear correlations In order to consider the
12 Mathematical Problems in Engineering
Table 1 Means (120583) and standard deviations (120590) of hotspot temperature distribution for active cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 5193 403 6125 540 7002 669 7935 8067 5920 425 6973 569 7964 705 9017 8508 6669 448 7846 600 8954 743 10131 895
Table 2 Means (120583) and standard deviations (120590) of hotspot temperature distribution for inactive cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 4059 171 4708 229 5319 284 5968 3427 4778 195 5546 261 6269 323 7037 389
Dynamic power
FPRe
gIT
LB IL1
DTB DL1 L2
Dec
ode
Bran
chRA
TRU
ULS
QIA
LU1
IALU
2IA
LU3
IALU
4FP
Add
FPM
ulIn
tReg
Leakage power for active coreLeakage power for inactive core
02468
1012
Erro
r rat
e (
)
Figure 5 Error rate of the power estimation
impact of estimation errors on the analysis of temperatureand frequency the error terms Edyn Eleaa and Eleaina areintroduced into the linear models of dynamic power andleakage power as expressed in (21) (22) and (24)
73 Probabilistic Distribution of Temperature Based on theassumption that no DTM techniques are used to controlthe temperature of processors according to Theorem 4the probabilistic distribution of the hotspot temperaturesfor both active cores and inactive cores can be obtainedTable 1 presents the means and the standard deviations ofthe probabilistic distribution of the hotspot temperature foractive cores and the corresponding probabilistic densitycurves are as shown in Figure 6 It can be seen that thehotspot temperature of processors is not deterministic andhas significant variations for a certain number of active coresand a certain frequency The number of active cores and therunning frequency simultaneously determine the range inwhich the temperature lies and the probabilistic distributionof the hotspot temperature For the same running frequencymore active cores will yield higher temperature and viceversa For the same number of active cores higher frequencywill bring higher temperature and vice versa Accordingto the characteristics of normal distribution curve it can
be known that the shape of probabilistic density curvecorresponds to the variations of data distribution dependingon the standard deviation of random variables The curvewith a higher peak implies a smaller standard deviationthat is a lower variation of data distribution whereas thecurve with a lower peak implies a bigger standard deviationthat is a larger variation it can be seen from Figure 6 thatthe probabilistic density curves corresponding to variousfrequencies have different peaks This observation impliesthat the degree of temperature variation has close correlationwith working frequency and higher frequency will yieldhigher variation of temperature
Table 2 presents the means and standard deviations ofthe probabilistic distribution of the hotspot temperature forinactive cores and the corresponding probabilistic densitycurves are as shown in Figure 7 When the number of activecores is eight inactive core does not exist so there is nocase where the number of active cores is eight in Table 2and Figure 7 The effect of the frequency and the number ofactive cores at the hotspot temperature for inactive cores is thesame as that for active cores except that the mean value andvariation of hotspot temperature of inactive cores are lowerthan those of active cores under the same frequency and thenumber of active cores
74 Probabilistic Distribution of Frequencies If the power-gating and the DFS techniques are simultaneously usedto manage the temperature of a processor according toTheorem 5 the probabilistic distribution of working frequen-cies can be determined Figure 8 presents the probabilisticdistribution of frequencies when the number of active cores issix seven and eight respectively When the frequency is lessthan 1 it is implied that the hotspot temperature surpasses thethreshold and the DFS technique is triggered to reduce thefrequency of the processor So the probability for triggeringDFS can be obtained from the probabilistic distribution offrequencies Figure 9 presents the probability for triggeringDFS when the number of active cores is six seven and eightrespectively
Mathematical Problems in Engineering 13
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
0
002
004
006
008
01Pr
obab
ility
den
sity
40 60 80 100 120 14020Temperature (∘C)
(a)
40 60 80 100 120 14020Temperature (∘C)
0
002
004
006
008
01
Prob
abili
ty d
ensit
yFrequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
0
002
004
006
008
01
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(c)
Figure 6 The probability density of the hotspot temperature for active cores (a) The number of active cores is six (b) The number of activecores is seven (c) The number of active cores is eight
If a core is powered off and made inactive using thepower-gating technique it only dissipates the leakage powerwhich is much less than that of an active core so the powerdissipated by the processor is reduced significantlyWhen thenumber of active cores is less than six that is more thantwo cores are powered off the decreased power makes it
enough for the rest of active cores of a processor to executeat the full speed and the DFS is not necessary to be triggeredTherefore when the number of active cores is less than six thefrequency of the processor is constantly 1 and the probabilityfor triggering DFS is constantly 0 This situation is not givenin Figures 8 and 9
14 Mathematical Problems in Engineering
Table 3 Comparisons of temperatures between active cores with and without DFS
Number of active cores Average hotspot temperature (∘C) Probability for exceedingtemperature threshold ()
With DFS Without DFS With DFS Without DFS6 7930 7935 0 0527 8885 9017 0 12378 9385 10131 0 5584
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(a)
40 60 80 100 120 14020Temperature (∘C)
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
Figure 7The probability density of the hotspot temperature for inactive cores (a)The number of active cores is six (b)The number of activecores is seven
It can be seen that various numbers of active cores resultin different probabilistic distributions of frequencies anddifferent probabilities for triggering DFS When all cores arepowered on the probability that the processor runs at themaximum frequency is only 4416 and the probability fortriggering DFS is 5584This means that all cores will run atthe full speed only when the IPCs of tasks are less If the IPCsincrease to an extent the running frequency will be scaleddown to control the temperature under the threshold Asthe number of active cores decreases the probability that theprocessor runs at the maximum frequency increases and theprobability for triggering DFS decreases When the numberof active cores is less than six no matter what the IPCs arethe saved power by shutting off more than two cores makes itdeterministic for the active cores to run at the full speed sothe probability that the processor runs at the maximum fre-quency is 100 and the probability for triggering DFS is 0
75 Comparisons of Temperatures with and without DFS IftheDFS technique is not used then the processor always runs
at the full speed that is the running frequency is constantly 1So the average hotspot temperature of the active core withoutthe DFS can be obtained according to (52) If both the power-gating and DFS techniques are used simultaneously for thedynamic thermal management then the running frequencycan be scaled to control the temperature of the processorunder the thermal threshold According to (63) the averagehotspot temperature of the active core with the DFS canbe obtained In terms of the average hotspot temperatureand the probability that the hotspot temperature exceeds thethreshold Table 3 presents the comparative results betweenthe active cores with and without the DFS when the numberof active cores is 6 7 and 8
If the DFS technique is used by the processor the runningfrequency will be scaled down to reduce the temperatureonce the hotspot temperature reaches the threshold So thehotspot temperature will not exceed the threshold that isthe probability that the hotspot temperature exceeds thethreshold is 0 If the DFS technique is not used by the pro-cessor the running frequency is always the maximum and
Mathematical Problems in Engineering 15
Number of active cores
Frequency is 1 Frequency is 083Frequency is 067 Frequency is 05
0
20
40
60
80
100
Prob
abili
ty o
f fre
quen
cy (
)
6 7 8
Figure 8 Probabilistic distribution of frequencies
6 7 8Number of active cores
0
10
20
30
40
50
60
Prob
abili
ty fo
r trig
gerin
g D
FS (
)
Figure 9 Probability for triggering DFS
the hotspot temperature is possible to exceed the thresholdthat is the probability that the hotspot temperature exceedsthe threshold is larger than 0Therefore the average hotspottemperature of the processor with the DFS is lower than thatwithout the DFS
As the number of active cores decreases the saved powerby shutting off cores makes it more possible for the activecores to execute at the full speed and the effect of the DFSon cooling down the processor weakens until it disappearsTherefore for the processors with and without the DFS the
average hotspot temperatures become closer as the number ofactive cores decreases as shown in Table 3 Even though theDFS is not used the probability that the hotspot temperatureexceeds the threshold will reduce until 0 when more coresare powered off
When the number of active cores is lower than 6 that ismore than 2 cores are powered off no matter what speed theprocessor runs at the hotspot temperature will not exceedthe threshold so all active cores are not necessary to triggerthe DFS for managing the temperature of the processor andcan run at the maximum frequency Therefore when thenumber of active cores is lower than 6 the probability thatthe hotspot temperature exceeds the threshold is 0 andthe average hotspot temperatures of the processor with andwithout theDFS are sameThere is no differencewith theDFSandwithout theDFSwhen the number of active cores is lowerthan 6 so the comparisons in this situation are not given inTable 3
8 Conclusions
In this paper a probabilistic analysis method of the tem-perature and frequency of multicore processors is presentedtaking the variation of workloads into account It is provedtheoretically in this paper that (1) the hotspot temperaturesof both active cores and inactive cores are the linear functionsof the IPC (2) the hotspot temperature follows the normalprobabilistic distribution based on the assumption that IPCsof all cores follow the same normal distribution and (3) therunning frequency follows a probabilistic distribution
From the experimental results it can be seen that (1) theestimation error rates of the dynamic powers for differentfunctional units have significant variations indicating thatthe linear correlations between the IPC and the dynamicpowers for various functional units are different (2) theestimation error rates of leakage powers for both activecores and inactive cores are similar showing similar linearcorrelations between temperature and leakage power acrossvarious functional units (3) a higher estimation accuracy ofleakage powers can be obtained at the temperature intervalbetween 60∘C and 100∘C at the cost of lower accuracy atother intervals (4) the hotspot temperature of the proces-sor is not deterministic and has significant variation fora certain number of active cores and a certain frequencyand the number of active cores and the running frequencydetermine simultaneously the probabilistic distribution ofhotspot temperature and (5) various numbers of active coresresult in different probabilistic distributions of frequenciesand different probabilities for triggering DFS
Competing Interests
The authors declare no competing interests
Authorsrsquo Contributions
Biying Zhang performed the theoretical analysis and wrotethe major part of this paper Gang Cui and Zhongchuan
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
10 Mathematical Problems in Engineering
Theorem 5 Let 119865119878119890119905 = 1198911
119891120587
119891120587+1
119891119891119888119900119906119899119905
be the set of frequencies of multicore processors where1198911
lt sdot sdot sdot lt 119891120587
lt 119891120587+1
lt sdot sdot sdot lt 119891119891119888119900119906119899119905
then the probabilisticdistribution of the frequency f follows
119875 119891 = 119891120587
=
119875 119879ℎ119900119905119886
(119899119886
119891120587
) lt 119879V119886119897V119890 119891120587
= 119891119888119900119906119899119905
119875 119879ℎ119900119905119886
(119899119886
119891120587
) le 119879V119886119897V119890 minus 119875 119879ℎ119900119905119886
(119899119886
119891120587+1
) le 119879V119886119897V119890 119891120587
lt 119891119888119900119906119899119905
(56)
where119879ℎ119900119905119886
(119899119886
119891120587
) is the function of the number of active cores119899119886
and the frequency 119891120587
representing the hotspot temperatureof the active core and 119879V119886119897V119890 is the temperature threshold of theprocessor
Proof When 119891120587
= 119891count according to the zero-slack DTMpolicy obviously
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve (57)
When 119891120587
lt 119891count then the probabilities of 119891 = 119891120587
can bebroken into two cases
(a) if 119879hota(119899a 119891120587) gt 119879hota(119899a 119891120587+1) according to thezero-slack DTM strategy the probability of 119891 = 119891
120587
is 0 that is
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) gt 119879hota (119899a 119891120587+1) = 0 (58)
(b) if 119879hota(119899a 119891120587) le 119879hota(119899a 119891120587+1) the probability of119891 = 119891
120587
is given by
119875 119891 = 119891120587
| 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
(59)
The right-hand side of (59) can be derived by
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
gt 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve minus 119875 119879hota (119899a 119891120587)
le 119879valve 119879hota (119899a 119891120587+1) le 119879valve | 119879hota (119899a 119891120587)
le 119879hota (119899a 119891120587+1)
119875 119879hota (119899a 119891120587) le 119879valve 119879hota (119899a 119891120587+1)
le 119879valve | 119879hota (119899a 119891120587) le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587+1) le 119879valve
(60)
According to (59) and (60) get
119875 119891119879hota = (119899a 119891120587) 119891120587 le 119879hota (119899a 119891120587+1)
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve
(61)
Hence when 119891120587
lt 119891count according to (58) and (61) get
119875 119891 = 119891120587
= 119875 119879hota (119899a 119891120587) le 119879valve
minus 119875 119879hota (119899a 119891120587+1) le 119879valve (62)
Therefore according to Theorem 5 the probabilistic dis-tribution of the set of frequencies can be obtained based onthe assumption that the zero-slack policy is used
Given the probabilistic distribution of the frequency andthe mean of hotspot temperature of the active core for acertain frequency the average hotspot temperature of theactive core 119879
averhota can be obtained by
119879averhota = sum
119891isin119865Set120583119879a (H 119899 119899a 119891) sdot 119875 (119891) (63)
where 119865Set = 1198911
119891120587
119891120587+1
119891119891count denotes the set
of frequencies of multicore processors and 119875(119891) denotes theprobability that the frequency is 119891 120583
119879a(H 119899 119899a 119891) denotesthe mean of hotspot temperature of the active core given thefrequency 119891
7 Experimental Results
71 Experimental Methodology Amulticore version of Alpha21264 processor is used as the processor model in ourexperiment [24] and there are eight cores in the processorThe cores have two working states active state and inac-tive state and the working state of each core can be setseparately The processor employs a global DFS techniquewhich means that frequencies of all cores in the processorare scaled uniformly There are four discrete frequenciesused by the processor that is 15 GHz 2GHz 25 GHz and3GHz To facilitate analysis the frequencies are normalizedinto the interval [0 1] so that the maximum frequency isnormalized to 1 After normalization the set of frequenciesis 05 067 083 1 According to our previous work thehotspot of Alpha 21264 processor is the branch predictor[18] So the second element of the hotspot selection vectorH corresponding to the branch predictor is set to 1 and theother elements are set to 0 sThe thermal threshold that is themaximum temperature allowed by processor is set to 100∘C
TheHotSpot is used as the thermalmodel of themulticoreprocessor and the parameters such as thermal conductanceand capacitance are set to default values of HotSpot sim-ulation tool [3] In order to construct the linear model ofdynamic power PTScalar is modified to obtain both the
Mathematical Problems in Engineering 11
dynamic power profiles of each functional unit in Alpha21264 processor and the IPC profiles [25] The parameters ofPTScalar are set to default values as well The mean 120583
119883
andvariance 120590
2
119883
of norm-distributed 119883a119894 are determined usingIPC profiles based on the maximum likelihood estimationSome representative tasks such as mesa ammp quake bzipmcfmath and qsort fromMiBench [26] and SPECCPU2000[27] are selected as the benchmarks These selected tasks aremutually independent that is no task takes precedence overthe others so the tasks can be parallel executed at the tasklevel In addition workload balancing techniques are used inthe processor A task is not fixed on a core and the tasks canbe migrated among all cores such that the IPCs of differentcores are equal The simple linear regression analysis is usedto determine the coefficients Gdyn and G
1198890
in (21) and thevariance 1205902
119864119889
of regression residuals Edyn is obtainedThe leakage power is the nonlinear monotonic increasing
function of temperature which is given by [25]
119875119897
= 120572 sdot 1198792
sdot 119890120574119879
+ 120573 (64)
where 120572 120573 and 120574 are parameters which depend on topologysize technology and design of processors In order toconstruct the linear model of leakage power (64) is regressedlinearly to determine the coefficients Gleaa and G
1198970a in (22)and the coefficients Gleaina and G
1198970ina in (24) as well asthe variance 1205902
119864a and 1205902
119864ina of regression residuals Eleaa andEleaina The parameters 120572 120573 and 120574 are set to the defaultvalues of PTScalar The smaller the range of temperaturethe higher the linear correlation between leakage power andtemperature [7] The temperature of a processor using DTMtechniques does not exceed the maximum value and thelower temperature has no impact on the design optimizationof thermal-aware processors Therefore the linear regressionanalysis of leakage powers is performed at the temperatureinterval between 60∘C and 100∘C and the regression resultsare used to estimate the leakage power at the whole tempera-ture interval
72 Estimated Accuracy For the hotspot of the processorthat is the branch predictor Figure 3 shows the comparisonbetween the actual value and the estimated value of leakagepower for active cores and Figure 4 shows that for inactivecores A higher estimation accuracy of leakage power isobtained at the temperature interval between 60∘C and 100∘Cat the cost of lower accuracy at the other intervals
After regression analysis the dynamic and leakage powercan be estimated with the linear model in (21) (22) and(24) Figure 5 shows the estimation error rate of the dynamicpower and that of the leakage power for active cores andinactive cores in thermal range between 60∘C and 100∘C Itcan be seen that the error rates of the dynamic powers fordifferent functional units have significant variations For thedecoder the estimation error rate of dynamic power is only262 but 1014 for the floating point register (FPReg) Thereason for this fact is that the linear correlations betweenthe IPC and the dynamic powers for various functionalunits are different Lower error rate results from highercorrelation Obviously the IPC and the dynamic power of
Actual leakage powerEstimated leakage power
60 80 100 12040Temperature (∘C)
minus02
0
02
04
06
08
1
Leak
age p
ower
(W)
Figure 3 Leakage power estimation of hotspot for active cores
Actual leakage powerEstimated leakage power
times10minus4
60 80 100 12040Temperature (∘C)
minus2
0
2
4
6
8
10
12
14
Leak
age p
ower
(W)
Figure 4 Leakage power estimation of hotspot for inactive cores
the decoder have the highest linear correlation while theIPC and that of FPReg have the lowest linear correlationIt can also be seen from Figure 5 that the estimation errorrates of leakage power for both active cores and inactivecores are similar The estimation error rates of leakage powerfor active cores are between 315 and 326 and those forinactive cores are between 306 and 591 This is becausethe temperature and the leakage power of various functionalunits have similar linear correlations In order to consider the
12 Mathematical Problems in Engineering
Table 1 Means (120583) and standard deviations (120590) of hotspot temperature distribution for active cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 5193 403 6125 540 7002 669 7935 8067 5920 425 6973 569 7964 705 9017 8508 6669 448 7846 600 8954 743 10131 895
Table 2 Means (120583) and standard deviations (120590) of hotspot temperature distribution for inactive cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 4059 171 4708 229 5319 284 5968 3427 4778 195 5546 261 6269 323 7037 389
Dynamic power
FPRe
gIT
LB IL1
DTB DL1 L2
Dec
ode
Bran
chRA
TRU
ULS
QIA
LU1
IALU
2IA
LU3
IALU
4FP
Add
FPM
ulIn
tReg
Leakage power for active coreLeakage power for inactive core
02468
1012
Erro
r rat
e (
)
Figure 5 Error rate of the power estimation
impact of estimation errors on the analysis of temperatureand frequency the error terms Edyn Eleaa and Eleaina areintroduced into the linear models of dynamic power andleakage power as expressed in (21) (22) and (24)
73 Probabilistic Distribution of Temperature Based on theassumption that no DTM techniques are used to controlthe temperature of processors according to Theorem 4the probabilistic distribution of the hotspot temperaturesfor both active cores and inactive cores can be obtainedTable 1 presents the means and the standard deviations ofthe probabilistic distribution of the hotspot temperature foractive cores and the corresponding probabilistic densitycurves are as shown in Figure 6 It can be seen that thehotspot temperature of processors is not deterministic andhas significant variations for a certain number of active coresand a certain frequency The number of active cores and therunning frequency simultaneously determine the range inwhich the temperature lies and the probabilistic distributionof the hotspot temperature For the same running frequencymore active cores will yield higher temperature and viceversa For the same number of active cores higher frequencywill bring higher temperature and vice versa Accordingto the characteristics of normal distribution curve it can
be known that the shape of probabilistic density curvecorresponds to the variations of data distribution dependingon the standard deviation of random variables The curvewith a higher peak implies a smaller standard deviationthat is a lower variation of data distribution whereas thecurve with a lower peak implies a bigger standard deviationthat is a larger variation it can be seen from Figure 6 thatthe probabilistic density curves corresponding to variousfrequencies have different peaks This observation impliesthat the degree of temperature variation has close correlationwith working frequency and higher frequency will yieldhigher variation of temperature
Table 2 presents the means and standard deviations ofthe probabilistic distribution of the hotspot temperature forinactive cores and the corresponding probabilistic densitycurves are as shown in Figure 7 When the number of activecores is eight inactive core does not exist so there is nocase where the number of active cores is eight in Table 2and Figure 7 The effect of the frequency and the number ofactive cores at the hotspot temperature for inactive cores is thesame as that for active cores except that the mean value andvariation of hotspot temperature of inactive cores are lowerthan those of active cores under the same frequency and thenumber of active cores
74 Probabilistic Distribution of Frequencies If the power-gating and the DFS techniques are simultaneously usedto manage the temperature of a processor according toTheorem 5 the probabilistic distribution of working frequen-cies can be determined Figure 8 presents the probabilisticdistribution of frequencies when the number of active cores issix seven and eight respectively When the frequency is lessthan 1 it is implied that the hotspot temperature surpasses thethreshold and the DFS technique is triggered to reduce thefrequency of the processor So the probability for triggeringDFS can be obtained from the probabilistic distribution offrequencies Figure 9 presents the probability for triggeringDFS when the number of active cores is six seven and eightrespectively
Mathematical Problems in Engineering 13
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
0
002
004
006
008
01Pr
obab
ility
den
sity
40 60 80 100 120 14020Temperature (∘C)
(a)
40 60 80 100 120 14020Temperature (∘C)
0
002
004
006
008
01
Prob
abili
ty d
ensit
yFrequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
0
002
004
006
008
01
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(c)
Figure 6 The probability density of the hotspot temperature for active cores (a) The number of active cores is six (b) The number of activecores is seven (c) The number of active cores is eight
If a core is powered off and made inactive using thepower-gating technique it only dissipates the leakage powerwhich is much less than that of an active core so the powerdissipated by the processor is reduced significantlyWhen thenumber of active cores is less than six that is more thantwo cores are powered off the decreased power makes it
enough for the rest of active cores of a processor to executeat the full speed and the DFS is not necessary to be triggeredTherefore when the number of active cores is less than six thefrequency of the processor is constantly 1 and the probabilityfor triggering DFS is constantly 0 This situation is not givenin Figures 8 and 9
14 Mathematical Problems in Engineering
Table 3 Comparisons of temperatures between active cores with and without DFS
Number of active cores Average hotspot temperature (∘C) Probability for exceedingtemperature threshold ()
With DFS Without DFS With DFS Without DFS6 7930 7935 0 0527 8885 9017 0 12378 9385 10131 0 5584
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(a)
40 60 80 100 120 14020Temperature (∘C)
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
Figure 7The probability density of the hotspot temperature for inactive cores (a)The number of active cores is six (b)The number of activecores is seven
It can be seen that various numbers of active cores resultin different probabilistic distributions of frequencies anddifferent probabilities for triggering DFS When all cores arepowered on the probability that the processor runs at themaximum frequency is only 4416 and the probability fortriggering DFS is 5584This means that all cores will run atthe full speed only when the IPCs of tasks are less If the IPCsincrease to an extent the running frequency will be scaleddown to control the temperature under the threshold Asthe number of active cores decreases the probability that theprocessor runs at the maximum frequency increases and theprobability for triggering DFS decreases When the numberof active cores is less than six no matter what the IPCs arethe saved power by shutting off more than two cores makes itdeterministic for the active cores to run at the full speed sothe probability that the processor runs at the maximum fre-quency is 100 and the probability for triggering DFS is 0
75 Comparisons of Temperatures with and without DFS IftheDFS technique is not used then the processor always runs
at the full speed that is the running frequency is constantly 1So the average hotspot temperature of the active core withoutthe DFS can be obtained according to (52) If both the power-gating and DFS techniques are used simultaneously for thedynamic thermal management then the running frequencycan be scaled to control the temperature of the processorunder the thermal threshold According to (63) the averagehotspot temperature of the active core with the DFS canbe obtained In terms of the average hotspot temperatureand the probability that the hotspot temperature exceeds thethreshold Table 3 presents the comparative results betweenthe active cores with and without the DFS when the numberof active cores is 6 7 and 8
If the DFS technique is used by the processor the runningfrequency will be scaled down to reduce the temperatureonce the hotspot temperature reaches the threshold So thehotspot temperature will not exceed the threshold that isthe probability that the hotspot temperature exceeds thethreshold is 0 If the DFS technique is not used by the pro-cessor the running frequency is always the maximum and
Mathematical Problems in Engineering 15
Number of active cores
Frequency is 1 Frequency is 083Frequency is 067 Frequency is 05
0
20
40
60
80
100
Prob
abili
ty o
f fre
quen
cy (
)
6 7 8
Figure 8 Probabilistic distribution of frequencies
6 7 8Number of active cores
0
10
20
30
40
50
60
Prob
abili
ty fo
r trig
gerin
g D
FS (
)
Figure 9 Probability for triggering DFS
the hotspot temperature is possible to exceed the thresholdthat is the probability that the hotspot temperature exceedsthe threshold is larger than 0Therefore the average hotspottemperature of the processor with the DFS is lower than thatwithout the DFS
As the number of active cores decreases the saved powerby shutting off cores makes it more possible for the activecores to execute at the full speed and the effect of the DFSon cooling down the processor weakens until it disappearsTherefore for the processors with and without the DFS the
average hotspot temperatures become closer as the number ofactive cores decreases as shown in Table 3 Even though theDFS is not used the probability that the hotspot temperatureexceeds the threshold will reduce until 0 when more coresare powered off
When the number of active cores is lower than 6 that ismore than 2 cores are powered off no matter what speed theprocessor runs at the hotspot temperature will not exceedthe threshold so all active cores are not necessary to triggerthe DFS for managing the temperature of the processor andcan run at the maximum frequency Therefore when thenumber of active cores is lower than 6 the probability thatthe hotspot temperature exceeds the threshold is 0 andthe average hotspot temperatures of the processor with andwithout theDFS are sameThere is no differencewith theDFSandwithout theDFSwhen the number of active cores is lowerthan 6 so the comparisons in this situation are not given inTable 3
8 Conclusions
In this paper a probabilistic analysis method of the tem-perature and frequency of multicore processors is presentedtaking the variation of workloads into account It is provedtheoretically in this paper that (1) the hotspot temperaturesof both active cores and inactive cores are the linear functionsof the IPC (2) the hotspot temperature follows the normalprobabilistic distribution based on the assumption that IPCsof all cores follow the same normal distribution and (3) therunning frequency follows a probabilistic distribution
From the experimental results it can be seen that (1) theestimation error rates of the dynamic powers for differentfunctional units have significant variations indicating thatthe linear correlations between the IPC and the dynamicpowers for various functional units are different (2) theestimation error rates of leakage powers for both activecores and inactive cores are similar showing similar linearcorrelations between temperature and leakage power acrossvarious functional units (3) a higher estimation accuracy ofleakage powers can be obtained at the temperature intervalbetween 60∘C and 100∘C at the cost of lower accuracy atother intervals (4) the hotspot temperature of the proces-sor is not deterministic and has significant variation fora certain number of active cores and a certain frequencyand the number of active cores and the running frequencydetermine simultaneously the probabilistic distribution ofhotspot temperature and (5) various numbers of active coresresult in different probabilistic distributions of frequenciesand different probabilities for triggering DFS
Competing Interests
The authors declare no competing interests
Authorsrsquo Contributions
Biying Zhang performed the theoretical analysis and wrotethe major part of this paper Gang Cui and Zhongchuan
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 11
dynamic power profiles of each functional unit in Alpha21264 processor and the IPC profiles [25] The parameters ofPTScalar are set to default values as well The mean 120583
119883
andvariance 120590
2
119883
of norm-distributed 119883a119894 are determined usingIPC profiles based on the maximum likelihood estimationSome representative tasks such as mesa ammp quake bzipmcfmath and qsort fromMiBench [26] and SPECCPU2000[27] are selected as the benchmarks These selected tasks aremutually independent that is no task takes precedence overthe others so the tasks can be parallel executed at the tasklevel In addition workload balancing techniques are used inthe processor A task is not fixed on a core and the tasks canbe migrated among all cores such that the IPCs of differentcores are equal The simple linear regression analysis is usedto determine the coefficients Gdyn and G
1198890
in (21) and thevariance 1205902
119864119889
of regression residuals Edyn is obtainedThe leakage power is the nonlinear monotonic increasing
function of temperature which is given by [25]
119875119897
= 120572 sdot 1198792
sdot 119890120574119879
+ 120573 (64)
where 120572 120573 and 120574 are parameters which depend on topologysize technology and design of processors In order toconstruct the linear model of leakage power (64) is regressedlinearly to determine the coefficients Gleaa and G
1198970a in (22)and the coefficients Gleaina and G
1198970ina in (24) as well asthe variance 1205902
119864a and 1205902
119864ina of regression residuals Eleaa andEleaina The parameters 120572 120573 and 120574 are set to the defaultvalues of PTScalar The smaller the range of temperaturethe higher the linear correlation between leakage power andtemperature [7] The temperature of a processor using DTMtechniques does not exceed the maximum value and thelower temperature has no impact on the design optimizationof thermal-aware processors Therefore the linear regressionanalysis of leakage powers is performed at the temperatureinterval between 60∘C and 100∘C and the regression resultsare used to estimate the leakage power at the whole tempera-ture interval
72 Estimated Accuracy For the hotspot of the processorthat is the branch predictor Figure 3 shows the comparisonbetween the actual value and the estimated value of leakagepower for active cores and Figure 4 shows that for inactivecores A higher estimation accuracy of leakage power isobtained at the temperature interval between 60∘C and 100∘Cat the cost of lower accuracy at the other intervals
After regression analysis the dynamic and leakage powercan be estimated with the linear model in (21) (22) and(24) Figure 5 shows the estimation error rate of the dynamicpower and that of the leakage power for active cores andinactive cores in thermal range between 60∘C and 100∘C Itcan be seen that the error rates of the dynamic powers fordifferent functional units have significant variations For thedecoder the estimation error rate of dynamic power is only262 but 1014 for the floating point register (FPReg) Thereason for this fact is that the linear correlations betweenthe IPC and the dynamic powers for various functionalunits are different Lower error rate results from highercorrelation Obviously the IPC and the dynamic power of
Actual leakage powerEstimated leakage power
60 80 100 12040Temperature (∘C)
minus02
0
02
04
06
08
1
Leak
age p
ower
(W)
Figure 3 Leakage power estimation of hotspot for active cores
Actual leakage powerEstimated leakage power
times10minus4
60 80 100 12040Temperature (∘C)
minus2
0
2
4
6
8
10
12
14
Leak
age p
ower
(W)
Figure 4 Leakage power estimation of hotspot for inactive cores
the decoder have the highest linear correlation while theIPC and that of FPReg have the lowest linear correlationIt can also be seen from Figure 5 that the estimation errorrates of leakage power for both active cores and inactivecores are similar The estimation error rates of leakage powerfor active cores are between 315 and 326 and those forinactive cores are between 306 and 591 This is becausethe temperature and the leakage power of various functionalunits have similar linear correlations In order to consider the
12 Mathematical Problems in Engineering
Table 1 Means (120583) and standard deviations (120590) of hotspot temperature distribution for active cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 5193 403 6125 540 7002 669 7935 8067 5920 425 6973 569 7964 705 9017 8508 6669 448 7846 600 8954 743 10131 895
Table 2 Means (120583) and standard deviations (120590) of hotspot temperature distribution for inactive cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 4059 171 4708 229 5319 284 5968 3427 4778 195 5546 261 6269 323 7037 389
Dynamic power
FPRe
gIT
LB IL1
DTB DL1 L2
Dec
ode
Bran
chRA
TRU
ULS
QIA
LU1
IALU
2IA
LU3
IALU
4FP
Add
FPM
ulIn
tReg
Leakage power for active coreLeakage power for inactive core
02468
1012
Erro
r rat
e (
)
Figure 5 Error rate of the power estimation
impact of estimation errors on the analysis of temperatureand frequency the error terms Edyn Eleaa and Eleaina areintroduced into the linear models of dynamic power andleakage power as expressed in (21) (22) and (24)
73 Probabilistic Distribution of Temperature Based on theassumption that no DTM techniques are used to controlthe temperature of processors according to Theorem 4the probabilistic distribution of the hotspot temperaturesfor both active cores and inactive cores can be obtainedTable 1 presents the means and the standard deviations ofthe probabilistic distribution of the hotspot temperature foractive cores and the corresponding probabilistic densitycurves are as shown in Figure 6 It can be seen that thehotspot temperature of processors is not deterministic andhas significant variations for a certain number of active coresand a certain frequency The number of active cores and therunning frequency simultaneously determine the range inwhich the temperature lies and the probabilistic distributionof the hotspot temperature For the same running frequencymore active cores will yield higher temperature and viceversa For the same number of active cores higher frequencywill bring higher temperature and vice versa Accordingto the characteristics of normal distribution curve it can
be known that the shape of probabilistic density curvecorresponds to the variations of data distribution dependingon the standard deviation of random variables The curvewith a higher peak implies a smaller standard deviationthat is a lower variation of data distribution whereas thecurve with a lower peak implies a bigger standard deviationthat is a larger variation it can be seen from Figure 6 thatthe probabilistic density curves corresponding to variousfrequencies have different peaks This observation impliesthat the degree of temperature variation has close correlationwith working frequency and higher frequency will yieldhigher variation of temperature
Table 2 presents the means and standard deviations ofthe probabilistic distribution of the hotspot temperature forinactive cores and the corresponding probabilistic densitycurves are as shown in Figure 7 When the number of activecores is eight inactive core does not exist so there is nocase where the number of active cores is eight in Table 2and Figure 7 The effect of the frequency and the number ofactive cores at the hotspot temperature for inactive cores is thesame as that for active cores except that the mean value andvariation of hotspot temperature of inactive cores are lowerthan those of active cores under the same frequency and thenumber of active cores
74 Probabilistic Distribution of Frequencies If the power-gating and the DFS techniques are simultaneously usedto manage the temperature of a processor according toTheorem 5 the probabilistic distribution of working frequen-cies can be determined Figure 8 presents the probabilisticdistribution of frequencies when the number of active cores issix seven and eight respectively When the frequency is lessthan 1 it is implied that the hotspot temperature surpasses thethreshold and the DFS technique is triggered to reduce thefrequency of the processor So the probability for triggeringDFS can be obtained from the probabilistic distribution offrequencies Figure 9 presents the probability for triggeringDFS when the number of active cores is six seven and eightrespectively
Mathematical Problems in Engineering 13
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
0
002
004
006
008
01Pr
obab
ility
den
sity
40 60 80 100 120 14020Temperature (∘C)
(a)
40 60 80 100 120 14020Temperature (∘C)
0
002
004
006
008
01
Prob
abili
ty d
ensit
yFrequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
0
002
004
006
008
01
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(c)
Figure 6 The probability density of the hotspot temperature for active cores (a) The number of active cores is six (b) The number of activecores is seven (c) The number of active cores is eight
If a core is powered off and made inactive using thepower-gating technique it only dissipates the leakage powerwhich is much less than that of an active core so the powerdissipated by the processor is reduced significantlyWhen thenumber of active cores is less than six that is more thantwo cores are powered off the decreased power makes it
enough for the rest of active cores of a processor to executeat the full speed and the DFS is not necessary to be triggeredTherefore when the number of active cores is less than six thefrequency of the processor is constantly 1 and the probabilityfor triggering DFS is constantly 0 This situation is not givenin Figures 8 and 9
14 Mathematical Problems in Engineering
Table 3 Comparisons of temperatures between active cores with and without DFS
Number of active cores Average hotspot temperature (∘C) Probability for exceedingtemperature threshold ()
With DFS Without DFS With DFS Without DFS6 7930 7935 0 0527 8885 9017 0 12378 9385 10131 0 5584
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(a)
40 60 80 100 120 14020Temperature (∘C)
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
Figure 7The probability density of the hotspot temperature for inactive cores (a)The number of active cores is six (b)The number of activecores is seven
It can be seen that various numbers of active cores resultin different probabilistic distributions of frequencies anddifferent probabilities for triggering DFS When all cores arepowered on the probability that the processor runs at themaximum frequency is only 4416 and the probability fortriggering DFS is 5584This means that all cores will run atthe full speed only when the IPCs of tasks are less If the IPCsincrease to an extent the running frequency will be scaleddown to control the temperature under the threshold Asthe number of active cores decreases the probability that theprocessor runs at the maximum frequency increases and theprobability for triggering DFS decreases When the numberof active cores is less than six no matter what the IPCs arethe saved power by shutting off more than two cores makes itdeterministic for the active cores to run at the full speed sothe probability that the processor runs at the maximum fre-quency is 100 and the probability for triggering DFS is 0
75 Comparisons of Temperatures with and without DFS IftheDFS technique is not used then the processor always runs
at the full speed that is the running frequency is constantly 1So the average hotspot temperature of the active core withoutthe DFS can be obtained according to (52) If both the power-gating and DFS techniques are used simultaneously for thedynamic thermal management then the running frequencycan be scaled to control the temperature of the processorunder the thermal threshold According to (63) the averagehotspot temperature of the active core with the DFS canbe obtained In terms of the average hotspot temperatureand the probability that the hotspot temperature exceeds thethreshold Table 3 presents the comparative results betweenthe active cores with and without the DFS when the numberof active cores is 6 7 and 8
If the DFS technique is used by the processor the runningfrequency will be scaled down to reduce the temperatureonce the hotspot temperature reaches the threshold So thehotspot temperature will not exceed the threshold that isthe probability that the hotspot temperature exceeds thethreshold is 0 If the DFS technique is not used by the pro-cessor the running frequency is always the maximum and
Mathematical Problems in Engineering 15
Number of active cores
Frequency is 1 Frequency is 083Frequency is 067 Frequency is 05
0
20
40
60
80
100
Prob
abili
ty o
f fre
quen
cy (
)
6 7 8
Figure 8 Probabilistic distribution of frequencies
6 7 8Number of active cores
0
10
20
30
40
50
60
Prob
abili
ty fo
r trig
gerin
g D
FS (
)
Figure 9 Probability for triggering DFS
the hotspot temperature is possible to exceed the thresholdthat is the probability that the hotspot temperature exceedsthe threshold is larger than 0Therefore the average hotspottemperature of the processor with the DFS is lower than thatwithout the DFS
As the number of active cores decreases the saved powerby shutting off cores makes it more possible for the activecores to execute at the full speed and the effect of the DFSon cooling down the processor weakens until it disappearsTherefore for the processors with and without the DFS the
average hotspot temperatures become closer as the number ofactive cores decreases as shown in Table 3 Even though theDFS is not used the probability that the hotspot temperatureexceeds the threshold will reduce until 0 when more coresare powered off
When the number of active cores is lower than 6 that ismore than 2 cores are powered off no matter what speed theprocessor runs at the hotspot temperature will not exceedthe threshold so all active cores are not necessary to triggerthe DFS for managing the temperature of the processor andcan run at the maximum frequency Therefore when thenumber of active cores is lower than 6 the probability thatthe hotspot temperature exceeds the threshold is 0 andthe average hotspot temperatures of the processor with andwithout theDFS are sameThere is no differencewith theDFSandwithout theDFSwhen the number of active cores is lowerthan 6 so the comparisons in this situation are not given inTable 3
8 Conclusions
In this paper a probabilistic analysis method of the tem-perature and frequency of multicore processors is presentedtaking the variation of workloads into account It is provedtheoretically in this paper that (1) the hotspot temperaturesof both active cores and inactive cores are the linear functionsof the IPC (2) the hotspot temperature follows the normalprobabilistic distribution based on the assumption that IPCsof all cores follow the same normal distribution and (3) therunning frequency follows a probabilistic distribution
From the experimental results it can be seen that (1) theestimation error rates of the dynamic powers for differentfunctional units have significant variations indicating thatthe linear correlations between the IPC and the dynamicpowers for various functional units are different (2) theestimation error rates of leakage powers for both activecores and inactive cores are similar showing similar linearcorrelations between temperature and leakage power acrossvarious functional units (3) a higher estimation accuracy ofleakage powers can be obtained at the temperature intervalbetween 60∘C and 100∘C at the cost of lower accuracy atother intervals (4) the hotspot temperature of the proces-sor is not deterministic and has significant variation fora certain number of active cores and a certain frequencyand the number of active cores and the running frequencydetermine simultaneously the probabilistic distribution ofhotspot temperature and (5) various numbers of active coresresult in different probabilistic distributions of frequenciesand different probabilities for triggering DFS
Competing Interests
The authors declare no competing interests
Authorsrsquo Contributions
Biying Zhang performed the theoretical analysis and wrotethe major part of this paper Gang Cui and Zhongchuan
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
12 Mathematical Problems in Engineering
Table 1 Means (120583) and standard deviations (120590) of hotspot temperature distribution for active cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 5193 403 6125 540 7002 669 7935 8067 5920 425 6973 569 7964 705 9017 8508 6669 448 7846 600 8954 743 10131 895
Table 2 Means (120583) and standard deviations (120590) of hotspot temperature distribution for inactive cores
Number of active cores Frequency of 05 Frequency of 067 Frequency of 083 Frequency of 1120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590 120583 (∘C) 120590
6 4059 171 4708 229 5319 284 5968 3427 4778 195 5546 261 6269 323 7037 389
Dynamic power
FPRe
gIT
LB IL1
DTB DL1 L2
Dec
ode
Bran
chRA
TRU
ULS
QIA
LU1
IALU
2IA
LU3
IALU
4FP
Add
FPM
ulIn
tReg
Leakage power for active coreLeakage power for inactive core
02468
1012
Erro
r rat
e (
)
Figure 5 Error rate of the power estimation
impact of estimation errors on the analysis of temperatureand frequency the error terms Edyn Eleaa and Eleaina areintroduced into the linear models of dynamic power andleakage power as expressed in (21) (22) and (24)
73 Probabilistic Distribution of Temperature Based on theassumption that no DTM techniques are used to controlthe temperature of processors according to Theorem 4the probabilistic distribution of the hotspot temperaturesfor both active cores and inactive cores can be obtainedTable 1 presents the means and the standard deviations ofthe probabilistic distribution of the hotspot temperature foractive cores and the corresponding probabilistic densitycurves are as shown in Figure 6 It can be seen that thehotspot temperature of processors is not deterministic andhas significant variations for a certain number of active coresand a certain frequency The number of active cores and therunning frequency simultaneously determine the range inwhich the temperature lies and the probabilistic distributionof the hotspot temperature For the same running frequencymore active cores will yield higher temperature and viceversa For the same number of active cores higher frequencywill bring higher temperature and vice versa Accordingto the characteristics of normal distribution curve it can
be known that the shape of probabilistic density curvecorresponds to the variations of data distribution dependingon the standard deviation of random variables The curvewith a higher peak implies a smaller standard deviationthat is a lower variation of data distribution whereas thecurve with a lower peak implies a bigger standard deviationthat is a larger variation it can be seen from Figure 6 thatthe probabilistic density curves corresponding to variousfrequencies have different peaks This observation impliesthat the degree of temperature variation has close correlationwith working frequency and higher frequency will yieldhigher variation of temperature
Table 2 presents the means and standard deviations ofthe probabilistic distribution of the hotspot temperature forinactive cores and the corresponding probabilistic densitycurves are as shown in Figure 7 When the number of activecores is eight inactive core does not exist so there is nocase where the number of active cores is eight in Table 2and Figure 7 The effect of the frequency and the number ofactive cores at the hotspot temperature for inactive cores is thesame as that for active cores except that the mean value andvariation of hotspot temperature of inactive cores are lowerthan those of active cores under the same frequency and thenumber of active cores
74 Probabilistic Distribution of Frequencies If the power-gating and the DFS techniques are simultaneously usedto manage the temperature of a processor according toTheorem 5 the probabilistic distribution of working frequen-cies can be determined Figure 8 presents the probabilisticdistribution of frequencies when the number of active cores issix seven and eight respectively When the frequency is lessthan 1 it is implied that the hotspot temperature surpasses thethreshold and the DFS technique is triggered to reduce thefrequency of the processor So the probability for triggeringDFS can be obtained from the probabilistic distribution offrequencies Figure 9 presents the probability for triggeringDFS when the number of active cores is six seven and eightrespectively
Mathematical Problems in Engineering 13
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
0
002
004
006
008
01Pr
obab
ility
den
sity
40 60 80 100 120 14020Temperature (∘C)
(a)
40 60 80 100 120 14020Temperature (∘C)
0
002
004
006
008
01
Prob
abili
ty d
ensit
yFrequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
0
002
004
006
008
01
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(c)
Figure 6 The probability density of the hotspot temperature for active cores (a) The number of active cores is six (b) The number of activecores is seven (c) The number of active cores is eight
If a core is powered off and made inactive using thepower-gating technique it only dissipates the leakage powerwhich is much less than that of an active core so the powerdissipated by the processor is reduced significantlyWhen thenumber of active cores is less than six that is more thantwo cores are powered off the decreased power makes it
enough for the rest of active cores of a processor to executeat the full speed and the DFS is not necessary to be triggeredTherefore when the number of active cores is less than six thefrequency of the processor is constantly 1 and the probabilityfor triggering DFS is constantly 0 This situation is not givenin Figures 8 and 9
14 Mathematical Problems in Engineering
Table 3 Comparisons of temperatures between active cores with and without DFS
Number of active cores Average hotspot temperature (∘C) Probability for exceedingtemperature threshold ()
With DFS Without DFS With DFS Without DFS6 7930 7935 0 0527 8885 9017 0 12378 9385 10131 0 5584
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(a)
40 60 80 100 120 14020Temperature (∘C)
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
Figure 7The probability density of the hotspot temperature for inactive cores (a)The number of active cores is six (b)The number of activecores is seven
It can be seen that various numbers of active cores resultin different probabilistic distributions of frequencies anddifferent probabilities for triggering DFS When all cores arepowered on the probability that the processor runs at themaximum frequency is only 4416 and the probability fortriggering DFS is 5584This means that all cores will run atthe full speed only when the IPCs of tasks are less If the IPCsincrease to an extent the running frequency will be scaleddown to control the temperature under the threshold Asthe number of active cores decreases the probability that theprocessor runs at the maximum frequency increases and theprobability for triggering DFS decreases When the numberof active cores is less than six no matter what the IPCs arethe saved power by shutting off more than two cores makes itdeterministic for the active cores to run at the full speed sothe probability that the processor runs at the maximum fre-quency is 100 and the probability for triggering DFS is 0
75 Comparisons of Temperatures with and without DFS IftheDFS technique is not used then the processor always runs
at the full speed that is the running frequency is constantly 1So the average hotspot temperature of the active core withoutthe DFS can be obtained according to (52) If both the power-gating and DFS techniques are used simultaneously for thedynamic thermal management then the running frequencycan be scaled to control the temperature of the processorunder the thermal threshold According to (63) the averagehotspot temperature of the active core with the DFS canbe obtained In terms of the average hotspot temperatureand the probability that the hotspot temperature exceeds thethreshold Table 3 presents the comparative results betweenthe active cores with and without the DFS when the numberof active cores is 6 7 and 8
If the DFS technique is used by the processor the runningfrequency will be scaled down to reduce the temperatureonce the hotspot temperature reaches the threshold So thehotspot temperature will not exceed the threshold that isthe probability that the hotspot temperature exceeds thethreshold is 0 If the DFS technique is not used by the pro-cessor the running frequency is always the maximum and
Mathematical Problems in Engineering 15
Number of active cores
Frequency is 1 Frequency is 083Frequency is 067 Frequency is 05
0
20
40
60
80
100
Prob
abili
ty o
f fre
quen
cy (
)
6 7 8
Figure 8 Probabilistic distribution of frequencies
6 7 8Number of active cores
0
10
20
30
40
50
60
Prob
abili
ty fo
r trig
gerin
g D
FS (
)
Figure 9 Probability for triggering DFS
the hotspot temperature is possible to exceed the thresholdthat is the probability that the hotspot temperature exceedsthe threshold is larger than 0Therefore the average hotspottemperature of the processor with the DFS is lower than thatwithout the DFS
As the number of active cores decreases the saved powerby shutting off cores makes it more possible for the activecores to execute at the full speed and the effect of the DFSon cooling down the processor weakens until it disappearsTherefore for the processors with and without the DFS the
average hotspot temperatures become closer as the number ofactive cores decreases as shown in Table 3 Even though theDFS is not used the probability that the hotspot temperatureexceeds the threshold will reduce until 0 when more coresare powered off
When the number of active cores is lower than 6 that ismore than 2 cores are powered off no matter what speed theprocessor runs at the hotspot temperature will not exceedthe threshold so all active cores are not necessary to triggerthe DFS for managing the temperature of the processor andcan run at the maximum frequency Therefore when thenumber of active cores is lower than 6 the probability thatthe hotspot temperature exceeds the threshold is 0 andthe average hotspot temperatures of the processor with andwithout theDFS are sameThere is no differencewith theDFSandwithout theDFSwhen the number of active cores is lowerthan 6 so the comparisons in this situation are not given inTable 3
8 Conclusions
In this paper a probabilistic analysis method of the tem-perature and frequency of multicore processors is presentedtaking the variation of workloads into account It is provedtheoretically in this paper that (1) the hotspot temperaturesof both active cores and inactive cores are the linear functionsof the IPC (2) the hotspot temperature follows the normalprobabilistic distribution based on the assumption that IPCsof all cores follow the same normal distribution and (3) therunning frequency follows a probabilistic distribution
From the experimental results it can be seen that (1) theestimation error rates of the dynamic powers for differentfunctional units have significant variations indicating thatthe linear correlations between the IPC and the dynamicpowers for various functional units are different (2) theestimation error rates of leakage powers for both activecores and inactive cores are similar showing similar linearcorrelations between temperature and leakage power acrossvarious functional units (3) a higher estimation accuracy ofleakage powers can be obtained at the temperature intervalbetween 60∘C and 100∘C at the cost of lower accuracy atother intervals (4) the hotspot temperature of the proces-sor is not deterministic and has significant variation fora certain number of active cores and a certain frequencyand the number of active cores and the running frequencydetermine simultaneously the probabilistic distribution ofhotspot temperature and (5) various numbers of active coresresult in different probabilistic distributions of frequenciesand different probabilities for triggering DFS
Competing Interests
The authors declare no competing interests
Authorsrsquo Contributions
Biying Zhang performed the theoretical analysis and wrotethe major part of this paper Gang Cui and Zhongchuan
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 13
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
0
002
004
006
008
01Pr
obab
ility
den
sity
40 60 80 100 120 14020Temperature (∘C)
(a)
40 60 80 100 120 14020Temperature (∘C)
0
002
004
006
008
01
Prob
abili
ty d
ensit
yFrequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
0
002
004
006
008
01
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(c)
Figure 6 The probability density of the hotspot temperature for active cores (a) The number of active cores is six (b) The number of activecores is seven (c) The number of active cores is eight
If a core is powered off and made inactive using thepower-gating technique it only dissipates the leakage powerwhich is much less than that of an active core so the powerdissipated by the processor is reduced significantlyWhen thenumber of active cores is less than six that is more thantwo cores are powered off the decreased power makes it
enough for the rest of active cores of a processor to executeat the full speed and the DFS is not necessary to be triggeredTherefore when the number of active cores is less than six thefrequency of the processor is constantly 1 and the probabilityfor triggering DFS is constantly 0 This situation is not givenin Figures 8 and 9
14 Mathematical Problems in Engineering
Table 3 Comparisons of temperatures between active cores with and without DFS
Number of active cores Average hotspot temperature (∘C) Probability for exceedingtemperature threshold ()
With DFS Without DFS With DFS Without DFS6 7930 7935 0 0527 8885 9017 0 12378 9385 10131 0 5584
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(a)
40 60 80 100 120 14020Temperature (∘C)
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
Figure 7The probability density of the hotspot temperature for inactive cores (a)The number of active cores is six (b)The number of activecores is seven
It can be seen that various numbers of active cores resultin different probabilistic distributions of frequencies anddifferent probabilities for triggering DFS When all cores arepowered on the probability that the processor runs at themaximum frequency is only 4416 and the probability fortriggering DFS is 5584This means that all cores will run atthe full speed only when the IPCs of tasks are less If the IPCsincrease to an extent the running frequency will be scaleddown to control the temperature under the threshold Asthe number of active cores decreases the probability that theprocessor runs at the maximum frequency increases and theprobability for triggering DFS decreases When the numberof active cores is less than six no matter what the IPCs arethe saved power by shutting off more than two cores makes itdeterministic for the active cores to run at the full speed sothe probability that the processor runs at the maximum fre-quency is 100 and the probability for triggering DFS is 0
75 Comparisons of Temperatures with and without DFS IftheDFS technique is not used then the processor always runs
at the full speed that is the running frequency is constantly 1So the average hotspot temperature of the active core withoutthe DFS can be obtained according to (52) If both the power-gating and DFS techniques are used simultaneously for thedynamic thermal management then the running frequencycan be scaled to control the temperature of the processorunder the thermal threshold According to (63) the averagehotspot temperature of the active core with the DFS canbe obtained In terms of the average hotspot temperatureand the probability that the hotspot temperature exceeds thethreshold Table 3 presents the comparative results betweenthe active cores with and without the DFS when the numberof active cores is 6 7 and 8
If the DFS technique is used by the processor the runningfrequency will be scaled down to reduce the temperatureonce the hotspot temperature reaches the threshold So thehotspot temperature will not exceed the threshold that isthe probability that the hotspot temperature exceeds thethreshold is 0 If the DFS technique is not used by the pro-cessor the running frequency is always the maximum and
Mathematical Problems in Engineering 15
Number of active cores
Frequency is 1 Frequency is 083Frequency is 067 Frequency is 05
0
20
40
60
80
100
Prob
abili
ty o
f fre
quen
cy (
)
6 7 8
Figure 8 Probabilistic distribution of frequencies
6 7 8Number of active cores
0
10
20
30
40
50
60
Prob
abili
ty fo
r trig
gerin
g D
FS (
)
Figure 9 Probability for triggering DFS
the hotspot temperature is possible to exceed the thresholdthat is the probability that the hotspot temperature exceedsthe threshold is larger than 0Therefore the average hotspottemperature of the processor with the DFS is lower than thatwithout the DFS
As the number of active cores decreases the saved powerby shutting off cores makes it more possible for the activecores to execute at the full speed and the effect of the DFSon cooling down the processor weakens until it disappearsTherefore for the processors with and without the DFS the
average hotspot temperatures become closer as the number ofactive cores decreases as shown in Table 3 Even though theDFS is not used the probability that the hotspot temperatureexceeds the threshold will reduce until 0 when more coresare powered off
When the number of active cores is lower than 6 that ismore than 2 cores are powered off no matter what speed theprocessor runs at the hotspot temperature will not exceedthe threshold so all active cores are not necessary to triggerthe DFS for managing the temperature of the processor andcan run at the maximum frequency Therefore when thenumber of active cores is lower than 6 the probability thatthe hotspot temperature exceeds the threshold is 0 andthe average hotspot temperatures of the processor with andwithout theDFS are sameThere is no differencewith theDFSandwithout theDFSwhen the number of active cores is lowerthan 6 so the comparisons in this situation are not given inTable 3
8 Conclusions
In this paper a probabilistic analysis method of the tem-perature and frequency of multicore processors is presentedtaking the variation of workloads into account It is provedtheoretically in this paper that (1) the hotspot temperaturesof both active cores and inactive cores are the linear functionsof the IPC (2) the hotspot temperature follows the normalprobabilistic distribution based on the assumption that IPCsof all cores follow the same normal distribution and (3) therunning frequency follows a probabilistic distribution
From the experimental results it can be seen that (1) theestimation error rates of the dynamic powers for differentfunctional units have significant variations indicating thatthe linear correlations between the IPC and the dynamicpowers for various functional units are different (2) theestimation error rates of leakage powers for both activecores and inactive cores are similar showing similar linearcorrelations between temperature and leakage power acrossvarious functional units (3) a higher estimation accuracy ofleakage powers can be obtained at the temperature intervalbetween 60∘C and 100∘C at the cost of lower accuracy atother intervals (4) the hotspot temperature of the proces-sor is not deterministic and has significant variation fora certain number of active cores and a certain frequencyand the number of active cores and the running frequencydetermine simultaneously the probabilistic distribution ofhotspot temperature and (5) various numbers of active coresresult in different probabilistic distributions of frequenciesand different probabilities for triggering DFS
Competing Interests
The authors declare no competing interests
Authorsrsquo Contributions
Biying Zhang performed the theoretical analysis and wrotethe major part of this paper Gang Cui and Zhongchuan
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
14 Mathematical Problems in Engineering
Table 3 Comparisons of temperatures between active cores with and without DFS
Number of active cores Average hotspot temperature (∘C) Probability for exceedingtemperature threshold ()
With DFS Without DFS With DFS Without DFS6 7930 7935 0 0527 8885 9017 0 12378 9385 10131 0 5584
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
40 60 80 100 120 14020Temperature (∘C)
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(a)
40 60 80 100 120 14020Temperature (∘C)
0
005
01
015
02
025
Prob
abili
ty d
ensit
y
Frequency is 05Frequency is 067
Frequency is 083Frequency is 1
(b)
Figure 7The probability density of the hotspot temperature for inactive cores (a)The number of active cores is six (b)The number of activecores is seven
It can be seen that various numbers of active cores resultin different probabilistic distributions of frequencies anddifferent probabilities for triggering DFS When all cores arepowered on the probability that the processor runs at themaximum frequency is only 4416 and the probability fortriggering DFS is 5584This means that all cores will run atthe full speed only when the IPCs of tasks are less If the IPCsincrease to an extent the running frequency will be scaleddown to control the temperature under the threshold Asthe number of active cores decreases the probability that theprocessor runs at the maximum frequency increases and theprobability for triggering DFS decreases When the numberof active cores is less than six no matter what the IPCs arethe saved power by shutting off more than two cores makes itdeterministic for the active cores to run at the full speed sothe probability that the processor runs at the maximum fre-quency is 100 and the probability for triggering DFS is 0
75 Comparisons of Temperatures with and without DFS IftheDFS technique is not used then the processor always runs
at the full speed that is the running frequency is constantly 1So the average hotspot temperature of the active core withoutthe DFS can be obtained according to (52) If both the power-gating and DFS techniques are used simultaneously for thedynamic thermal management then the running frequencycan be scaled to control the temperature of the processorunder the thermal threshold According to (63) the averagehotspot temperature of the active core with the DFS canbe obtained In terms of the average hotspot temperatureand the probability that the hotspot temperature exceeds thethreshold Table 3 presents the comparative results betweenthe active cores with and without the DFS when the numberof active cores is 6 7 and 8
If the DFS technique is used by the processor the runningfrequency will be scaled down to reduce the temperatureonce the hotspot temperature reaches the threshold So thehotspot temperature will not exceed the threshold that isthe probability that the hotspot temperature exceeds thethreshold is 0 If the DFS technique is not used by the pro-cessor the running frequency is always the maximum and
Mathematical Problems in Engineering 15
Number of active cores
Frequency is 1 Frequency is 083Frequency is 067 Frequency is 05
0
20
40
60
80
100
Prob
abili
ty o
f fre
quen
cy (
)
6 7 8
Figure 8 Probabilistic distribution of frequencies
6 7 8Number of active cores
0
10
20
30
40
50
60
Prob
abili
ty fo
r trig
gerin
g D
FS (
)
Figure 9 Probability for triggering DFS
the hotspot temperature is possible to exceed the thresholdthat is the probability that the hotspot temperature exceedsthe threshold is larger than 0Therefore the average hotspottemperature of the processor with the DFS is lower than thatwithout the DFS
As the number of active cores decreases the saved powerby shutting off cores makes it more possible for the activecores to execute at the full speed and the effect of the DFSon cooling down the processor weakens until it disappearsTherefore for the processors with and without the DFS the
average hotspot temperatures become closer as the number ofactive cores decreases as shown in Table 3 Even though theDFS is not used the probability that the hotspot temperatureexceeds the threshold will reduce until 0 when more coresare powered off
When the number of active cores is lower than 6 that ismore than 2 cores are powered off no matter what speed theprocessor runs at the hotspot temperature will not exceedthe threshold so all active cores are not necessary to triggerthe DFS for managing the temperature of the processor andcan run at the maximum frequency Therefore when thenumber of active cores is lower than 6 the probability thatthe hotspot temperature exceeds the threshold is 0 andthe average hotspot temperatures of the processor with andwithout theDFS are sameThere is no differencewith theDFSandwithout theDFSwhen the number of active cores is lowerthan 6 so the comparisons in this situation are not given inTable 3
8 Conclusions
In this paper a probabilistic analysis method of the tem-perature and frequency of multicore processors is presentedtaking the variation of workloads into account It is provedtheoretically in this paper that (1) the hotspot temperaturesof both active cores and inactive cores are the linear functionsof the IPC (2) the hotspot temperature follows the normalprobabilistic distribution based on the assumption that IPCsof all cores follow the same normal distribution and (3) therunning frequency follows a probabilistic distribution
From the experimental results it can be seen that (1) theestimation error rates of the dynamic powers for differentfunctional units have significant variations indicating thatthe linear correlations between the IPC and the dynamicpowers for various functional units are different (2) theestimation error rates of leakage powers for both activecores and inactive cores are similar showing similar linearcorrelations between temperature and leakage power acrossvarious functional units (3) a higher estimation accuracy ofleakage powers can be obtained at the temperature intervalbetween 60∘C and 100∘C at the cost of lower accuracy atother intervals (4) the hotspot temperature of the proces-sor is not deterministic and has significant variation fora certain number of active cores and a certain frequencyand the number of active cores and the running frequencydetermine simultaneously the probabilistic distribution ofhotspot temperature and (5) various numbers of active coresresult in different probabilistic distributions of frequenciesand different probabilities for triggering DFS
Competing Interests
The authors declare no competing interests
Authorsrsquo Contributions
Biying Zhang performed the theoretical analysis and wrotethe major part of this paper Gang Cui and Zhongchuan
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 15
Number of active cores
Frequency is 1 Frequency is 083Frequency is 067 Frequency is 05
0
20
40
60
80
100
Prob
abili
ty o
f fre
quen
cy (
)
6 7 8
Figure 8 Probabilistic distribution of frequencies
6 7 8Number of active cores
0
10
20
30
40
50
60
Prob
abili
ty fo
r trig
gerin
g D
FS (
)
Figure 9 Probability for triggering DFS
the hotspot temperature is possible to exceed the thresholdthat is the probability that the hotspot temperature exceedsthe threshold is larger than 0Therefore the average hotspottemperature of the processor with the DFS is lower than thatwithout the DFS
As the number of active cores decreases the saved powerby shutting off cores makes it more possible for the activecores to execute at the full speed and the effect of the DFSon cooling down the processor weakens until it disappearsTherefore for the processors with and without the DFS the
average hotspot temperatures become closer as the number ofactive cores decreases as shown in Table 3 Even though theDFS is not used the probability that the hotspot temperatureexceeds the threshold will reduce until 0 when more coresare powered off
When the number of active cores is lower than 6 that ismore than 2 cores are powered off no matter what speed theprocessor runs at the hotspot temperature will not exceedthe threshold so all active cores are not necessary to triggerthe DFS for managing the temperature of the processor andcan run at the maximum frequency Therefore when thenumber of active cores is lower than 6 the probability thatthe hotspot temperature exceeds the threshold is 0 andthe average hotspot temperatures of the processor with andwithout theDFS are sameThere is no differencewith theDFSandwithout theDFSwhen the number of active cores is lowerthan 6 so the comparisons in this situation are not given inTable 3
8 Conclusions
In this paper a probabilistic analysis method of the tem-perature and frequency of multicore processors is presentedtaking the variation of workloads into account It is provedtheoretically in this paper that (1) the hotspot temperaturesof both active cores and inactive cores are the linear functionsof the IPC (2) the hotspot temperature follows the normalprobabilistic distribution based on the assumption that IPCsof all cores follow the same normal distribution and (3) therunning frequency follows a probabilistic distribution
From the experimental results it can be seen that (1) theestimation error rates of the dynamic powers for differentfunctional units have significant variations indicating thatthe linear correlations between the IPC and the dynamicpowers for various functional units are different (2) theestimation error rates of leakage powers for both activecores and inactive cores are similar showing similar linearcorrelations between temperature and leakage power acrossvarious functional units (3) a higher estimation accuracy ofleakage powers can be obtained at the temperature intervalbetween 60∘C and 100∘C at the cost of lower accuracy atother intervals (4) the hotspot temperature of the proces-sor is not deterministic and has significant variation fora certain number of active cores and a certain frequencyand the number of active cores and the running frequencydetermine simultaneously the probabilistic distribution ofhotspot temperature and (5) various numbers of active coresresult in different probabilistic distributions of frequenciesand different probabilities for triggering DFS
Competing Interests
The authors declare no competing interests
Authorsrsquo Contributions
Biying Zhang performed the theoretical analysis and wrotethe major part of this paper Gang Cui and Zhongchuan
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
16 Mathematical Problems in Engineering
Fu are the supervisor and the cosupervisor of Biying Zhangand his current PhD work Hongsong Chen performed theexperiments
Acknowledgments
This work is supported by the project under Grant no SGS-DWH00YXJS1500229 Beijing Natural Science Foundation(4142034) and Fundamental Research Funds for the CentralUniversities (no FRF-BR-15-056A)
References
[1] J Kong S W Chung and K Skadron ldquoRecent thermalmanagement techniques formicroprocessorsrdquoACMComputingSurveys vol 44 no 3 article 13 2012
[2] K Skadron M R Stan K Sankaranarayanan W Huang SVelusamy and D Tarjan ldquoTemperature-aware microarchitec-ture modeling and implementationrdquo ACM Transactions onArchitecture and Code Optimization vol 1 no 1 pp 94ndash1252004
[3] W Huang S Ghosh S Velusamy K Sankaranarayanan KSkadron and M R Stan ldquoHotSpot A compact thermalmodeling methodology for early-stage VLSI designrdquo IEEETransactions on Very Large Scale Integration (VLSI) Systems vol14 no 5 pp 501ndash513 2006
[4] D Li S X-D Tan EH Pacheco andMTirumala ldquoParameter-ized architecture-level dynamic thermal models for multicoremicroprocessorsrdquo ACM Transactions on Design Automation ofElectronic Systems vol 15 no 2 article 16 2010
[5] H Wang S X-D Tan D Li A Gupta and Y Yuan ldquoCom-posable thermal modeling and simulation for architecture-level thermal designs of multicore microprocessorsrdquo ACMTransactions on Design Automation of Electronic Systems vol18 no 2 article no 28 2013
[6] V Hanumaiah and S Vrudhula ldquoTemperature-aware DVFSfor hard real-time applications on multicore processorsrdquo IEEETransactions on Computers vol 61 no 10 pp 1484ndash1494 2012
[7] V Hanumaiah S Vrudhula and K S Chatha ldquoPerformanceoptimal online DVFS and task migration techniques for ther-mally constrained multi-core processorsrdquo IEEE Transactions onComputer-Aided Design of Integrated Circuits and Systems vol30 no 11 pp 1677ndash1690 2011
[8] W S Lawrence and P R Kumar ldquoImproving the performanceof thermally constrained multi core processors using DTMtechniquesrdquo in Proceedings of the International Conference onInformation Communication and Embedded Systems (ICICESrsquo13) pp 1035ndash1040 IEEE Chennai India February 2013
[9] K Chen E Chang H Li and A Wu ldquoRC-based temperatureprediction scheme for proactive dynamic thermal managementin throttle-based 3D NoCsrdquo IEEE Transactions on Parallel andDistributed Systems vol 26 no 1 pp 206ndash218 2015
[10] B Wojciechowski K S Berezowski P Patronik and J BiernatldquoFast and accurate thermal modeling and simulation of many-core processors and workloadsrdquo Microelectronics Journal vol44 no 11 pp 986ndash993 2013
[11] H B Jang J Choi I Yoon et al ldquoExploiting applicationsystem-dependent ambient temperature for accurate microarchitec-tural simulationrdquo IEEE Transactions on Computers vol 62 no4 pp 705ndash715 2013
[12] B Shi Y Zhang and A Srivastava ldquoDynamic thermal manage-ment under soft thermal constraintsrdquo IEEETransactions onVeryLarge Scale Integration (VLSI) Systems vol 21 no 11 pp 2045ndash2054 2013
[13] Z Liu T Xu S X-D Tan and H Wang ldquoDynamic thermalmanagement for multi-core microprocessors considering tran-sient thermal effectsrdquo in Proceedings of the 18th Asia and SouthPacific Design Automation Conference (ASP-DAC rsquo13) pp 473ndash478 Yokohama Japan January 2013
[14] D Das P P Chakrabarti and R Kumar ldquoThermal analysisof multiprocessor SoC applications by simulation and verifi-cationrdquo ACM Transactions on Design Automation of ElectronicSystems vol 15 no 2 article 15 2010
[15] J Lee and N S Kim ldquoAnalyzing potential throughput improve-ment of power- and thermal-constrained multicore processorsby exploiting DVFS and PCPGrdquo IEEE Transactions on VeryLarge Scale Integration (VLSI) Systems vol 20 no 2 pp 225ndash235 2012
[16] R Rao and S Vrudhula ldquoFast and accurate prediction of thesteady-state throughput of multicore processors under thermalconstraintsrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 28 no 10 pp 1559ndash15722009
[17] C-Y Cher and E Kursun ldquoExploring the effects of on-chipthermal variation on high-performance multicore architec-turesrdquo ACM Transactions on Architecture and Code Optimiza-tion vol 8 no 1 article 2 2011
[18] B Zhang and G Cui ldquoPower estimation for alpha 21264using performance events and impact of ambient temperaturerdquoInternational Journal of Control and Automation vol 7 no 6pp 159ndash168 2014
[19] V Hanumaiah S Vrudhula and K S Chatha ldquoMaximizingperformance of thermally constrainedmulti-core processors bydynamic voltage and frequency controlrdquo in Proceedings of theIEEEACM International Conference on Computer-Aided Design(ICCAD rsquo09) pp 310ndash313 San Jose Calif USANovember 2009
[20] B Wojciechowski K S Berezowski P Patronik and J Bier-nat ldquoFast and accurate thermal simulation and modelling ofworkloads of many-core processorsrdquo in Proceedings of 17thInternational Workshop on Thermal Investigations of ICs andSystems (THERMINIC rsquo11) September 2011
[21] J Lee and N S Kim ldquoOptimizing throughput of power- andthermal-constrainedmulticore processors usingDVFS and per-core power-gatingrdquo inProceedings of the 46thACMIEEEDesignAutomation Conference (DAC rsquo09) pp 47ndash50 San FranciscoCalif USA July 2009
[22] R Rao S Vrudhula and C Chakrabarti ldquoThroughput of multi-core processors under thermal constraintsrdquo inProceedings of theInternational Symposium on Low Power Electronics and Design(ISLPED rsquo07) pp 201ndash206 Portland Ore USA August 2007
[23] M Mohaqeqi M Kargahi and A Movaghar ldquoAnalyticalleakage-aware thermal modeling of a real-time systemrdquo IEEETransactions on Computers vol 63 no 6 pp 1377ndash1391 2014
[24] R E Kessler ldquoThe alpha 21264 microprocessorrdquo IEEE Microvol 19 no 2 pp 24ndash36 1999
[25] W Liao L He and K M Lepak ldquoTemperature and supplyvoltage aware performance and power modeling at microarchi-tecture levelrdquo IEEE Transactions on Computer-Aided Design ofIntegrated Circuits and Systems vol 24 no 7 pp 1042ndash10532005
[26] M R Guthaus J S Ringenberg D Ernst T M Austin TMudge and R B Brown ldquoMiBench a free commercially
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Mathematical Problems in Engineering 17
representative embedded benchmark suiterdquo in Proceedings ofthe IEEE International Workshop of Workload Characterization(WWC rsquo01) Washington DC USA 2001
[27] J L Henning ldquoSPEC CPU2000 measuring CPU performancein the new millenniumrdquo Computer vol 33 no 7 pp 28ndash352000
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
top related