how intel power management works · how intel power management works abstract: it is worth noting...
TRANSCRIPT
Copyright © 2010 Intel Corporation 1
How Intel Power Management Works ABSTRACT: It is worth noting that the Intel® Xeon® processor 5500 series-based servers offer the capability to allow the operator to actually set power consumption and trade off the power consumption level against performance. This capability is valuable from two perspectives. First, power capping brings predictable power consumption within the specified power capping range, and second, servers implementing power capping offer actual power readouts as a bonus: their power supplies are PMBus*-enabled and their historical power consumption can be retrieved through standard APIs. With actual historical power data, it is possible to optimize the loading of power limited racks, whereas before the most accurate estimation of power consumption had to be derived from derated nameplate data. The nameplate estimation for power consumption is a static measure that requires allowing a considerable safety margin. This conservative approach to power sizing leads to over-provisioning of power. This was acceptable when energy costs were a second order consideration. Not today anymore. Integration with the Intel® Data Center Manager SDK allows dialing the power to be consumed by groups of over a thousand servers, allowing a power control authority of tens of thousands of watts in data centers. How does power capping work? The foundation for power management resides in CPU voltage and frequency scaling implemented by the Intel® Xeon® processor 5500 series architecture. More likely than not, CPUs represent the most energetic components in a server. If we can regulate the power consumed by the CPUs we can have an appreciable effect on the power consumed by the server as a whole. Multiply this control over the thousands of servers in a data center. Through this mechanism, we can alter the power consumed in that data center in significant ways. Nesting Technologies for Scaling Power Management Figure 1 below depicts a series of abstractions for a large deployment of servers in a data center, and cloud storage appliances in particular. Chipsets are used to bind together the CPUs and the memory in a server. A server carries direct-attached (DASD) storage. Servers are organized in racks, and racks are organized in rows. The aggregation of rows encompasses all the servers in a data center. Today our main lever for power control is by throttling the power consumed by the CPUs up and down. In the near future we can expect to see memory power control added to the mix. The basic mechanism of CPU voltage and
Copyrigh
frequentens ofthousancontrol There ieffectivserversratchet
Figure 1 Power capabilserver Concepimplem Manag The lowof phyproportpower t
ht © 2010 I
ncy scalinf watts unds of CP to tens o
s also a pveness) fas use less ted down a
1 Power
control foities of pois attained
ptually, pomented thr
ging CPU
west level ysics dictational to tthe CPU.
ntel Corpora
g allows mp or dow
PUs in a df kilowatts
potential syactor as power, thas well.
r control hie
or groups ower contd by comp
ower contrough a se
U Power C
is implemate that he CPU's
ation
moving thwn. Aggrdata centes or even
ynergisticdefined bhe power a
erarchy.
of serverstrol of eacposing CPU
trol for theries of co
Consump
mented thrfor a givfrequency
he power regating ter expand hundreds
c effect caby The Gallocated
s is attainch serverU power c
housands oordinated
ption
rough freqven archity and to th
consumptthis capabs the ran
s.
ptured byreen Gridto the coo
ned by co. Likewiscontrol as
of serve set of ne
quency antecture, phe square
tion of a Cbility overge of atta
y the PUE d* industroling equi
mposing pse, power illustrated
rs in a dested mech
nd voltagepower coe of the vo
CPU by a r the tenainable po
(power usry group.pment can
power con control fd in Figure
ata centehanisms.
scaling: lnsumptio
oltage use
2
few s of ower
sage If n be
ntrol for a e 1.
er is
laws n is
ed to
Copyright © 2010 Intel Corporation 3
There are mechanisms built into the CPU architecture that allow a certain number of discrete combinations of voltage and frequency. Using the ACPI standard nomenclature, these discrete combinations are called P-states, the highest performing state is nominally identified as P0, and the lower power consumption states are identified as P1, P2 and so on. An Intel® Xeon® processor 5500 series supports over ten states, the actual number depending on the processor model. For the sake of an example, a CPU in P0 may have been assigned a voltage of 1.4 volts and 3.6 GHz, at which point it draws about 100 watts. As the CPU transitions to lower power states, it may have a state P4 using 1.2 volts running at 2.8 GHz and consuming about 70 watts. Table 1 illustrates this example. Actual CPUs may actually support over ten P-states. Please consult the CPU specifications for the actual values. Table 1 P-State Table Example.
Power State
CPU Voltage (V)
CPU Frequency (GHz)
CPU Power Draw (W)
P0 1.4 3.6 103
P1 1.35 3.4 94
P2 1.3 3.2 85
P3 1.25 3.0 76
P4 1.2 2.8 68
The P-states by themselves can't control the power consumed by a server. The CPU itself has no mechanisms to measure the power it consumes. This mechanism is implemented by firmware running in the Intel® Xeon® processor 5500 series chipset. This firmware implements the Intel® Intelligent Power Node Manager technology. Managing Server Power Consumption If what we want is to measure the power consumed by a server, looking only at CPU consumption does not provide the whole picture. For this purpose, the power supplies in Intel® Intelligent Power Node Manager-enabled servers are instrumented to provide actual power readings for the whole server through the PMBus* standard. This process is shown in Figure 2. With these components in place, it is now possible to establish a classic control feedback loop where we compare a target power against the actual power indicated by the power supplies. The
Copyrigh
Intel® down u If the dNode Maverageof anotobvious
Figure 2 Figure that thdifferenoperati
Figure 3loop. Note ththroughnot sethyperv
ht © 2010 I
Intelligenuntil the de
desired poManager e power cther classis reasons.
2 Power
3 providee Intel®
nce enginng system
3 Expan
hat the tah the opert P-statesisor throu
ntel Corpora
nt Power Nesired targ
ower lies bcode rap
consumptioc control .
r Control Lo
es an expa Intelligenne. Intem or the h
nded view
arget P-starating syss directly.ugh an AP
ation
Node Manget power
between tpidly switcon meets scheme, a
oop in an Int
anded detnt Power l® Intelliypervisor
of the Inte
ate is set stem. Inte. It sendPI. This i
ager coder is reache
two P-statches betw the set poaffectionat
tel® Xeon®
tail of theNode Magent Pow to change
el® Xeon®
through el® Intellds requess necessa
e manipulaed.
tes, the Inween the ower. Thitely called
® Processor
e control lonager firm
wer Node e to a targ
processor
an in-banigent Powsts to theary to coo
ates the P
ntel® Inte two stais is an imd bang-ba
5500 Series
oop. Hermware im manageget P-state
5500 serie
nd mechawer Node Me operatinordinate p
P-states u
elligent Potes until
mplementang contro
s Server.
re we can plements r directs e.
s power co
nism, thaManager dng systempower pol
4
p or
ower the ation ol for
see the the
ontrol
t is, does m or icies
Copyrigh
with otcarrying Changeconsum Manag From ajust a Harnesthe mepower sgroup oserverssoftwar
Figure 4 Note tmechanfor a sitwo poconsumthousanimplem Figure control No specommusetting Intel® the ind
ht © 2010 I
ther poweg out.
es in promption reg
ging Pow
data censingle se
sing the “eans to cosupply reaof serverss. This fre develop
4 Power
hat Intel®nism very ingle servower supmption of nds. At
menting th
5 depicts loop as wcific agen
unicates w power ta Intelligenividual se
ntel Corpora
r policies
ocessor Pgistered by
wer Consu
ter perspeerver has“power of ntrol servadouts fors to allowfunction ipment kit
r control loo
® Data similar to
ver, but atplies, Intmultiple s this levee differen
s an expwell as thents need twith the bo
rgets andnt Power Nrver meet
ation
that the
P-state iny the PMB
umption
ective, thes a small the massvers as a gr one serv
w meetings provideand show
op in a grou
Center Mo the mect a much tel® Dataservers orel Intel® ce engine
anded viee relationsto run in eoard mana for doing
Node Manats the assi
operating
nduce chBus enable
in Serve
e ability to impact es” represgroup, andver, we ne a global ed by the
wn in Figur
p managed
Manager ihanism thlarger sca
a Center r “nodes” Data Ce.
ew of theship with teach nodeagement cg readoutsager firmwigned pow
g system o
anges ined power s
er Group
o regulateand is nsents a ked just as eed to mo power tae Intel® re 4.
by Intel® D
mplementhat regulatale. Inste
Managerwhose nu
enter Man
e Intel® the NM coe. Intel®controller s of the acware take
wer consum
or hyperv
the levsupplies.
ps
e power conot intrinsey capabilwe were a
onitor the arget for tData Cen
Data Center
ts a feedtes poweread of war overseeumber cannager is
Data Cenontrol loop
® Data Ce(BMC) in ctual pow
es care of mption tar
isor migh
vel of po
onsumptiosically useity. We nable to obpower forthat grounter Mana
r Manager.
dback conr consumptching on
es the pon range uin charge
nter Manap undernenter Manaeach nodeer consumensuring rget.
5
t be
ower
on of eful. need btain r the p of ager
ntrol ption e or
ower p to e of
ager eath. ager e for med. that
Copyrigh
Intel® buildingcapabilis showintegraIntelligtempertempercertain power ccode is
Figure 5 An appindividudesignathroughin turn nodes u
ht © 2010 I
Data Ceg block foities for thwn in Fited into ent Powrature senrature of thresholdconsumpt in charge
5 Intel®
lication inual serveates a poh the Inte breaks dunder its c
ntel Corpora
nter Manor industryhe benefitgure 6, a cloud
wer Nodensors. Ta group d, it can tion to rede of the di
® Data Cent
nterfacing er nodesower consel® Data Cdown the command
ation
ager wasy players t of data cwhere In storage
e ManagThis allowof serverstake a nu
ducing thefference e
ter Manager
with Inte; the apsumption Center Mapower ta.
purposeto build m
center opentel® Da appliancer-enable
ws the aps, and if umber of rmal stres
engine.
r power con
l® Data Cpplication target to
anager APrget into
ly architemore sopherators. Oata Centece applicaed serverpplication the tempmeasures
sses. At th
trol loop, de
Center ma code’s o Intel® PI. Intel®
power ta
ected as ahisticated
One possiber Managation. Srs come to moniperature rs, from thhis level th
etailed view
nager no power pData Cen
® Data Ceargets to t
an SDK a and valuble applicaer has bSome Inte with itor the rises abovhrottling bhe applica
w.
longer “sepolicy ennter Manaenter Manathe indivi
6
as a able
ation been tel® inlet inlet ve a back ation
ees” gine ager ager dual
Copyrigh
Cloud The apsubsystcapabilstoragesets thmatche
Figure 6 Figure the appolicy eservice One popotentipracticethe timloop is constanassociaof secoeach looccur.
ht © 2010 I
Storage
plication ctem. Lesity; the e subsystee server p
es the ove
6 Stora
7 providepliance cengine. Ts APIs.
otential coal for osce; the tim
me constanin the ordnt is in tated with onds. Becaoop individ
ntel Corpora
Applicat
code also ss sophistapplicatioem by qupower in aerall set po
ge applianc
s a more ontrol alg
The constit
oncern witcillatory orme constant involveder of millthe orderpower maause of thdually to e
ation
tion Pow
needs to ticated imn code r
uerying tha way thaower.
e control loo
detailed vgorithm ituent com
th three nr unstablents acrossd with theiseconds, r of a feanagemenhe time coensure tha
wer Mana
mind the mplementareads outhe intelligeat the tota
op.
view of thimplement
mponents a
nested looe behaviors the three Intel® whereas w second
nt for an aonstant spat no unde
gement
power coation mayt the powent PDU tal consum
is processted by thare loosely
ps operatrs. This iee levels Intelligentthe Intel®
ds. Finalappliance preads, it er-damped
nsumed by have a wer consuthat feeds
mption for
s. In this he applicay coupled
ting concuissue doeare prettyt Power N
® Data Celly, the t is in the is possibld, oscillat
by the stor monitor-umed by s it and tthe applia
figure weation’s po through W
urrently iss not arisy spread Node Manaenter Manatime consorder of tle to optimory behav
7
rage only the then ance
see ower Web
s the se in out: ager ager stant tens mize viors
Copyrigh
Figure 7 For moplease EssentiGolden
About Enriqutechnoland aplatformemergiemergicorporaissues o
He has undertawith cotechnolwhite pWeb seCompu
BernarBusinesregular
ht © 2010 I
7 Expan
ore informrefer to thial Handb, and Migu
the Authue Castrology straterchitecturm definitiong technong markeations largof technol
taken rolaken the orporate elogies. Hepapers onervices. Hter Scienc
rd Goldess Edge) rly feature
ntel Corpora
nded view o
mation abhe book Cook for Iuel Gomez
hors o-Leon isegist for Ire, softwon, and bologies wets. In thge and smogy and d
es as visiotechnical
end userse has pu technolo
He holds Pce from Pu
en has be and “aned in mag
ation
f appliance
bout powCreating thIT Professz.
s an entIntel Digit
ware engbusiness dith innovahis capac
mall, nonpdata cente
onary, sol execution
s and insiblished ovgy strategPhD and Murdue Univ
een calledn open sgazines lik
power cont
wer manahe Infrastrsionals by
terprise atal Enterprineering, developmeative buscity he haprofits ander planning
lution archn in advade Intel iver 40 agy and mMS degreversity.
d “a renosource guke Compu
rol.
gement aructure foy Enrique
and data rise Group
high-perent. His winess modas served
d governmg.
hitect, proanced proillustratingrticles, co
managemees in Elec
owned opuru” (Seaterworld,
and cloudor Cloud C Castro-L
center ap workingrformancework involdels in ded as a c
mental org
oject manaoof of cong the useonference nt as welctrical Eng
en sourcearchCRM.c Informati
d computComputingLeon, Bern
architect in OS de
e computves matceveloped
consultant ganizations
ager, and ncept proje of emer papers, l as SOA gineering
e expert” com) andionWeek,
8
ting, : An nard
and esign ting, hing and for s on
has ects ging and
and and
(IT d is and
Copyright © 2010 Intel Corporation 9
Inc. His blog “The Open Source” is one of the most popular features of CIO Magazine’s Web site. Bernard is a frequent speaker at industry conferences like LinuxWorld, the Open Source Business Conference, and the Red Hat Summit. He is the author of Succeeding with Open Source, (Addison-Wesley, 2005, published in four languages), which is used in over a dozen university open source programs throughout the world. Bernard is the CEO of Navica, a Silicon Valley IT management consulting firm.
Miguel Gomez is a Technology Specialist for the Networks and Service Platforms Unit of Telefónica Investigación y Desarrollo, working on innovation and technological consultancy projects related to the transformation and evolution of telecommunications services infrastructure. He has published over 20 articles and conference papers on next-generation service provisioning infrastructures and service infrastructure management. He holds PhD and MS degrees in Telecommunications Engineering from Universidad Politécnica de Madrid.
Copyright © 2010 Intel Corporation. All rights reserved.
This article is based on material found in book Creating the Infrastructure for Cloud Computing: An Essential Handbook for IT Professionals by Enrique Castro-Leon, Bernard Golden, and Miguel Gomez. Visit the Intel Press web site to learn more about this book: http://www.intel.com/intelpress/sum_virtn.htm, or our Recommended Reading List for similar topics: www.intel.com/technology/rr
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4744. Requests to the Publisher for permission should be addressed to the Publisher, Intel Press, Intel Corporation, 2111 NE 25 Avenue, JF3-330, Hillsboro, OR 97124-5961. E-mail: [email protected] .