energy-efficient data centers: exploiting knowledge about application and resources
DESCRIPTION
Presentation by Jose M. Moya at the IEEE Region 8 SB & GOLD Congress (25 – 29 July, 2012). The current techniques for data center energy optimization, based on efficiency metrics like PUE, pPUE, ERE, DCcE, etc., do not take into account the static and dynamic characteristics of the applications and resources (computing and cooling). However, the knowledge about the current state of the data center, the past history, the resource characteristics, and the characteristics of the jobs to be executed can be used very effectively to guide decision-making at all levels in the datacenter in order to minimize energy needs. For example, the allocation of jobs on the available machines, if done taking into account the most appropriate architecture for each job from the energetic point of view, and taking into account the type of jobs that will come later, can reduce energy needs by 30%. Moreover, to achieve significant reductions in energy consumption of state-of-the-art data centers (low PUE) is becoming increasingly important a comprehensive and multi-level approach, ie, acting on different abstraction levels (scheduling and resource allocation, application, operating system, compilers and virtual machines, architecture, and technology), and at different scopes (chip, server, rack, room, and multi-room).TRANSCRIPT
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
1
Energy-efficient data centers: Exploiting knowledge about
application and resources
José M. Moya <[email protected]>Integrated Systems Laboratory
José M.Moya | Madrid (Spain), July 27, 2012
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 2
Data centers
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 3
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 4
Power distribution
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 5
Power distribution (Tier 4)
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 6
Contents
• Motivation• Our approach
– Scheduling and resource management
– Virtual machine optimizations
– Centralized management of low-power modes
– Processor design
• Conclusions
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 7
Motivation
• Energy consumption of data centers– 1.3% of worldwide energy production in 2010– USA: 80 mill MWh/year in 2011 = 1,5 x NYC– 1 data center = 25 000 houses
• More than 43 Million Tons of CO2 emissions per year (2% worldwide)
• More water consumption than many industries (paper, automotive, petrol, wood, or plastic)
Jonathan Koomey. 2011. Growth in Data center electricity use 2005 to 2010
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 8
Motivation
• It is expected for total data center electricity use to exceed 400 GWh/year by 2015.
• The required energy for cooling will continue to be at least as important as the energy required for the computation.
• Energy optimization of future data centers will require a global and multi-disciplinary approach.
2000 2005 20100
5000
10000
15000
20000
25000
30000
35000
High-end serversMid-range serversVolume servers
Wor
ld se
rver
inst
alle
d ba
se
(tho
usan
ds)
2000 2005 20100
50
100
150
200
250
300
InfrastructureCommunicationsStorageHigh-end serversMid-range serversVolume serversEl
ectr
icity
use
(b
illio
n kW
h/ye
ar)
5,75 Million new servers per year10% unused servers (CO2 emissions similar to 6,5 million cars)
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 9
Temperature-dependent reliability problems
Time-dependent dielectric-
breakdown (TDDB)
Electromigration (EM)
Stress migration (SM)
Thermal cycling (TC)
✔ ✖
✖✖
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
10
Cooling a data center
José M.Moya | Madrid (Spain), July 27, 2012
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 11
• Virtualization - 27%• Energy Star server
conformance = 6.500
• Better capacity planning 2.500
Server improvements
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
12
Cooling improvements
• Improvements in air flow management and wider temperature ranges
José M.Moya | Madrid (Spain), July 27, 2012
Energy savings up to 25% 25.000Return of investmentin only 2 years
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 13
AC DC– 20% reduction of power losses in the
conversion process– 47 million dollars savings of real-state costs– Up to 97% efficiency, energy saving enough to
power an iPad during 70 million years
Infrastructure improvements
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 14
Best practices
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 15
And… what about IT people?
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 16
PUEPower Usage Effectiveness
• State of the Art: PUE ≈ 1,2– The important part is IT energy consumption– Current work in energy efficient data centers is focused
in decreasing PUE– Decreasing PIT does not decrease PUE, but it is seen in
the electricity bill
• But how can we reduce PIT ?
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
17
Potential energy savings by abstraction level
José M.Moya | Madrid (Spain), July 27, 2012
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 18
Our approach
• Global strategy to allow the use of multiple information sources to coordinate decisions in order to reduce the total energy consumption
• Use of knowledge about the energy demand characteristics of the applications, and characteristics of computing and cooling resources to implement proactive optimization techniques
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 19
Holistic approach
Chip Server Rack Room Multi-room
Sched & alloc 2 1
app
OS/middleware
Compiler/VM 3 3
architecture 4 4
technology 5
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 20
1. Room-level resource management
Chip Server Rack Room Multi-room
Sched & alloc 2 1
app
OS/middleware
Compiler/VM 3 3
architecture 4 4
technology 5
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
21
Leveraging heterogeneity
• Use heterogeneity to minimize energy consumption from a static/dynamic point of view– Static: Finding the best data center set-up, given a number
of heterogeneous machines– Dynamic: Optimization of task allocation in the Resource
Manager• We show that the best solution implies an
heterogeneous data center– Most data centers are heterogeneous (several generations
of computers)
CCGrid 2012
José M.Moya | Madrid (Spain), July 27, 2012
M. Zapater, J.M. Moya, J.L. Ayala. Leveraging Heterogeneity for Energy Minimization in Data Centers, CCGrid 2012
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
22
Current scenario
WORKLOAD Scheduler Resource Manager
Execution
José M.Moya | Madrid (Spain), July 27, 2012
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
23
Potential improvements with best practices
José M.Moya | Madrid (Spain), July 27, 2012
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
24
Cooling-aware scheduling and resource allocation
José M.Moya | Madrid (Spain), July 27, 2012
iMPACT Lab (Arizona State U)
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
Application-aware scheduling and resource allocation
25
LSI-UPM
WORKLOAD
Resource Manager(SLURM)
Execution
Profiling and Classification
Energy Optimization
José M.Moya | Madrid (Spain), July 27, 2012
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
26
Application-aware scheduling and resource allocation
• Workload:– 12 tasks from SPEC CPU INT 2006– Random workload composed by 2000 tasks, divided into
job sets– Random job set arrival time
• Servers:
Scenario
José M.Moya | Madrid (Spain), July 27, 2012
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
Application-aware scheduling and resource allocation
27
Energy profiling
WORKLOAD
Resource Manager(SLURM)
Execution
Profiling and Classification
Energy Optimization
Energy profiling
José M.Moya | Madrid (Spain), July 27, 2012
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
28
Workload characterization
José M.Moya | Madrid (Spain), July 27, 2012
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
Application-aware scheduling and resource allocation
29
Optimization
WORKLOAD
Resource Manager(SLURM)
Execution
Profiling and Classification
Energy Optimization
Energy Minimization:• Minimization subjected to constraints• MILP problem (solved with CPLEX)• Static and Dynamic
José M.Moya | Madrid (Spain), July 27, 2012
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
Application-aware scheduling and resource allocation
30
Static optimization
• Definition of optimal data center– Given a pool of 100 servers of each kind– 1 job set from workload– The optimizer chooses the best selection of servers– Constraints of cost and space
Best solution:• 40 Sparc• 27 AMD
Savings:• 5 a 22% energy• 30% time
José M.Moya | Madrid (Spain), July 27, 2012
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
Application-aware scheduling and resource allocation
31
Dynamic optimization
• Optimal workload allocation– Complete workload (2000 tasks)– Good enough resource allocation in terms of energy (not
the best)– Run-time evaluation and optimization
Energy savings ranging from 24% to 47%
José M.Moya | Madrid (Spain), July 27, 2012
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
Application-aware scheduling and resource allocation
• First proof-of-concept regarding the use of heterogeneity to save energy
• Automatic solution• Automatic processor selection offers notable energy
savings• Easy implementation in real scenarios
– SLURM Resource Manager– Realistic workloads and servers
32
Conclusions
José M.Moya | Madrid (Spain), July 27, 2012
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
33
2. Server-level resource management
José M.Moya | Madrid (Spain), July 27, 2012
Chip Server Rack Room Multi-room
Sched & alloc 2 1
app
OS/middleware
Compiler/VM 3 3
architecture 4 4
technology 5
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
34
Scheduling and resource allocation policies in MPSoCs
A. Coskun , T. Rosing , K. Whisnant and K. Gross "Static and dynamic temperature-aware scheduling for multiprocessor SoCs", IEEE Trans. Very Large Scale Integr. Syst., vol. 16, no. 9, pp.1127 -1140 2008
José M.Moya | Madrid (Spain), July 27, 2012
UCSD – System Energy Efficiency Lab
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 35
Scheduling and resource allocation policies in MPSoCs
• Energy characterization of applications allows to define proactive scheduling and resource allocation policies, minimizing hotspots
• Hotspot reduction allows to raise cooling temperature
+1oC means around 7% cooling energy savings
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
36
3. Application-aware and resource-aware virtual
machine
José M.Moya | Madrid (Spain), July 27, 2012
Chip Server Rack Room Multi-room
Sched & alloc 2 1
app
OS/middleware
Compiler/VM 3 3
architecture 4 4
technology 5
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 37
JIT compilation in virtual machines
• Virtual machines compile (JIT compilation) the applications into native code for performance reasons
• The optimizer is general-purpose and focused in performance optimization
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 38
Back-end
JIT compilation for energy minimization
• Application-aware compiler– Energy characterization of applications and transformations– Application-dependent optimizer– Global view of the data center workload
• Energy optimizer– Currently, compilers for high-end processors oriented to
performance optimization
Front-end Optimizer Code generator
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 39
Energy saving potential for the compiler (MPSoCs)
T. Simunic, G. de Micheli, L. Benini, and M. Hans. “Source code optimization and profiling of energy consumption in embedded systems,” International Symposium on System Synthesis, pages 193 – 199, Sept. 2000
– 77% energy reduction in MP3 decoder
FEI, Y., RAVI, S., RAGHUNATHAN, A., AND JHA, N. K. 2004. Energy-optimizing source code transformations for OS-driven embedded software. In Proceedings of the International Conference VLSI Design. 261–266.
– Up to 37,9% (mean 23,8%) energy savings in multiprocess applications running on Linux
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
40
4. Global automatic management of low-power
modes
José M.Moya | Madrid (Spain), July 27, 2012
Chip Server Rack Room Multi-room
Sched & alloc 2 1
app
OS/middleware
Compiler/VM 3 3
architecture 4 4
technology 5
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 41
DVFS – Dynamic Voltage and Frequency Scaling
• As supply voltage decreases, power decreases quadratically
• But delay increases (performance decreases) only linearly
• The maximum frequency also decreases linearly
• Currently, low-power modes, if used, are activated by inactivity of the server operating system
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 42
Room-level DVFS
• To minimize energy consumption, changes between modes should be minimized
• There exist optimal algorithms for a known task set (YDS)
• Workload knowledge allows to globally schedule low-power modes without any impact in performance
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 43
Parallelism to save energy
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
44
5. Temperature-aware floorplanning of MPSoCs and many-cores
José M.Moya | Madrid (Spain), July 27, 2012
Chip Server Rack Room Multi-room
Sched & alloc 2 1
app
OS/middleware
Compiler/VM 3
architecture 4 4
technology 5
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
45
Temperature-aware floorplanning
José M.Moya | Madrid (Spain), July 27, 2012
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
46
Potential energy savings with floorplanning
– Up to 21oC reduction of maximum temperature– Mean: -12oC in maximum temperature– Better results in the most critical examples
José M.Moya | Madrid (Spain), July 27, 2012
Y. Han, I. Koren, and C. A. Moritz. Temperature Aware Floorplanning. In Proc. of the Second Workshop on Temperature-Aware Computer Systems, June 2005
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 47
Temperature-aware floorplanning in 3D chips
• 3D chips are getting interest due to:– Scalability: reduces 2D
equivalent area– Performance: shorter wire
length– Reliability: less wiring
• Drawback:– Huge increment of hotspots
compared with 2D equivalent designs
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
48
Temperature-aware floorplanning in 3D chips
José M.Moya | Madrid (Spain), July 27, 2012
• Up to 30oC reduction per layer in a 3D chip with 4 layers and 48 cores
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 49
There is still much more to be done
• Smart Grids– Consume energy when everybody else does not– Decrease energy consumption when everybody
else is consuming• Reducing the electricity bill
– Variable electricity rates– Reactive power coefficient– Peak energy demand
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
José M.Moya | Madrid (Spain), July 27, 2012 50
Conclusions
• Reducing PUE is not the same as reducing energy consumption– IT energy consumption dominates in state-of-the-art data centers
• Application and resources knowledge can be effectively used to define proactive policies to reduce the total energy consumption– At different levels– In different scopes– Taking into account cooling and computation at the same time
• Proper management of the knowledge of the data center thermal behavior can reduce reliability issues
• Reducing energy consumption is not the same as reducing the electricity bill
“Ingeniamos el futuro”
CAMPUS OFINTERNATIONALEXCELLENCE
51
Contact
José M.Moya | Madrid (Spain), July 27, 2012
José M. Moya+34 607 082 [email protected]
ETSI de Telecomunicación, B104Avenida Complutense, 30Madrid 28040, Spain
Gracias: