virtual hierarchies to support server consolidation michael marty and mark hill university of...
TRANSCRIPT
Virtual Hierarchies to Support Server Consolidation
Michael Marty and Mark HillUniversity of Wisconsin - Madison
What is Server Consolidation?
Multiple server applications are deployed onto Virtual Machines (VMs), running on a single, more powerful server.
Feasibility Virtualization Technology (VT) –
Hardware and software Many-core CMPs – Sun’s Niagara (32
threads); Intel’s Tera-scale project (100s tiles)
CMP Running Consolidated Servers
Characteristics
Isolating the function of VMs Isolating the performance of
consolidated servers Facilitating dynamic reassignment
of VM resources (processor, memory)
Supporting inter-VM memory sharing (content-based page sharing)
How Memory System Optimized?
Minimize AMAT by servicing misses within a VM
Minimize interference among separate VMs to isolate performance
Facilitate dynamic reassignment of cores, caches, and memory to VMs
Inter-VM page sharing
Current CMP Memory Systems
Global broadcast – Not viable for such a large number of tiles
Global directory – Forcing memory accesses to cross chip, failing to minimize AMAT and isolate performance
Statically distributing dir among tiles – Better, complicating memory allocation, VM reassignment & scheduling, limiting sharing opportunity
DRAM Dir with Dir Cache (DRAM-DIR)
Main dir in DRAM; Dir cache in Memory Controller
Each tile is a sharer of the data
Any miss issues a request to dir.
1. Failing to minimize AMAT-Significant latency to reach dir, even data is near2. Allows performance of one VM to affect others-due to interconnect and directory contention.
Duplicate Tag Directory (TAG-DIR)
Centrally located Fails to minimize
AMAT Dir contentions Challenging as the
number of cores increases (64 cores, 16-way => 1024-way)
Static Cache Bank Dir(STATIC-BANK-DIR)
Home tile (decided by block address or page frame no.)
Home tile maintains sharer & states
A local miss asks for home tile
A replacement from home tile invalidates all copies
Fails to meet minimizing AMAT, VM isolation (Even worse, due to invalidations.)
Solution: Two-level virtual hierarchy
Level 1 directory for intra-VM coherence Minimizing memory access time Isolating performance
Two alternative global level two protocols for inter-VM coherence Allowing for inter-VM sharing due to
migration, reconfiguration, page sharing
VHA and VHB
Level 1 Intra-VM Dir Protocol
Home tile within the VM
Who is home? Not necessarily
power of 2 Dynamic
reassignment Dynamic home
tiles by VM config Table (64-entry)
64 bit vector for each dir entry
Level 2 – Option 1: VHA
Dir in DRAM and Dir Cache in Memory Controller
Each entry contains a full 64-bit vector
Why not home tile ID?
Brief Summary
Level-one Intra-VM protocol handles most of the coherence
Level-two protocol will only be used for inter-VM sharing and dynamic reconfiguration of VMs
Can we reduce the complexity of Level-two protocol?
Level 2 – Option 2: VHB
A single bit tracks whether a block has any cached copies.
Broadcast for misses for inter-VM sharing if bit is set.
Advantage of Level 2 Broadcast
Reduce the complexity of protocol, get rid of many transient states
Enables level 1 proto to be inexact Using limited or coarse-grain vector Even no state with broadcast within VM No home tag for private data Victimize a tag without invalidating sharers Accessing memory with prediction without
checking the home tile first
Uncontended L1-to-L1 Sharing latency
Normalized Runtime: Homogenous
STATIC-BANK-DIR & VHA consumes tag space in static or dynamic home tiles
VHB: no home tiles for private data
Memory System Stall Cycle
Cycle per Transaction for Mixed
VHB best overall performance, lowest cpt DRAM-DIR: 45%-55% hit rate in the 8MB Dir
Cache (no partition) STATIC: slightly better for oltp, worse for jbb in
mixed1, allow interference, allow oltp to use other VMs resource
Conclusion
Future memory system should be optimized for workload consolidation as well as single-workload.
Maximize shared memory accesses serviced within a VM
Minimize interference among separate VMs
Facilitate dynamic reassignment of resource