v sphere performance pitfalls

Upload: fpgrande822

Post on 03-Jun-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 V Sphere Performance Pitfalls

    1/12

    Understanding Major

    Performance Pitfalls

    in vSphere

    V e r n a k . c o m

    7 / 2 9 / 2 0 1 3

    Technical White Paper by Dan McGee

    This document is designed to help IT Professionals avoid

    painful design mistakes or oversights in a vSphere

    environment.

  • 8/12/2019 V Sphere Performance Pitfalls

    2/12

    Page 1

    Table of Contents1. Introduction............................................................................................................................... 2

    2. CPU .................................................................................................................................. 3

    2.1 Understanding CPU % Ready and the CPU Scheduler ...................................................... 3

    2.2 Processor Caching ............................................................................................................. 3

    2.3 Failure to Complete P2V Migrations ............................................................................. 3-4

    3. Memory ............................................................................................................................ 5

    3.1 Overcommitted Memory .................................................................................................. 5

    3.2 Memory Limits on Gold Images ..................................................................................... 5-6

    3.3 Failure to Maintain Like VMs ............................................................................................ 6

    4. Network ........................................................................................................................... 7

    4.1 Understanding Default NIC Configuration........................................................................ 7

    4.2 STP, STA, BPDU Guard & BPDU Filter ............................................................................ 7-8

    4.3 Broadcasts, Flat VLANs and Interrupts .......................................................................... 8-9

    5. Storage ........................................................................................................................... 10

    5.1 Dangers of Thin Provisioning .......................................................................................... 10

    5.2 Array Misconfiguration .............................................................................................. 10-11

    5.3 Alignment Issues ............................................................................................................. 11

    6. Further Reading .............................................................................................................. 11

  • 8/12/2019 V Sphere Performance Pitfalls

    3/12

    Page 2

    1. Introduction

    A colleague of mine recently asked me What are the things that just grind our

    virtualized servers to a halt? What are the big performance nightmares in vSphere? They did

    not just want to know some troubleshooting methodology and how it changes going from

    physical to virtual servers. They wanted to know the juicy stuff. They wanted the dark,

    disturbing stories from organizations that have had it all go to hell in a handbasket. The types

    of companies that have suffered performance issues they cant explain, lower ROI, resume-

    generating events, and parted with enormous piles of cash on hardware that wasnt

    appropriate for them. This White Paper is my answer to the heart of the inquiry and it is my

    hope that these examples provide a basis for better decision making in a vSphere environment.

    The examples to follow are by no means mutually exclusive from one another, nor are

    they collectively exhaustive. Also, I will not be discussing certain features such as vMotion, HA,and DRS in any significant detail. This is not to discredit their incredible value to a healthy

    vSphere environment by any means. For more information on these and other technologies,

    please reference the Future Reading section at the end of this document.

    Virtual environments are inherently dynamic and utilization levels can change drastically

    and quickly. Virtual machines can be created or cloned in minutes by administrator actions or

    sometimes by automation tools. HA restarts or DRS events can also add to the less predictable

    virtualized world. Unlike physical environments, the underlying resource categories are shared

    and this requires IT professionals to learn and adapt to new rules and guidelines in the data

    center. The four primary resource categories are CPU, Memory, Network (I/O), and Storage

    (Capacity & I/O). The following sections will briefly explain how they are shared and

    demonstrate soul-crushing scenarios in each category and how they could be resolved or

    avoided.

  • 8/12/2019 V Sphere Performance Pitfalls

    4/12

    Page 3

    2. CPU

    2.1 Understanding CPU % Ready and the CPU Scheduler

    In Larry Louks Critical VMware Mistakes You Should Avoid, the following foundational rules arewell laid out for server processor management:

    1. ESXi Host Server schedules VMs onto and off-of processor as needed.2. Whenever a VM is scheduled to processor, all of the cores must be available for the VM to be

    scheduled or the VM cannot be scheduled at all.

    3. If a VM cannot be scheduled to processor when it needs access to processor, VM performancecan suffer tremendously.

    4. When VMs are ready for a processor but unable to be scheduled, this creates what VMwarecalls CPU % Ready values.

    5. CPU % Ready manifests as a utilization issue but is actually a scheduling issue.Due the rules mentioned above, mixing VMs with greatly varied vCPUs on the same ESXi host

    can create major scheduling problems. This issue is amplified if the host has low density for CPU

    cores. Whenever possible, reduce VMs to a single vCPU. Often this is not recommended or

    possible but never do the opposite of arbitrarily thinking that throwing vCPUs at a VM will

    actually make it perform faster.

    2.2 Processor Caching

    Processor caches contain information that allow guest operating system and applications to

    perform better. To this end, vSphere attempts to schedule on the same core over and over again. If thevirtual machine is moved from to a new core on a new socket where the cache isnt shared (which is

    usually isnt between sockets) the cache for the new core must be loaded over time with information

    formerly cached on the previous core. The reason I discuss processor caching is because VMware

    administrators need to understand that there is a temporary switching cost associated with things such

    as a hot migration with vMotion.

    2.3 Failure to Complete P2V Migrations

    Physical to Virtual migrations involve imaging of an entire physical server including OS,

    applications and data, into a virtual disk. The often neglected second phase of a proper migration isstripping the new virtual machine of physical device drivers and related hardware-vendor specific tools.

    Plain and simple, physical drivers need to be replaced with virtual ones (VM Tools). Here is a list of steps

    for a proper P2V migration:

    1. Plan capacity2. Pre-migration cleanup3. Full shutdown migration of critical apps and databases, ensuring stable data

  • 8/12/2019 V Sphere Performance Pitfalls

    5/12

    Page 4

    4. Exclude unneeded partitions5. Split array partitions into separate virtual disks6. Post-migration cleanup7. Adjust VM and Guest OS to single vCPU where possible8. Remove unnecessary applications9. Review virtual hardware assignments

  • 8/12/2019 V Sphere Performance Pitfalls

    6/12

    Page 5

    3. Memory

    3.1 Overcommitted Memory

    The total physical RAM in an ESXi host is available to VMs aside from a small amount for the

    VMKernel. If I have I have an ESXi host with 64GB of physical RAM, I could build VMs on it and assign

    more than 64GB total to the VMs. When this is done, I am overcommitting memory. By default, a

    virtual machine does not power on with all of the RAM assigned to it. It boots with half of its RAM

    assignment unless the minimum value is changed for the VM.

    This memory overcommitment does not cause major issues if the VMs dont actually need all of

    the RAM assigned to them. However, if the ESXi host is overcommitted and VMs actually need RAM in

    excess of RAM available, there are a few ESXi processes that attempt to keep it all up and running.

    These processes include balloon driver activity and VMKernel swap file.

    The Memory Control Driver (also known as the Balloon Driver) is installed as part of the VMTools package that should be installed on all of the VMs in the environment. The purpose of the Balloon

    Driver Is to force VMs to give up some of their RAM and begin using their swap file. In other words, if

    there is more demand for RAM than what the ESXi host may provide, the balloon driver inflates and

    VMs are forced to move some contents of their own RAM into the guest OS swap file within their own

    virtual disks. As is the case with a physical server, this will cause degradation of performance since the

    VM must rewrite parts of its page file back into memory for processing.

    The VMKernel swap file Is created when a new VM is first powered on in the VMFS volume that

    contains the VM. This file is equal in size to the amount of RAM assigned to the virtual machine. This

    file is only used in emergency situations and otherwise sits idle in a health vSphere environment. If

    there is a crisis with memory overcommitment, the ESXi host will first use the balloon driver to reclaim

    RAM. If the crisis continues, the ESXi host will grab parts of the VM memory and move it out to

    VMKernel swap. In this scenario, VMs could be swapping both with a page file and again with a

    VMKernel swap file. This takes awful performs it and drags it right down to apocalyptic.

    Just because you can overcommit memory does not mean that you should. ESXi is very

    resilient in keeping virtual machines up and running even in these scenarios but at an obvious cost of

    performance.

    3.2 Memory Limits on Gold Images

    The base images that we use as templates for our various guest operating system must be

    configured properly, lest they yield cloned virtual machines that share their flaws. A relatively common

    mistake found in a lot of vSphere implementations is that this master image has memory limits set.

    Being that the limit always wins, a VM assigned 8GB of RAM will not use 8GB if a limit was accidentally

    set to 2GB. The VM will only ever be allocated up to 2GB but the VM guest thinks it has the full 8GB.

  • 8/12/2019 V Sphere Performance Pitfalls

    7/12

    Page 6

    If the VM actually needs the RAM you have assigned to it beyond the limit, severe performance issues

    occur. Multiply this by the number of times a misconfigured template has been cloned in the

    environment, and you can see how the nightmare scenario might play out.

    3.3 Failure to Maintain Like VMs

    ESXi is able to scan RAM pages for the virtual machines and identify any duplicate pages. After

    this detection, the ESXi host will use a single RAM location and point multiple virtual machines to that

    single page of shared RAM. The shared page location is flagged read-only to prevent corruption. To the

    VM guest operating systems, they are unaware of this underlying sharing. They simply have what they

    need available in RAM. When we scale this concept up to having, lets say, one thousand Windows 2008

    R2 Servers, you could imagine how much RAM we could save by having OS builds that are similarly

    patched and maintained. This is not to say that you should only run one type of guest operating system

    in your environment because that simply isnt practical in many, if not most, environments. However, if

    attention is paid to proper operating system patching, application patching, and reducing disparate VMs

    and sprawl, it is possible to see high percentages of shared ram on your ESXi hosts.

  • 8/12/2019 V Sphere Performance Pitfalls

    8/12

    Page 7

    4. Network

    4.1 Understanding Default NIC Configuration

    The virtual machines hosted on ESXi are commonly assigned virtual network interface cards.

    These virtual NICs connect to virtual switches which provide a path from the virtual NICs to the physical

    adaptor(s). The physical NICs on the ESXi hosts do not have IP addresses themselves, but rather act as a

    passthrough for the virtual machine traffic to the physical network. Proportional shares are utilized in

    order to prevent virtual machines from monopolizing a network path. Shares are enforced only in

    situations where there is contention for network resources. Otherwise, virtual machines are simply

    given what they need in terms of network resources.

    When two or more physical NICs are assigned to a virtual switch, the default mechanism used by

    the ESXi host is a round-robin approach to load-balancing. This default methodology works fine for

    many environments but can lead to network cards carrying uneven loads. In contrast, the virtual switch

    can be setup to perform link aggregation as opposed to using round-robin. In this configuration, the

    VMs will gain the combined throughput of the NICs while maintaining failover ability.

    4.2 STP, STA, BPDU Guard & BPDU Filter

    Spanning Tree Protocol (STP) enabled switches in a redundant Local Area Network (LAN) need to

    exchange information between each other for STP to work properly. Bridge Protocol Data Units (BPDUs)

    are messages exchange between switches in a redundant LAN. The BPDU frames contain information

    such as Switch ID, originating port, MAC Address, switch port cost, switch port priority, etc. The BPDUs

    are sent out as multicast messages and when they are received, the switch uses Spanning Tree

    Algorithm (STA) to know when there is a Layer 2 Switch loop in network and determines which

    redundantports need to be shut down. BPDUs and STA are designed to avoid Layer 2 switching loops

    as well as broadcast storms.

    As VMware so eloquently puts it in their KB 2047822, The STP process of identifying root bridge

    and finding is the switch ports are in a forwarding or blocking state takes somewhere around 30 to 50

    seconds. During that time no data can be passed from those switch ports. If a server connected to the

    port cannot communicate for that long, the applications running on them will time out. To avoid this

    issue of time out on the servers, the best practice is to enable Port Fast configuration on the switch

    ports where the servers NICs are connected. The Port Fast configuration puts the physical switch port

    immediately into STP forwarding state.

    So now that we have the important STP best practice out of the way, lets discuss how Standard

    and Distributed vSwitches use STP and BPDU They dont! vSwitches do not support STP nor do they

    exchange BPDU frames. By default, they just forward them and think nothing of it. That said, an STP

    boundary is often created by using BPDU guard on the physical switch ports. In this BPDU guard

  • 8/12/2019 V Sphere Performance Pitfalls

    9/12

    Page 8

    example, any BPDU frames received on the physical switch port causes the port to become blocked.

    Since the vSwitch will simply forward BPDU frames through to the physical switch ports, this can cause

    an uplink traffic path failure if a compromised virtual machine generates BPDU frames. Then, the

    vSphere host moves the traffic to another uplink which then disables another switch port. This process

    chains until the entire cluster is compromised by the Denial of Service attack. This attack is especially

    cruel and damaging in a multi-tenant environment where a nasty neighbor can attack the neighborhood.

    From KB 2047822, To prevent such Denial of Service attack scenarios, the BPDU filter feature is

    supported as part of the vSphere 5.1 release. After configuring this feature at the ESXi host level, BPDU

    frames from any virtual machine will be dropped by the vSwitch. This feature is available on both

    Standard and Distributed vSwitches. I believe that it is critically important for VMware Administrators

    and their Networking colleagues to understand how to prevent this cluster-wide catastrophe. As an

    aside, there are those in the VMware community who believe that VMware should implement a true

    BPDU Guard on the vSwitches and not just BPDU Filter, but that is a discussion that will not be picked up

    here. Please refer to the Further Readingsection of this paper for appropriate links to related materials.

    4.3 Broadcasts, Flat VLANs and Interrupts

    Broadcasts tend to be very small packets and very bursty in nature. Although broadcast packets

    are sent to every device on the same VLAN as the sending device, we tend to not worry too much

    about broadcasts anymore because available bandwidth is usually so high that the impact of small

    broadcast packets is usually negligible. Also, contemporary switches have many features to limit or

    eliminate networking loops and broadcast storms such as STP.

    However, there are two major reasons why we still have to care at some levels: Bad (Read: Flat)

    design and Interrupts. If your organization has one, flat VLAN with all your management traffic, network

    storage, and virtual machine traffic playing together, then they are not going to play nicely. When you

    do this (Dont be this guy/girl), you create a scenario where broadcasts of everything affect everything

    else. Do not cut corners in your network design and use VLANs that far exceed industry recommended

    sizes. If you do, you will suffer the wrath of broadcasts and their best friend: Interrupts.

    In the years before Plug-N-Play, devices had interrupts set manually by the use of jumpers of dip

    switches. Plug-N-Play effectively makes the jumper/dip switch decisions for us now. However, devices

    like NICs still need their interrupt and every time a packet hits a network card in an Intel architecture

    device, it triggers an interrupt. They use interrupt requests (IRQs) to get an Operating

    System/Application to stop and make sure the communication with them is correct and complete before

    resuming. So when we talk about interrupts, we are talking about literally stopping the processor.Usually, these interrupt stoppages are brief and infrequent enough to not be noticeable or well

    understood by many administrators. Sometimes though, organizations for whatever reasons (or lack

    thereof) implement flat VLANs where an enormous amount of broadcast packets are hitting Intel

    architecture devices such as ESXi Hosts. This then causes incredible slowdowns due to production

    workload processing getting interrupted constantly. Proper network segmentation from the start is the

  • 8/12/2019 V Sphere Performance Pitfalls

    10/12

    Page 9

    most effective way to mitigate this risk. Remember, interrupts are like bee stings. One or two on

    occasion is tolerable but you do not want to ever suffer the swarm.

  • 8/12/2019 V Sphere Performance Pitfalls

    11/12

    Page 10

    5. Storage

    5.1 Dangers of Thin Provisioning

    Thin provisioned virtual disks are disks for which the assigned amount of disk space is not

    allocated at creation. Only enough disk space is allocated to accommodate existing data with a small

    buffer space added. The VMDK file for the VM will then grow in incrementally as more data is saved.

    The guest operating system sees the full amount that has been assigned to the virtual disk but the

    underlying storage is conserved until needed. At first glance, this looks great to new VMware admins.

    Reducing wasted disk space can be quite tempting but it comes with a price.

    First and foremost, since the virtual disk is dynamically growing as needed in small increments,

    there is a performance hit as the LUN is actually assigning the storage that the guest OS already thinks it

    has. While were on the guest OS note, that brings me to my other point you are lying to it. As stated

    before, the guest operating system sees the full provisioning amount so you are effectively lying to the

    OS and expecting everything to work out fine. If you want to see a train wreck up close and personal, fill

    a LUN unexpectedly when there are active, expanding VMDK files. Hint: It is going to come to a

    screeching halt.

    The error checking mechanisms in the guest operating system cannot detect that the VMFS

    volume is full or almost full because the administrator wrote the server a bad check. The guest believes

    there is enough room left to write but there is no space left physically to write to. This nightmare

    scenario can definitely lead to data corruption as VMs crash with partially saved data. In a production

    environment, it is highly recommended to keep thin provisioning to an absolute minimum or avoid it

    completely. It is suitable for a test lab or for VMware Workstation, but do not rely on it for a production

    environment.

    5.2 Array Misconfiguration

    Virtualized environments take advantage of shared storage types such as NAS, iSCSI, and Fiber

    Channel ranking from low to high disk I/O. While fiber storage allows administrators to decide on array

    configurations, many NAS and iSCSI options do not. This complicates storage purchase decisions for an

    organization and several do not successfully avoid the following scenario. In a fiber channel setup, the

    administrator can decide on which disks to carve up unto multiple arrays with different RAID

    configurations.

    The impact of VM storage I/O is isolated to the arrays upon which the VM is installed. The

    ability to break down the storage into these separate arrays and RAID Groups prevents a storage I/O

    bottleneck. However, in the NAS and iSCSI example with single arrays, every disk affects every other

    disk in the storage device. If an iSCSI device, for instance, supports only a single array that the admin

    formats in RAID 5, any change to any disk is going to be striped across every single disk. The lesson to be

    learned here when purchasing storage is that it is necessary to properly understand the I/Ops and

    configuration capabilities of the device and not just the storage and expandability. When organizations

  • 8/12/2019 V Sphere Performance Pitfalls

    12/12

    Page 11

    to not heed this warning, they hit their I/Ops limits long before the storage fills and they potentially

    spend heaps of money on temporary storage while reconfiguring the poorly provisioned array.

    5.3 Alignment Issues

    If we think of storage in a virtual environment in three layers, we have the guest operating

    system partition (such as NTFS), VMFS/VMDK, and the physical disk. Misalignment can occur between

    the virtual disk blocks (.vmdk) and the underlying physical disk blocks. Also, it is possible to also find

    misalignment between the guest operating system partition and the underlying virtual disk blocks.

    Misalignment in the environment causes unnecessary I/Ops to be used to overcompensate for the

    misalignment. Performance degradation ranges from moderate to severe in these instances depending

    on the type and scope of the alignment problem. Most newer guest operating systems install with the

    guest partition properly aligned but it still worth checking periodically within the environment. Several

    organizations such as Dell provide free or paid tools in order to scan vSphere and report this type of

    storage issue.

    6. Further Readinghttp://blog.ipspace.net/2012/09/dear-vmware-bpdu-filter-bpdu-guard.html

    http://blog.ipspace.net/2011/11/virtual-switches-need-bpdu-guard.html

    http://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-51-Network-Technical-

    Whitepaper.pdf

    http://rickardnobel.se/esxi-5-1-bdpu-guard/

    http://pubs.vmware.com/vsphere-51/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-51-

    resource-management-guide.pdf

    http://blog.ipspace.net/2012/09/dear-vmware-bpdu-filter-bpdu-guard.htmlhttp://blog.ipspace.net/2012/09/dear-vmware-bpdu-filter-bpdu-guard.htmlhttp://blog.ipspace.net/2011/11/virtual-switches-need-bpdu-guard.htmlhttp://blog.ipspace.net/2011/11/virtual-switches-need-bpdu-guard.htmlhttp://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-51-Network-Technical-Whitepaper.pdfhttp://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-51-Network-Technical-Whitepaper.pdfhttp://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-51-Network-Technical-Whitepaper.pdfhttp://rickardnobel.se/esxi-5-1-bdpu-guard/http://rickardnobel.se/esxi-5-1-bdpu-guard/http://pubs.vmware.com/vsphere-51/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-51-resource-management-guide.pdfhttp://pubs.vmware.com/vsphere-51/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-51-resource-management-guide.pdfhttp://pubs.vmware.com/vsphere-51/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-51-resource-management-guide.pdfhttp://pubs.vmware.com/vsphere-51/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-51-resource-management-guide.pdfhttp://pubs.vmware.com/vsphere-51/topic/com.vmware.ICbase/PDF/vsphere-esxi-vcenter-server-51-resource-management-guide.pdfhttp://rickardnobel.se/esxi-5-1-bdpu-guard/http://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-51-Network-Technical-Whitepaper.pdfhttp://www.vmware.com/files/pdf/techpaper/Whats-New-VMware-vSphere-51-Network-Technical-Whitepaper.pdfhttp://blog.ipspace.net/2011/11/virtual-switches-need-bpdu-guard.htmlhttp://blog.ipspace.net/2012/09/dear-vmware-bpdu-filter-bpdu-guard.html