x-pod for citrix vdi on ucs with ise 700 hybrid storage array

33
Validated Reference Architecture January 2015 X-Pod for VDI Providing a simple yet best-of-breed converged infrastructure for virtualized desktop solutions Enabled by: X-IO ISE 740 Hybrid Storage Array Citrix XenDesktop 7.5 VMware ESX 5.5 Cisco UCS

Upload: x-io-technologies

Post on 20-Jul-2015

190 views

Category:

Technology


0 download

TRANSCRIPT

Validated Reference Architecture January 2015

X-Pod for VDI Providing a simple yet best-of-breed converged

infrastructure for virtualized desktop solutions

Enabled by:

X-IO ISE 740 Hybrid Storage Array

Citrix XenDesktop 7.5

VMware ESX 5.5

Cisco UCS

X-Pod for Citrix XenDesktop 7.5 2

Table of Contents

Introduction ................................................................................................................... 3

Key Takeaway ........................................................................................................................................... 3

Executive Overview....................................................................................................... 3

Why VDI? .................................................................................................................................................. 4

VDI—Business Benefits ................................................................................................ 5

Flexible Desktop Environment .................................................................................................................. 5

Better, Easier Desktop Management ........................................................................................................ 5

Desktop by Template ................................................................................................................................ 6

Security and Compliance .......................................................................................................................... 6

BYOD Support .......................................................................................................................................... 6

Virtual Desktop Implementation Risks ........................................................................ 7

Performance and Capacity: Not a Trade-Off ............................................................................................ 7

Reliability and Redundancy....................................................................................................................... 7

Redefining “Steady-State” Operations ...................................................................................................... 8

VDI Planning Questions to Ask .................................................................................... 8

Solution Overview ......................................................................................................... 9

Components .............................................................................................................................................. 9

Citrix XenDesktop ................................................................................................................................... 10

Solution Architecture .................................................................................................. 13

Hardware Components ........................................................................................................................... 14

Software Components ............................................................................................................................. 15

Cisco Unified Compute System (UCS) ................................................................................................... 15

VMware vSphere 5 .................................................................................................................................. 16

XenDesktop Environment Architecture ................................................................................................... 17

Datastores ............................................................................................................................................... 18

Performance Analysis of Tested configurations ...................................................... 21

Test Methodology .................................................................................................................................... 21

Workload Analysis ................................................................................................................................... 21

Virtual Desktop Deployment Operations ................................................................................................. 27

Virtual Desktop Boot Storm ..................................................................................................................... 29

Conclusion ................................................................................................................... 32

Contact X-IO technologies ...................................................................................................................... 32

Appendix ...................................................................................................................... 33

Appendix A: IOPS Comparison .............................................................................................................. 33

X-Pod for Citrix XenDesktop 7.5 3

Introduction

This white paper provides a primer for the X-Pod for VDI converged infrastructure solution, powered by

X-IO storage. X-Pod is a reference architecture designed to deliver a repeatable, high-performance virtual

desktop infrastructure. It utilizes an industry-leading XenDesktop 7.5 environment, hosted on Cisco

Unified Computing System (UCS) Blade Servers, with storage housed on a single high-performance X-IO

ISE 740 Hybrid Storage Array.

The environment described herein has been extensively tested by X-IO. This document is intended to

provide insight into the components, the architecture, and the performance required to meet the demands

of a 450- and 500-seat virtual desktop infrastructure (VDI) deployment.

Key Takeaway

This paper details the extensive testing that X-IO has performed and

provides guidance to the performance requirements of various operations

based on the X-Pod core components of Cisco UCS server and networking

hardware together with the X-IO Intelligent Storage Element (ISE) systems.

This validated testing demonstrates that this solution is capable of

delivering a high-performance desktop experience at up to 95%

concurrency.

This paper is designed to act as both a primer for XenDesktop deployments

and as a technical overview of the X-Pod architecture. Those readers who

are comfortable with the business and technical benefits of deploying

XenDesktop and who understand the pitfalls of implementing such an

infrastructure should progress to the Solution Overview section of this

document.

Executive Overview

When discussing VDI architectures with customers and solution providers, X-IO has noticed two

concerns:

Whether the technical statistics and benchmarks provided are “real-world” or marketing-hyped

numbers.

Whether VDI architects should be utilizing hyper-converged systems or using best of breed

components.

When it comes to VDI design, there are many possible pitfalls. This has led to a frightening statistic: for

every successful VDI implementation, 7 to 10 fail, the vast majority of these due to an inappropriate

storage design. X-IO has therefore worked with real-world benchmark tools such as Login VSI to ensure

that any benchmarks carried out are true reflections of user actions rather than synthetic workloads

applied by I/O testing tools. All of the user counts reflected in this document are actually high-utilization

loads and, if anything, are worst-case numbers rather than optimistic based upon assumptions or

unrealistic scenarios (e.g., zero user data or ridiculously high data de-duplication forecasts).

X-Pod for Citrix XenDesktop 7.5 4

Regarding the argument for hyper-converged systems, the systems out there today undoubtedly have

their place; however, they need to be precisely architected for today’s workload and are relatively

inflexible for future growth or change of use. This has often led to a difficult decision for architects as to

whether to endure the complexity and risk of using best-of-breed components or to compromise and

deploy a pre-converged, collapsed stack. With the arrival of converged solutions such as X-Pod, best-of-

breed components can be deployed, but using pre-architected and tested blueprints that give the peace

of mind previously found only with hyper-converged systems.

Why VDI?

Virtualization of servers and IT infrastructure has been well established in the business landscape.

Operating expense (OPEX) cost reductions are routine in a virtualized data center due to the reduced

number of physical servers, more centralized management tools, energy savings, and many other factors.

This cost savings makes desktop virtualization a promising opportunity to transform a major cost for IT

organizations. Unfortunately, virtualized desktop workloads are unlike most of the server virtualization

workloads with which organizations have experience. While “steady-state” desktop operations are what is

commonly sized for (for example, 10-20 IOPS/desktop), it is the “non-standard” operations and

misconfigured sizing operations that eventually cause poor user experience and make VDI such a

challenging, and expensive, solution to design and implement.

By leveraging desktop virtualization solutions such as Citrix XenDesktop, IT organizations can provide

their customers with a superior desktop experience while decreasing management cost and increasing

flexibility of the organization. IT organizations can now increase the leverage of their IT staff by

consolidating desktop management and hardware into the enterprise virtualization computing model. For

example, relatively few IT staff can manage hundreds, even thousands, of desktops for patch

management or application upgrades. Using tools such as Machine Creation Services or Provisioning

Server to manage image deployment, gold images can be quickly reverted to a known state if required,

which reduces capacity as only data changes are stored for each snapshot. By contrast, traditional

methods of patch management require excessive involvement from the IT organization across hundreds,

even thousands, of physical machines that could be spread around the world.

In addition to desktop virtualization, Citrix XenApp is available to virtualize applications as well. End users

can be provisioned individual applications that can be delivered to any device, whether that is a laptop,

tablet, or smartphone. As the “bring your own device” (BYOD) phenomenon continues to grow, users

demand more flexible access to the resources they need to perform their jobs. Policies that prohibit

employees from using personal devices, and that the company does not pay for, are becoming more

difficult to defend.

Implementing a virtual desktop solution can be immensely beneficial to the organization, but if

implemented incorrectly, it can be an equally impressive failure. User experience is the “gold standard”

that any solution will be held to, because users will evaluate the new solution based on the physical

desktop that was just replaced. Proof of Concept (POC) testing is essential when evaluating any new

solution, because the process of configuring the test gives essential information about how the system

will perform. Tools such as Login Virtual Session Indexer (Login VSI™) are critical for simulating “like”

production workloads and are a worthwhile investment as part of evaluating different solutions.

Inadequately designed VDI implementations often perform well in the POC phase but are overwhelmed

when placed into production. In a majority of these cases, storage performance is often the cause of poor

user experience, due to it often being undersized for performance, and can be the most expensive single

component of any VDI solution.

X-Pod for Citrix XenDesktop 7.5 5

VDI—Business Benefits

Properly sized and implemented, a VDI project can provide users with a “whole” desktop experience that

surpasses their physical desktop machine. However, the benefits to the business are similar to server

virtualization but are applied at an order of magnitude greater scale. Enterprise-class computing hardware

can be applied to end-user computing (reliability/availability/performance) and can leverage large-scale

virtualization management techniques (capacity/scale). This can allow for greater leveraging of existing IT

staff, enable higher levels of reliability (including disaster recovery), and reduce overall ongoing costs to

the business.

While VDI can be a tremendous advantage for the business, data storage costs are the single largest

cost component of the solution and the most common cause of poor user experience and failed

implementations. Listed below are some advantages to the business and the role that storage has to play

in each.

Flexible Desktop Environment

Many types of desktop needs are seen across various organizations and industries. Based upon the load

pattern and use case, different kinds of desktop virtualization options are available. In fact, it is likely that

multiple types of implementations exist in a single, larger organization.

For example, the resource requirements for graphic designers, software developers, Microsoft Office

power users, and executives are quite different than the requirements for a call center or kiosk desktop.

The power users will generally have higher CPU and memory requirements and will likely require that the

desktop be persistent. Power users will expect that any changes to their desktop will be preserved and in

the same state they left it when logging off.

In a call center (or kiosk) configuration, users expect to log in to any desktop and have the exact same

experience with nothing preserved on the desktop. If one desktop has an issue, the user simply moves to

another desktop instance and tries again. There is usually a standard suite of software that these virtual

desktops use to support the business functions. This is especially true if, for example, cloud-based

customer resource management (CRM) service or Office365 are being used.

Better, Easier Desktop Management

With physical desktops, the IT organization has to go out and physically touch each desktop in some

cases to remediate a problem. This geographic dispersion, even if within the same office building,

increases the staffing requirements to manage the end-user desktops and increases the time required for

“break/fix” functions. There is also an increased risk of data theft when users store corporate data on

physical desktops that reside anywhere in the organization. Desktop virtualization allows for centralized

management of the most commonly touched component in the desktop solution, the desktop itself.

Backups can also be more effective, since all user data is kept in the datacenter, even though it looks to

the user like an attached physical disk housed on local storage. This eliminates many of the issues seen

with trying to back up hundreds and thousands of remote physical desktop machines. Network

interruptions, issues with individual operating system bugs, failing physical hard drives, and so on are all

mitigated by using a common storage platform. In addition, Citrix Profile Management provides the ability

to store necessary user preference data to a network location for backup, management, and replication.

VMware has for years provided a rich API to enable storage partners to better integrate with the VMware

ecosystem. This enables the virtualization administrator (vAdmin) to quickly and efficiently assign storage

resources where required while the complex provisioning of storage resources up through the different

X-Pod for Citrix XenDesktop 7.5 6

virtualization and hardware layers is done by software. This greatly leverages the amount of storage the

vAdmin can effectively manage and ensures that best practices are followed for storage configuration,

thereby reducing risk. With the addition of the ISE Manager plug-in for the vCenter console, the vAdmin

has additional insight into the performance and health of the underlying storage without having to view an

additional console.

Desktop by Template

IT organizations can create a library of virtual desktop templates, each of which can be carefully tuned

and configured for the business need. These templates can then be used to provision virtual desktop

machines very, very quickly—in minutes in most cases.

For example, if an organization routinely uses contractors for a few different types of functions, a “gold

image” template can be created for each, which can include the appropriate operating system, security

settings, and all the applications contractors would need to accomplish their tasks. When a new contract

employee is added, the IT staff need only deploy a new virtual desktop based upon the appropriate

template. This can be done literally in minutes, but this can also generate an enormous amount of storage

traffic—or I/Os per second (IOPS)—with a relatively small number of deploy operations.

NOTE: Deploying a desktop is one of the most performance-demanding operations that storage will

encounter in a desktop virtualization solution. Performing these operations during “steady-state”

operations can put an abnormally large strain on the storage infrastructure and can impact the experience

of all other users on the system.

Security and Compliance

VDI allows an IT organization to have much better control over corporate data. As such, it makes the job

of implementing consistent, common security and compliance features easier. Depending upon the

organization’s industry, compliance and security concerns can be important or strictly mandated.

With physical desktops and mobile devices, corporations face the increased risk of data loss and theft of

any device where data is stored “locally.” Desktop virtualization enables employees to work with data

securely on centralized corporate resources. In some cases, users can work from any device in any

location with their data meeting the security requirements of the organization.

Depending on the role of the desktop users, certain data regulations (e.g., financial, legal, and medical)

that require data separation and isolation may apply. This requires separate storage devices that must be

able to support high IOPS and capacity levels that the separated pools require. Modular-based storage

systems are inherently designed to accommodate this data security requirement.

Citrix XenDesktop allows an administrator the ability to flexibly assign security features such as

preventing the saving of data to a local resource or copying and pasting data from the XenDesktop

session to the local resource. Add a Citrix NetScaler to the infrastructure, and the admin can now choose

to allow or prevent these features based on the identity of the source device or network entry point to

allow internal users the ability to save to a local resource but deny that user the option when connecting

from a personal device.

BYOD Support

Increasingly, organizations are supporting (and are being demanded to support) “bring your own device”

(BYOD) functionality. The workforce is changing, and the growing expectation is that applications and

desktop access will be available on whatever device the employee has, from laptop to tablet to

X-Pod for Citrix XenDesktop 7.5 7

smartphone. This also has an immense benefit to the business, as the employee is providing a

“preferred” end-point device that can be used to be productive.

Citrix StoreFront combined with Citrix Receiver offers a very rich interface that allows users to access

their applications and desktop on any mobile device. Individual applications published in Citrix XenApp

enable the employee to quickly gain access to an application outside of the desktop, allowing for easier

access to applications on the go.

Virtual Desktop Implementation Risks

While the list of benefits that can be realized by VDI implementation, both in reduction of manpower and

operating expenses (OPEX), is impressive, the opportunities for failure are equally concerning. This

section lists several areas that can put a VDI initiative at risk. In all cases, sufficient planning for the final

production load and careful monitoring of operations are required to ensure smooth operations.

Performance and Capacity: Not a Trade-Off

Performance has a direct effect on end-user experience, and poor storage performance sizing is usually

the number one reason. Many VDI solutions are sized only for “steady-state” performance requirements.

Common VDI operations, such as boot storms, login storms, deploy operations, recovery, and other

“maintenance operations,” can require an enormous amount of transactional performance (IOPS) from

storage systems. VDI instances have to respond with little or no discernable impact to the end user when

these maintenance operations are conducted.

Performance is not the only thing to consider when sizing a VDI solution, as there must be enough

capacity to satisfy the requirements for user data, operating systems, application data, and user

personae. While there are several techniques to increase the amount of effective capacity a solution can

provide, there is still an underlying requirement for capacity and performance. Unlike regular “file server”

class capacity, this data has performance requirements that go across all of the data (e.g., there is little

“dead data”). This broad requirement means that while there may be areas of high-performance

concentration (base images or replicas), the rest of the data has a high IOPS/TB requirement as well.

This rest of the data is what the individual VDI instances have to work with, so any slowdown in capacity

will directly affect the end-user experience. Solutions that rely on calculated deduplication and

compression techniques will see degrading performance ratios as the capacity is consumed, as there are

limited amounts of processing power and memory capacity in the storage controllers (i.e., the number of

calculations increases with increasing capacity).

According to VDI support organizations, storage systems account for 80 – 90% of the VDI performance

issues reported. Often this is due to read and write latency in the storage system.

Reliability and Redundancy

No virtualized datacenter is sustainable without reliability and redundancy. In the past, services running in

the data center did not necessarily represent a disaster if there was a service disruption. For example, a

service event to the CRM applications, or email services, would be painful to the organization if they were

sluggish, non-responsive, or unavailable for a period of time. Users could still use their desktops for other

productive functions, such as Microsoft Office applications or researching on the internet.

Virtual Desktop Infrastructures, on the other hand, can represent a true disaster if they become sluggish

or non-responsive. The user is no longer impacted by only one or two applications but by the very

X-Pod for Citrix XenDesktop 7.5 8

desktop itself. Imagine the majority of the workforce reduced to typing 10 or 20 words per minute or login

times in excess of 30 minutes. Any slowdown in the VDI ecosystem has a very “public” result. Storage

systems incur a tremendous performance penalty when performing disk drive recovery operations, and

these have a dramatically negative impact to the end-user experience. Even if data is separated into

“groups” or “sets,” centralized storage controllers are now responsible for managing recovery operations

in addition to ongoing operations. Storage companies would not generally consider this a “single point of

failure,” but users will.

Redefining “Steady-State” Operations

When sizing a VDI environment, many different operations must be planned for. Accommodating hosted

desktop users during “steady state” with a low-latency experience has to be satisfied during maintenance

operations. Boot storms and login/logout storms of relatively few numbers of users can push a storage

system designed only for “steady state” well past its breaking point. These operations are part of the

normal desktop access activity for users, and as such they should be included in any storage sizing

discussion for “normal” operations.

In testing with Login VSI, storage performance (IOPS) required during the user login phase was four

times the “steady-state” workload. Considerable thought should be given to user behavior patterns and

how many users will be concurrently initiating connections to the desktop environment. Login VSI is

capable of adjusting the rate at which users log in to the environment for the test and is an excellent

method for exploring the performance of the solution during various login activity levels. In the testing

X-IO performed, user login rates of 9/min and 33/min were tested. The single greatest impact to the login

times was the high compute (CPU and RAM) utilization of the UCS blade servers when the majority of the

users had connected to the solution.

VDI Planning Questions to Ask

Proper planning is essential to a successful VDI roll-out. Here is a non-exhaustive set of questions to help

guide your investigation into a VDI pilot program and your full production program.

What are the different kinds of users you will have to support (i.e., kiosks, developers, power users,

tellers, knowledge workers)?

What is the scope of each user type?

How many simultaneous desktops will be required?

What are the expected resource demands for each user (i.e., CPU, memory, disk space, network

traffic)?

What is the planned concurrency for your user pools and will their usage be offset from each other

(i.e., shifts, geographical support)?

What are the relative benefits for virtualizing desktop access for each user type?

What are the relative risks for each user type, if there are access issues?

What existing infrastructure can be used for the VDI implementation?

For each user type, would persistent or non-persistent desktops be more appropriate?

For each user type, would provisioned VMs, full clones, or dedicated virtual machines be more

X-Pod for Citrix XenDesktop 7.5 9

appropriate?

What existing IT management and monitoring tools do you have in place?

Will your infrastructure support iSCSI? Fibre Channel? Is either preferable to you?

Will the pilot program be based on actual users or will it use validation software like Login VSI?

For the pilot program, what are your success criteria?

How much cost will there be in extending the warranty beyond what was included with the base

support period?

What metrics are important to you? For example:

o Total number of IOPS

o Total throughput

o Recompose wall-clock duration

o Maximum latency as seen by end user

o Virtual machine boot time

o Login / logout times

How will the transition from pilot program to full production implementation be done?

Solution Overview

This X-Pod reference architecture white paper describes a compact configuration to deliver a high-

performance VDI environment that supports up to 850 virtual desktops. In the following sections, Cisco

UCS, Cisco Nexus, Cisco MDS, VMware vSphere, Citrix XenDesktop, and the ISE hybrid storage array

are described.

Components

The following components were chosen for this X-Pod reference architecture.

Cisco Unified Computing Systems (UCS)

The underlying premise of a VDI solution is to run user desktops on powerful datacenter servers rather

than on distributed physical machines. Cisco has focused on the characteristics needed to support this

functionality in datacenter servers and has developed the following innovations:

Extended memory

Virtualization optimization, with Cisco VN-Link technology

Unified I/O access and unified fabric

Unified, centralized management

Service profiles

The bottom line is that Cisco has developed and refined the Unified Computing System to specifically

meet the Enterprise VDI requirements. Simplified architecture and management geared toward

datacenter fulfillment of virtual desktops leads to reduced total cost of ownership (TCO) by lowering

acquisition costs, lowering operating costs, and lowering ongoing operational costs.

X-Pod for Citrix XenDesktop 7.5 10

The UCS unites compute, network, storage access, and virtualization into a cohesive system. The system

is integrated on a low-latency, 10-Gigabit ethernet (10GbE) unified network fabric with enterprise-class,

x86-architecture servers. It is an integrated, multi-chassis platform in which all resources participate in a

unified management domain. The Cisco UCS accelerates the delivery of new services simply, reliably,

and securely through end-to end provisioning and migration support for both virtualized and non-

virtualized systems.

For more on UCS:

http://www.cisco.com/c/dam/en/us/solutions/collateral/data-center-virtualization/unified-

computing/at_a_glance_c45-523181.pdf

Cisco Nexus

The Cisco Nexus 5548P is a one-rack-unit (1U), 1 GbE, 10 GbE, and FCoE access-layer switch built to

provide 960 Gbps of throughput with very low latency. It has 32 fixed, 1 GbE, or 10 GbE ports that accept

modules and cables meeting the Small Form-Factor Pluggable Plus (SFP+) form factor. One expansion

module slot can be configured to support up to 16 additional 1 GbE and 10 GbE ports or eight Fibre

Channel ports plus eight 1 GbE and 10 GbE ports. The switch has a single serial console port and a

single out-of-band 10/100/1000-Mbps Ethernet management port.

Cisco MDS

Cisco MDS 9148 Multilayer Fabric Switch is a high-performance Fibre Channel switch platform. It

provides low power consumption and high density, with up to 48 line-rate 8 Gbps ports in one rack unit

(1U).

Citrix XenDesktop

Citrix XenDesktop delivers Windows applications and desktops as secure mobile services. With

XenDesktop, IT can mobilize the business while reducing costs by centralizing control and security for

intellectual property. Incorporating the full power of XenApp, XenDesktop can deliver full desktops or just

the applications to any device. HDX technologies enable XenDesktop to deliver a native touch-enabled

look-and-feel that is optimized for the type of device as well as the network.

Citrix XenDesktop provides the best method available today to provision, host, manage, and deliver

virtual desktops and applications. Citrix has the benefit of 25 years of advanced expertise in optimizing

remote protocols and virtualizing user workloads.

User experience is greatly enhanced with Citrix XenDesktop by providing high-performance, 24x7 access

to all corporate applications and data from anywhere and from any type of device (PC, Mac, Linux,

Tablets, IOS, Android, etc.).

For the IT department, Citrix centralizes management in dramatically more efficient ways. Rather than

maintaining the legacy approach of installing and managing operating systems, applications, patches,

and customizations on every single device, Citrix allows a “One to Many” solution. A single desktop image

can serve hundreds, thousands, or hundreds of thousands of users, providing a single point of update for

the entire IT environment. At the same time, security and performance are greatly improved by keeping

the applications close to the data in the secure boundary of the data center.

X-Pod for Citrix XenDesktop 7.5 11

X-IO Technologies Intelligent Storage Element (ISE)

Consolidation and business intelligence are key themes in today’s IT. Consolidation brings with it

challenges in server multi-tenancy and hosted desktops while database management systems become

the keys to a successful business. In both cases, fast and reliable solutions lead to a more productive and

profitable enterprise. The ISE 700 series hybrid storage system keeps pace with the performance

demands of today’s IT without the high cost it takes traditional storage systems to keep up. The ISE 700

series provides an ideal balance of price, performance, capacity, and reliability by combining SSD and

HDD into a single hybrid pool of capacity to provide SSD performance at HDD pricing. ISE outperforms

systems that are up to ten times more expensive and provides an outstanding TCO by reducing operating

costs associated with management, power, cooling, and datacenter footprint.

ISE Manager Suite

Enterprise storage is facing not only a challenging future but also a challenging present. Today’s

datacenters are increasingly heterogeneous and multifaceted. Virtualization technologies, diverse storage

platforms, and cloud services create obstacles for traditional storage systems and storage management.

How can you as a storage administrator be expected to efficiently and effectively manage storage in such

a complex environment? The answer lies in your management tools, which must have deep integration

with virtualization technologies and host operating systems while being precise, streamlined, and user-

friendly.

Ideal for the challenging storage scenarios of modern enterprises, ISE Manager 4.0 is the solution for

today and tomorrow. It is an intuitive, flexible interface that provides simplified end-to-end storage

management for multiple physical, virtual, and cloud environments from a single interface. It lets you

simplify, centralize, and automate storage administration with software tailored to modern datacenters.

Figure 1 - SE Manager Suite

X-Pod for Citrix XenDesktop 7.5 12

Login VSI

Login Virtual Session Indexer (Login VSI) is the industry-standard load

testing tool for virtualized desktop environments. Login VSI can be used to

test the performance and scalability of VMware Horizon View, Citrix

XenDesktop and XenApp, Microsoft Remote Desktop Services (Terminal

Services), or any other Windows-based virtual desktop solution. Login VSI

may be used to compare and validate the performance of different software

and hardware solutions in an environment. Login VSI provides a method to

measure the maximum capacity of an infrastructure. Simulated users work

with the same applications as an average employee, such as Word, Excel,

Outlook, and Internet Explorer.

For more information, download a trial at www.loginvsi.com.

X-Pod for Citrix XenDesktop 7.5 13

Solution Architecture

This section highlights the hardware and software configurations used to assemble this reference

architecture for 350 and 500 virtual desktops delivered with Citrix XenDesktop 7.5 on vSphere 5.5 U1.

This environment was built on top of two Cisco UCS B-Series chassis, Cisco networking components,

and X-IO 700 Series hybrid storage arrays.

Figure 2 shows the logical diagram of the solution architecture for 350 and 500 hosted desktops. The

same infrastructure was used for all three tests.

Figure 2 - Logical Diagram for 350 and 500 Desktop Reference Architecture

X-Pod for Citrix XenDesktop 7.5 14

Hardware Components

The following hardware components were leveraged to support the Login VSI test of the initial 500 and

750 desktop stress tests and the subsequent 350 and 500 VDI desktop loads.

Hardware Quantity Configuration

Servers

Cisco UCS 5100 B-Series Chassis 1 1 350 Desktop Cluster “Standard Performance”

2208XP Fabric I/O Extenders 2

Cisco UCS B200 M3 1 Two Intel Xeon E5-2680 2.7-GHz CPU (16 cores total) 128 GB RAM

Infrastructure Blade

Cisco UCS B200 M3 7 Two Intel Xeon E5-2680 2.7-GHz CPU (16 cores total) 128 GB RAM

vSphere desktop cluster

VIC 1280 8

Cisco UCS 5100 B-Series Chassis 2 1 500 Desktop Cluster “High Performance”

2208XP Fabric I/O Extenders 2

Cisco UCS B200 M3 8 Two Intel Xeon E5-2697 2.7-GHz CPU (24 cores total) 256 GB RAM

vSphere desktop cluster

VIC 1280 8

Cisco UCS 5100 B-Series Chassis 3 1 Login VSI Infrastructure

2208XP Fabric I/O Extenders 2

Cisco UCS B440 1 Two Intel Xeon E7-4870 2.4-GHz CPU (24 cores total) 256 GB RAM

Login VSI Server

UCS-VIC-M82-8P 1

Networking

Cisco Nexus 5548 2

Cisco MDS 9148 2 8 GB/s Fibre Channel Switch, 2 ports per ISE 700

Cisco UCS 6248 Fabric Interconnect 2

Storage

X-IO ISE 710 Hybrid Storage Array 1 8Gb/s Fibre Channel – for Boot from SAN Array

X-IO ISE 730 Hybrid Storage Array 1 8Gb/s Fibre Channel – for Login VSI Share

X-IO ISE 740 Hybrid Storage Array 1 8Gb/s Fibre Channel – for all XenDesktop machine catalogs

X-Pod for Citrix XenDesktop 7.5 15

Software Components

See the table below for software details.

Software Version

vSphere

ESXi 5.5 update 1

vCenter Server

Operating System

Microsoft .NET

Microsoft SQL Server

5.5 update 1

Windows Server 2008 R2 64-bit Standard Ed.

3.5 SP1

2008 R2

Citrix XenDesktop

Desktop Controller

Operating System

7.5

Windows Server 2008 R2 64-bit Standard Ed.

Provisioning Services

Operating System

7.1.3

Windows Server 2008 R2 64-bit Standard Ed.

Microsoft Software Platforms

Active Directory, DNS, DHCP Windows Server 2012

Login VSI, VSIshare Server

Operating System Windows Server 2008 R2 64-bit Standard Ed.

Microsoft .NET 3.5

Login VSI 4.0

Virtual Desktops: Target Desktop

Operating System Windows 7 32-bit

Microsoft Office 2010

Adobe Reader v. 11

Java SE 7 U13

DoroPDF

Citrix Reciever 4.1.2

Virtual Desktop: Launch Desktop

OS Windows 7 32 bit

Cisco Unified Compute System (UCS)

The Cisco UCS configuration was connected to X-IO ISE 700 storage arrays through dual, redundant

paths to the MDS 9148 switches, then connected to the Cisco 6248UP fabric interconnect with dual,

redundant 10 Gbps connections. This architecture provides for a highly available and high-performance

architecture.

Each Cisco UCS chassis was connected to each fabric interconnect with four 10-Gbps network

connections.

X-Pod for Citrix XenDesktop 7.5 16

The dual fabric interconnect pairs have primary and subordinate roles in this configuration. For more

information about optimizing and configuring the Cisco UCS server service profiles for VDI, please refer to

the following link:

http://www.cisco.com/go/unifiedcomputing

VMware vSphere 5

For this test configuration, the VMware vSphere 5.5 Update 1, ESXi hypervisor was deployed. All of the

hosts were set up to boot from SAN (optional) using the X-IO ISE 710 hybrid storage array.

Clusters

The environment is organized into three main components: the VDI clusters, management cluster, and

Login VSI Launcher cluster. Two VDI clusters encompass a total of 15 UCS blades and are responsible

for supporting all target virtual desktops and were tested until reaching server resource saturation. The

first VDI cluster of 8 nodes was tested at 500 desktops and found to run 350 VDI target desktops

proficiently. The second VDI cluster of 8 nodes was tested at 750 desktops and found to run 500 target

VDI desktops proficiently. One UCS blade (in its own cluster) is used to run all infrastructure functionality,

including a Domain Controller, vCenter, vCenter Operations Manager, and all XenDesktop infrastructure

servers. The last cluster, the Login VSI Launcher cluster, is a UCS blade that runs 80 Login VSI Launcher

desktops.

VDI Clusters

The two clusters under test in this reference architecture are the VDI clusters. They consist of 16 hosts,

as described in the “Hardware Components” section above. These work together to support the test loads

of 350 and 500 virtual desktop machines.

Figure 3 - VDI Cluster for 350 Desktops

X-Pod for Citrix XenDesktop 7.5 17

Figure 4 - VDI Cluster for 500 Desktops

Infrastructure

The Infrastructure UCS blade houses all virtual machines to support the vSphere environment, the Citrix

XenDesktop environment, and the Login VSI testing environment.

Figure 5 - Infrastructure VM for Reference Architecture Validation

XenDesktop Environment Architecture

The VDI environment in this solution is created and managed in XenDesktop 7.5. For the Provisioning

Services test, version 7.1.3 was used. Four Desktop Groups were created with four corresponding

Machine Catalogs: an MCS and PVS group each for the high-performance cluster and an MCS and PVS

group each for the standard performance cluster.

X-Pod for Citrix XenDesktop 7.5 18

For the Machine Creation Services tests, MCS was used from within the Desktop Controllers to deploy

and update the VMs. For the Provisioning Services tests, two PVS servers were used to deploy and host

images for the VMs. Both had a local store attached as a hard drive to each server, and images were

replicated between servers.

Datastores

The virtual machines in the 350 desktop VDI cluster were stored on four datastores, which are in turn stored

on four ISE LUNs. Each LUN is 1 TB in total capacity, configured with RAID-1 protection and housed on a

single ISE 740 hybrid storage array as shown in the following ISE Manager and vSphere Client figures.

Figure 6 - Datastores for 350 Desktop Groups

X-Pod for Citrix XenDesktop 7.5 19

Figure 7 - LUNs for 350 Desktop View Pools (ISE Manager)

The 500 VM Desktop Group is configured in the same manner. Below is a screen shot of the datastores as

they are configured.

Figure 8 - Datastores for 500 Desktop View Pools (vSphere Client)

X-Pod for Citrix XenDesktop 7.5 20

Target VMs

Base Image

OS Windows 7 32-bit, Enterprise

VM Hardware Version 8

vCPU 1

Memory 1.5 GB

HD – MCS 24 GB (Shared between VMs for OS)

HD – PVS 6 GB

The target desktop virtual machines are created from a base image. This image was created as a stock

Windows 7 32-bit Enterprise operating system. Windows updates were applied to bring it to the current

levels. Several applications were then installed, including Microsoft Office, Adobe Acrobat Reader, and

FreeMind. The Citrix XenDesktop agent is installed on the gold image, and a snapshot or image upload is

created.

X-IO ISE 740 Hybrid Storage Array

The ISE 740 hybrid storage array has 28.8 TB of usable capacity configured from the highest-quality,

mission-critical 10K RPM SAS drives and enterprise-grade, MLC SSD into a single pool of flash-enabled

storage. The ISE 740 is fully redundant with active-active controllers, each including four 8 GB Fibre

Channel ports. The ISE 700 Series includes patented Continuous Adaptive Data Placement (CADP)

software, which analyzes the behavior of host I/O and automatically places hotspot data onto SSD only if

measurable performance gains will be achieved. CADP runs continuously and makes data movement

decisions every 5 seconds.

In each of the user tests the ISE 740 delivered low-latency read and write transactions for the entire

duration of the tests and project.

Figure 10 - The ISE 740 Hybrid Storage Array (VMware Ready and Cisco Compatible)

X-Pod for Citrix XenDesktop 7.5 21

Performance Analysis of Tested configurations

Test Methodology

While storage vendors have for years utilized synthetic benchmark tools (Iometer, SQLIO, fio, iozone) to

simulate performance loads, nothing can provide more insight into performance requirements than a

testing tool that replicates actual end-user usage. Testing in this “systems view” methodology allows for

many different facets of the solution to be evaluated, as different virtual desktop operations have

drastically different requirements from storage. Simply using a load generator to show the performance

possible from a storage array and somehow relating it to desktop virtualization workloads completely

ignores the challenges that are unique to this solution design.

Login VSI was used as the load generation tool, as this is capable of mimicking end-user functions, such

as working with Microsoft Office applications, running Java, browsing web pages, and other common user

functions. If the console is left open in one of the target desktops, this activity can be watched as the test

progresses. Login VSI provides a valuable framework to gather much more information than just the main

workload run, as will be detailed in the sections below. Other virtual desktop management operations

were also performed as part of the setup and environment maintenance throughout the testing period.

Performing these actions proved invaluable to learning about the different workloads involved in the

solution.

Testing to determine the scale of the ISE 740 storage array was one of the goals in the testing, and Login

VSI test runs were performed with 500 and 750 active users to determine maximum capacity, then further

testing was done within capacity and determined to reach server resource saturation at 350 and 500

active user desktops respectively. The “office worker” workload setting was used for this testing series, as

per Login VSI this can be considered an average workload for a virtual desktop user. Login VSI measures

the end-user desktop experience and produces a metric that is a measure of the amount of desktops that

a given solution could support with acceptable performance (VSImax). When VSImax is reached, that is

the estimated number of desktops the solution can be expected to support. In all testing performed, the

Cisco UCS CPU utilization was the main limiting factor to achieving higher numbers of desktops. The ISE

740 was able to accommodate all of the tested user levels with no signs of a performance limit being

approached.

Workload Analysis

Machine Creation Services Versus Provisioning Services

The primary focus of the workload analysis will be on how VMs deployed with Machine Creation Services

(MCS) use resources compared to VMs deployed with Provisioning Services (PVS). Both tools are

available for Citrix XenDesktop but operate in different manners that affect how they perform on similar

infrastructures.

To use MCS for the creation and management of your VDI objects, you create a gold image and then

create a snapshot of it. To deploy your objects, simply tell MCS which snapshot you wish to use on which

defined storage space. MCS will make a copy of the snapshot to each LUN, which acts as a “master”

image and then creates objects that will all use the master image on its LUN as its boot partition. MCS

then creates a small 16 MB drive for its identity disk and a second thin provisioned drive for writing

changes to the VM. MCS will handle load balancing for the initial deployment, evenly distributing the

required objects over all defined LUNs in the host entry used. There is no need for additional

X-Pod for Citrix XenDesktop 7.5 22

infrastructure to use MCS as it’s a function of the Desktop Controllers and built into the XenDesktop

Desktop Studio console.

For PVS, a gold image is still needed. However, instead of creating copies of that image that reside on

the LUNs, PVS uses tools to upload a copy of the gold image to a virtual disk it keeps on its own store.

The store can be local to the PVS server or mapped so that all PVS servers point to the same store. PVS

then creates objects that it manages through PXE boot infrastructure. These objects usually only contain

a small drive for identity management and write caching, typically 6 GB in size. Upon a successful PXE

boot, PVS will stream the OS from the image to the RAM of the object and manage its caching based on

the image setting. PVS requires additional infrastructure, namely PVS servers, and the PXE boot requires

DHCP configuration, the use of a boot ISO mapped to CDROM, or a boot disk built into the master

template. PVS is not as simple to use and manage as MCS, but its additional complexity leads to a

variety of more options for a flexible and dynamic environment designed for thousands of users.

For this reference architecture, two modes used for caching in PVS are reviewed: Cache to Hard Disk,

wherein the VM will write cached information directly to a 6 GB disk attached to the object, and a new

option with XenDesktop 7.x, Cache to RAM with overwrite to hard disk. Using this method, the VM

attempts to cache the OS into RAM first, and, as it needs to free up memory, writes the cached

information to the 6 GB attached disk.

Login VSI user Workload

All users in this reference configuration were logged in and simulated by Login VSI. The workload chosen

for each of the remote users in all tests was “office worker.”

Login VSI produces a metric called VSImax. This is a measure of the number of concurrent virtual

desktops that a given solution can support with “acceptable” desktop performance. Test iterations are

performed, and the goal is to closely match or exceed the number of desktops that are planned to be

concurrently run in production with the VSImax score.

Below are the results from the two test iterations (350 and 500 desktops). A clear, linear increase in the

VSImax score can be seen as the number of users was increased, indicating that the VDI solution was

able to accommodate the workload with no signs of encountering a performance bottleneck until 97% or

greater concurrency was achieved.

X-Pod for Citrix XenDesktop 7.5 23

Figure 9 - VSImax for 750 and 500 user tests – VSImax was a result of server resource saturation

X-Pod for Citrix XenDesktop 7.5 24

Because of the VSImax levels observed, to get a more accurate calculation of IOPS, further tests were

conducted with the following user loads:

High Performance PVS – 500 users

MCS – 450 users

Standard Performance PVS – 350 users

MCS – 350 users

Login VSI Steady-State Workload

In the graphs above, VSImax is encountered at 449 (MCS), 566 (PVS – C/HD), and 569 (PVS –

C/RAM/HD) for 750 session user tests, and 341 (MCS), 333 (PVS – C/HD), and 317 (PVS – C/RAM/HD)

for 500 user tests. This is due to the high resource utilization of the host servers in this configuration.

Figure 10 shows that the primary bottleneck is from CPU utilization. Once booted, there is very little

increase in memory utilization. Both PVS variations as well as the MCS tests showed very similar patterns

in host resource utilization.

Figure 10 - Server Resources During 750 User Test - PVS

X-Pod for Citrix XenDesktop 7.5 25

One of the things that makes the virtual desktop workload so challenging for storage solutions is the high

amount of write operations that are required, especially considering the write penalties involved with RAID

operations. During the steady-state testing, write operations to storage were observed to be 86% of the

total IOPS for PVS cache to RAM, overwrite to hard drive, and 90% for PVS cache to HD.

Below are graphs of the total ISE system I/O (write and read) in the six test runs. Write I/O can be clearly

seen dominating the workload mix for PVS, while read I/O dominates MCS because of the lack of

caching.

It was observed that write IOPS were higher in the login phase of the test run for PVS Cache to Hard

Drive and MCS. PVS cache to RAM, overwrite to HD remained at a steady rate throughout login and

steady-state login VSI tests and required the least amount of performance (IOPS) from the ISE 740

hybrid storage array.

Figure 11 - Login Phase and Steady State for PVS and MCS on High Performance

X-Pod for Citrix XenDesktop 7.5 26

Figure 18 shows that the ISE 740 array performed all of the write operations under 1 ms during the tests.

Values of below 1 ms are reported as 0ms, and, as such all three series can be viewed as having no

observable latency over 1 ms for the testing period.

Read latency is also an important measure of system performance, and increases in read latency were

observed in the MCS test due to the lack of desktop caching. The highest value was a spike observed in

the PVS cache to RAM, over write to HD desktop series, with consistently more reads observed in the

MCS series with 95% of the average values below 1 ms. The majority of the PVS cache to RAM, over

write to HD was below 2 ms, with 99% below 1 ms. PVS cache to HD had more spikes, but all below 2

ms, with most below 1 ms. MCS, while more read intensive, was consistently at 1 ms, with occasional

spikes to 2 ms.

Figure 12 - Write Latency for PVS and MCS on High Performance

X-Pod for Citrix XenDesktop 7.5 27

Figure 13 - Read Latency for PVS and MCS on High Performance

Read latency is the value that will react first when increasing load on the storage system. However, the

amount of read IOPS comprises a small percentage of the overall workload for PVS scenarios. For MCS,

read latency plays the more critical factor.

The ISE 740 hybrid storage array demonstrated that it was able to satisfy the Login VSI workloads up to

the point of saturation (100% utilization) for the Cisco UCS CPU resources of the 16 blade servers. If

large volumes of users are logging in/out of the environment concurrently, storage performance will play

an important role.

Virtual Desktop Deployment Operations

Deployment operations are something that every environment must go through. Whether performing the

initial creation of the desktops or having to add capacity during production hours, it is vital that the impact

to the infrastructure is known to the administrator.

PVS deploys in a routine manner that offers the least impact to performance. A VM is cloned, booted for

differential operations, then shut down. This occurs in a production line manner, with one clone operation

occurring after another has finished. At any time there is one cloning and 2-4 VMs booted up for the

differential operations. It should be noted that this takes the longest time to complete.

MCS creates VMs in a different fashion. The “gold image” is copied to all datastores available in the

cluster as defined in the Desktop Controller. This causes a burst of write activity while the gold images are

created. Once done, VMs are created on each datastore and booted to complete differential operations,

creating a second burst of activity. While “chattier” in terms of IOPS, it completes in a significantly faster

time frame than PVS deployments.

If there is a need to regularly deploy new VMs during production hours, PVS will have the least impact.

X-Pod for Citrix XenDesktop 7.5 28

The deploy operation performance requirements are biased towards storage write IOPS, at just over 70%

of the workload. Average values for total storage IOPS demand were seen to regularly approach 2,000

IOPS for PVS and MCS on average, with high values approaching 6,000 IOPS for PVS and 3,500 for

MCS. This process operates over all of the VM “active” data set size and generates I/O across all of the

new desktop capacity.

Figure 14 - ISE 740, Total Read and Write IOPS During Deploy Operations

X-Pod for Citrix XenDesktop 7.5 29

Response time is also an important measure to examine when performing system “stress testing.” The

deploy process is one example of what a virtualization administrator may conduct to prove out storage

systems being proposed for VDI deployments. The ISE 740 shows excellent read and write response for

this workload, with the majority of the write and read latency values below 1ms and 4ms for PVS and

below 9 ms and 12 ms for MCS respectively.

The reduction in read and write latency seen at the beginning of the test run is due to the ISE

management of data to SSD in real time. This is Continuous Adaptive Data Placement (CADP) in action,

as it learns the workload and automatically optimizes for best performance.

Figure 15 - Read and Write Latency During Deploy Operation

When images were updated, the method used by MCS was identical to a deployment, wherein the gold

image was copied to datastores and then each VM was updated to point to the new image and booted to

perform differential operations with the updated OS. In contrast, an update on PVS was identical to a

normal boot operation since there is no need to copy data to the datastores as the image resides on the

PVS server store. This resulted in significantly lower IOPS activity for the PVS image update process.

Virtual Desktop Boot Storm

The virtual machine boot process was the most taxing on the CPU utilization. The figure below shows the

processor and memory utilization of a single blade server during this process. CPU utilization reaches

saturation (100%) as the different pools of desktops are booted. Limiting the desktop pool sizes should be

considered, as this can limit the impact and duration of the event if an entire desktop pool needs to be

booted or rebooted.

The boot process focused heavily on read IOPS, which dominated over 90% of the time. Initially, there is

a large read workload, followed by write IOPS in PVS as information begins to be written to cache. Total

IOPS during this period were observed to reach above 80,000 IOPS for MCS and 20,000 IOPs for PVS.

X-Pod for Citrix XenDesktop 7.5 30

Figure 16 - Total Read and Write IOPS During Boot Storm

Figure 17 - Total IOPS and KB/s for Boot Storm

Response times of the ISE 740 were well within what would be considered normal for database

operations, proving that the ISE was not approaching any limit in performance for this operation. As

shown below, the greatest latency was seen in the MCS boot storm, where writes reached 4 ms and

reads up to 10 ms. PVS never exceeded 2 ms.

X-Pod for Citrix XenDesktop 7.5 31

Figure 18 - Read and Write Latency During Boot Storm

Boot storms are traditionally extremely difficult for storage systems to keep up with. The broad range of

read vs. write requirements, while requiring high-performance IOPS, are usually where most storage

systems have significant issues. In this test, the main limiting factor was the Cisco UCS CPU resources

as all servers were pushed to 100% CPU utilization. When planning for numbers of consecutive desktops

that can be safely started at the same time, careful attention should be paid to the processor utilization of

the ESXi servers after high-performance storage is implemented (such as the ISE 740 hybrid storage

array).

X-Pod for Citrix XenDesktop 7.5 32

Conclusion

There has long been a debate in Citrix communities as to which provisioning methodology, MCS or PVS,

is better. Prior to 7.1.3, available versions of PVS have been buggy and unreliable, sending most

administrators to MCS as a reliable tool. However, as your environment grows, PVS clearly has the tools,

such as its method for library management of images, to allow an administrator much more flexibility to

manage and deploy VMS.

It is’ clear from these tests that with the performance improvement of the PVS caching method of “cache

to RAM, overwrite to HD” combined with the maturity of 7.1.3, PVS hands down offers vast performance

improvement to administrators who find that storage is their primary performance factor. Without making

changes to the VM itself, PVS can leverage significantly reduced IOPs that can be a bottleneck to the

end-user experience.

When it comes to deploying virtualized desktop deployments, ’there clearly is a dangerous combination of

misleading marketing statistics and many implementation pitfalls. However, the purpose of the X-Pod for

VDI solution is to provide insight and proof points into the performance and sizing of a virtual desktop

infrastructure with Cisco UCS B-Series Servers, based on an X-IO ISE 700 Series hybrid storage array.

This paper provides an appropriate converged infrastructure design to competently design a suitable

architecture for a high-performance hosted desktop end-user experience.

While this paper provides a simple, easy-to-deploy model for the user counts suggested, it should be

noted that these are high-end assumed guidelines and X-IO and its VDI partners will be happy to help

provide a customized X-Pod solution to meet the VDI specifications needed.

Contact X-IO technologies

Website: http://www.xiostorage.com/ Email: [email protected]

Get in touch with us:

http://xiostorage.com/contact/

or

Visit our website and chat with us to get more information.

United States »

866.472.6764

International »

+1.719.388.5500

9950 Federal Drive, Suite 100 | Colorado Springs, CO 80921 | U.S. >> 1.866.472.6764 | International. >> +1.719.388.5500 | www.x-io.com

X-IO, X-IO Technologies, ISE and CADP are trademarks of Xiotech Corporation. Product names mentioned herein may be trademarks and/or registered trademarks of their respective companies. © Xiotech Corporation. All rights reserved. RA-0004-20150112

X-Pod for Citrix XenDesktop 7.5 33

Appendix

Appendix A: IOPS Comparison

MCS PVS, Cache to HD

PVS, Cache to RAM, o/w to HD

Boot Storm Max per VM 82 19 20

Average per VM 7 7 2

Login Max per VM 34 12 2

Average per VM 8 4 0.5

Steady Operating State Max per VM 18 8 1.5

Average per VM 4 3 0.5

Deploy * Max 3376 6120 Not performed **

Average 1647 625 Not performed **

* Because of the clustered nature of the deployment for MCS,

the max and average are presented for the whole duration rather than per VM.

** Since changing between the two PVS modes is a small administrative setting, deploying VMs with the Cache to RAM, overwrite to HD was not performed.