2. equallogic load balancers - dell · pdf file• guiding principles • network load...

EqualLogic PS Series Load Balancers and Tiering, a Look Under the Covers

Keith Swindell

Dell Storage Product Planning Manager

• Guiding principles

• Network load balancing

• MPIO

• Capacity load balancing

– Disk spreading (wide striping)

– RAID optimizer

• Free space balancing

• Automatic Performance Load Balancer

• Tiered Array

– Shock absorber

• Summary

2

Topics

EqualLogic Load Balancers: overview and guiding principles

3

3

• As you raise work on a device, its latency goes up

– When you overload a device the observed latency will become unacceptable

• SANs run multiple workloads typically

– The net effect is the sum of the workloads is an ever growing random I/O workload

• Workloads change

– Sometimes we know about it

– Many times we don’t (we react to it)

• This is resolved by automatically spreading work across available hardware proportionally

Observations

4

EqualLogic Load Balancers

The NLB (Network Load Balancer) manages the assignment of individual iSCSI connections to Ethernet ports on the group members

EqualLogic MPIO extends network load balancing end to end host <-> array port

The CLB (Capacity Load Balancer) manages the utilization of the disk capacity in the pool

Free Space balancing manages dynamic space usage between arrays

The APLB (Automatic Performance Load Balancer) manages the distribution of high I/O data within the pool.

The Hybrid Array Load Balancer manages the distribution of high I/O data within a hybrid array.

5

Network Load Balancing

6

6

Goal: Interface load is balanced

• Internal communication is balanced

• I/O is directed to the most used array

• Best network paths are preferred

Network Load balancer

7

• Each array port’s statistics are analyzed every 6 minutes.

– All traffic on port is considered:

› Host access

› Replication

› Internal communications (if more than 1 array)

• Are any of the ports overloaded?

– Near maximum

– Large difference in utilization vs. other ports

• If yes, what is the overloaded port and most eligible connection on that port?

– Is there a port that is less loaded we can move to?

– If yes, move the connection to that [array] port*

› This is done via iSCSI commands

› It is transparent to host

• Will only make one determination every 6 minutes so that the effect can be taken into account for the next analysis.

*In multi-member groups the selection will take into account that a server is doing more I/O to a particular member’s data - the connection load balancer will move the connection to that member.

Automatic Network Load Balancer Operation

8

• The EqualLogic Host Integration Tools include support for Enhanced MPIO – Windows, VMware, and Linux – Enhanced MPIO automatically creates additional connections and will send I/Os directly to the member

which contains the data requested by the initiator – Very helpful in a congested networking environment

• The network connection load balancer has knowledge built in to communicate with the OS

based Enhanced MPIO processes • The group and Enhanced MPIO module agree on the optimal connection matrix for best

performance

• Only interface balancing is needed when using the Enhanced MPIO

9

Enhanced MPIO effect on network connection load balancing

• It is best practice to run MPIO on your servers

– Improves reliability of storage access

– Improves resilience to errors

– Improves performance

• EqualLogic MPIO is easier to use than host/OS MPIO

– Automatically manage iSCSI sessions

› Create iSCSI sessions based on the SAN configuration to ensure High Availability and maximize I/O performance

› Sessions automatically raise and lower based on operating needs

– It comes with the product and does not cost extra.

• It performs better than host/OS MPIO

– Optimizes network performance and throughput

– Provides End to end network balancing (rather than array to switch balancing)

– Route I/O directly to Member which will be servicing it

› Reduces overall network traffic

10

Why use EqualLogic Multipathing?

• iSCSI connections are “long lived”

– Connections are created by servers, and commonly exist for days/weeks

• The operations of the network load balancer can be seen in PS GUI or SANHQ › you see iSCSI connections are spread across multiple arrays & array ports

› You see event messages for server iSCSI logins that are not tied to reboot/restarts

– In your host MPIO

› Connection are spread across host ports and array IP addresses

› Event messages showing iSCSI connection being adjusted

• Some of my hosts have a disruption during network load balancing operations

– This is not normal, call support

– Typically a combination of

› Network errors ( switches not properly setup, Cabling or configuration problem)

› Server OS settings not properly configured (iSCSI settings or MPIO settings)

11

Other Details – Automatic network load balancing

• Spread volume data among the member arrays keeping in use and free space percentages equivalent

Goals:

• Keep capacity balanced within a pool

• Honor volume RAID preference hints if possible

• In larger pools (3 or more members)

– Tries to keep volume on 3 members if larger pool

› could spread to more if needed

– Performs automatic R10 preference

Automatic Capacity Balancer Operation Fit volumes optimally across pool members

12

• Data within a volumes is first distributed across members via the CLB.

– The CLB looks at the total amount of volume reserve space

• This is compared with free space on each member to determine the optimal members that will receive data

– The collection of volume data that the CLB places on a member make up a volume slice

– By default, the CLB will strive to create no more than 3 slices per volume

› This is not a hard limit as a volume can have as many slices as there are members in a pool if required to hold the capacity of the volume

Automatic Capacity Balancer Operation Details

13

Automatic Capacity Balance When does the capacity balancer run?

14

A member

• Is added or is removed from the group

• Is merged or moved to another pool

• Following correction of member free space trouble.

A volume

• is created, deleted, expanded, change in preferences, or bound (bind / unbind in cli)

Timer

• If a balance operation has not run for 36 hours, a timer will start a balance evaluation. An actual balance operation may or may not execute, depending on whether or not the pool is already sufficiently

• At volume create (or later) an administrator can select a “RAID preference” for the volume.

– If the desired RAID type is available in the pool, the CLB will attempt to honor the request

– After re-calculating for RAID preference, the pool rebalance begins

• In larger groups (3 members or more) the CLB heuristically determines if RAID-10 would be advantageous for the volume

• Automatic Performance Load Balancer in many cases renders automatic raid placement obsolete

Automatic Capacity Balancer Operation RAID Placement

15

• In PS groups there are many volume operations that consume and release free space dynamically

– Volumes grow/shrink – thin provisioning (map/unmap)

– Snapshots space grows/shrinks – create / delete, space grows as writes occur to volumes or other snaps

– Replication – recovery points, freeze of data for transmission

• In multi-member pools, this can cause free space imbalances between members in the pool

– Data changes more quickly on faster members typically (e.g. consuming snapshot space faster)

– Free Space balancing adjusts this in background shifting in-use and free pages between members

• When capacity gets low, the member enters free space trouble state.

– In worst case scenarios, these imbalances can affect the dynamic operations

– When in free space trouble state, the load balancer works to more rapidly free space on member that is running low by swapping in use pages for free pages with other members.

– In groups with > 3 members It may also change the slice layout amongst members

Free Space Balancing

16

• Free Space Trouble: Member FST vs. Pool FST

– Member FST occurs when a member of the group is low on space

– Pool FST occurs when a pool is low on space.

• Are there ways I can create more free space in a pool?

– Yes

– unmap non-replicated volumes (V6 FW)

– use snapshot space borrowing (V6)

– adjust snapshot or replication reserves

– convert volumes from full to thin provisioned

– Delete unneeded snapshots (or volumes)

– Adjust schedule keep counts

– Add more storage to pool

• I’m seeing a lot of performance issues, yet workloads do not appear very heavy – what do I do?

– Call support “Houston we have a problem”

– It may be free space balancing is occurring too frequently

› If demand for dynamic page allocations is greater than balancing rate, free space can be in short supply on a member

– Common solution is to increase free space in pool

• Very high capacity arrays mixed with lower capacity in pool

› Example PS65x0 with 3TB drives with 6100XV with 146GB disks

– We adjust the proportions for capacity spread (PS65x0 array capacity is discounted) to prevent the large capacity system from carrying the bulk of the workload

17

Other Details – Capacity and Free space balancing

Automatic Performance Load Balancing

18

• Automatically operates when there are multiple members in a pool

• APLB is designed to minimize response time for all pool members

– Arrays are optimized in pool by trading hot and

cold data

– Trigger: significant latency differences between members

• The APLB operates in near real time

– Data exchanges occur as frequently as every 2 minutes

– APLB can adjust the distribution of data and free space as

conditions change

Automatic Performance Load Balancer

19

20

How does each of the Load Balancers work? Automatic Performance Load Balancer

• Data Movement Automation is based on the POOL…

• Data is moved through SAN infrastructure

• Data reside on arrays based on:

– Capacity balancing

– Access frequency vs. Array latency

Member01

SAN

Member02

Vol2

EQL Group

Latency: 10 ms Latency: 40 ms

Pool

Latency: 20 ms Latency: 20 ms

Automatic Performance Load Balancer Data Swapping Algorithm

21

Evaluate member latencies over 2 minute interval Track hot data by fine grain statistics of volume heat by regions

If an out of balance condition is detected, select some hot data on the overloaded member and an equal amount of cold data from the same

volume on a less loaded member Identify cold member involved in the swap based on member latency

and headroom left on the member First choice is assigned (reserved) unallocated data.

Second choice is cold allocated in-use data.

Swap the data (up to 150MB), wait, re-evaluate the pool, and continue if needed

Wait minimum of 2 minutes then re-evaluate

22

Automatic Performance Load Balancer Different array types within Pool (Tiered Pool)

23

Observing Automatic Performance Load Balancing

Dell internal testing May 2011 - Group setup: a single pool containing 3 arrays, PS6500 10k (fastest),\ PS6000XV 15k (second fastest) and PS6000E 7k (slowest)

Me

mb

er

A

Me

mb

er

B

Group performance results

Avg. IOPs goes up Latency comes down

Queue depth reduced

• The configuration is multiple members in a pool

– Single member pools can use hybrid arrays for similar results

• The algorithm works well with

– Different disk capacities

– Different disk speeds

– Different RAID types (and RAID performance differences)

• Optimal tiering depends on adequate resources at different tiers

– SANHQ can help look into operating data

• If workloads do not optimize over time

– May be an issue of not enough resources at specific tier

– Workload is not tiered

– Technical specialists or support can assist analyzing SANHQ data

Other Details – Automatic Performance Load Balancing

24

Tiering within an Array

25

• Common array models are

– comprised of a single Disk speed

› (7200, 10K, 15K, SSD)

– Are configured as a single RAID type

› R6, R10, R50

• There are models that have multiple disk types

– 60x0XVS (15K + SSD)

– 61x0XS (10K + SSD)

– 65x0ES (7200 + SSD)

• These are sometimes called hybrid arrays

How to tier with a single member pool (or group)?

26

• 1st Balancing Technology: Fast shifting within the array of hot and cold data between HDD and SSD drives

– Hot data is monitored, and can be quickly shifted from HDD to SSD

› Balancer can start reacting with 10 seconds of workload shift

• 2nd Technology: Write cache extension / accelerator (shock absorber)

– A portion of SSD is used as a controller write cache extension

– Dramatically increases the size of the write cache

– I/Os are later played back to the normal storage

• The total usable capacity of the storage array is the sum of HDD and SSD (after RAID & shock absorber)

– Hybrids use exclusively a RAID type call “Accelerated R6”

Hybrid Operation Two unique technologies

27

• Volumes are initially placed on

SSD until about 2/3 full, then

spread across both SAS and

SSD drives

28

How does each of the Load Balancers work? Hybrid Arrays – Auto-tiering within an array

Storage Pool

Disks

RAID

PS Series Array 1

Pages

RAID 6 Accelerated

Oracle

SQL

Exchange

Switched Gb Ethernet

Archive

Exchange SQL

Automatic and transparent load balancing within array

SAS Drives SSD Drives

Oracle

• When data on a volume is frequently accessed it turns hot

• If the hot data is on the HDD, the data will be moved to the SSD drives

Hybrid SANHQ View Hybrid Array SSDs vs. SAS Drives IOPS at workload saturation

80% from SSD

20% from SAS

Source: Benefits of Automatic Data Tiering in OLTP Database Environments with Dell EqualLogic Hybrid Arrays, Dell TR-PS002 March 2011

29

HDD

Hybrid Operation Multi-tiered VDI workload

Switched Gb Ethernet

Server

Desktop Linked Clones Volume 1

Desktop Linked Clones Volume 2

PS61x0XS SSD/HDD Hybrid

SSD

= High-IOPS “hot” data

= Low-IOPS “warm” data

Gold Image Volume

30

http://moss.dell.com/sites/apriori/Equipment Icons/Dell_G11_Servers_Stacked_FrontView.gif

• OLTP in production exhibits similar characteristics

– high % of I/O directed at small % of total dataset

• EqualLogic hybrid arrays automatically tiers OLTP workloads by intelligently and continuously moving the “hot” datasets to SSD tier

• Performance continues to be optimized even as access patterns change

Hybrid Results with SQL Server Testing conclusions

Source: Benefits of Automatic Data Tiering in OLTP Database Environments with Dell EqualLogic Hybrid Arrays, Dell TR-PS002 March 2011

Normalized Improvements in Transactions Hybrid SSD Array vs. SAS Array

31

• How can I determine if my application needs SSD (or SSD tiering)?

– SANHQ has displays “Group I/O Load Space Distribution”

• Do Hybrid arrays work with other load balancers (APLB, Network, etc.)

– Yes

– Hybrid arrays can be mixed in pool with non-hybrid arrays

• Can I have multiple hybrids in a pool?

– Yes

• Can I bind volumes to hybrids?

– Yes – choose RAID preference for Accelerated R6

• Can I bind volumes to the SSD in a hybrid?

– Data placement within a hybrid is done by the load balancer only

• In current generation Hybrids XS, ES, how many SSDs, how much SSD capacity?

– 7 SSDs, ~ 2TB of SSD capacity per array

• Are there all SSD arrays?

– Yes, 61x0S models

Other Details – Hybrid Arrays

32

Summary

33

• The various load balancers that work in an EqualLogic PS Series pool provide a flexible, dynamic operations that quickly adapt to the shifting workload requirements.

• For maximum automatic balancing, pool multiple arrays together

• Follow Best Practices

– Keep up to date in firmware

› Consider V6 FW for latest in load balancing features

– Array choice & setup

› Choose RAID types for required reliability

– Use SANHQ

– Host setup

› MPIO and iSCSI settings

– Network setup

› Switch settings & configuration

Summary

34

Q&A

35

Thank You!

36

These features are representative of feature areas under development. Nothing in this presentation constitutes a commitment that these features will be available in future products. Feature commitments must not be included in contracts, purchase orders, sales agreements of any kind. Technical feasibility and market demand will affect final delivery. THIS PRESENTATION REQUIRES A DELL NDA AND MAY NOT BE PROVIDED ELECTRONICALLY OR AS HARDCOPY TO CUSTOMERS OR PARTNERS.

Notices & Disclaimers

37

2. equallogic load balancers - dell · pdf file• guiding principles • network load...

Documents