vm13 vnx mixed workloads
Post on 24-Dec-2014
2.282 Views
Preview:
DESCRIPTION
TRANSCRIPT
Mixed Workloads on EMC VNX Storage Arrays
Tony Pittman @pittmantonyTPittman@Varrow.com
Martin Valencia @ubergiekMValencia@Varrow.com
Discuss:• How VNX storage pools work• How common workloads compare• Which workloads are compatible• How to monitor performance• How to mitigate performance
problems
Goals For This Session
Also check out this session at 2:55 EMC Session: VNX Performance Optimization and Tuning - David Gadwah, EMC
Goals For This Session
VNX5100 VNX5300CX4-120
VNX5500CX4-240
VNX5700CX4-480
VNX7500CX4-960
CX4VNX Series with Rotating drivesVNX Series with Flash drives
IOPS - Mixed Workloads
Platform
# ofUsers
VNX Basics• VNX shines at mixed workloads
• VNX is EMC’s mid-tier unified storage array
• FC, iSCSI or FCoE block connectivity
• Multiple SAS buses backend• NFS and CIFS file connectivity• Built for flash
VNX Basics
VNX Architecture
LCCLCC
Flash drivesSAS drives
SPSSPS
Power Supply
Power Supply
LAN
Near-Line SAS drives
VNX SP Failover
Clients Oracle serversExchange servers
VNX Unified Storage
Application servers
SAN
FailoverVNX X-Blade
VNX SP
VNX OE FILE
VNX OE BLOCK
Virtual servers
FC iSCSI FC iSCSIFCoE
FCoE
VNX X-BladeVNX X-BladeVNX X-Blade VNX X-BladeVNX X-BladeVNX X-BladeVNX X-Blade
10Gb Enet
10GbEnet
Object:Atmos VE
• Two Storage Processors with DRAM cache, frontend ports (FC, iSCSI, FCOE) and backend ports (6 Gb SAS)
• Each LUN owned by one SP, and accessible by both
• Both SP’s have active connections
VNX Architecture
• FAST Cache– Second layer of read/write cache,
housed on solid state drives– Operates in 64 KB chunks– Reactive in nature– Great for random I/O– Don’t use it for sequential I/O
VNX Architecture
• Storage Pools– Based on RAID
• RAID 5, RAID 1/0, RAID 6
– FAST VP: Fully Automated Storage Tiering• Pools with multiple drive types: EFD, SAS,
NL-SAS• Sub-LUN tiering• Operates at 1 GB chunks• Adjusts over time, not immediately
– FASTCache is more immediate
VNX Architecture
When should I use traditional RAID Groups? As the exception:• Very specific performance tuning (MetaLUNs)• Internal array features (write intent logs, clone
private LUNs)• Maybe Recoverpoint journals• Supportability (I’m looking at you, Meditech)
Remember the limitations:• Maximum of 16 drives• Expand via metaLUNs• No tiering
VNX Architecture
• IOPS per drive type (for sizing)3500 IOPS - EFD 180 IOPS - 15k rpm drive 140 IOPS - 10k rpm drive 90 IOPS - 7200 rpm driveEffects of RAID
• Parity calculations (RAID 5 and RAID 6)• Effect on response times
• Write penalty• RAID 1/0 = 2x• RAID 5 = 4x• RAID 6 = 6x
VNX Architecture
• Real-world effect of write penalty:– 10x 600 GB 15k SAS drives = 1800
read IOPS• With RAID 1/0, capable of 900 write IOPS• With RAID 5, capable of 450 write IOPS
– 1 write operation takes 4 I/O operations
• With RAID 6, capable of 300 write IOPS– 1 write operation takes 6 I/O operations
VNX Architecture
Common workloads seen in the field.
Virtual Disks/VMFS (RAID5)DB – Data files (RAID5)DB – Transaction files (RAID 10)Unstructured Data, Backups (RAID6)
Workloads
Benchmarking Real World Performance
Non-profitUses generic applications rather than specific applicationsSPEC benchmarks rely on a mix of I/O to simulate a generic applicationThis balances the need for real world performance and consistency over time
Real World WorkloadsStandard Performance Evaluation Corporation
• Array with single application• No budget constraints• Separate storage pools for
different sub-workloads
Ideal Scenario
• The ideal SQL:– PCIe flash and XtremeSW on the host– FAST Cache in the array– tempDB:
• Data files on separate RAID 5 storage pool
– User DB’s:• Each has tlogs on separate RAID 1/0 storage pool• Each has data files on one or more RAID 5
storage pools, with the appropriate drive configuration (EFD+FAST)
– Backups / Dump files:• Separate RAID 6 storage pool, maybe a separate
array
Ideal Scenario
Cost prohibitive, and do we have to?• Business Critical Application … maybe• Management & Lower-tier application…
probably not
Reality – Can’t isolate every workload
• One or Two RAID 5 pools (ex: Gold & Silver)
– FAST with EFDs, SAS, NL-SAS according to skew or 5/20/75 rule
• RAID 1/0 pool for transaction logs. – 15k SAS drives
• RAID 6 pool for backup files and unstructured data– 7.2k NL-SAS drives
Basic Storage Pool Layout
• VMFS• DB Data Files• Good for random read/write mix• Use FASTCache
RAID 5 Pools
Example:• Gold Pool: 5x EFD, 15x SAS, 16x NL-SAS• Silver Pool: 15x SAS, 16x NL-SAS
Drive Composition: Skew
Example:• 8x 15k SAS drives
RAID 1/0 Pool• Transaction Logs for many applications• Specifically for small sequential writes• Do Not Use FAST Cache– It’ll be wasted– It’ll hurt performance
• Unstructured data– Office Files (.doc, .xls, .etc)– Images
• Backup files– Split into separate pool if necessary
• Low I/O & high capacity • Good for long sequential writes• Do Not Use FAST Cache
– It’ll be wasted
RAID 6 Pool
Example:• 16x 7.2k rpm NL-SAS drives
Pool Layout
There is no “Set it and forget it”
Workloads change over time• Users get added• Transaction load increases• Requirements change
Often no one tells us
Monitoring and Troubleshooting
Proactive performance review– Admins wear too many hats– Low priority
Reactive to user impact (Too late)– Crisis management
Problem identification
Where do we start? What do we look at?• Cache Utilization– Exceeding a high water marks, need to flush
cache to disk– Forced Flushes
• SP performance – Balance the SP load
• Pool LUN migration (metadata)• Online LUN migration
Troubleshooting Metrics
Unisphere Analyzer (On array)– Proactively gathers data for review– Data logging must be enabled on the array
The “Toolbox”
VNX Monitoring and Reporting (Off array)– Historical Data Collection– Streamlined application based on
Watch4net
The “Toolbox”
EMC miTrend– Leverages NAR (Navisphere analyzer data) that can
be retrieved from the array– Need EMC or partner (us) to perform the analysis
The “Toolbox”
Several options for mitigating a performance problem:• Add drives
– OE 32 required to rebalance existing data– Pre OE 32, must increase pool by originating drive
count, existing data will not be rebalanced
• Migrate to a different pool– Live migration avoids the need for an outage– Performance Throttling minimizes performance
impact
Troubleshooting / Problem Mitigation
• Rebalance at the application layer– Storage vMotion– Host-based data migration (Open Migrator, etc)
• Migrate data between arrays– SANCopy– Replication (Mirroring/RecoverPoint)
• Reduce workload– Reschedule for off-hours (backups for example)– Decommission non-critical workloads
Troubleshooting / Problem Mitigation
Questions
Thank You!
top related