storage considerations
DESCRIPTION
Storage considerations. Yaniv Weinberg VMware Professional Services March 2009. AGENDA. Storage Considerations with VMware Virtual Machine Monitor Storage Volume Considerations (Expansion / Partition Alignment) SCSI Reservations Storage I/O Queue Trouble-Shooting Tips. - PowerPoint PPT PresentationTRANSCRIPT
VMWARE CONFIDENTIAL INFORMATION
AGENDA
1. Storage Considerations with VMwareVirtual Machine MonitorStorage Volume Considerations (Expansion / Partition Alignment)SCSI ReservationsStorage I/O Queue
2. Trouble-Shooting Tips
VMWARE CONFIDENTIAL INFORMATION
vmxbuslogic SCSI Emulation SAN friendly BT-958 emulation
vmxlsilogic SCSI Emulation SAN friendly LSI53C1030 emulation Shipping with ESX v2.x and ESX v3.x
NOTE: check VMware ESX release notes and Server System Guide for details or restrictions on vmxbuslogic versus vmxlsilogic for Guest Operating Systems
ESX Architecture: Virtual Machine Monitor
VMWARE CONFIDENTIAL INFORMATION
All eggs in 1 basket Less LUNs easier to
manage Single RAID level
Modular for DR strategies More LUNs more I/Os in flight but
harder to manage Employ multiple RAID levels for
application specific demands
ESX Architecture: Virtual Machine Monitor
VMWARE CONFIDENTIAL INFORMATION
10 VMsL1.Bandwidth
5 VMs 5 VMs
L1.Bandwidth + L2.Bandwidth
10 VMs
L1.Bandwidth
10 VMs to fill up 90 GB
L1.Bandwidth + L2.Bandwidth
10:1
5:1
10:1
>5:1
Storage Volume Considerations
VMWARE CONFIDENTIAL INFORMATION
Starting at LBA 63
64KB
Track 0 Track 1 Track 16 Track 32 Track 64 Track N
Block 0 of VMFS (1 MB) Block 1 of VMFS (1 MB)
First VMFS Partition
LBA 63
Storage Volume Considerations
VMWARE CONFIDENTIAL INFORMATION
64K
Track 0 Track 1 Track 16 Track 32 Track 64 Track N
MBR
Block 0 of VMFS (1 MB) Block 1 of VMFS (1 MB)
First VMFS Partition
LBA 128
Starting at LBA 128
VI3 client will automatically align a VMFS-3 volume on LBA 128.
Storage Volume Considerations
VMWARE CONFIDENTIAL INFORMATION
Step to align the partition manually – Linux/ESX
1. On service console, execute “fdisk /dev/sd<x>”/dev/sd<x> is the device on which you would like to
create the VMFS2. Type “n” to create a new partition3. Type “p” to create a primary partition4. Type “1” to create partition #15. Select the defaults to use the complete disk6. Type “x” to get into expert mode
Storage Volume Considerations
VMWARE CONFIDENTIAL INFORMATION
Step to align the partition manually – Linux/ESX
7. Type “b” to specify the starting block for partitions8. Type “1” to select partition #19. Type “128” to make partition #1 to align on 64KB
boundary10. Type “r” to return to main menu11. Type “t” to change partition type12. Type “1” to select partition 113. Type “fb” to set type to fb (VMFS volume)14. Type “w” to write label and the partition information to
disk
Storage Volume Considerations
VMWARE CONFIDENTIAL INFORMATION
Do we also need to align the partitions with the Guest OS?
The VI client for ESX 3 & VC 2.x automatically aligns VMFS-3 file-systems.However there has been a lot of discussion around partitions alignment within the Guest OS (Windows/Linux) on forums and mailing aliases.The general consensus is yes, you also have to align the Guest OS partitions to get the full performance advantage on partition alignment on ESX.
Storage Volume Considerations
VMWARE CONFIDENTIAL INFORMATION
Partition alignment in the Guest OS:Linux: same procedure as discussed previously. Starting partition is LBA=63, C/H/S of 0/0/63. Move it to LBA=128.Windows: starting partition by default is also misaligned since the offset is 32256 (LBA=63) & not 32768 bytes. For this, we might consider using diskpart, a standard Windows
tool.More information on track alignment can be found at http://www.vmware.com/pdf/esx3_partition_align.pdf.
Storage Volume Considerations
VMWARE CONFIDENTIAL INFORMATION
1. preserving the integrity of VMFS filesystem when more than one host is updating it.
2. currently use SCSI-2 non-persistent reservations which implies that only a single host can reserve the LUN in question at any one time. A reboot of that host will clear the reservation on the LUN (non-persistent).
SCSI Reservations – Why?
VMWARE CONFIDENTIAL INFORMATION
Nov 15 07:48:05 frasvmhst06 vmkernel: 0:12:44:32.598 cpu4:1046)SCSI: vm 1046: 5509: Sync CR at 64Nov 15 07:48:06 frasvmhst06 vmkernel: 0:12:44:33.520 cpu4:1046)SCSI: vm 1046: 5509: Sync CR at 48Nov 15 07:48:07 frasvmhst06 vmkernel: 0:12:44:34.512 cpu4:1046)SCSI: vm 1046: 5509: Sync CR at 32Nov 15 07:48:08 frasvmhst06 vmkernel: 0:12:44:35.482 cpu4:1046)SCSI: vm 1046: 5509: Sync CR at 16Nov 15 07:48:09 frasvmhst06 vmkernel: 0:12:44:36.490 cpu2:1046)SCSI: vm 1046: 5509: Sync CR at 0 Nov 15 07:48:09 frasvmhst06 vmkernel: 0:12:44:36.490 cpu2:1046)WARNING: SCSI: 5519: Failing I/O due to too many reservation conflictsNov 15 07:48:09 frasvmhst06 vmkernel: 0:12:44:36.490 cpu2:1046)WARNING: SCSI: 5615: status SCSI reservation conflict, rstatus 0xc0de01 for vmhba2:0:3. residual R 919, CR 0, ER 3Nov 15 07:48:09 frasvmhst06 vmkernel: 0:12:44:36.490 cpu2:1046)FSS: 343: Failed with status 0xbad0022 for f530 28 2 45378334 5f8825b2 17007a7d 1d624ca4 4 1 0 0 0 0 0
SCSI Reservations – sample log
VMWARE CONFIDENTIAL INFORMATION
SCSI Reservations -- Conditions
1. Virtual Machine Disk FilesWhen a VMDK is created, deleted, placed in REDO mode, has a snapshot (delta) file, is migrated (reservations from the source ESX and from the target ESX) or when the VM is suspended (since there is a suspend file written).If the VMDK is created via a template, then we get SCSI reservations on the source and the target.If a template is created from a VMDK.
2. Service ConsoleWhen the file system is modified via fdisk or similar, when vm-support is run from the Service Console or when vdf is run from the Service Console after a modification to the file system is made.
VMWARE CONFIDENTIAL INFORMATION
3. When a file on the VMFS has its ownership, permissions, size, access times, or modification times changed
However once a .vmdk is file-locked and owned, these changes can take place without any additional SCSI Reservations.
4. When a LUN is first attached to ESX.VMotion operations
SCSI Reservations -- Conditions
VMWARE CONFIDENTIAL INFORMATION
1. Simplify deployments so that a VM does not SPAN more than one LUN.
This will ensure that a SCSI Reservation doesn’t impact more than one LUN.
2. Determine if any operations are already happening on the LUN you wish to start another operation that may cause a SCSI Reservation.
VCB Backups, VMotion operations, Template deployments, etc, etc.
3. Choose one ESX Server as your deployment server to avoid conflicts with multiple ESX servers trying to deploy templates.
SCSI Reservations – Best Practices
VMWARE CONFIDENTIAL INFORMATION
4. Inside Virtual Center, limit access to the Administrative operations so that you control who can enact an operation.
5. Schedule each VM reboot so that there is only one reboot per LUN at any given time.
A power-off and power-on are considered separate operations.6. Use care when scheduling backups.
Verify that there is no current backup operation happening.7. Scaling (No. for VM and Queue length)
SCSI Reservations – Best Practices
VMWARE CONFIDENTIAL INFORMATION
Understanding the Network Storage Stack SCSI Queuing and Throttling
SCSI is a connect/disconnect protocol so the array can make certain optimizationsWait queue - I/O’s buffering in the HBA/sd queueActive queue – I/O’s buffered in the storage arrayService queue – I/O’s being serviced on the disk (read miss) or cache (read hit, or fast write)
IO QUEUING
VMWARE CONFIDENTIAL INFORMATION
GOS (VM) VMKERNEL HBA Storage Processor Disk
IO QUEUING
1 Barrel can hold 64 liters of wine
VMWARE CONFIDENTIAL INFORMATION
GOS (VM) VMKERNEL HBA Storage Processor DiskHKLM,SYSTEM\CurrentControlSet\Services\Lsi_fc\Parameters\
Device,MaximumTargetQueueDepth,0x00010001,0x40
IO QUEUING
VMWARE CONFIDENTIAL INFORMATION
GOS (VM) VMM VMKERNEL HBA Storage Processor Disk1. GOS (VM) 64 (default) for Windows GOS2. VMM 512 (4x128 per virtual SCSI bus)3. VMkernel thousands4. HBA thousands of IOs can be queued per HBA
Emulex: 30 default per LUN (max of 128)Qlogic: 32 default per LUN (max of 128)
5. Storage Processor thousands6. Disk 64 per disk
IO QUEUING
VMWARE CONFIDENTIAL INFORMATION
Which driver module?# esxcfg-module -l | grep -i ql qla2300_707 fc true true
What is the current Q-depth?# cat /proc/scsi/qla2300/? | grep -i "queue depth" Device queue depth = 0x20 Device queue depth = 0x20 # grep -i Queue /var/log/initrdlogs/vmklog.qla2300_707 0:00:00:29.948 cpu3:1033)LinSCSI: 689: Queue depth for device vmhba1:0:0
is 32 0:00:00:29.948 cpu3:1033)LinSCSI: 689: Queue depth for device vmhba1:0:1
is 32
IO QUEUING
VMWARE CONFIDENTIAL INFORMATION
Change Queue Depth to 64: (reboot required)# esxcfg-module -s ql2xmaxqdepth=64 qla2300_707# esxcfg-boot -b# reboot
After reboot: check it !# cat /proc/scsi/qla2300/* | grep -i "queue depth" Device queue depth = 0x40 Device queue depth = 0x40# grep -i Queue /var/log/initrdlogs/vmklog.qla2300_707 0:00:00:29.753 cpu1:1033)LinSCSI: 689: Queue depth for device
vmhba1:0:0 is 64 0:00:00:29.762 cpu1:1033)LinSCSI: 689: Queue depth for device
vmhba2:3:0 is 64# esxcfg-module -g qla2300_707 qla2300_707 enabled = 1 options = 'ql2xmaxqdepth=64‘
IO QUEUING
VMWARE CONFIDENTIAL INFORMATION
Commands queued in VMkernel
Commands sent to driver
Adapter max. queue length
Device max. queue length
World max. queue length
Queuing in VMkernel – where to look for numbers
Use esxtop and press d to show disk statistics
IO QUEUING
VMWARE CONFIDENTIAL INFORMATION
LUN max. queue length
- Active and queued I/Os for this VM
IO QUEUINGQueuing per VM ...
VMWARE CONFIDENTIAL INFORMATION
If using BusLogic with Windows 2000, 2003 or XP, the BusLogic Driver Queue Depth is limited to 1 on these OSs.This can impact performance of your VM.http://kb.vmware.com/kb/1614 documents the issue.The recommendation is to use LSILogic rather than BusLogic drivers.
IO QUEUING
VMWARE CONFIDENTIAL INFORMATION
1. Check cabling, zoning, and LUN presentation against drafted configuration diagram
2. Verify all paths are operationalCheck HBA, Switch and Array LEDs Verify HBA ports are initialized at correct speed Verify Switch port is initialized at correct speed Verify correct array state Verify topology
3. Collect vm-support
Troubleshooting -- Basic
VMWARE CONFIDENTIAL INFORMATION
1. Use Virtual Center and check paths state2. Use command line and check paths state
esxcfg-mpath -lesxcfg-vmhbadevs
3. Look for clues in logs/var/log/messages/var/log/vmkernel
4. Collect vm-support
Troubleshooting -- Intermediate
VMWARE CONFIDENTIAL INFORMATION
1. Switch logsLink or connectivity failuresHigh level of CRC errors
2. Array controller logsI/O errorsHigh level of CRC errorsLink or connectivity errors
3. Fibre Channel Trace4. Collect vm-support with options
Troubleshooting -- Advance
VMWARE CONFIDENTIAL INFORMATION
You can run vm-support on a ESX Server logged in as “root”1. For gathering general debugging information
Run vm-support with “-n” or “-a” options
2. For gathering performance informationRun vm-support with “-s” or “-S” or “-d” or “-i”
3. For gathering information about a specific virtual machine
Run vm-support with “-x” or “-X”See vm-support man page for more information
Troubleshooting – advance vm-support options