sto1315bu vsan troubleshooting deep dive or … · vsan troubleshooting deep dive vmworld 2017...

63
Francis Daly & Javier Menendez STO1315BU #VMworld #STO1315BU vSAN Troubleshooting Deep Dive VMworld 2017 Content: Not for publication or distribution

Upload: doanhuong

Post on 29-Aug-2018

263 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Francis Daly & Javier Menendez

STO1315BU

#VMworld #STO1315BU

vSAN Troubleshooting Deep Dive

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 2: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

• This presentation may contain product features that are currently under development.

• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

• Technical feasibility and market demand will affect final delivery.

• Pricing and packaging for any new technologies or features discussed or presented have not been determined.

Disclaimer

#STO1315BU CONFIDENTIAL 2

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 3: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Agenda

1 Using common sense

2 vSAN Overview

3 vSAN Tools

4 Health

5 Use cases

6 Questions

#STO1315BU CONFIDENTIAL 3

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 4: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Plan Your Build

#STO1315BU CONFIDENTIAL 4

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 5: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Ensure You Have Backups (That Have Been Tested)

#STO1315BU CONFIDENTIAL 5

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 6: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

vSAN Cluster – The World is Your Oyster

Disk

Group

1

Disk

Group

2

Disk

Group

5…

SSD

1

SSD

or

HDD

1

SSD

or

HDD

2

SSD

or

HDD

7…

Cache

Capacity

Host 1Host 2 Host 3 Host 64…

10GbE

• Cluster: 2-64 physical hosts

• Host: 1-5 disk groups

• Disk Group:

– 1 flash device for cache

– 1-7 flash or HDD devices for capacity

#STO1315BU CONFIDENTIAL 6

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 7: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Important Considerations

• Ensure all hardware is supported to ensure no performance degradation

• Disks/Controllers/Firmware/Drivers

• Objects:

• VM Home, VM Swap, VMDK

• Delta Disk, Memory Delta

vSphere vSAN

vSAN Datastore

#STO1315BU CONFIDENTIAL 7

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 8: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

What if I Choose Incorrect Policy?

• Applied at per VM level, or VMDK level

• Define protection level & performance

#STO1315BU CONFIDENTIAL 8

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 9: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

vSAN Objects and Components

• The vSAN datastore is an object store

• Each object made up of one or more components

• Policy will determine how components are distributed across cluster

C1

RAID-0 RAID-0

C2 W

RAID-0

200GB

RAID-1

FTT=1

Keeper_VM

#STO1315BU CONFIDENTIAL 9

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 10: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

vSAN Objects and Components

• The vSAN datastore is an object store

• Object store allows you to meet granular availability and performance requirements

• Each object made up of one or more components

• Data (components) is distributed across cluster based on VM storage policy

C1

RAID-0 RAID-0

C2 W

RAID-0

200GB

RAID-1

FTT=1

RAID-1 requires 2n+1 hostsNeed >50% of components for object

to remain active (quorum)

#STO1315BU CONFIDENTIAL 10

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 11: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Network Partition

• The vSAN datastore is an object store

• Object store allows you to meet granular availability and performance requirements

• Each object made up of one or more components

• Data (components) is distributed across cluster based on VM storage policy

C1

RAID-0 RAID-0

C2 W

RAID-0

200GB

RAID-1

FTT=1

Witness or votes are used as tie

breaker if 2N+1 not satisfied#STO1315BU CONFIDENTIAL 11

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 12: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

vSAN Object Component States

C1

RAID-0

C2 C3 C1

RAID-0

C2 C3 W

RAID-0

RAID-1

FTT=1

• Active. Component accessible

• Absent. Inaccessible, but no explicit error codes sensed

– Ex: Host outage, or EMM with “ensure accessibility”

– Rebuild begins after 60 minute timeout window

• Degraded. Inaccessible, with error codes sensed

– Ex: Device failure

– Rebuild begins immediately.

• Can check state in UI,CLI,RVC and more

#STO1315BU CONFIDENTIAL 12

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 13: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Tools

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 14: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Useful Tools for Trouble-shooting

Health Check

RVCESXCLIvRealize Ops

vSAN Observer

#STO1315BU CONFIDENTIAL 14

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 15: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

ESXCLI

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 16: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

VC Unavailable – Now What?

#STO1315BU CONFIDENTIAL 16

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 17: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Health Cluster List

#STO1315BU CONFIDENTIAL 17

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 18: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Investigating Yellow State

Threshold is

30Disk A = 25% Disk B = 60% Delta = 35%

Rebalance

required

Consider performance impact

#STO1315BU CONFIDENTIAL 18

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 19: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Where are the Most Serious Issues in my Cluster?

What is an object? vmdk,

vswp etc

#STO1315BU CONFIDENTIAL 19

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 20: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

So What Happened?

#STO1315BU CONFIDENTIAL 20

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 21: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Diving Deeper

#STO1315BU CONFIDENTIAL 21

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 22: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Fundamentals of vSAN ResyncsResync is like DNA Replication!

#STO1315BU CONFIDENTIAL 22

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 23: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Resync Summary

Where else can I see this

vSphere Web Client

RVC: Resync Dashboard

Log Insight

vSAN Observer

#STO1315BU CONFIDENTIAL 23

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 24: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Resync Summary

Do not

Reboot

hosts/Maintenance mode

Remove hosts/disks

Do

Configure hosts to reboot

after PSOD

Increase default clomd

repair time

#STO1315BU CONFIDENTIAL 24

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 25: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

vSAN Debug ListAre my disks keeping up?

Network

VM IO workload exceeds

bandwidth

Resync

Throttling will prioritize

VM IO’s

#STO1315BU CONFIDENTIAL 25

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 26: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Performance BottlenecksFast disks but still seeing large queues and failed IO’s

Manage expectations Controller queue depth

SSD gradeNetwork setup

#STO1315BU CONFIDENTIAL 26

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 27: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

RVC

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 28: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

vsan.check_state

• What state is my cluster in?

• I want to quickly know if there is anything serious happening in my cluster

#STO1315BU CONFIDENTIAL 28

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 29: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

vsan.disks_stats 0

KB 2145267 – Understand vSAN on-disk format

#STO1315BU CONFIDENTIAL 29

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 30: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

vsan.cluster_info

#STO1315BU CONFIDENTIAL 30

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 31: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Health UI

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 32: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Cluster Level Quick Peek

CEIP will need to be enabled

#STO1315BU CONFIDENTIAL 32

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 33: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Issues at Cluster Level

#STO1315BU CONFIDENTIAL 33

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 34: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

vSAN Disk Balance

#STO1315BU CONFIDENTIAL 34

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 35: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

What to Do if Rebalance is Required?

#STO1315BU CONFIDENTIAL 35

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 36: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

vSAN Disk Balance

#STO1315BU CONFIDENTIAL 36

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 37: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

What Happens During a Rebalance?

#STO1315BU CONFIDENTIAL 37

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 38: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Rebalance in Action

#STO1315BU CONFIDENTIAL 38

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 39: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Post Rebalance

#STO1315BU CONFIDENTIAL 39

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 40: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Health ServiceCLI

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 41: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

vCenter vSAN Health Status Script

• Provides a text output for all the Health Tests

• Might be useful to run on the vCenter as the log bundles are being collected

• Manually copy off the text file

python /usr/lib/vmware-vpx/vsan-health/vsan-vc-health-status.py > /tmp/vsan-vc-health-status.txt

#STO1315BU CONFIDENTIAL 41

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 42: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

vCenter vSAN Health Status Script

Runs the following RVC commands and collects their output

• vsan.cluster_info on each Cluster

• vsan.host_info for each host in every cluster

• vsan.vm_object_info on a host-per-host basis

• vsan.disks_info for each host on a per-cluster basis

• vsan.disks_stats

• vsan.check_limits

• vsan.check_state

• vsan.lldpnetmap

• vsan.obj_status_report

• vsan.resync_dashboard

• vsan.disk_object_info

What else does it do for me?

How much time would that save you on a call?

#STO1315BU CONFIDENTIAL 42

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 43: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

ESXi vSAN Health Status Script

• Command-line tool to display the vSAN health for a particular node

What is the vSAN Health Status script?

• On each host individually, not vCenter• /usr/lib/vmware/vsan/bin/vsan-health-status.pyc

Where do I find it?

• [root@esxi] python /usr/lib/vmware/vsan/bin/vsan-health-

status.pyc

How do I run it?

See KB 2107705

#STO1315BU CONFIDENTIAL 43

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 44: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

When to Use it

• The Health Service on vCenter is not available

• You can check the Health of individual nodes and their components

• Output is quite different to vCenter version

When vCenter is down

#STO1315BU CONFIDENTIAL 44

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 45: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

ESXi vSAN Health Status script – What Will I Find in the Output

vSAN HCL related hardware info

• Displays key information about controllers to check against HCL

Limits summary

• How many components vs limit on the host

• How much space is consumed by components

Network summary

• Subnet used by vSAN

• Multicast addresses

Physical VSAN disk summary

• Info about physical disks including cmmds UUIDs

#STO1315BU CONFIDENTIAL 45

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 46: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Missing Disks

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 47: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

I Added Hosts to my Cluster, Not Seeing Space

#STO1315BU CONFIDENTIAL 47

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 48: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Which Tools/Logs Do I Need

Track down missing MD’s

What is vobd.log?

What is boot.log?

What happens in cmmds?

esxcli vsan storage list – check if in cmmds

#STO1315BU CONFIDENTIAL 48

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 49: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

esxcli vsan Storage List

#STO1315BU CONFIDENTIAL 49

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 50: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

What to Do Next

Take a disk naa ID that is not “In CMMDS”

cd /var/run/log

less vobd.log

/”Disk UUID 12345678”

Disk not found

#STO1315BU CONFIDENTIAL 50

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 51: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

What to Do Next

cd var/log

cat boot.log |less

/”Disk UUID 12345678”

LVM: 8355:Device naa.”Disk UUID 12345678” detected to be a snapshot:

#STO1315BU CONFIDENTIAL 51

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 52: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

What Happened and Resolution

The disk cannot be added to the cluster because there is a file system on it

Disk was given a UUID, It was in a cluster, and used at some point

Verified with customer we could delete data

Used PartedUtil to kill partitions

Delete disk groups and create new disk groups

#STO1315BU CONFIDENTIAL 52

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 53: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Poor VM performanceSome VM’s are performing poorly and these VM’s are sitting on the vSAN datastore, this has resulted in effective data unavailability. Performance improved after some VSAN disks entered a failed state

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 54: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

What Happened to the Disk?

less vmkernel.log | grep failed

Check the sense codes – scsi decoder

#STO1315BU CONFIDENTIAL 54

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 55: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

What Else Happened? Controllers

Host ID

#STO1315BU CONFIDENTIAL 55

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 56: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Sense Code Output

#STO1315BU CONFIDENTIAL 56

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 57: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

What Happened?

Three disks failed in short space of time

Controllers were aborting

What would you do?

#STO1315BU CONFIDENTIAL 57

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 58: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

VMware vSAN Training

Training

VMware vSAN: Deploy and Manage [V6.6]

• Classroom

• Live Online

• Onsite

More Options

On Demand Courses

• Self-paced learning

• Meets certification requirements

VMware Learning Zone

• vSAN Troubleshooting

• Cloud-based learning

• Supplements traditional training

View complete options and descriptions:www.vmware.com/education #STO1315BU CONFIDENTIAL 58

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 59: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Global Support Services

Learn more about how VMware is radically transforming Customer Support through VMware Skyline™ technology.

• See a demo in the VMware booth in the Solutions Exchange

• Sign up for a Meet the Experts roundtable in the Schedule Builder on the VMworld mobile app or visit the Meet the Experts, Level 2

• Visit www.vmware.com/support/service/skyline

#STO1315BU CONFIDENTIAL 59

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 60: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Additional Education Resources

At VMworld 2017

• Education & Certification Lounge: VM Village

• Certification Exam Center: Jasmine EFG, Level 3

Online

• VMware Training: www.vmware.com/education

• VMware Certification: www.vmware.com/certification

Save 50% off VCP & VCAP

exams at

VMworld 2017

#STO1315BU CONFIDENTIAL 60

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 61: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

Prove Your Expertise with VMware Digital Badges

What are VMware Digital Badges?

• Single source that combines your credential and

a complete overview of your skills

• Way for you to easily share your accomplishments

in social media

• Provides employers with easy, valid verification of

VMware credentials

Digital Badges Online

• vmware.com/go/vSAN2017badge

• vmware.com/go/VxRail2017badge

VMware vSAN 2017 Specialist

Dell EMC VMwareCo-Skilled - VxRail 2017

NEW!VMware Digital Badges

Prove your expertiseon vSAN and

vSAN HCI environments

#STO1315BU CONFIDENTIAL 61

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 62: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 63: STO1315BU vSAN Troubleshooting Deep Dive or … · vSAN Troubleshooting Deep Dive VMworld 2017 Content: ... 1 Using common sense 2 vSAN Overview 3 vSAN Tools 4 Health 5 Use cases

VMworld 2017 Content: Not fo

r publication or distri

bution