vsan – architettura e design

42
Virtual SAN Design & Architecture Simone Di Mambro Sr. Presales System Engineer [email protected]

Upload: vmug-it

Post on 07-Jan-2017

143 views

Category:

Technology


5 download

TRANSCRIPT

Virtual SAN Design & Architecture

Simone Di Mambro Sr. Presales System Engineer [email protected]

Chi sono e perchè sono qua

• Da Luglio 2016 faccio parte del team di Prevendita

• 4 anni in VMware nei Servizi Professionali

• Ho maturato le seguenti passioni nella vita:

– Smartworld ed affini (…si consideratemi pure un Nerd!)

– Dieta!

• “Life motivator” per amici

– Runner (per scendere di peso ma anche per mangiare di tanto in tanto qualche sfizio!)

2

3

What’s new in vSAN 6.2

How it Works

Use Cases

In case of failure?

Reference

Overview and principles

4

VMware® vSphere ® Storage Policy-Based Mgmt

• App-centric storage automation

• Common mgmt across heterogeneous arrays

VMware® Virtual SAN™

• Hyper-converged architecture

• Data persistence delivered from the hypervisor

The VMware Software-Defined Storage Vision Transforming Storage the Way Server Virtualization Transformed Compute

vSphere

Tiered Hybrid and All-Flash Options

5

All-Flash

90K IOPS per Host +

sub-millisecond latency

Caching

Writes cached first, Reads from capacity tier

Capacity Tier Flash Devices

Reads go directly to capacity tier

SSD PCIe Ultra DIMM

Data Persistence

Hybrid

40K IOPS per Host

Read and Write Cache

Capacity Tier SAS / NL-SAS / SATA

SSD PCIe Ultra DIMM

Virtual SAN

6

What’s new in vSAN 6.2

How it Works

Use Cases

In case of failure?

Reference

Overview and principles

Deduplication and Compression for Space Efficiency

• Nearline deduplication and compression per disk group level.

– Will be called “Space Efficiency”

• Space Efficiency enabled on a cluster level

• Deduplicated when de-staging from cache tier to capacity tier

– Fixed block length deduplication (4KB Blocks)

• Compressed after deduplication

7

Beta

esxi-01 esxi-02 esxi-03

vmdk vmdk

vSphere & Virtual SAN

vmdk

All Flash Only

RAID-5 Erasure Coding

• With “FTT=1” availability RAID-5

– 3+1 (4 host minimum)

– 1.33x instead of 2x overhead

• 20GB disk normally takes 40GB, now just ~27GB

8

RAID-5

ESXi Host

parity

data

data

data

All Flash Only

ESXi Host

data

parity

data

data

ESXi Host

data

data

parity

data

ESXi Host

data

data

data

parity

RAID-6 Erasure Coding

• With “FTT=2” availability RAID-6

– 4+2 (6 host minimum)

– 1.5x instead of 3x overhead

• 20GB disk normally takes 60GB, now just ~30GB

9

All Flash Only

ESXi Host

parity

data

data

RAID-6

ESXi Host

parity

data

data

ESXi Host

data

parity

data

ESXi Host

data

parity

data

ESXi Host

data

data

parity

ESXi Host

data

data

parity

Virtual SAN Quality of Service

10

• Complete visibility into IOPS consumed per

VM/Virtual Disk

• 1 Click-to-configure limit

• Eliminate noisy neighbor issues

• Granularly manage performance SLAs:

independent of VM provisioning order

vSphere + Virtual SAN

vSphere & Virtual SAN

Enhanced Virtual SAN Management with New Health Service

Built-in performance monitoring

Health and performance APIs and SDK

Storage capacity reporting

And many more health checks…

Performance Monitoring Capacity Monitoring

11

12

What’s new in vSAN 6.2

How it Works

Use Cases

In case of failure?

Reference

Overview and principles

Virtual SAN Disk Groups

• Virtual SAN uses the concept of disk groups to pool together flash devices and magnetic disks as single management constructs

• Disk groups are composed of at least 1 flash device and 1-7 capacity devices

– Flash devices are use for read cache / write buffer

– Storage capacity can be provided by either magnetic disks or flash based devices

– Disk groups cannot be created without a flash device

– Can have 5 disk groups per host = 35 capacity devices per host

13

disk group disk group disk group disk group disk group

Each host: 5 disk groups max. Each disk group: 1 flash device + 1-7 capacity devices

Virtual SAN Disk Groups

• Virtual SAN All-Flash disk group configurations are composed of at least 1 performance flash device and 1 capacity flash device.

– Flash devices are use in a two tier format for caching and capacity.

– Capacity flash devices are used for storage capacity similar to magnetic disks.

– All-Flash disk groups cannot be created without a capacity flash device.

14

disk group disk group disk group disk group

Each host: 5 disk groups max. Each disk group: 1 Performance + 1 to 7 Capacity

Devices

disk group

capacity capacity capacity capacity capacity

performan

ce

performan

ce performan

ce

performan

ce

performan

ce

Virtual SAN Datastore

• Virtual SAN is an object store solution that is presented to vSphere as a file system.

• The object store mounts the volumes from all hosts in a cluster and presents them as a single shared datastore.

– Only members of the cluster can access the Virtual SAN datastore

– Not all hosts need to contribute storage, but its recommended.

15

disk group disk group disk group disk group

Each host: 5 disk groups max. Each disk group: 1 SSD + 1 to 7 HDDs

disk group

Virtual SAN

network

Virtual SAN

network

Virtual SAN

network

Virtual SAN

network

Virtual SAN

network

Virtual SANDatastore

HDD HDD HDD HDD HDD

Virtual SAN Objects

• Virtual SAN manages data in the form of flexible data containers called objects.

• Virtual machines files are referred to as objects.

– There are five different types of virtual machine objects:

• VM Home

• VM swap

• VMDK

• Snapshots

• Memory (vmem)

16

disk group disk group disk group disk group

Each host: 5 disk groups max. Each disk group: 1

SSD + 1 to 7 HDDs

disk group

Virtual SAN

network

Virtual SAN

network

Virtual SAN

network

Virtual SAN

network

Virtual SAN

network

Virtual

SANDatastore

HDD HDD HDD HDD HDD • Virtual machine objects are split into

multiple components based on performance and availability requirements defined in VM Storage profile.

Virtual SAN Components

• Virtual SAN components are chunks of objects distributes across multiple hosts in a cluster in order to tolerate simultaneous failures and meet performance requirements.

• Virtual SAN utilizes a Distributed RAID architecture to distribute data across the cluster.

• Components are distributed with the use of two main techniques:

– Striping (RAID0)

– Mirroring (RAID1)

17

disk group disk group disk

group

disk

group

disk

group

Virtual SAN

network

Virtual SAN

network

Virtual SAN

network

Virtual SAN

network

Virtual SAN

network

Virtual

SANDatastore

replica-1 replica-2 RAID1

HD

D

HD

D

HD

D

HD

D

HD

D

• The number of component replicas and copies created is based on the object policy definition.

Storage Policies

18

Simplest configuration: 1 object containing 1 component

19

esxi-01 esxi-02 esxi-03 esxi-04

Virtual SAN Policy: “Number of failures to tolerate = 0”

vmdk

raid-0

VMDK

More complicated layout: 1 object containing 3 components

20

esxi-01 esxi-02 esxi-03 esxi-04

Virtual SAN Policy:

“Number of failures to tolerate = 1”

vmdk vmdk witness

raid-1

VMDK

More complicated layout: 1 object containing 5 components

21

esxi-01 esxi-02 esxi-03 esxi-04

VSAN Policy: “Number of failures to tolerate = 1” + “Stripe Width =2”

stripe-1a stripe-2a witness

raid-0 raid-0

stripe-1b stripe-2b

raid-1

VMDK

22

What’s new in vSAN 6.2

How it Works

Use Cases

In case of failure?

Reference

Overview and principles

2-Node Remote Office Branch Office Solution – More details

23

23

Centralized Data Center

Centr

ally

ma

nage

d b

y o

ne v

Cente

r S

erv

er

ROBO1

HDD SSD HDD SSD

vSphere +

Virtual SAN

HDD SSD HDD SSD

vSphere +

Virtual SAN

HDD SSD HDD SSD

vSphere +

Virtual SAN

witness

witness

witness

vESXi applianc

e

vESXi applianc

e

vESXi applianc

e

ROBO2

ROBO3

vCenter Server

2-Node ROBO Solution Overview

• 2-Node in ROBO share a single L2

domain. It requires multi-cast setup with

<5ms RTT, and 1Gbps <10 VMs or

10Gbps >10 VMs.

• Communication between the ROBO and

witness is unicast only. No need for

multi-cast and Witness is only reachable

via L3. Requires <500ms RTT over

1.5Mbps

• Resource requirements for the witness assuming 5-10 VMs in each ROBO:

• Memory: 8 GB (Use overcommit to save costs)

• CPU: 1vCPU

• Storage: 5 GB for capacity & 1GB for catching tier (Use thin provisioning to save costs)

New Hardware Options

2-node clusters for ROBO

New Flash HW options:

• Intel NVMe

• Diablo ULLtraDIMM™

Hardware Deployment options

Certified Hardware

Integrated Systems

Virtual SAN Stretch Cluster

24

• Active-Active architecture

• Supported on both Hybrid and All-Flash

• Extremely easy to setup through UI

• Automated failover

Fault Domain A

Active

Fault Domain C

Active

vSphere + Virtual SAN

Stretched Cluster

HDD SS

D

HD

D

SS

D

HD

D

SS

D HDD SS

D

HD

D

SS

D

HD

D

SS

D

Witness

Fault Domain B

HDD SS

D

Overview

<=5 ms latency over >10/20/40 Gbps L2 with multicast

Virtual SAN Stretched Clusters – Overview

25

Active – Active Data Centers

HDD SSD HDD SSD

vSphere + Virtual SAN

Witness Appliance

< 5 ms RTT over >10/20/40 Gbps

FD3

FD1 FD2

L2 with multicast

Overview

• Site-level protection with zero data loss and near-instantaneous recovery. This enables support for active-active data center

• Architecture is based on Fault Domains. Virtual SAN cluster split across 3 failure domains

• Network latency between the two main sites must be less than 5 milliseconds RTT

• Witness VM resides on a 3rd site ( It could be another Data Center, vCloud Air, or Colo). It holds only meta-data and does not run any VMs

• Max FTT is 1

• Automated failover in case of site failures

• Communication between the main sites and witness is unicast( FD1 and FD2 share a single L2 domain. FD3 is only reachable via L3 from FD1 & FD2)

• Each VSAN stretched cluster scales to 31+31+1

• Customer could have many VSAN stretched clusters

Benefits

• Disaster avoidance

• Planned maintenance

Virtual SAN Stretched Clusters – Sites Network Requirements

26

Network Requirements

HDD SSD HDD SSD

vSphere + Virtual SAN

Witness Appliance

< 5 ms RTT over >10/20/40 Gbps

FD3

FD1 FD2

L2 with multicast

Details

• Max network latency of 5 millisecond RTT and enough

bandwidth for the workloads

• Support for sites up to 100km apart as long as network

requirements are met

– <=200ms RTT over >=100Mbps to witness (L3, No multicast)

– <= 5ms RTT over 10/20/40gbps to data fault domains (L2

with multicast)

• Network Bandwidth requirement for the write

operations between the main two sites in Kbps= N

(number of nodes on one site) * W (Amount of 4K IOPS

per Node) * 125Kbps. Minimum of 1 Gbps each way

• For a 5+5+1 config on server medium and ~300 VM the

network requirement is around total of 4 Gbps (2Gbps

each way)

• Layer 2 network communication is required

Virtual SAN Stretched Cluster with vSphere Replication and SRM

28

– Live migrations and automated HA restarts between stretched cluster sites

– Replication between Virtual SAN datastores enables RPOs as low as 5 minute

– 5 minutes RPO is exclusively available to Virtual SAN 6.x

– Lower RPO’s are achievable due to Virtual SAN’s efficient vsanSparse snapshot mechanism

– SRM does not support standalone Virtual SAN with one vCenter Server

Any distance >5 min RPO site a

vSphere + Virtual SAN

Stretched Cluster

< 5 ms RTT over >10/20/40 gbps

Active Active

site b

L2 with Multicast

site x

vSphere + Virtual SAN

VR

DR

vCenter

vCenter

witness appliance

SRM

SRM

29

What’s new in vSAN 6.2

How it Works

Use Cases

In case of failure?

Reference

Overview and principles

Understanding Failure Events

Virtual SAN recognized two different types of hardware device events in order to define the type of failed scenario:

– Absent

– Degraded

Absent: VSAN knows/thinks the data is temporarily unavailable

– Good chance data copy will come back soon

Absent events are responsible to trigger the 60 minutes recovery operations.

– Virtual SAN will wait 60 minutes before starting the object and component recovery operations

– 60 minutes is the default setting for all absent events

– Configurable value via hosts advanced settings

30

Host Failure – 60 Minute Delay

• Absent – will wait the default time setting of 60 minutes before starting the copy of objects and components onto other disk, disk groups, or hosts.

• Greater impact on the cluster overall compute and storage capacity.

vsan network

vmdk vmdk witness

esxi-01 esxi-02 esxi-03 esxi-04

vmdk

new mirror copy 60 minute wait

Host failure, 60 minutes wait copy of impacted component

raid-1

Understanding Failure Events

Degraded: VSAN knows/thinks the data copy is permanently lost

– No hope data copy will come back

– No reason to wait, data copy is known to be lost

Degraded events are responsible to trigger the immediate recovery operations.

– Triggers the immediate recovery operation of objects and components

– Not configurable

Any of the following detected I/O errors are always deemed degraded:

– Magnetic disk failures

– Flash based devices failures

– Storage controller failures

Any of the following detected I/O errors are always deemed absent:

– Network failures

– Network Interface Cards (NICs)

– Host failures

Unplugging disks causes absent events, and not degraded – this allows for mistakes (unplugging incorrect drive) to be rectified with 60 minutes

32

Cache/Capacity Device Failure – Instant mirror copy

• Degraded – The process for resynchronizing all impacted components on the failed capacity device is instantaneously in order to re-created the data onto other disk, disk groups, or hosts.

– Resynchronization can take time depending on the amount of data.

vsan network

vmdk vmdk witness

esxi-01 esxi-02 esxi-03 esxi-04

vmdk

new mirror copy Instant!

Disk failure, instant mirror copy of impacted component

raid-1

Network Failure – 60 Minute Delay

• Absent – will wait the default time setting of 60 minutes before starting the copy of objects and components onto other disk, disk groups, or hosts.

• NIC failures, physical network failures can lead to network partitions.

– Multiple hosts could be impacted in the cluster.

vsan network

vmdk vmdk witness

esxi-01 esxi-02 esxi-03 esxi-04

vmdk

new mirror copy 60 minute wait

Network failure, 60 minutes wait copy of impacted component

raid-1

Virtual SAN 1 host isolated – HA restart

vsan network

vmdk vmdk witness

esxi-01 esxi-02 esxi-03 esxi-04

isolated!

HA restart

raid-1

vSphere HA restarts VM

Virtual SAN 2 hosts isolated – HA restart

vsan network

vmdk vmdk witness

esxi-01 esxi-02 esxi-03 esxi-04

isolated! isolated!

HA restart

raid-1

vSphere HA restarts VM on ESXi-02 / ESXi-03, they own > 50% of components!

Virtual SAN partition – With HA restart

vsan network

vmdk vmdk witness

esxi-01 esxi-02 esxi-03 esxi-04

Partition 1 Partition 2

HA restart

vSphere HA restarts VM in Partition 2, it owns > 50% of components!

raid-1

Monitoring Rebuilds in the UI

38

3 Nodes or 4?

• 3 Nodes – Nowhere to re-build components

• 4 Nodes – Offers the ability to rebuild components

39

vmdk vmdk witness X vmdk

Quorum Mechanism

• In version 5.5 the witness was required to meet the necessary >50% of components must be available for the data to be online

• In 6.0 not all disk objects will have a Witness

• 6.0 Introduced a “Voting” mechanism where each component could have more than one vote

• 6.x >50% of votes are required for data to be available

40

vmdk witness vmdk

33.3333% 33.3333% 33.3333%

41

What’s new in vSAN 6.2

How it Works

Use Cases

In case of failure?

Reference

Overview and principles

Three Ways to Get Started with Virtual SAN Today

42

VSAN

Assessment 3 2 Download

Evaluation Online

Hands-on Lab 1

• Test-drive Virtual SAN right

from your browser—with

an instant Hands-on Lab

• Register and your free,

self-paced lab is up and

running in minutes

• 60-day Free Virtual SAN

Evaluation

• VMUG members get a 6-

month EVAL or 1-year

EVALExperience for $200

• Reach out to your VMware

Partner, SEs or Rep for a

FREE VSAN Assessment

• Results in just 1 week!

• The VSAN Assessment tool

collects and analyzes data

from your vSphere storage

environment and provides

technical and business

recommendations.

Learn more…

vmware.com/go/virtual-san

• Virtual SAN Product

Overview Video

• Virtual SAN Datasheet

• Virtual SAN Customer

References

• Virtual SAN Assessment

• VMware Storage Blog

• @vmwarevsan

vmware.com/go/try-vsan-en vmware.com/go/try-vsan-en

Grazie Blog: http://virtualmaster.it Twitter: @sidimam Linkedin: linkedin.com/in/simonedimambro [email protected]