scsi support on xen · scsi support on xen matsumoto hitoshi [email protected] fujitsu...

44
All Rights Reserved, Copyright (C) FUJITSU 2007 SCSI support on Xen Matsumoto Hitoshi [email protected] Fujitsu Ltd.

Upload: hoangquynh

Post on 08-Sep-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

All Rights Reserved, Copyright (C) FUJITSU 2007

SCSI support on Xen

Matsumoto Hitoshi [email protected]

Fujitsu Ltd.

All Rights Reserved, Copyright (C) FUJITSU 20072

AgendaAgenda

ArchitectureCurrent status Next challenge

All Rights Reserved, Copyright (C) FUJITSU 20073

Architecture

All Rights Reserved, Copyright (C) FUJITSU 20074

Requirements for VM in data centerRequirements for VM in data center

Conventional application supportServer consolidation HW fault tolerancePerformance

All Rights Reserved, Copyright (C) FUJITSU 20075

Requirements for VM in data centerRequirements for VM in data center

Conventional application supportSome applications issue SCSI commands.

ex. DB (enterprise), backup.

pvSCSI driver

Server consolidation HW fault tolerancePerformance

All Rights Reserved, Copyright (C) FUJITSU 20076

Data centerData center

Data Center managementEnterprise data are stored in FC/SCSI devices.Reliability and availability are required.Many SCSI devices in data center.

hardware snapshottape operation

DB server

backup server

snapshotdata file data file

Oracle

storage (RAID)

tape drive

SCSI commandSCSI command

load, unload, reset

hardware snapshot

LAN

SAN

All Rights Reserved, Copyright (C) FUJITSU 20077

Minimum backup windowMinimum backup window

example : SCSI command for storage

Disk to TapeDisk to Tape

Disk to diskDisk to disk

Hardware Hardware snapshotsnapshot

D2DD2D

D2TD2T on line on line

D2DD2D

Backup window

minimize

All Rights Reserved, Copyright (C) FUJITSU 20078

example : SCSI command for tapeexample : SCSI command for tape

backup restoreload

unload

Disk(RAID)

Tapedrive

Tape cartridge

out of box

Robot : move cartridgeRobot : move cartridgeTape : load,unload,rewindTape : load,unload,rewind

All Rights Reserved, Copyright (C) FUJITSU 20079

pvSCSI driver(SCSI passthrough)pvSCSI driver(SCSI passthrough)

pvSCSI driver(SCSI passthrough) consists of SCSI frontend driver and SCSI backend driver. Each guest can issue SCSI commands via host.Each guest can occupy each FC HBA card.

conventional application works on guest.

guest 2

Hypervisor

guest 1host

Host OS

guest OS guest OS

SCSI frontenddriver

SCSI command SCSI commandSCSI backend

driver

FC(SCSI) nativedriver

FC HBA card FC HBA card

FC(SCSI) nativedriver

SCSI backenddriver

SANSAN

SCSI frontenddriver

All Rights Reserved, Copyright (C) FUJITSU 200710

Requirements for VM in data centerRequirements for VM in data center

Conventional application supportServer consolidation

All resources are consolidated on VM.

pvSCSI driver + NPIV

HW fault tolerance Performance

All Rights Reserved, Copyright (C) FUJITSU 200711

Data center (Enterprise system)Data center (Enterprise system)

Tape

HBA HBAHBA HBA HBA HBAHBAHBA HBA

LAN

HBA

DB server

SCSI command

SCSI driver

Backup server

SCSI command

SCSI driver

serverserverSCSI driver

SCSI command

Many servers in data centerEach server has several storage.

Disk Disk Disk Disk

All Rights Reserved, Copyright (C) FUJITSU 200712

Data center (Enterprise system) on VMData center (Enterprise system) on VM

Server consolidation on VMMany HBA cards are need for data center.

guest(server)

guest(server)

guest (DB server)

SCSI command

guest (backup server)

SCSI driver

SCSI backend

driver

SCSI driver

hypervisor

host

HBA HBA HBA FC HBA HBA HBA HBA HBA HBA

Disk Tape

VBDfrontend

driver

VBDfrontend

driver

VBDbackend

driver

VBDbackend

driver

SCSI backend

driver

SCSI backend

driver

SCSI driverSCSI driver SCSI driver

SCSI frontend

driver

SCSI commandSCSI command

SCSI frontend

driver

SCSI frontend

driver

All Rights Reserved, Copyright (C) FUJITSU 200713

NPIV supportNPIV supportNPIV : Technology to creates a many vHBA(VP) in a physical HBA.

Each guest can have own vHBA.The number of physical HBA can be reduced.

guest 2guest 1host

Host OS

guest OS guest OS

SCSI frontenddriver

SCSI frontenddriver

SCSI command SCSI commandSCSI backend

driver

FC(SCSI) nativedriver with NPIV

Hypervisor

SCSI backenddriver

FC HBA card with NPIVVP VP

SANSAN

All Rights Reserved, Copyright (C) FUJITSU 200714

NPIV (N-Port Identifier Virtualization)NPIV (N-Port Identifier Virtualization)

The virtual port can connect to SAN independently as the physical port. The virtual port is allocated to owner guest.NPIV is standardized by the SNIA.

HBA driverHBA driver

HBA card HBA card HBA cardHBA card

VP VP VP

no NPIV NPIV

SAN SAN

VP : Virtual Port

PP : Physical PortPP

PP

PP PP PP

PP PP PP

All Rights Reserved, Copyright (C) FUJITSU 200715

Requirements for VM in data centerRequirements for VM in data center

Conventional application supportServer consolidationHW fault tolerance

Containment hardware failure

pvSCSI driver + NPIV + driver domain

Redundancy hardware failure

pvSCSI driver + driver domain + multi path driver

Performance

All Rights Reserved, Copyright (C) FUJITSU 200716

Containment with NPIVContainment with NPIV

Crash on VP to guest 1 does not affect guest 2.

guest 2guest 1host

Host OS

guest OS guest OS

SCSI frontenddriver

SCSI frontenddriver

SCSI command SCSI commandSCSI backend

driver

FC(SCSI) nativedriver with NPIV

Hypervisor

SCSI backenddriver

FC HBA card with NPIVVP VP

SANSAN

All Rights Reserved, Copyright (C) FUJITSU 200717

driver domaindriver domain

If host goes down with I/O operation, whole system does not go down.Guest can not access I/O device directly.

Hypervisor

guest 2guest 1

SANSAN

driver domain

guest OS guest OS

SCSI frontenddriver

SCSI frontenddriver

SCSI backenddriver

SCSI command SCSI command

host

host

HBA card HBA card

FC(SCSI) native driver

SCSI backenddriver

All Rights Reserved, Copyright (C) FUJITSU 200718

Containment with driver domainContainment with driver domain

Crash on driver domain 1 for guest 1 does not affect guest 2.

Hypervisor

SANSAN

host

hostOS

driver domain 1

SCSI backenddriver

FC(SCSI) native driver

crash

FC HBA card

guest 2

guest OS

SCSI frontenddriver

SCSI command

driver domain 2 guest 1

guest OS

SCSI frontenddriver

SCSI command

SCSI backenddriver

FC(SCSI) native driver

FC HBA card

All Rights Reserved, Copyright (C) FUJITSU 200719

multi path drivermulti path driver

fail over: alternate path retryload balance: multi access path

application

sd/st/sgscsi_mod

multi path driver

HBA cardHBA card

disk

application

sd/st/sgscsi_mod

multi path driver

HBA cardHBA card

disk

fail over load balance

• Linux has a multi path driver as “device mapper”.• Many vendors prepare their original multi path driver.

All Rights Reserved, Copyright (C) FUJITSU 200720

redundancy with driver domainredundancy with driver domain

Each guest has alternate path to I/O device via driver domain so that each guest can continue to work when a HBA card or a driver domain is crashed.

Hypervisor

SANSAN

guest 2

guest OS

SCSI frontenddriver

SCSI command

host

hostOS

driver domain 1 driver domain 2 guest 1

guest OS

SCSI frontenddriver

SCSI command

SCSI backenddriver

SCSI backenddriver

FC(SCSI) native driver

SCSI backenddriver

SCSI backenddriver

FC(SCSI) native driver

Multi pathdriver

Multi pathdriver

crash

FC HBA card with NPIVVP VP

FC HBA card with NPIVVP VP

All Rights Reserved, Copyright (C) FUJITSU 200721

Requirements for VM in data centerRequirements for VM in data center

Conventional application supportServer consolidation HW fault tolerance Performance

The performance of pvSCSI driver is almost same as VBD. Guest issues I/O to device not via host.More performance!!direct I/O

PV domain and HVM domain

All Rights Reserved, Copyright (C) FUJITSU 200722

direct I/Odirect I/O

Each guest can access hardware without host.

guest 2guest 1

SANSAN

Host domain

Host OS

guest OS guest OS

FC(SCSI)native driver

SCSI command SCSI command

Hypervisor

FC(SCSI)native driver

direct I/O hardware

FC HBA card FC HBA card

All Rights Reserved, Copyright (C) FUJITSU 200723

direct I/O / NPIV architecturedirect I/O / NPIV architecture

chipset for direct I/O

VP

DMA controller

CPU CPU

for guest B

Memory

for guest A

Address Translation

AccessAllowed

AccessDenied

AccessAllowed

AddressTranslation

Table

Guarantee that the PCI Express device cannot perform unauthorized access to memory portion

The device can be assigned to guest domain

Authorized Access

Unauthorized Access

Managed by Hypervisor/Dom0but guest domain

VP

DMA controller HBA

All Rights Reserved, Copyright (C) FUJITSU 200724

current status

All Rights Reserved, Copyright (C) FUJITSU 200725

current statuscurrent statusThe basic function of pvSCSI driver code was posted to xen community.NPIV works. pvSCSI driver on driver domain is under evaluation.pvSCSI driver works on HVM domain and PV domain.Oracle RMAN works on guest with pvSCSIdriver.

All Rights Reserved, Copyright (C) FUJITSU 200726

performance (same as VBD)performance (same as VBD)

Memory : 2GBcpu :2 for each domaintool : iogen1.3.6

Dom0 vs VBD vs pvSCSIThe performance ratio of VBD and pvSCSI to Dom0(100%).

Performance of pvSCSI is almost same as VBD.

read

Dom

0

Dom

0

Dom

0

VBD

VBD VB

D

pvS

CSI

pvSC

SI

pvSC

SI

10%20%30%40%50%60%70%80%90%

100%

8k 128k 256kblock size

Dom0= 100% write

Dom

0

Dom

0

Dom

0

VBD VBD

VBD

pvS

CSI

pvSC

SI

pvS

CSI

10%20%30%40%50%60%70%80%90%

100%

8k 128k 256kblock size

Dom0= 100%

All Rights Reserved, Copyright (C) FUJITSU 200727

Oracle RMAN works on guest with pvSCSI driver.DB server on guest Oracle

RMAN

hardwaresnapshot

data file data file

backup set

control data

RMANarchive log

control data

Storage (RAID)SAN

pvSCSI

Oracle RMAN

All Rights Reserved, Copyright (C) FUJITSU 200728

Next challenge

All Rights Reserved, Copyright (C) FUJITSU 200729

Next challengeNext challenge

Complexity direct I/O and nondirect I/OLUN assignment

All Rights Reserved, Copyright (C) FUJITSU 200730

Complexity direct I/O and nondirect I/OComplexity direct I/O and nondirect I/O

PV domain uses pvSCSI driver.HVM domain uses direct I/O.

Hypervisor

I/O deviceI/O device

host

hostOS

guest 2(PV domain)

guest OS

SCSI frontenddriver

SCSI command

Multi pathdriver

FC HBA with NPIVVP VP

driver domain 1

SCSI backenddriver

driver domain 2

SCSI backenddriver

guest 1(PV domain)

guest OS

SCSI frontenddriver

SCSIcommand

Multi pathdriver

guest 3(HVM domain)

guest OS

SCSI command

Multi pathdriver

guest 4(HVM domain)

guest OS

SCSI command

Multi pathdriver

FC(SCSI)native driver

FC(SCSI)native driver

direct I/O hardware

FC HBA with NPIVVP VP

FC HBA with NPIVVP VP

FC HBA with NPIVVP VP

FC(SCSI)native driver

SCSI backenddriver

SCSI backenddriver

FC(SCSI)native driver

All Rights Reserved, Copyright (C) FUJITSU 200731

LUN assignmentLUN assignmentLUN allocation to guest with pvSCSI driver

Hypervisor

guest 4(PV domain)

guest 1(HVM domain)

guest 2(HVM domain)

guest 3(PV domain)

guest 5(HVM domain)

guest 6(HVM domain)

direct I/O hardware

driver domain

VP VP VP

guest1

guest2

guest2

guest3

guest4

guest1

SCSIbackend driver

VPVP

SAN

LUN allocationfor guest

SANSAN

SCSIfronten driver

SCSIfrontend driver

SCSIfrontend driver

SCSIfrontend driver

LUN0

LUN1

LUN2

LUN3

LUN4

LUN5

All Rights Reserved, Copyright (C) FUJITSU 200732

Special thanks toIntel Corporation QLogic CorporationEmulex Corporation Brocade Communications Systems IncSun Microsystems Inc Xenon community engineers!

All Rights Reserved, Copyright (C) FUJITSU 200733

This work was partly funded by Ministry of Economy, Trade and Industry of Japan as the secure platform project of Association of Super-Advanced Electronics Technologies (ASET).

All Rights Reserved, Copyright (C) FUJITSU 200734

Appendix

All Rights Reserved, Copyright (C) FUJITSU 200735

- Direct I/O- But, All LUNs are assigned to“one” guest domain

A HBA must be occupied by an owner guest.

LUN Assignment to Guest Domain(Direct I/O without VT-d)

HBA

LUN

LUN

LUN

LUN

LUN

LUN

GuestDomain . . .Guest

DomainSCSI

frontenddriver

Hostdomain

SCSIfrontend

driver

SCSIbackend

driver

All Rights Reserved, Copyright (C) FUJITSU 200736

HBA

LUN

LUN

LUN

LUN

LUN

LUN

- Portion of LUNs canbe assigned to appropriateguest domain(LUN filtering by pvSCSI)

- But, not direct I/O(via DD)

LUN Assignment(Driver Domain (DD) with pvSCSI)

GuestDomain . . .

GuestDomain

SCSIfrontend

driver

DD

SCSIfrontend

driver

SCSIbackend

driver

SCSIbackend

driver

All Rights Reserved, Copyright (C) FUJITSU 200737

HBA

LUN

LUN

LUN

LUN

LUN

LUN

- Portion of LUNs canbe assigned to appropriateguest domain

- But, not yet direct I/O(via DD)

LUN Assignment to Guest Domain(Driver Domain (DD) with pvSCSI/NPIV)

vpvp

GuestDomain . . .

GuestDomain

SCSIfrontend

driver

DD

SCSIfrontend

driver

SCSIbackend

driver

SCSIbackend

driver

All Rights Reserved, Copyright (C) FUJITSU 200738

HBA

GuestDomain

LUN

LUN

LUN

. . .

LUN

LUN

LUN

GuestDomain

- Portion of LUNs canbe assigned to appropriateguest domain

- And direct I/O

LUN Assignment to Guest Domain(Direct I/O with VT-d/SR-IOV)

vpvp

All Rights Reserved, Copyright (C) FUJITSU 200739

Virtualization layersVirtualization layers

storage

server

networkSANSAN

xen

network virtual storage

storage box

SCSI protocol

SCSI protocol

Management interface depends on SCSI protocol.

All Rights Reserved, Copyright (C) FUJITSU 200740

network virtual storagenetwork virtual storage

storage

server

Virtual storage manager

SAN switch

- Configuration

- Migration

etc

application

Virtual disk

Virtual storage pool

SAN

All Rights Reserved, Copyright (C) FUJITSU 200741

storage data migrationstorage data migration

Virtual storage has the online migration function with SCSI protocol.

application

writeread

copying

application

read/write

Pre copy

application

read/write

Migration complete

All Rights Reserved, Copyright (C) FUJITSU 200742

replicationreplication

Network/storage box virtualization has snapshot copy.

backup restore replication

Virtualstorage

All Rights Reserved, Copyright (C) FUJITSU 200743

Example 1 : Guest MigrationExample 1 : Guest Migration

guest 2guest 1

VP VPV-WWN 1 V-WWN 2

HBA

guest 2data

guest 1data

VP VP

HBA

SAN

guest 2guest 1

VP VPV-WWN 1

HBA

guest 2data

guest 1data

VP VPV-WWN 2

HBA

SAN

migrate guest 2

Keep connection to the same v-WWN2.

All Rights Reserved, Copyright (C) FUJITSU 200744

Example 2 : Guest MigrationExample 2 : Guest Migration

migrate guest 2guest 2

vNIC

guest 1

vNIC

guest 2

vNIC

guest 1

vNIC

FCOE / iSCSIFCOE / iSCSI

Keep connection to guest 2.