vmworld 2013: failsafe at pcie level: enabling pcie hot swap

30
Failsafe at PCIe Level: Enabling PCIe Hot Swap Wenchao Cui, VMware Caixue Lin, VMware TEX5316 #TEX5316

Upload: vmworld

Post on 13-Jul-2015

90 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

Failsafe at PCIe Level: Enabling PCIe Hot Swap

Wenchao Cui, VMware

Caixue Lin, VMware

TEX5316

#TEX5316

Page 2: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

2

Disclaimer

This presentation may contain product features that are currently

under development.

This overview of new technology represents no commitment from

VMware to deliver these features in any generally available

product.

Features are subject to change, and must not be included in

contracts, purchase orders, or sales agreements of any kind.

Technical feasibility and market demand will affect final delivery.

Pricing and packaging for any new technologies or features

discussed or presented have not been determined.

Page 3: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

3

Agenda

Background and overview

Behind the scene: Native driver & PCIe hot-plug

Case study: PCIe SSD driver with hot plug capability

Get involved

Challenges and future work

Conclusion

Q&A

Page 4: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

4

Background

Benefits of adopting PCIe hot plug

• Extend hardware with no down time

• Remove/replace hardware with no down time

• Enables fail safe at PCIe level

• PCIe SSDs to be a major driving factor

SFF-8639

Page 5: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

5

Background (Cont.)

Differences between PCIe hot plug and disk hot plug

Disk Hot Plug

• Events delivered to HBA

• PCI device still functional

• PCI config space intact

• Interrupts still active

• Impacts SCSI LUNs and Paths

• Minimal resource cleanup

PCIe Hot Plug

• Events delivered to PCI Bridge

• PCI device not functional

• PCI config space invalidated

• Interrupts lost

• Impacts SCSI LUNs, Paths,

and VMHBAs

• Full resource cleanup

Page 6: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

6

Overview: What’s New - Native Driver Model

New vSphere Driver Architecture with Native VMkernel API

Reference PCIe SSD Driver

Development Kits and Tools

Page 7: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

7

Behind the Scene

Native Driver Components

Device model

Device lifecycle management

Hot plug in the picture

Hot unplug impact to storage I/O

Quick summary

Page 8: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

8

Native Driver Model Overview (Big Picture)

Hardware

VMkernel

Native

Driver

PCI Bridge

PCI Device PCI Device PCI Device

PSA

Support

Storage

Stack

Network

Support

PCI Bus

Support VMKLinux

Device

Manager

Network

Stack PCI

Subsystem

User World

Device

Manager

Existing components

Modified components

New components

Native

Driver

VMKLinux

Driver

Page 9: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

9

Native Driver Model Components

Device Manager

• Manages device and driver mapping

• Maintains device tree and driver bindings

• Delivers events to drivers

Logical PCI bus driver

• Claims PCI bridges

• Monitors PCI bus changes

• Register PCI device nodes

Logical function driver

• Registers function objects to the IO stack

Device driver

• Claims and operates devices

Hardware

VMkernel

Native

Driver

PCI Bridge

PCI Device PCI DevicePCI Device

PSA

Support

Storage

Stack

Network

Support

PCI Bus

SupportHBA

Support

Device Layer

in Kernel

Network

StackPCI

Subsystem

User World

Device Layer

in UW

Page 10: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

10

Behind the Scene: Device Model

vmkernel

I/O Subsystems Device

Manager

Device

Layer

Legend: Physical device Logical device Relationship

Device and Driver

Objects

Drivers

Page 11: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

11

Behind the Scene: Device Lifecycle Management

Unregistered

Registered,

Unclaimed

Register Device

Claimed,

Quiesced

Attach Device

IO-able

Start Device

Scan Device

Quiesce Device

Detach Device

Unregister Device

Unregistered

Registered,

Unclaimed

Claimed,

Quiesced

IO-able

Page 12: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

12

PCIe Device Hot Add

vmkernel

I/O Subsystems Device

Manager

Legend: Physical device Logical device Relationship

Device and Driver

Objects

Drivers

1

3

4

5

6

PCI Bridge &

PCI Bus Driver

2

Page 13: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

13

PCIe Device Hot Remove

vmkernel

I/O Subsystems Device

Manager

Legend: Physical device Logical device Relationship

Device and Driver

Objects

Drivers

1

3

5

4

PCI Bridge &

PCI Bus Driver

2

“Forget Device”

“Forget Device”

6

“Quiesce/Detach”

7

“Quiesce/Detach”

Page 14: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

14

Device Unplug Impact to Storage I/O

Permanent Device Loss (PDL)

• All LUNs of the removed HBA are considered PDL

• Limitations of PDL applies

• Not all of the storage stack supports PDL

• End user needs to ensure all open handles are closed for full cleanup

• Device removal might fail in the middle due to PDL limitations

Driver’s attention required

• Prepare for MMIO read/write failures

• Prepare for I/O disruption

• Return proper codes for all outstanding SCSI commands

• Manage resources according to device life cycle

Page 15: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

15

Quick Summary

New VMkernel driver architecture

• Native VMkernel API

• Device tree

• Device lifecycle management

With hot plug/unplug awareness

• Device layer coordinates driver load, attach, and detach

• Parent/bus driver monitors device insertion and removal

• Surprise device removal supported via “Forget Device” notification

Page 16: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

16

Case Study: PCIe SSD with Hot Plug

Page 17: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

17

Case Study: PCIe SSD with Hot Plug

Page 18: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

18

Get Involved

Program

Implementation and Testing

Certification

Page 19: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

19

Get Involved: Program

IO Vendor

Program

Developer

Center

Page 20: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

20

Get Involved: Development and Testing

Acquire VMKAPI DDK

Review Docs and Sample Code

Develop Drivers Test with CLI tools and Manual Cases

Page 21: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

21

Get Involved: Certification

Certification for PCIe hot-plug (In Progress)

• IOVP certification (for testing driver/device functions)

• Hot-plug tests are manual and need on-demand request

• HCL listing does not include PCIe hot plug feature yet

Page 22: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

22

Challenges and Future Work

Challenges

• Device tree and device life cycle management

• PCI Config Space re-initialization

• I/O disruption handling

Future work

• Development/validation of wider range of devices

• Device manager enhancements

• I/O stack enhancements

• Certification/HCL listing support for hot swappable devices and drivers

Page 23: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

23

Conclusion

Key Takeaways

• New driver architecture with PCIe hot-plug awareness

• Driver development considerations

• Program enrollment and certification

Page 24: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

24

Q&A

Page 26: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

26

TAP Membership Renewal – Great Benefits

• TAP Access membership includes:

New TAP Access NFR Bundle

• Access to NDA Roadmap sessions at VMworld, PEX and Onsite/Online

• VMware Solution Exchange (VSX) and Partner Locator listings

• VMware Ready logo (ISVs)

• Partner University and other resources in Partner Central

• TAP Elite includes all of the above plus:

• 5X the number of licenses in the NFR Bundle

• Unlimited product technical support

• 5 instances of SDK Support

• Services Software Solutions Bundle

• Annual Fees

• TAP Access - $750

• TAP Elite - $7,500

• Send email to [email protected]

Page 27: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

27

TAP Resources

TAP

• TAP support: 1-866-524-4966

• Email: [email protected]

• Partner Central: http://www.vmware.com/partners/partners.html

TAP Team

• Kristen Edwards – Sr. Alliance Program Manager

• Sheela Toor – Marketing Communication Manager

• Michael Thompson – Alliance Web Application Manager

• Audra Bowcutt –

• Ted Dunn –

• Dalene Bishop – Partner Enablement Manager, TAP

VMware Solution Exchange

• Marketplace support –

[email protected]

• Partner Marketplace @ VMware

booth pod TAP1

Page 28: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

THANK YOU

Page 29: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap
Page 30: VMworld 2013: Failsafe at PCIe Level: Enabling PCIe Hot Swap

Failsafe at PCIe Level: Enabling PCIe Hot Swap

Wenchao Cui, VMware

Caixue Lin, VMware

TEX5316

#TEX5316