enabling efficient hypervisor-as-a-service clouds with ... · quicksort kernbench netperf. 30...

34
Dan Williams , Yaohui Hu , Umesh Deshpande , Piush K Sinha , Nilton Bila , Kartik Gopalan , Hani Jamjoom IBM T.J. Watson Research Center Binghamton University IBM Almaden Research Center Funded in part by the NSF Enabling Efficient Hypervisor-as-a-Service Clouds with Ephemeral Virtualization

Upload: others

Post on 06-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

Dan Williams†, Yaohui Hu‡, Umesh Deshpande∗, Piush K Sinha‡, Nilton Bila†, Kartik Gopalan‡, Hani Jamjoom†

†IBM T.J. Watson Research Center ‡Binghamton University ∗IBM Almaden Research Center

Funded in part by the NSF

Enabling Efficient Hypervisor-as-a-Service Clouds with Ephemeral Virtualization

Page 2: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation2 29 June 2016

§ Thin and secure layer in the cloud

Hypervisors…

Guest 1

Hypervisor

VM

-- or --

Page 3: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation3 29 June 2016

Hypervisors can do a lot!

Guest 1

Hypervisor

VM

§ Thin and secure layer in the cloud

§ Feature-filled cloud differentiators– Rootkit detection– Live patching– Intrusion detection– High availability service– Other VMI…

-- or --

Page 4: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation4 29 June 2016

§ Cloud provider runs hyperplexor– Multiplex hardware

§ Each guest selects a featurevisor– Implements rich services

§ “Hypervisor-as-a-service”

§ Can implement with nested virtualization– Pay nesting overhead all of the time

Splitting the hypervisor

Guest 1

Hyperplexor

Featurevisor

VM

Page 5: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation5 29 June 2016

Ephemeral virtualization

Guest 1

Hyperplexor

VMFeaturevisor

§ VM runs on hyperplexor when not needing featurevisor services

§ VM runs on featurevisor when needing its services

§ Implemented “Dichotomy”– 80ms switching times– Up to 12% better performance

than nested with 2.5s switching interval

Page 6: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation6 29 June 2016

§ VM runs on hyperplexor when not needing featurevisor services

§ VM runs on featurevisor when needing its services

§ Implemented “Dichotomy”– 80ms switching times– Up to 12% better performance

than nested with 2.5s switching interval

Ephemeral virtualization

Guest 1

Hyperplexor

Featurevisor

VM

Page 7: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation7 29 June 2016

§ VM runs on hyperplexor when not needing featurevisor services

§ VM runs on featurevisor when needing its services

§ Implemented “Dichotomy”– 80ms switching times– Up to 12% better performance

than nested with 2.5s switching interval

Ephemeral virtualization

Guest 1

Hyperplexor

VMFeaturevisor

VM

Page 8: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation8 29 June 2016

§ When does this make sense?

§ Design of Dichotomy

§ Future directions

§ Evaluation

§ Related work

§ Conclusions

Roadmap

Page 9: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation9 29 June 2016

§ One-time– Snapshot, guest patching, VM mgmt …

§ Sample-based– Rootkit detection, logging, VMI …

§ Continuous– HA snapshot (Remus), memory dedup …

Hypervisor-level applications

time

H

F

Page 10: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation10 29 June 2016

§ One-time– Snapshot, guest patching, VM mgmt …

§ Sample-based– Rootkit detection, logging, VMI …

§ Continuous– HA snapshot (Remus), memory dedup …

Hypervisor-level applications

time

H

F

H

F

Page 11: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation11 29 June 2016

§ One-time– Snapshot, guest patching, VM mgmt …

§ Sample-based– Rootkit detection, logging, VMI …

§ Continuous– HA snapshot (Remus), memory dedup …

Hypervisor-level applications

time

H

F

H

F

H

F

Page 12: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation12 29 June 2016

§ One-time– Snapshot, guest patching, VM mgmt …

§ Sample-based– Rootkit detection, logging, VMI …

§ Continuous– HA snapshot (Remus), memory dedup …

Hypervisor-level applications

time

H

F

H

F

H

F

Page 13: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation13 29 June 2016

Progress during one cycleFeaturevisor is active

t

Rate

of P

rogr

ess

X

Page 14: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation14 29 June 2016

Progress during one cycleFeaturevisor is active

t

Rate

of P

rogr

ess

X

β XNested

α X

Page 15: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation15 29 June 2016

Progress during one cycle

t

Rate

of P

rogr

ess

α X

X

β X

Featurevisor is active

Ephemeral

Nested

Page 16: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation16 29 June 2016

Progress during one cycle

t

Rate

of P

rogr

ess

α X

X

β X

Ephemeral

Nested

Featurevisor is active

Page 17: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation17 29 June 2016

§ Overhead of nesting

§ Amount of time on hyperplexor vs featurevisor

§ Frequency of switching

§ Overhead of switching– Design goal: minimal switching overhead

Factors affecting performance

Page 18: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation18 29 June 2016

§ Overhead of nesting

§ Amount of time on hyperplexor vs featurevisor

§ Frequency of switching

§ Overhead of switching– Design goal: minimal switching overhead

Factors affecting performance

Dependent on featurevisor

and workload

Page 19: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation19 29 June 2016

§ Memory management

§ Switching

Dichotomy

Hyperplexor

Guest

Virtual Guest EPT

Featurevisor

Migration Engine

Performaction

Migration EngineTrigger

Gueststate

Memory Management

Page 20: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation20 29 June 2016

§ Non-nested

§ Nested

§ Dichotomy

Memory management

Page 21: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation21 29 June 2016

§ Hypervisor manages mapping between guest-physical and machine-physical pages in the Guest EPT

§ EPT faults may cause hypervisor to update mapping

Memory management: non-nested

L0

Guest

Guest EPTEPT Fault Handler

EPT fault

Page 22: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation22 29 June 2016

§ Guest pages mapped into L1 hypervisor

§ L1 manages mappings via a virtual EPT for the guest

§ L0 backs virtual EPT with shadow EPT

Memory management: nested

L0

Guest (L2) Virtual Guest EPT

L1

Virtual EPT

Virtual EPT Trap HandlerShadow EPT

EPT Fault Handler

Trampoline

bounce EPT fault

EPT fault

L1 EPT

Virtual EPT modification

Page 23: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation23 29 June 2016

Memory management: Dichotomy

HyperplexorEPT Fault Handler

Guest Featurevisor

Virtual EPT

Virtual EPT Trap Handler

EPT Fault Handler

Trampoline

EPT fault

F-EPTChange Tracker

Change Receiver

FH

ChangeReceiver

bounce EPT fault

Forward changes

Virtual EPT modification

Dual Guest/ Shadow EPT

Page 24: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation24 29 June 2016

§ Similar to VM migration– Pause VM– Transfer VCPU, I/O, unsynchronized page table mappings– No copying of memory

Switching

forever:execute guest whilewaiting for trigger

on_trigger:relinquish_guestwait_for_guest

forever:wait_for_guestdo actionrelinquish guestfinish action

Hyperplexor Featurevisor

Page 25: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation25 29 June 2016

§ Time-based

§ On new network flow?

§ On particular memory access?

§ On program counter value?

§ What’s the right set of triggers?

Triggers

Page 26: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation26 29 June 2016

§ Multiple featurevisors for one guest– How do featurevisors interact?

§ Multiple guests on one featurevisor– E.g., temporary high-speed memory channel

§ Common hyperplexor services– Dirty page tracking service?

§ What is the right featurevisor interface?

Future work

Page 27: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation27 29 June 2016

§ Implementation based on KVM/QEMU– Time-based triggers

§ When is Dichotomy better than nested?

§ How small are switching times?

Evaluation

Page 28: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation28 29 June 2016

§ Featurevisor applications– No-op – Volatility (VMI rootkit detection)– Netmon (1 second tcpdump)

§ Guest workloads– Quicksort 800 MB– Kernbench– Netperf

Evaluation: setup

Page 29: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation29 29 June 2016

0 4 8 12 16Period (Seconds)

0.0

0.2

0.5

0.8

1.0

Rela

tive

Perfo

rman

ce to

Bas

eDichotomyNestedPre-copy

Workload performance:no-op featurevisor

0 4 8 12 16Period (Seconds)

0.0

0.2

0.5

0.8

1.0

Rela

tive

Perfo

rman

ce to

Bas

e

DichotomyNestedPre-copy

0 4 8 12 16Period (Seconds)

0.0

0.2

0.5

0.8

1.0

Rela

tive

Perfo

rman

ce to

Bas

e

DichotomyNestedPre-copy

§ Dichotomy outperforms nesting if period is sufficiently long (2.5—6 seconds)

§ Pre-copy suffers because of slow switching times

quicksort kernbench netperf

Page 30: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation30 29 June 2016

Workload performance:volatility featurevisor

§ Service time of volatility is sufficiently long (4s)– Dichotomy always outperforms nesting

quicksort kernbench netperf

0 4 8 12 16Period (Seconds)

0.0

0.2

0.5

0.8

1.0

Rela

tive

Perfo

rman

ce to

Bas

eDichotomyNestedPre-copy

0 4 8 12 16Period (Seconds)

0.0

0.2

0.5

0.8

1.0

Rela

tive

Perfo

rman

ce to

Bas

e

DichotomyNestedPre-copy

0 4 8 12 16Period (Seconds)

0.0

0.2

0.5

0.8

1.0

Rela

tive

Perfo

rman

ce to

Bas

e

DichotomyNestedPre-copy

Page 31: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation31 29 June 2016

§ 80ms independent of memory size

§ Compared to ~300ms downtime with live migration

How fast is switching time?

0 2 4 6 8

10 12

0 2 4 6 8 10

Switc

hing

tim

e (s

)

Guest memory size (GB)

Pre-copyDichotomy

Page 32: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation32 29 June 2016

§ On-demand switching to emulation– Taint-based protection (Ho et al., Eurosys 2006)

§ Switching between bare metal and virtualization– On-demand virtualization (Kooburat and Swift, HotOS 2011)

– Mercury (Chen et al., ICPP 2007)

§ Reducing hypervisor to its essentials– Microvisor (Lowell et al., ASPLOS 2004)

– CloudVisor (Zhang et al., SOSP 2011)

§ Disaggregating administrative domain– Breaking up is hard to do (Colp et al., SOSP 2011)

Related Work

Page 33: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2016 IBM Corporation33 29 June 2016

§ Cloud providers shouldn’t need to chose between small and feature-filled hypervisors

§ Dichotomy: split the role of the hypervisor into hyperplexor andfeaturevisor

§ Switch between with ephemeral virtualization– 80ms switching times– Up to 12% better performance than

nested with 2.5s switching interval

Conclusion

Guest

Hyperplexor

VM VMFeature-visor

VMGaincontrol

Relinquish control

Page 34: Enabling Efficient Hypervisor-as-a-Service Clouds with ... · quicksort kernbench netperf. 30 ©2016 IBM Corporation 29 June 2016 Workload performance: volatility featurevisor §

©2015 IBM Corporation