![Page 1: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/1.jpg)
Fakultat Informatik Institut fur Systemarchitektur, Betriebssysteme
THE NOVA KERNEL API
Julian Stecklina ([email protected])
Dresden, 5.2.2012
![Page 2: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/2.jpg)
00 Disclaimer
This is not about OpenStack Compute.
NOVA is mainly the work of Udo Steinberg (kernel) and Bernhard Kauer (userland).
http://hypervisor.org/
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 2 von 26
![Page 3: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/3.jpg)
00 Goals
• not talking about virtualization propaganda,
• giving a very short overview of NOVA as a whole
• introducing basic concepts of the kernel API
In the end you should be able to pick up the NOVA API manual and make heads ortails of it.
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 3 von 26
![Page 4: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/4.jpg)
01 NOVA OS Virtualization Architecture
http:
//os.inf.tu-dresden.de/papers_ps/steinberg_eurosys2010.pdf
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 4 von 26
![Page 5: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/5.jpg)
01 What works, what doesn’t
Works• x86 32-bit
• SMP
• VT-x, AMD-V
• VT-d (Intel IOMMU)
• SR-IOV
• grub, syslinux, . . .
• Linux, L4, . . .
• emulates AHCI, igb, . . .
• drivers for AHCI, someIntel NICs, . . .
• experimental libvirt support
Doesn’t work yet• Windows
• Migration
• Recursive Virtualization
• 64-bit
• being user-friendly ;-)
• . . .
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 5 von 26
![Page 6: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/6.jpg)
![Page 7: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/7.jpg)
02 NOVA Architecture
Reduce complexity of hypervisor:
• hypervisor provides low-level protection domains– address spaces– virtual machines
• one VMM per guest in (root mode) userspace,– possibly specialized VMMs to reduce attack surface– only one generic VMM implement so far
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 7 von 26
![Page 8: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/8.jpg)
Demo
![Page 9: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/9.jpg)
03 The L4 Influence
NOVA cannot deny its roots in the L4 family:
• task, threads, synchronous IPC
• recursive mapping of memory
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 9 von 26
![Page 10: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/10.jpg)
03 Capability-Based
Syscalls operate on capabilities to kernel objects:
• Protection Domain (PD) (“task”) — create pd
• Execution Context (EC) (“thread”) — create ec, ec ctrl
• Scheduling Context (SC) — create sc, sc ctrl
• Portals (PT) — create pt, call, reply
• Semaphore (SM) — create sm, sm ctrl
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 10 von 26
![Page 11: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/11.jpg)
03 Capabilities
Userspace can
• create capabilities to objects (by creating kernel objects),
• delegate capabilities (recursively, just as memory),
Capabilities are stored per-PD in capability space in the kernel. A PD
• uses index into capability space to name capabilities,
• unforgeable.
(Think file descriptors.)
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 11 von 26
![Page 12: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/12.jpg)
03 Communication
EC (thread)• bound to one PD (address
space)
• either thread or vCPU
• has a special memoryregion (UTCB) for IPC
Portals• entry point (instruction pointer)
• bound to one EC
• per client/function/. . .
• pass data, delegate capabilitiesfrom UTCB to UTCB
• can be called or implicitly used byexceptions (if a thread has the cap)
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 12 von 26
![Page 13: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/13.jpg)
03 ECs and SCs
There are two kinds of threads:
with time• “global thread” or vCPU
• stick SC to newly created EC
• causes startup exception whenfirst scheduled
without time• “local thread”
• bind portals to ECs
• when portal invoked, startsexecuting at portal EIP
• caller hands in time to handlethe request (no schedulingdecision)
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 13 von 26
![Page 14: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/14.jpg)
03 Basic Server Scenario
ECglobal
EClocal
(1) call
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 14 von 26
![Page 15: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/15.jpg)
03 Basic Server Scenario
ECglobal
EClocal
(1) call
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 14 von 26
![Page 16: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/16.jpg)
03 Basic Server Scenario
ECglobal
EClocal
(2) reply
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 14 von 26
![Page 17: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/17.jpg)
03 Resource Contention
ECs are not reentrant.What happens when a second client wants to call a service?
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 15 von 26
![Page 18: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/18.jpg)
03 Resource Contention
ECglobal
EClocal
(1) call
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 15 von 26
![Page 19: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/19.jpg)
03 Resource Contention
ECglobal
EClocal
(1) call (2) call
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 15 von 26
![Page 20: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/20.jpg)
03 Resource Contention
ECglobal
EClocal
(1) call (2) call
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 15 von 26
![Page 21: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/21.jpg)
03 Resource Contention
ECglobal
EClocal
(2) call
(3) reply
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 15 von 26
![Page 22: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/22.jpg)
03 Resource Contention
ECglobal
EClocal
(4) reply
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 15 von 26
![Page 23: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/23.jpg)
03 NOVA’s time management
With SCs only bound to some threads, it is possible to build (simple) serverswithout time reservation.
• How much time should service foo need anyway?
• fewer things to schedule,
• contended resources get “boosted” by clients as needed.
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 16 von 26
![Page 24: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/24.jpg)
04 Hardware Support for Virtualization
Late Pentium 4 (2004) introduced hardware support for virtualization: Intel VT.(AMD-V is conceptually very similar)
• root mode vs. non-root mode– root mode runs hypervisor– non-root mode runs guest
• situations that Intel VT cannot handle trap to root mode (VM Exit)
• special memory region (VMCS) holds guest state
• reduced software complexity
Supported by all major virtualization solutions today.
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 17 von 26
![Page 25: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/25.jpg)
04 VT-x Problems
VMCS (memory region holding guest state) needs to manipulated by VMM, yet
• cannot be mapped into userspace,
• have to use privileged VMREAD/VMWRITE instructions to access,
• reading all content for every VM Exit is expensive.
Kernel has to manage VMCS access.
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 18 von 26
![Page 26: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/26.jpg)
04 Virtualization on NOVA
VM Exits (and normal exceptions) vector through special portals.
• Portals created with bit field denoting interesting information (MessageTransfer Descriptor, MTD)
– for WRMSR or CPUID we need only general purpose registers– for page fault we need complete vCPU state
• kernel puts this data in handler’s UTCB
• handler produces new MTD on reply
Reduce number of expensive VMREAD/VMWRITE in the kernel.
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 19 von 26
![Page 27: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/27.jpg)
04 Writing to disk
![Page 28: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/28.jpg)
04 Writing to disk
VMM
Drv
vCPUExc
Handler IRQ
SharedMemory
![Page 29: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/29.jpg)
04 Writing to disk
VMM
Drv
vCPUExc
Handler IRQ
MMIO
SharedMemory
![Page 30: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/30.jpg)
04 Writing to disk
VMM
Drv
vCPUExc
Handler IRQ
MMIO
SharedMemory
"write data"
![Page 31: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/31.jpg)
04 Writing to disk
VMM
Drv
vCPUExc
Handler IRQ
MMIO
SharedMemory
"working on it"
![Page 32: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/32.jpg)
04 Writing to disk
VMM
Drv
vCPUExc
Handler IRQ
SharedMemory
![Page 33: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/33.jpg)
04 Writing to disk
VMM
Drv
vCPUExc
Handler IRQ
SharedMemory
![Page 34: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/34.jpg)
04 Writing to disk
VMM
Drv
vCPUExc
Handler IRQ
SharedMemory
recall
![Page 35: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/35.jpg)
04 Writing to disk
VMM
Drv
vCPUExc
Handler IRQ
SharedMemory
recallexception
![Page 36: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/36.jpg)
04 Writing to disk
VMM
Drv
vCPUExc
Handler IRQ
SharedMemory
injectirq
![Page 37: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/37.jpg)
04 Writing to disk
VMM
Drv
vCPUExc
Handler IRQ
SharedMemory
![Page 38: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/38.jpg)
http://os.inf.tu-dresden.de/papers_ps/steinberg_eurosys2010.pdf
![Page 39: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/39.jpg)
05 There is also . . .
• Userspace Timer Service
• Admission Server
• Device Drivers (IOMMU!)
• . . .
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 22 von 26
![Page 40: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/40.jpg)
05 Summary
The NOVA microhypervisor is a
• fast capability-basedmicrokernel
• with virtualization
in mind.
Supported by:
http://ict-passive.eu/
Code at http://hypervisor.org/
Discuss at http://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 23 von 26
![Page 41: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/41.jpg)
06 Multiple CPUs
Thread-related kernel objects are bound to one CPU:
• Portals,
• Execution Contexts,
• Scheduling Contexts.
Semphores work cross-CPU. Communication via Semaphores/recall.“Non-donating” cross-CPU IPC never really needed.Servers can be CPU-topology aware!
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 24 von 26
![Page 42: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/42.jpg)
http://os.inf.tu-dresden.de/papers_ps/steinberg_eurosys2010.pdf
![Page 43: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/43.jpg)
06 Livelock
It’s possible to construct helpingloops... Ouch!
• Kernel detects loop
• Random IPC is aborted
PagerSrv A
Srv BClient
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 26 von 26
![Page 44: THE NOVA KERNEL API - FOSDEM · 02 NOVA Architecture Reduce complexity of hypervisor: hypervisor provides low-level protection domains –address spaces –virtual machines one VMM](https://reader035.vdocuments.mx/reader035/viewer/2022063017/5fd92e69f0036c3cb0373737/html5/thumbnails/44.jpg)
06 Livelock
It’s possible to construct helpingloops... Ouch!
• Kernel detects loop
• Random IPC is aborted
PagerSrv A
Srv BClient
TU Dresden, 5.2.2012 The NOVA Kernel API Folie 26 von 26