penn state cse “optimizing network virtualization in xen” aravind menon, alan l. cox, willy...
TRANSCRIPT
![Page 1: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/1.jpg)
Penn State CSE
“Optimizing Network Virtualization in Xen”Aravind Menon, Alan L. Cox, Willy Zwaenepoel
Presented by : Arjun R. Nath
![Page 2: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/2.jpg)
Introduction
● Paper to appear in USENIX 2006– Alan L. Cox, Rice University– Aravind and Willy, EPFL, Lausanne
● Outline: Paper introduces three methods of optimization to existing Xen (2.0) design for improving networking efficiency:– Improving network interface– Faster I/0 channel for exchange of packets
between host and guest– Virtual memory modifications to reduce TLB
misses by guest domains
![Page 3: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/3.jpg)
Xen Network I/0 Architecture
![Page 4: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/4.jpg)
1. Improving the Virtual Network Interface
● Front end, virtualized network driver for Xen guests– Simple, low-level interface : allows support for a
large number of physical NICs.– However, this prevents the virtual driver from
using advanced NIC capabilities, such as :● Checksum offload● Scatter/Gather DMA support● TCP segmentation offload
– Improve by making use of some of these NIC capabilities if possible
![Page 5: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/5.jpg)
1. Improving the Virtual Network Interface
● TCP Segmentation Offload (or TCP Large Send) is when the NIC's buffer is much larger than the supported maximum transmission unit (MTU) of a given medium. The work of dividing the much larger packets into smaller packets can be offloaded to the NIC. (less CPU processing)
● Scatter/Gather I/O: Enables the OS to construct packets directly from the file system buffers without needing to copy them to contiguous memory location.
● Checksum Offload: Compute TCP data checksum on NIC hardware rather than in software.
![Page 6: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/6.jpg)
Modified Network I/O Architecture
Offload driver in Driver Domain
![Page 7: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/7.jpg)
Results of Network modifications
Offload benefits
![Page 8: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/8.jpg)
2. I/O Channel Improvements
Pre-modification operations
● Packet transfer between guest and domain is done via a zero-copy page remapping mechanism.
● The page containing the packet is remapped into the address space of the target domain
● This operation requires each packet to be allocated on a separate page.
● Each packet receive/transfer requires 3/2 address remaps and 2 memory alloc/dealloc operations
![Page 9: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/9.jpg)
2. I/O Channel Improvements
Optimizations● Transmit:
– Let driver domain examine MAC header of packet, check if destination is driver domain or broadcast.
– Then construct network packet from packet header and unmapped (maybe) packet fragments and send over bridge. (needs Gather DMA support in NIC)
● Receive:– Reduce small packet overheads by doing a data-
copy instead of page remap.– Implemented using shared pages between Dom0
and guest
![Page 10: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/10.jpg)
Results of I/0 channel changes
I/O channel optimization benefit (transmit)
![Page 11: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/11.jpg)
3. Virtual Memory Improvements
● Observations– High number of TLB misses for guest domains in
Xen for network operations compared to native Linux
– Possibly due to increase in working set size.– Absence of support for virtual memory primitives
such as superpage mappings and global page table mappings
![Page 12: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/12.jpg)
3. Virtual Memory Improvements
Optimizations
● Modified guest OS to use superpage mappings for a virtual address range only if associated physical pages are contiguous
● Modified memory allocator in guest OS tries to group together all memory pages into a contiguous range (within a superpage).
● Modify VMM to allow use of global page mappings in their address space. (limited benefit)
● Avoid TLB flush when switching between busy to idle and back to busy domains.
![Page 13: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/13.jpg)
Results for virtual memory changes
TLB measurements using Xenoprof
![Page 14: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/14.jpg)
Results for virtual memory changes
Data TLB measurements
![Page 15: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/15.jpg)
Virtual Memory Issues
● Transparent page sharing between Vms breaks contiguity of physical page frames. Bad for superpages (currently not implemented in Xen)
● Ballooning driver use also breaks contiguity of physical pages. (solution – have coarse grained ballooning in units of superpage size)
![Page 16: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/16.jpg)
Results Transmit (Overall)
Transmit throughput measurements
![Page 17: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/17.jpg)
Results Receive (Overall)
Receive Throughput measurements
![Page 18: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/18.jpg)
Results overall
● Transmit throughput of guests improved more than 300%
● Receive throughput improved by 30%
● Receive performance is still a bottleneck
![Page 19: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/19.jpg)
Questions to the Authors
● Why did they use Xen 2.0 rather than 3.x– They had started this work when Xen was at 2.0.6,
they continued working on it even after Xen 3.0 was announced.
● Will these optimizations be included in the Xen 3.x codebase ?– Yes, they are looking to do that (its not done yet).
● Any More ?
![Page 20: Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath](https://reader035.vdocuments.mx/reader035/viewer/2022071718/56649eac5503460f94bb29b7/html5/thumbnails/20.jpg)
Thanks !