qemu disk io which performs better: native or threads?
TRANSCRIPT
![Page 1: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/1.jpg)
devconf.cz 2014
QEMU Disk IO Which performs Better:
Native or threads?
Pradeep Kumar Surisetty Red Hat, Inc.devconf.cz, February 2016
![Page 2: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/2.jpg)
Outline
devconf.cz 2016
● KVM IO Architecture● Storage transport choices in KVM● Virtio-blk Storage Configurations● Performance Benchmark tools ● Challenges● Performance Results with Native & Threads● Limitations ● Future Work
![Page 3: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/3.jpg)
KVM I/O Architecture
HARDWARE
cpu0 …. cpuM
HARDWARE
cpu0 …. cpuM
HARDWARE
….
Applications
File System& Block
Drivers
vcpu0 … vcpuN
… vcpuN
iothread
cpu0 cpuM
Applications
File System& Block
Drivers
KVM GUEST
KVM Guest’s Kernel
vcpu0 iothread
Hardware Emulation (QEMU)
Generates I/O requests to host on
guest’s behalf & handle events
Notes:Each guest CPU has a dedicated vcpu thread that uses kvm.ko module to execute guest codeThere is an I/O thread that runs a select(2) loop to handle events
devconf.cz 2016
KVM (kvm.ko)
File Systems and Block Devices Physical Drivers
![Page 4: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/4.jpg)
Storage transport choices in KVM
● Full virtualization : IDE, SATA, SCSI● Good guest compatibility● Lots of trap-and-emulate, bad performance
● Para virtualization: virtio-blk, virtio-scsi ● Efficient guest ↔ host communication through virtio ring buffer
(virtqueue)● Good performance● Provide more virtualization friendly interface, higher
performance.● In AIO case, io_submit() is under the global mutex
devconf.cz 2016
![Page 5: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/5.jpg)
Storage transport choices in KVM
● Device assignment (Passthrough)● Pass hardware to guest, high-end usage, high performance● Limited Number of PCI Devices● Hard for Live Migration
devconf.cz 2016
![Page 6: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/6.jpg)
Full virtualization Para-virtualization
Storage transport choices in KVM
devconf.cz 2016
![Page 7: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/7.jpg)
Virtio PCI Controller
Virtio Device
vring
Guest
Qemu
Virtio pci controller
Virtio Device
Kick
Ring buffer with para virtualization
![Page 8: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/8.jpg)
Virtio-blk-data-plane:
● Accelerated data path for para-virtualized block I/O driver
● Threads are defined by -objectothread,iothread=<id> and the user can set up arbitrary device->iothread mappings (multiple devices can share an iothread)
● No need to acquire big QEMU lock
KVM Guest
Host Kernel
QEMU Event Loop
Virtio-bl data-pla
KVM Guest
Host Kernel
QEMU Event Loop
vityio-blk-data-planethread(s)
Linux AIO
irqfd
devconf.cz 2016
![Page 9: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/9.jpg)
Virtio-blk Storage Configurations
KVM Applications Guest
LVM Volume on Virtual Devices
Host Server
KVMGuest
LVM Volume on Virtual Devices
Physical Storage
Applications
Direct I/O w/ Para-Virtualized
Drivers
Block Devices/dev/sda,/dev/sdb,…
Device-Backed Virtual Storage
Virtual Devices/dev/vda, /dev/vdb,…
KVMGuest Applications
LVM FileVolume
Host Server
KVMGuest
LVMVolume
Physical Storage
File
Applications
Para- virtualized
Drivers, Direct I/O
RAW or QCOW2
File-Backed Virtual Storage
devconf.cz 2016
![Page 10: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/10.jpg)
Openstack: Libvirt: AIO mode for disk devices
1) Asynchronous IO (AIO=Native)
Using io_submit calls
2) Synchronous (AIO=Threads)
pread64, pwrite64 calls
Default Choice in Openstack is aio=threads*
Ref: https://specs.openstack.org/openstack/novaspecs/specs/mitaka/approved/libvirt-aio-mode.html
* Before solving this problem
devconf.cz 2016
![Page 11: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/11.jpg)
● </disk>
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='none' io='native'/>
<source file='/home/psuriset/xfs/vm2-native-ssd.qcow2'/>
<target dev='vdb' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
</disk>
● <disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='none' io='threads'/>
<source file='/home/psuriset/xfs/vm2-threads-ssd.qcow2'/>
<target dev='vdc' bus='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
</disk>
Example XML
devconf.cz 2016
![Page 12: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/12.jpg)
CPU Usage with aio=Native
devconf.cz 2016
![Page 13: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/13.jpg)
CPU Usage with aio=Threads
devconf.cz 2016
![Page 14: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/14.jpg)
ext4/XFSExt4, XFS
NFS
SSD, HDD
Qcow2, Qcow2 (With Falloc), Qcow2 (With Fallocate), Raw(With Preallocated)
File, Block device
Jobs: Seq Read, Seq Write,Rand Read, Rand Write, Rand Read WriteBlock Sizes: 4k, 16k, 64k, 256k
Number of VM: 1, 16 (Concurrent)
devconf.cz 2016
Multiple layers Evaluated with virtio-blk
![Page 15: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/15.jpg)
Test Environment
Hardware
● 2 x Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
● 256 GiB memory @1866MHz
● 1 x 1 TB NVMe PCI SSD
● 1 x 500 GB HDD
Software
● Host: RHEL 7.2 :3.10.0-327
● Qemu: 2.3.0-31 + AIO Merge Patch
● VM: RHEL 7.2
devconf.cz 2016
![Page 16: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/16.jpg)
Tools What is Pbench?
pbench (perf bench) aims to:
● Provide easy access to benchmarking & performance tools onLinux systems
● Standardize the collection of telemetry and configurationInformation
● Automate benchmark execution
● Output effective visualization for analysis allow for ingestion into elastic search
devconf.cz 2016
![Page 17: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/17.jpg)
Pbench Continued...
Tool visualization:
sar tool, total cpu consumption:
devconf.cz 2016
![Page 18: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/18.jpg)
Pbench Continued ..tool visualization:
iostat tool, disk request size:
devconf.cz 2016
![Page 19: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/19.jpg)
Pbench Continued ..tool visualization:
proc-interrupts tool, function call interrupts/sec:
devconf.cz 2016
![Page 20: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/20.jpg)
Pbench Continued ..tool visualization:
proc-vmstat tool, numa stats: entries in /proc/vmstat which begin with “numa_” (delta/sec)
devconf.cz 2016
![Page 21: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/21.jpg)
Pbench Continued ..
pbench benchmarksexample: fio benchmark# pbench_fio --config=baremetal-hddruns a default set of iterations:[read,rand-read]*[4KB, 8KB….64KB]takes 5 samples per iteration and compute avg, stddevhandles start/stop/post-process of tools for each iterationother fio options:--targets=<devices or files>--ioengine=[sync, libaio, others]--test-types=[read,randread,write,randwrite,randrw]--block-sizes=[<int>,[<int>]] (in KB)
devconf.cz 2016
![Page 22: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/22.jpg)
FIO: Flexible IO Tester● IO type
Defines the io pattern issued to the file(s). We may only be reading sequentially from this file(s), or we may be writing randomly. Or even mixing reads and writes, sequentially or Randomly
● Block sizeIn how large chunks are we issuing io? This may be a single value, or it may describe a range of block sizes.
● IO size How much data are we going to be reading/writing
● IO EngineHow do we issue io? We could be memory mapping the file, we could be using regular read/write, we could be using splice, async io, syslet, or even SG (SCSI generic sg)
● IO depthIf the io engine is async, how large a queuing depth do we want to maintain?
● IO TypeShould we be doing buffered io, or direct/raw io?
devconf.cz 2016
![Page 23: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/23.jpg)
Guest
Host
Guest & Host iostat during 4k seq read with aio=native
devconf.cz 2016
![Page 24: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/24.jpg)
● Aio=native uses Linux AIO io_submit(2) for read and write requests and Request completion is signaled using eventfd.
● Virtqueue kicks are handled in the iothread. When the guest writes to the virtqueue kick hardware register the kvm.ko module signals the ioeventfd which the main loop thread is monitoring.
● Requests are collected from the virtqueue and submitted (after write request merging) either via aio=threads or aio=native.
● Request completion callbacks are invoked in the main loop thread and an interrupt is injected into the guest.
.
AIO Native
devconf.cz 2016
Virtio PCI Controller
Virtio Device
vring
Guest
Qemu
Virtio pci controller
Virtio Device
Kick
![Page 25: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/25.jpg)
Challenges for Read with aio=native
●
● virtio-blk does *not* merge read requests in qemu-kvm. It only merges write requests.
● QEMU submits each 4 KB request through a separate io_submit() call.
● Qemu would submit only 1 request at a time though Multiple requests to process
● Batching method was implemented for both virtio-scsi and virtio-blk-data-plane disk
![Page 26: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/26.jpg)
Batch Submission
What is I/O batch submission
● Handle more requests in one single system call(io_submit), so calling number of the syscall of io_submit can be decrease a lot
Abstracting with generic interfaces
● bdrv_io_plug( ) / bdrv_io_unplug( )
● merged in fc73548e444ae3239f6cef44a5200b5d2c3e85d1
(virtio-blk: submit I/O as a batch)
devconf.cz 2016
![Page 27: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/27.jpg)
Performance Comparison Graphs
devconf.cz 2016
![Page 28: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/28.jpg)
Test Specifications Single VM Results Multiple VM (16) Results
Disk: SSD
FS: None (used LVM)
Image: rawPreallocated: yes
aio=threads has better performance with LVM. 4K read,randread performance is 10-15% higher. 4K write is 26% higher.
Native & threads perform equally in most cases but native does better in few cases.
Disk: SSDFS: EXT4
Image: raw
Preallocated: yes
Native performs well with randwrite, write, and randread-write. Threads 4K read is 10-15% Higher..4K randread is 8% higher.
Both have similar results.4K seq reads: threads 1% higher.
Disk: SSD
FS: XFS
Image: RawPreallocated: yes
aio=threads has better performance Native & threads perform equally in most cases but native does better in few cases. Threads better in seq writes
Disk: SSD
FS: EXT4
Image: rawPreallocated: yes
NFS : yes
Native performs well with randwrite, write and randread-write. Threads do well with 4K/16K read, randread by 12% higher.
Native & threads perform equally in most cases but native does better in few cases.
Results
devconf.cz 2016
![Page 29: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/29.jpg)
Test Specifications Single VM Results Multiple VM (16) Results
Disk: SSD
FS: XFS
Image: raw
Preallocated: yes
NFS : yes
Native performs well with all tests except read & randread tests where threads perform better.
Native performs well with all tests.
Disk: SSD
FS: EXT4
Image: qcow2
Preallocated: no
Native does well with all tests.Threads outperform native <10% for read and randread.
Native is better than threads in most cases.Seq reads are 10-15% higher with native.
Disk: SSD
FS: XFS
Image: qcow2
Preallocated: no
Native performs well with all tests except seq read which is 6% higher
Native performs better than threads except seq write, which is 8% higher
Disk: SSD
FS: EXT4
Image: qcow2
Preallocated: with falloc (using qemu-img)
Native is optimal for almost all tests.Threads slightly better (<10%) for seq reads.
Native is optimal with randwrite, write and randread-write. Threads have slightly better performance for read and randread.
Disk: SSD
FS: XFS
Image: qcow2
Preallocate: with falloc
Native is optimal for write and randread-write. Threads better (<10%) for read and randread.
Native is optimal for all tests. Threads is better for seq writes.
Disk: SSD
FS: EXT4
Image: qcow2
Preallocated: with fallocate
Native performs better for randwrite, write, randread-write.Threads does better for read and randread.4K,16K read,randread is 12% higher.
Native outperforms threads.
![Page 30: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/30.jpg)
Test Specifications Single VM Results Multiple VM (16) Results
Disk: SSD
FS: XFS
Image: qcow2
Preallocated: with fallocate
Native is optimal for randwrite, write and randreadThreads better (<10%) for read
Native optimal for all tests.Threads optimal for randread, and 4K seq write.
Disk: HDD
FS: No. Used LVM
Image: rawPreallocated: yes
Native outperforms threads in all tests. Native outperforms threads in all tests.
Disk: HDD
FS: EXT4
Image: rawPreallocated: yes
Native outperforms threads in all tests. Native outperforms threads in all tests.
Disk: HDD
FS: XFS
Image: rawPreallocated: yes
Native outperforms threads in all tests. Native is optimal or equal in all tests.
Disk: HDD
FS: EXT4
Image: qcow2Preallocated: no
Native is optimal or equal in all test cases except randread where threads is 30% higher.
Native is optimal except for 4K seq reads.
Disk: HDD
FS: XFS
Image: qcow2Preallocated: no
Native is optimal except forseq writes where threads is 30% higher.
Native is optimal except for 4K seq reads.
Disk: HDD
FS: EXT4
Image: qcow2Preallocated: with falloc( using qemu-img)
Native is optimal or equal in all cases except randread where threads is 30% higher.
Native is optimal or equal in all tests except 4K read.
![Page 31: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/31.jpg)
Test Specifications Single VM Results Multiple VM (16) Results
Disk: HDD
FS: XFS
Image: qcow2Preallocated: with falloc (using qemu-img)
Native is optimal or equal in all cases except seq write where threads is 30% higher.
Native is optimal or equal in all cases except for 4K randread.
Disk: HDD
FS: EXT4
Image: qcow2Preallocated: with fallocate
Native is optimal or equal in all tests. Native is optimal or equal in all tests except 4K randread where threads is 15% higher.
Disk: HDD
FS: XFS
Image: qcow2Preallocated: with fallocate
Native is optimal in all tests except for seq write where threads is 30% higher.
Nativs is better. Threads has slightly better performance(<3-4%), excluding randread where threads is 30% higher.
devconf.cz 2016
![Page 32: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/32.jpg)
Performance Graphs
devconf.cz 2016
![Page 33: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/33.jpg)
1. Disk: SSD, Image: raw, Preallocated: yes, VMs: 16FS: no.Used LVM, aio=nativeFS: No.Used LVM, aio=threadsFS: EXT4, aio=nativeFS: EXT4, aio=threadsFS: XFS, aio=nativeFS: XFS, aio=threads
RandRead
devconf.cz 2016
![Page 34: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/34.jpg)
RandReadWrite
FS: no.Used LVM, aio=nativeFS: No.Used LVM, aio=threadsFS: EXT4, aio=nativeFS: EXT4, aio=threadsFS: XFS, aio=nativeFS: XFS, aio=threads
devconf.cz 2016
![Page 35: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/35.jpg)
RandWrite
devconf.cz 2016
FS: no.Used LVM, aio=nativeFS: No.Used LVM, aio=threadsFS: EXT4, aio=nativeFS: EXT4, aio=threadsFS: XFS, aio=nativeFS: XFS, aio=threads
![Page 36: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/36.jpg)
Seq Read
FS: no.Used LVM, aio=nativeFS: No.Used LVM, aio=threadsFS: EXT4, aio=nativeFS: EXT4, aio=threadsFS: XFS, aio=nativeFS: XFS, aio=threads
devconf.cz 2016
![Page 37: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/37.jpg)
Seq Write
devconf.cz 2016
FS: no.Used LVM, aio=nativeFS: No.Used LVM, aio=threadsFS: EXT4, aio=nativeFS: EXT4, aio=threadsFS: XFS, aio=nativeFS: XFS, aio=threads
![Page 38: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/38.jpg)
2. Disk: HDD, Image: raw, Preallocated: yes, VMs: 16
RandRead RandReadWrite
devconf.cz 2016
FS: no.Used LVM, aio=nativeFS: No.Used LVM, aio=threadsFS: EXT4, aio=nativeFS: EXT4, aio=threadsFS: XFS, aio=nativeFS: XFS, aio=threads
![Page 39: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/39.jpg)
RandWrite
Seq Read
Seq Write
devconf.cz 2016
RandWrite
FS: no.Used LVM, aio=nativeFS: No.Used LVM, aio=threadsFS: EXT4, aio=nativeFS: EXT4, aio=threadsFS: XFS, aio=nativeFS: XFS, aio=threads
![Page 40: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/40.jpg)
3. Disk: SSD, Image: raw, Preallocated: yes, VMs: 1
RandRead
RandRead Write
FS: no.Used LVM, aio=nativeFS: No.Used LVM, aio=threadsFS: EXT4, aio=nativeFS: EXT4, aio=threadsFS: XFS, aio=nativeFS: XFS, aio=threads
devconf.cz 2016
![Page 41: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/41.jpg)
RandRead Write
Seq Read
FS: no.Used LVM, aio=nativeFS: No.Used LVM, aio=threadsFS: EXT4, aio=nativeFS: EXT4, aio=threadsFS: XFS, aio=nativeFS: XFS, aio=threads
devconf.cz 2016
![Page 42: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/42.jpg)
Seq WriteFS: no.Used LVM, aio=nativeFS: No.Used LVM, aio=threadsFS: EXT4, aio=nativeFS: EXT4, aio=threadsFS: XFS, aio=nativeFS: XFS, aio=threads
devconf.cz 2016
![Page 43: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/43.jpg)
Rand Read Rand Read Write Rand Write
4. Disk: HDD, Image: raw, Preallocated: yes, VMs: 1 FS: no.Used LVM, aio=nativeFS: No.Used LVM, aio=threadsFS: EXT4, aio=nativeFS: EXT4, aio=threadsFS: XFS, aio=nativeFS: XFS, aio=threads
devconf.cz 2016
![Page 44: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/44.jpg)
Seq Read
Seq Write
FS: no.Used LVM, aio=nativeFS: No.Used LVM, aio=threadsFS: EXT4, aio=nativeFS: EXT4, aio=threadsFS: XFS, aio=nativeFS: XFS, aio=threads
devconf.cz 2016
![Page 45: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/45.jpg)
5. Disk: SSD, Image: qcow2, VMs: 16
RandRead
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
![Page 46: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/46.jpg)
RandReadWrite RandWrite
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
![Page 47: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/47.jpg)
Seq ReadSeqWrite
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
![Page 48: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/48.jpg)
6. Disk: HDD, Image: qcow2, VMs: 161 FS: EXT4, aio=native, Img: qcow2
2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
![Page 49: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/49.jpg)
RandRead Rand Read Write Rand Write Seq Read
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
![Page 50: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/50.jpg)
7. Disk: SSD, Image: qcow2, VMs: 1
RandRead
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
![Page 51: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/51.jpg)
RandReadWrite RandWrite
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
![Page 52: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/52.jpg)
Seq Read
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
![Page 53: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/53.jpg)
Seq Write
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
![Page 54: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/54.jpg)
8. Disk: HDD, Image: qcow2, VMs: 1
Seq Read
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
![Page 55: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/55.jpg)
Rand Read Write Rand WriteRand Read
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
![Page 56: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/56.jpg)
Seq Write
1 FS: EXT4, aio=native, Img: qcow2 2 FS: EXT4, aio=threads, Img: qcow2 3 FS: XFS, aio=native, Img: qcow2 4 FS: XFS, aio=threads, Img: qcow2 5 FS: EXT4, aio=native, Img: qcow2, Prealloc: Falloc 6 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Falloc 7 FS: XFS, aio=native, Img: qcow2, Prealloc: Falloc 8 FS: XFS, aio=threads, Img: qcow2, Prealloc: Falloc 9 FS: EXT4, aio=native, Img: qcow2, Prealloc: Fallocate 10 FS: EXT4, aio=threads, Img: qcow2, Prealloc: Fallocate 11 FS: XFS, aio=native, Img: qcow2, Prealloc: Fallocate 12 FS: XFS, aio=threads, Img: qcow2, Prealloc: Fallocate
devconf.cz 2016
![Page 57: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/57.jpg)
8. Disk: SSD, Image: raw, NFS: yes, VMs: 11.FS: EXT4, aio=native 2. FS: EXT4, aio=threads
3. FS: XFS, aio=native 4. FS: XFS, aio=threads
Rand Read
Rand Read Write
devconf.cz 2016
![Page 58: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/58.jpg)
Rand Write
Seq Read
Seq Write
devconf.cz 2016
![Page 59: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/59.jpg)
https://review.openstack.org/#/c/232514/7/specs/mitaka/approved/libvirt-
aio-mode.rst,cm
devconf.cz 2016
![Page 60: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/60.jpg)
Performance Brief
● https://access.redhat.com/articles/2147661
devconf.cz 2016
![Page 61: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/61.jpg)
Conclusion & Limitations
● Throughput increased a lot because IO thread takes fewer CPU to submit I/O
● AIO=Native is Preferable choice with few limitations.
● Native AIO can block the VM if the file is not fully allocated and is therefore not recommended for use on sparse files.
● Writes to sparsely allocated files are more likely to block than fully preallocated files. Therefore it is recommended to only use aio=native on fully preallocated files, local disks, or logical volumes.
devconf.cz 2016
![Page 62: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/62.jpg)
Future work
devconf.cz 2016
Evaluate Virtio Data Plane PerformanceReduce cpu utilization for aio=threads and consider
![Page 63: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/63.jpg)
Questions
devconf.cz 2016
![Page 64: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/64.jpg)
References● Stefan Hajnoczi Optimizing the QEMU Storage Stack, Linux Plumbers
2010
● Asias He, Virtio-blk Performance Improvement, KVM forum 2012
● Khoa Huynch: Exploiting The Latest KVM Features For Optimized Virtualized Enterprise Storage Performance, LinuxCon2012
● Pbench: http://distributed-system-analysis.github.io/pbench/
https://github.com/distributed-system-analysis/pbench
● FIO:
https://github.com/axboe/fio/
devconf.cz 2016
![Page 65: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/65.jpg)
Special Thanksto
Andrew TheurerStefan Hajnoczj
![Page 66: QEMU Disk IO Which performs Better: Native or threads?](https://reader034.vdocuments.mx/reader034/viewer/2022042421/58f293261a28ab564b8b45af/html5/thumbnails/66.jpg)
Thanks
Irc: #psuriset
Blog: psuriset.com