colo: coarse-grain lock-stepping virtual machine for non-stop service li zhijian fujitsu limited
TRANSCRIPT
![Page 1: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/1.jpg)
COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service
Li Zhijian <[email protected]>
Fujitsu Limited.
![Page 2: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/2.jpg)
Agenda
Background COarse-grain LOck-stepping Summary
![Page 3: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/3.jpg)
Non-Stop Service with VM Replication
Typical Non-stop Service Requires
Expensive hardware for redundancy
Extensive software customization
VM Replication: Cheap Application-agnostic Solution
![Page 4: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/4.jpg)
Existing VM Replication Approaches
4
Replication Per Instruction: Lock-steppingExecute in parallel for deterministic instructions
Lock and step for un-deterministic instructions
Replication Per Epoch: Continuous CheckpointSecondary VM is synchronized with Primary
VM per epoch
Output is buffered within an epoch
![Page 5: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/5.jpg)
Problems
5
Lock-steppingExcessive replication overhead
memory access in an MP-guest is un-deterministic
Continuous CheckpointExtra network latency
Excessive VM checkpoint overhead
![Page 6: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/6.jpg)
Agenda
6
Background
COarse-grain LOck-stepping
Summary
![Page 7: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/7.jpg)
Why COarse-grain LOck-stepping (COLO)
7
VM Replication is an overly strong conditionWhy we care about the VM state ?
The client care about response only
Can the control failover without ”precise VM state
replication”?
Coarse-grain lock-stepping VMsSecondary VM is a replica, as if it can generate
same response with primary so farBe able to failover without service stop
Non-stop service focus on server response, not internal machine state!Non-stop service focus on server response, not internal machine state!Non-stop service focus on server response, not internal machine state!Non-stop service focus on server response, not internal machine state!
![Page 8: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/8.jpg)
Architecture of COLO
8
Pnode: primary node; PVM: primary VM; Snode: secondary node; SVM: secondary VMPnode: primary node; PVM: primary VM; Snode: secondary node; SVM: secondary VM
COarse-grain LOck-stepping Virtual Machine for Non-stop ServiceCOarse-grain LOck-stepping Virtual Machine for Non-stop ServiceCOarse-grain LOck-stepping Virtual Machine for Non-stop ServiceCOarse-grain LOck-stepping Virtual Machine for Non-stop Service
Internal Network
Primary Node
Primary VM
QEMU
Heartbeat
COLO Disk Manager
KVM
Kernel
Storage External Network
VM Checkpoint
Proxy module
Disk IO
Net IO
Failover
Secondary Node
Secondary VM
QEMU
Heartbeat
COLO Disk Manager
KVM
Kernel
StorageExternal Network
Failover
VM Checkpoint
Proxy module
![Page 9: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/9.jpg)
Network topology of COLO
9
[eth0] : client and vm communication
[eth1] : migration/checkpoint, storage replication and proxy
Pnode: primary node; PVM: primary VM; Snode: secondary node; SVM: secondary VMPnode: primary node; PVM: primary VM; Snode: secondary node; SVM: secondary VM
![Page 10: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/10.jpg)
Network Process
10
Guest-RX Pnode
Receive a packet from client
Copy the packet and send to Snode
Send the packet to PVM
Snode
Receive the packet from Pnode
Adjust packet’s ack_seq number
Send the packet to SVM
Guest-TX Snode
Receive the packet from SVM
Adjust packet’s seq number
Send the SVM packet to Pnode
Pnode
Receive the packet from PVM
Receive the packet from Snode
Compare PVM/SVM packet
Same: release the packet to clientSame: release the packet to clientDifferent: trigger checkpoint and release packet to clientDifferent: trigger checkpoint and release packet to client
Base on Qemu’s netfilter and SLIRPBase on Qemu’s netfilter and SLIRP
![Page 11: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/11.jpg)
Storage Process
11
Write
Pnode
Send the write request to Snode
Write the write request to storage
Snode
Receive PVM write request
Read original data to SVM cache & write PVM write request to storage(Copy On Write)
Write SVM write request to SVM cache
Read
Snode
Read from SVM cache, or storage (SVM cache miss)
Pnode
Read form storage
Checkpoint Drop SVM cache
Failover Write SVM cache to storage
Base on qemu’s quorum,nbd,backup-driver,backingfile
![Page 12: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/12.jpg)
Memory Sync Process
12
Internal Network
Transfer Dirty Pages
Primary Node
Primary VM Qemu
VM Checkpoint
KVM
Kernel
Get Dirty Pages
Secondary Node
Secondary VMQemuVM Checkpoint
KVM
Kernel
PVM Memory Cache Update SVM state
• PNode– Track PVM dirty pages, send them to Snode periodically
• Snode– Receive the PVM dirty pages, save them to PVM Memory
Cache– On checkpoint, update SVM memory with PVM Memory Cache
![Page 13: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/13.jpg)
Checkpoint Process
13
Need modify migration process in Qemu to support Need modify migration process in Qemu to support checkpoint checkpoint
![Page 14: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/14.jpg)
Why Better
14
Comparing with Continuous VM checkpointNo buffering-introduced latency
Less checkpoint frequencyOn demand vs. periodic
Comparing with lock-steppingEliminate excessive overhead of un-
deterministic instruction execution due to MP-guest memory access
![Page 15: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/15.jpg)
Agenda
15
Background
COarse-grain LOck-stepping
Summary
![Page 16: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/16.jpg)
Summary
16
Performance
![Page 17: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/17.jpg)
Summary
17
COLO status colo frame: patchset v9 had been post(by
colo-block: most of patch is reviewed (by [email protected])
colo-proxy: (by [email protected])
netfilter related is reviewed
packet compare is developing
![Page 18: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/18.jpg)
Summary
18
Next steps:Redesign based on feedbacks
Develop and send out for review
Optimize performance
![Page 19: COLO: COarse-grain LOck-stepping Virtual Machine for Non-stop Service Li Zhijian Fujitsu Limited](https://reader035.vdocuments.mx/reader035/viewer/2022062409/5697bfb51a28abf838c9d82b/html5/thumbnails/19.jpg)
19