kernel vm#9 powerkvm-dist-20131208

52
Power で KVM な世界 第九回 カーネル/VM探検隊

Upload: manabu-ori

Post on 22-May-2015

3.269 views

Category:

Documents


1 download

TRANSCRIPT

  • 1. Power KVM VM

2. 2013827 IBM IBM IBM IBM IBMIBM IBM 2 AIX International Business Machines Corporation AIX 5L International Business Machines Corporation AIX 6 International Business Machines Corporation AIX 7 International Business Machines Corporation IBM International Business Machines Corporation UNIX The Open Group PowerPC International Business Machines Corporation POWER4 International Business Machines Corporation POWER5 International Business Machines Corporation POWER6 International Business Machines Corporation POWER7 International Business Machines Corporation IBM System International Business Machines Corporation IBM eServer International Business Machines Corporation Power Systems International Business Machines Corporation pSeries International Business Machines Corporation BladeCenter International Business Machines Corporation IBM i5/OS International Business Machines Corporation Systems Director VMControl International Business Machines Corporation IBM Systems Director Active Energy Manager International Business Machines Corporation Intel, Pentium Intel Corporation Linux Linus Torvalds 2013/12/16 2013 IBM Corporation 3. ( ) Twitter: @orimanabu IBM LinuxOSS ... PostgreSQL on ppc64 http://www.slideshare.net/orimanabu/pgcon2012ori20120224PostgreSQL 9.2 release noteshttp://www.postgresql.org/docs/9.2/static/release-9-2.html 2013 IBM Corporation 4. 4KVM on x86 POWER PowerVM KVM on Power2013/12/16 2013 IBM Corporation 5. KVM on x8652013/12/16 2013 IBM Corporation 6. x86 (PopekGoldberg ) CPU x86 ... (Xen) (LilyVM) (VMware) ... Intel-VT AMD-V62013/12/16 2013 IBM Corporation 7. Intel VMX root mode VMX non-root mode UserUser VMEntryPriviledgePriviledge VMExitUserUser PriviledgePriviledge 72013/12/16 2013 IBM Corporation 8. KVM KVM: Kernel-based Virtual Machine Intel-VT/AMD-V Qemu /dev/kvm ioctl(2) Qemu CPU I/O Qemu User ProgramQemuioctl Linux Kernel Guest KernelKVM vmlaunch vmresume82013/12/16 2013 IBM Corporation 9. Qemu User ProgramQemu /dev/kvm ioctlLinux KernelGuest KernelKVM vmlaunch VMX non-root mode 92013/12/16 2013 IBM Corporation 10. I/O ... I/O User ProgramQemu I/O ioctl I/O Qemu I/O vmresumeLinux Kernel2013/12/16Guest KernelKVM 10 I/O VMX root mode 2013 IBM Corporation 11. POWER 112013/12/16 2013 IBM Corporation 12. POWER Architecture POWER: Performance Optimized With Enhanced RISC RISC Memory mapped I/O Power ISA Blue GeneWii Server (Book 3s) Embedded (Book 3e)Playstation3 12 RAID 2013/12/16PathfinderSpiritPhoenix 2013 IBM Corporation 13. Power Instruction Set Architecture PowerPC Architecture Book v2.01 http://cseweb.ucsd.edu/classes/wi07/cse240b/Assignments/es-archpub1.pdf http://cseweb.ucsd.edu/classes/wi07/cse240b/Assignments/es-archpub2.pdf http://cseweb.ucsd.edu/classes/wi07/cse240b/Assignments/es-archpub3.pdf v2.02 http://www.ibm.com/developerworks/systems/library/es-archguide-v2.htmlPower ISA v2.03 https://www.power.org/documentation/power-isa-version-2-03/ v2.04 https://www.power.org/documentation/power-isa-version-2-04/ v2.05 https://www.power.org/documentation/power-isa-version-2-05-2/ v2.06 Revision B https://www.power.org/documentation/power-isa-version-2-06-revision-b/ v2.06 Revision B Addendum https://www.power.org/documentation/power-isa-version-2-06-revision-b-addendum/ v2.07 https://www.power.org/documentation/power-isa-version-2-07/132013/12/16 2013 IBM Corporation 14. 142013/12/16 2013 IBM Corporation 15. PowerPC In cores, IBM's Power dominates in networking and comms, but Intel and ARM are on the rise. (Source: Linley Group)In chips, Freescale dominates in networking and comms, but will lose some share to Intel. (Source: Linley Group)http://www.eetimes.com/document.asp?doc_id=1262633 152013/12/16 2013 IBM Corporation 16. 162013/12/16 2013 IBM Corporation 17. 172013/12/16 2013 IBM Corporation 18. 182013/12/16 2013 IBM Corporation 19. Server vs Embedded Server (Book 3s) MMU hardware walked hash table Real mode (MMU disabled) Segmentation dierenciates address space (sPAPR) (hypervisor call)POWER Popek Goldberg User mode Supervisor mode Hypervisor mode Embedded (Book 3e) MMU Software controlled TLB On-chip array, no hardware page walker No real mode PID register dierenciates address space (ePAPR) 192013/12/16 2013 IBM Corporation 20. 2 Eective address Virtual address Real address Eective Virtual OS (supervisor) SLB : 256MB or 1TB Linux MMU hash table 1 1 : 4KB, 64KB, 16MB, 16GB Virtual Real202013/12/16 2013 IBM Corporation 21. PowerVM212013/12/16 2013 IBM Corporation 22. PowerVM Power System Advanced Power Virtualization HMC (Hardware Management Console) (Power Hypervisor, PHYP, pHyp) OS (AIX, Linux, IBM i) OS PAPR POWER Architecture Platform Requirements Hypervisor Call (Hcall) = (LPAR, Logical Partition)222013/12/16 2013 IBM Corporation 23. PowerVM LPAR= Dynamic Logical Partitioning CPU1/100 CPUMicro-Partitioning Live Partition Mobility I/O Virtual I/O Server (VIOS) x86 32 LinuxPOWER Lx86 Transitive Rosetta (Intel Mac PowerPC ) IBM Transitive Lx86 ...232013/12/16 2013 IBM Corporation 24. Power7 LPAR pHyp hypervisor mode I/O MMU hash table hypervisor mode (HDECR ) PowerVM Processor version register Priviledged instructionsn and registers SMT () Power5+, Power6 VSX242013/12/16 2013 IBM Corporation 25. KVM on Power252013/12/16 2013 IBM Corporation 26. @ Red Hat Summit 2013 ... 2013 IBM Corporation 27. OpenPower Consortium POWER GPU IBM IBM, Google, NVIDIA, Mellanox, Tyan IBM ...OpenPower Google Mellanox Open Innovation NVIDIATYAN POWER POWER 272013/12/16 2013 IBM Corporation 28. KVM on Power - Overview - KVM x86 Power ioctl(2) KVM on x86 POWER7 (HV mode KVM) PowerVM (PHYP) POWER7 OPAL (Open Power Abstraction Layer): PHYP Sapphire x86 Qemu upstream 282013/12/16 2013 IBM Corporation 29. 2 HV KVM CPU PPC970, POWER7, FSL PR KVM HV mode KVM 292013/12/16 2013 IBM Corporation 30. KVM on Power - Architecture - Linux pHyp kvm OPAL Qemu QEMU emulates RTAS callGuest Hypercall QEMUReturn control to guestHypervisorRTAS/OPAL Open Firmware 302013/12/16 2013 IBM Corporation 31. KVM on Power - Technology - OS PAPR PowerVM (pHyp) PowerVM (pHyp) OS RHEL, SLES AIXIBM i PAPR Linux OS I/O 31KVM on x86 Qemu PAPR (Virtual I/O)Linux virtio I/O Qemu PCI 2013/12/16 2013 IBM Corporation 32. PowerVM vs PowerKVM PowerVM Firmware PCI I/O VIOS HMC/IVMFSPpHypVIOS 322013/12/16PowerKVM Linux + KVM PCI I/O Qemu Qemu 2013 IBM Corporation 33. Power(PC) KVM 2007-2008 PowerPC 4xx support by IBM (Hollis Blanchard, Christian Ehrhardt) 2009 Preliminary port to e500v2 by Freescale (Yu Liu) Port to server Book III S (Alexander Graf) 2010-2011 improve e500v2 support and port to 500mc HV KVM support on POWER7, PowerPC970332013/12/16 2013 IBM Corporation 34. KVM on POWER7 hypervisor supervisor entry/exit MMU #0 MMU PAPR PowerVM Linux (RHEL6, SLES11SP1 ) KVM Guest UserspaceUser modeQemu Hypervisor CallGuest KernelSupervisor modeKVMHost Linux Kernel342013/12/16Hypervisor mode 2013 IBM Corporation 35. KVM on POWER7 Qemu pseries KVM book3s_hv avor: book3s_pr avor: Virtual I/O PAPR VSCSI, VETH, console virtio virtio-blk, virtio-net SLOF (Slim-Line Open Firmware) powernv (Power non-virtualized) CONFIG_PPC_POWERNV arch/powerpc/platforms/powernv352013/12/16 2013 IBM Corporation 36. KVM on POWER7 POWER7 PPC970 IBM Power System, IBM PowerLinux, YDL PowerStation G5 Mac NG ... large pages (16MB, not pageable/swappable) SMT4 ST CPU PAPRvirtio I/O 362013/12/16 2013 IBM Corporation 37. Server Processors PPC970, POWER7 sPAPR (pHyp) PR KVM Embedded Processors ePAPR sPAPR sPAPR PR KVM 372013/12/16 2013 IBM Corporation 38. ...... 382013/12/16 2013 IBM Corporation 39. Book3s sPAPR - kernel - upstream HV KVM MMU SMT4 ST MMIO coalescing MMU notiers POWER7 (PPC970 ) in-kernel XICS ICP presentation controllers in real mode No exits for IPIs No exits for MSIs in most cases ICS source controllers in kernel virtual mode Complete CPU State save/restore MMU Hash table save/restore39 VFIO 2013/12/16 2013 IBM Corporation 40. Book3s sPAPR - Qemu - upstream IOMMU sPAPR I/O vscsi, veth, vterm (pSeries ) PCI virtio-pci OHCH, e1000, ... VGA SLOF vscsi, veth virtio-blk, virtio-net SLOF virtio-scsi NVRAM MMU hash table save/restore VFIO xHCIxes 402013/12/16 2013 IBM Corporation 41. Book3e & ePAPR BIOS, ACPI, UEFI OS VDC Research 2011 50% OS in-house OS VM 412013/12/16 2013 IBM Corporation 42. Book3e & ePAPR e500mc (32bit) upstream e5500 (64bit) SMP hugetlbfs e500 Qemu Freescale IOMMU (PAMU) To Do e6500 LRAT (Logical-to-Real Address Translation) Qemu gdb stub VFIO 42 Datapath Acceleration Architecture (DPAA)in-kernel (MPIC) libvirt 2013/12/16 2013 IBM Corporation 43. HV mode KVM Sapphire OPAL KVM on Power pHyp Linux Petitboot kexec FSP () 432013/12/16 2013 IBM Corporation 44. OpenPower IBM, Google, Tyan, Nvidia, Mellanox OpenStack libvirt, libguestfs Power LE/BE Nested Virtualization OpenStack CI PCI 442013/12/16 2013 IBM Corporation 45. Nested Virtualization OpenStack CI full emulation or PR style KVM Full emulation POWER6/7/8 Qemu PR KVM PR KVM: MMU HV KVM full emulation PR KVM Data breakpoint (watchpoint) POWER8 (transactional memory) PR KVM HV KVM 452013/12/16 2013 IBM Corporation 46. libvirtoVirtOpenStack46 virsh, virt-manager, virt-install RHEV-H nova-compute2013/12/16 tmp aa94bbbe-750d-c841-6198-c58399e4a0ad 1048576 1048576 1 hvm destroy restart restart /usr/bin/qemu-system-ppc64 2013 IBM Corporation 47. arch/powerpc/kvm/powerpc.c int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run) { virt/kvm/kvm_main.c int r; sigset_t sigsaved; static long kvm_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) if (vcpu->sigset_active) { sigprocmask(SIG_SETMASK, &vcpu->sigset, &sigsaved); struct kvm_vcpu *vcpu = filp->private_data; void __user *argp = (void __user *)arg; if (vcpu->mmio_needed) { int r; if (!vcpu->mmio_is_write) struct kvm_fpu *fpu = NULL; kvmppc_complete_mmio_load(vcpu, run); struct kvm_sregs *kvm_sregs = NULL; vcpu->mmio_needed = 0; } else if (vcpu->arch.dcr_needed) { if (vcpu->kvm->mm != current->mm) if (!vcpu->arch.dcr_is_write) return -EIO; kvmppc_complete_dcr_load(vcpu, run); vcpu->arch.dcr_needed = 0; #if defined(CONFIG_S390) || defined(CONFIG_PPC) || defined(CONFIG_MIPS) (vcpu->arch.osi_needed) { } else if /* u64 *gprs = run->osi.gprs; * Special cases: vcpu ioctls that are asynchronous to vcpu execution,int i; * so vcpu_load() would break it. */ for (i = 0; i < 32; i++) if (ioctl == KVM_S390_INTERRUPT || ioctl == KVM_INTERRUPT) kvmppc_set_gpr(vcpu, i, gprs[i]); return kvm_arch_vcpu_ioctl(filp, ioctl, arg); vcpu->arch.osi_needed = 0; #endif } else if (vcpu->arch.hcall_needed) { int i; r = vcpu_load(vcpu); kvmppc_set_gpr(vcpu, 3, run->papr_hcall.ret); if (r) for (i = 0; i < 9; ++i) return r; kvmppc_set_gpr(vcpu, 4 + i, run->papr_hcall.args[i]); switch (ioctl) { arch/powerpc/kvm/book3s.c vcpu->arch.hcall_needed = 0; case KVM_RUN: #ifdef CONFIG_BOOKE r = -EINVAL; } else if (vcpu->arch.epr_needed) { int kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) { if (arg) kvmppc_set_epr(vcpu, run->epr.epr); struct kvm *kvm = vcpu->kvm; goto out; vcpu->arch.epr_needed = 0; r = kvm_arch_vcpu_ioctl_run(vcpu, vcpu->run); #endif /* trace_kvm_userspace_exit(vcpu->run->exit_reason, r); } * If HV mode hasn't been selected by now, make it PR mode break; * from now on. case KVM_GET_REGS: { r = kvmppc_vcpu_run(run, vcpu); */ struct kvm_regs *kvm_regs; if (kvm->arch.kvm_mode == KVM_MODE_UNKNOWN) { if (vcpu->sigset_active) mutex_lock(&kvm->lock); sigprocmask(SIG_SETMASK, &sigsaved, NULL); if (kvm->arch.kvm_mode == KVM_MODE_UNKNOWN) kvm->arch.kvm_mode = KVM_MODE_PR; return r; mutex_unlock(&kvm->lock); } } VCPU_DO_PR(vcpu, return kvmppc_vcpu_run_pr(kvm_run, vcpu)); VCPU_DO_HV(vcpu, return kvmppc_vcpu_run_hv(kvm_run, vcpu)); return -EINVAL; }472013/12/16 2013 IBM Corporation 48. arch/powerpc/kvm/book3s.carch/powerpc/kvm/book3s.cint kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) { struct kvm *kvm = vcpu->kvm; /* * If HV mode hasn't been selected by now, make it PR mode * from now on. */ if (kvm->arch.kvm_mode == KVM_MODE_UNKNOWN) { mutex_lock(&kvm->lock); if (kvm->arch.kvm_mode == KVM_MODE_UNKNOWN) kvm->arch.kvm_mode = KVM_MODE_PR; mutex_unlock(&kvm->lock); } VCPU_DO_PR(vcpu, return kvmppc_vcpu_run_pr(kvm_run, vcpu)); VCPU_DO_HV(vcpu, return kvmppc_vcpu_run_hv(kvm_run, vcpu)); return -EINVAL; }int kvmppc_vcpu_run_hv(struct kvm_run *run, struct kvm_vcpu *vcpu) { int r; int srcu_idx; if (!vcpu->arch.sane) { run->exit_reason = KVM_EXIT_INTERNAL_ERROR; return -EINVAL; } kvmppc_core_prepare_to_enter(vcpu); /* No need to go into the guest when all we'll do is come back out */ if (signal_pending(current)) { run->exit_reason = KVM_EXIT_INTR; return -EINTR; } atomic_inc(&vcpu->kvm->arch.vcpus_running); /* Order vcpus_running vs. rma_setup_done, see kvmppc_alloc_reset_hpt */ smp_mb(); /* On the first time here, set up HTAB and VRMA or RMA */ if (!vcpu->kvm->arch.rma_setup_done) { r = kvmppc_hv_setup_htab_rma(vcpu); if (r) goto out; } flush_fp_to_thread(current); flush_altivec_to_thread(current); flush_vsx_to_thread(current); vcpu->arch.wqp = &vcpu->arch.vcore->wq; vcpu->arch.pgdir = current->mm->pgd; vcpu->arch.state = KVMPPC_VCPU_BUSY_IN_HOST; do { r = kvmppc_run_vcpu(run, vcpu); if (run->exit_reason == KVM_EXIT_PAPR_HCALL && !(vcpu->arch.shregs.msr & MSR_PR)) { r = kvmppc_pseries_do_hcall(vcpu); kvmppc_core_prepare_to_enter(vcpu); } else if (r == RESUME_PAGE_FAULT) { srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); r = kvmppc_book3s_hv_page_fault(run, vcpu, vcpu->arch.fault_dar, vcpu->arch.fault_dsisr); srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx); } } while (r == RESUME_GUEST); out:}482013/12/16vcpu->arch.state = KVMPPC_VCPU_NOTREADY; atomic_dec(&vcpu->kvm->arch.vcpus_running); return r; 2013 IBM Corporation 49. arch/powerpc/kvm/book3s.c int kvmppc_vcpu_run_hv(struct kvm_run *run, struct kvm_vcpu *vcpu) { int r; int srcu_idx; if (!vcpu->arch.sane) { run->exit_reason = KVM_EXIT_INTERNAL_ERROR; return -EINVAL; } kvmppc_core_prepare_to_enter(vcpu); /* No need to go into the guest when all we'll do is come back out */ if (signal_pending(current)) { run->exit_reason = KVM_EXIT_INTR; return -EINTR; } atomic_inc(&vcpu->kvm->arch.vcpus_running); /* Order vcpus_running vs. rma_setup_done, see kvmppc_alloc_reset_hpt */ smp_mb(); /* On the first time here, set up HTAB and VRMA or RMA */ if (!vcpu->kvm->arch.rma_setup_done) { r = kvmppc_hv_setup_htab_rma(vcpu); if (r) goto out; } flush_fp_to_thread(current); flush_altivec_to_thread(current); flush_vsx_to_thread(current); vcpu->arch.wqp = &vcpu->arch.vcore->wq; vcpu->arch.pgdir = current->mm->pgd; vcpu->arch.state = KVMPPC_VCPU_BUSY_IN_HOST; do { r = kvmppc_run_vcpu(run, vcpu); if (run->exit_reason == KVM_EXIT_PAPR_HCALL && !(vcpu->arch.shregs.msr & MSR_PR)) { r = kvmppc_pseries_do_hcall(vcpu); kvmppc_core_prepare_to_enter(vcpu); } else if (r == RESUME_PAGE_FAULT) { srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); r = kvmppc_book3s_hv_page_fault(run, vcpu, vcpu->arch.fault_dar, vcpu->arch.fault_dsisr); srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx); } } while (r == RESUME_GUEST); out:}49arch/powerpc/kvm/book3s_hv.c static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) { (snip) /* * This happens the first time this is called for a vcpu. * If the vcore is already running, we may be able to start * this thread straight away and have it join in. */ if (!signal_pending(current)) { if (vc->vcore_state == VCORE_RUNNING && VCORE_EXIT_COUNT(vc) == 0) { kvmppc_create_dtl_entry(vcpu, vc); kvmppc_start_thread(vcpu); } else if (vc->vcore_state == VCORE_SLEEPING) { wake_up(&vc->wq); } } while (vcpu->arch.state == KVMPPC_VCPU_RUNNABLE && !signal_pending(current)) { (snip) if (!vc->n_runnable || vcpu->arch.state != KVMPPC_VCPU_RUNNABLE) break; vc->runner = vcpu; vcpu->arch.is_master = 1; n_ceded = 0; list_for_each_entry(v, &vc->runnable_threads, arch.run_list) { if (!v->arch.pending_exceptions) n_ceded += v->arch.ceded; else v->arch.ceded = 0; } if (n_ceded == vc->n_runnable) kvmppc_vcore_blocked(vc); else kvmppc_run_core(vc); vc->runner = NULL; vcpu->arch.is_master = 0; } (snip)vcpu->arch.state = KVMPPC_VCPU_NOTREADY; atomic_dec(&vcpu->kvm->arch.vcpus_running); return r;2013/12/16 2013 IBM Corporation 50. static void kvmppc_run_core(struct kvmppc_vcore *vc) { struct kvm_vcpu *vcpu, *vnext; static int kvmppc_run_vcpu(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu) long ret; { u64 now; (snip) int i, need_vpa_update; /* int srcu_idx; arch/powerpc/kvm/book3s_hv_interrupts.S * This happens the first time this is called for a vcpu. struct kvm_vcpu *vcpus_to_update[threads_per_core]; * If the vcore is already running, we may be able to start _GLOBAL(__kvmppc_vcore_entry) * this thread straight away and have it join in. /* don't start if any threads have a signal pending */ */ need_vpa_update = 0; /* Write correct stack frame */ if (!signal_pending(current)) { list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) { mflr r0 if (vc->vcore_state == VCORE_RUNNING && if (signal_pending(vcpu->arch.run_task)) std r0,PPC_LR_STKOFF(r1) VCORE_EXIT_COUNT(vc) == 0) { return; kvmppc_create_dtl_entry(vcpu, vc); if (vcpu->arch.vpa.update_pending || /* Save host state to the stack */ kvmppc_start_thread(vcpu); vcpu->arch.slb_shadow.update_pending stdu r1, -SWITCH_FRAME_SIZE(r1) || } else if (vc->vcore_state == VCORE_SLEEPING) { vcpu->arch.dtl.update_pending) wake_up(&vc->wq); vcpus_to_update[need_vpa_update++] = vcpu; /* Save non-volatile registers (r14 - r31) and CR */ } } SAVE_NVGPRS(r1) (snip) mfcr r3 } /* std r3, _CCR(r1) * Make sure we are running on thread 0, and that while (vcpu->arch.state == KVMPPC_VCPU_RUNNABLE && * secondary threads are offline. /* Save host DSCR */ !signal_pending(current)) { */ BEGIN_FTR_SECTION (snip) if (threads_per_core > 1 && !on_primary_thread()) { r3, SPRN_DSCR mfspr if (!vc->n_runnable || vcpu->arch.state != KVMPPC_VCPU_RUNNABLE) list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) std r3, HSTATE_DSCR(r13) break; vcpu->arch.ret = -EBUSY; END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206) vc->runner = vcpu; goto out; vcpu->arch.is_master = 1; } (snip) n_ceded = 0; list_for_each_entry(v, &vc->runnable_threads, arch.run_list) { vc->pcpu = smp_processor_id(); /* Jump to partition switch code */ if (!v->arch.pending_exceptions) list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) { bl .kvmppc_hv_entry_trampoline n_ceded += v->arch.ceded; kvmppc_start_thread(vcpu); nop else kvmppc_create_dtl_entry(vcpu, vc); v->arch.ceded = 0; } /* } * We return here in virtual mode after the guest exits if (n_ceded == vc->n_runnable) /* Set this explicitly in case thread 0 doesn't with something that we can't handle in real mode. * have a vcpu */ kvmppc_vcore_blocked(vc); get_paca()->kvm_hstate.kvm_vcore = vc; * Interrupts are enabled again at this point. else get_paca()->kvm_hstate.ptid = 0; */ kvmppc_run_core(vc); vc->runner = NULL; vc->vcore_state = VCORE_RUNNING; /* vcpu->arch.is_master = 0; preempt_disable(); * Register usage at this point: } spin_unlock(&vc->lock); * (snip) * R1 = host R1 kvm_guest_enter(); * R2 = host R2 * R12 = exit handler id srcu_idx = srcu_read_lock(&vc->kvm->srcu); * R13 = PACA */ __kvmppc_vcore_entry(); /* Restore non-volatile host registers (r14 - r31) and CR */ spin_lock(&vc->lock); REST_NVGPRS(r1) /* disable sending of IPIs on virtual external irqs ld */ r4, _CCR(r1) list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) mtcr r4 vcpu->cpu = -1; /* wait for secondary threads to finish writing their state to memory */ addi r1, r1, SWITCH_FRAME_SIZE if (vc->nap_count < vc->n_woken) ld r0, PPC_LR_STKOFF(r1) kvmppc_wait_for_nap(vc); mtlr r0 (snip) blrarch/powerpc/kvm/book3s_hv.c 2013 IBM Corporation 51. arch/powerpc/kvm/book3s_hv_interrupts.S.global kvmppc_hv_entry kvmppc_hv_entry: /* Required state: * * R4 = vcpu pointer * MSR = ~IR|DR * R13 = PACA _GLOBAL(kvmppc_hv_entry_trampoline) * R1 = host R1 mflr r0 * R2 = TOC /* Save host state to the stack */ std r0, PPC_LR_STKOFF(r1) * all other volatile GPRS = free stdu r1, -112(r1) stdu r1, -SWITCH_FRAME_SIZE(r1) */ mfmsr r10 mflr r0 /* Save non-volatile registers (r14 - r31) and CR */ LOAD_REG_ADDR(r5, kvmppc_call_hv_entry) std r0, PPC_LR_STKOFF(r1) li r0,MSR_RI SAVE_NVGPRS(r1) stdu r1, -112(r1) andc r0,r10,r0 mfcr r3 std r3, _CCR(r1) li r6,MSR_IR | MSR_DR BEGIN_FTR_SECTION andc r6,r10,r6 /* Set partition DABR */ /* Save host DSCR */ mtmsrd r0,1 /* clear RI in MSR */ /* Do this before re-enabling PMU to avoid P7 DABR corruption bug */ BEGIN_FTR_SECTION mtsrr0 r5 lwz r5,VCPU_DABRX(r4) mtsrr1 r6 mfspr r3, SPRN_DSCR ld r6,VCPU_DABR(r4) std r3, HSTATE_DSCR(r13) RFI mtspr SPRN_DABRX,r5 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_206) mtspr SPRN_DABR,r6 kvmppc_call_hv_entry: (snip) /* Indicate that we are now in real mode */ (snip) ld r5, HSTATE_KVM_VCORE(r13) li r0, 1 /* Jump to partition switch code */ /* bl .kvmppc_hv_entry_trampoline stw r0, VCORE_RM_THREADS(r5) * POWER7 host -> guest partition switch code. nop * We don't have to lock against concurrent tlbies, /* any guest vcpu to run? */ * but we do have to coordinate across hardware threads. /* ld r4, HSTATE_KVM_VCPU(r13) */ cmpdi r4, 0 * We return here in virtual mode after the guest exits /* Increment entry count iff exit count is zero. */ * with something that we can't handle in real mode. beq kvmppc_call_no_guest ld r5,HSTATE_KVM_VCORE(r13) kvmppc_got_guest: * Interrupts are enabled again at this point. addi r9,r5,VCORE_ENTRY_EXIT bl kvmppc_hv_entry */ 21: lwarx r3,0,r9 cmpwi r3,0x100 /* any threads starting to exit? */ /* Back from guest - restore host state and return to caller */ /* bge secondary_too_late /* if so we're too late to the party */ * Register usage at this point: addi r3,r3,1 * BEGIN_FTR_SECTION stwcx. r3,0,r9 /* Restore host DABR and DABRX */ * R1 = host R1 bne 21b ld r5,HSTATE_DABR(r13) * R2 = host R2 * R12 = exit handler id li r6,7 /* Primary thread switches to guest partition. */ mtspr SPRN_DABR,r5 * R13 = PACA ld r9,VCPU_KVM(r4) /* pointer to struct kvm */ */ mtspr SPRN_DABRX,r6 lbz r6,VCPU_IS_MASTER(r4) END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S) cmpwi r6,0 /* Restore non-volatile host registers (r14 - r31) and CR */ beq 20f REST_NVGPRS(r1) /* Restore SPRG3 */ /* wait for thread 0 to get into real mode */ ld r3,PACA_SPRG3(r13) ld r4, _CCR(r1) HMT_LOW mtspr SPRN_SPRG3,r3 mtcr r4 50: lwz r6,VCORE_RM_THREADS(r5) cmpwi r6,0 addi r1, r1, SWITCH_FRAME_SIZE beq 50b ld r0, PPC_LR_STKOFF(r1) HMT_MEDIUM mtlr r0 ld r6,KVM_SDR1(r9) blr lwz r7,KVM_LPID(r9) li r0,LPID_RSVD /* switch to reserved LPID */ mtspr SPRN_LPID,r0 ptesync mtspr SPRN_SDR1,r6 /* switch to partition page table */ mtspr SPRN_LPID,r7 isync 2013 IBM Corporation _GLOBAL(__kvmppc_vcore_entry) /* Write correct stack frame */ mflr r0 std r0,PPC_LR_STKOFF(r1)arch/powerpc/kvm/ book3s_hv_rmhandlers.S 52. POWER/PowerPC KVM (Intel-VT/AMD-V ) KVM on Power 522013/12/16 2013 IBM Corporation