nflug xen presentation
TRANSCRIPT
Xen VirtualizationXen Virtualization
Niagara Frontier LUGMay 2008
Erek Dyskant
Niagara Frontier LUGMay 2008
Erek Dyskant
VirtualizationVirtualization
Separation of administrative zonesSeparation of software failureConsolidation of hardware resources
Full utilization of hardwareEasier hardware provisioning -- Want a server? You’ve got a server.
Excellent test environments
Separation of administrative zonesSeparation of software failureConsolidation of hardware resources
Full utilization of hardwareEasier hardware provisioning -- Want a server? You’ve got a server.
Excellent test environments
What virtualization isn’t
What virtualization isn’t Not an HA solution by itself
Naïve Implementation: Not suitable for some secure applications
Timing of private keysUnknown -- Lots of new codeHost OS adds a new point of entry
May actually increase complexityAdds Host OSes to manageAdds to total number of points of managementEncourages “guerilla” server projects
Not an HA solution by itself Naïve Implementation: Not suitable for some secure applications
Timing of private keysUnknown -- Lots of new codeHost OS adds a new point of entry
May actually increase complexityAdds Host OSes to manageAdds to total number of points of managementEncourages “guerilla” server projects
Mail Web Directory
DatabaseMail
Web
Directory
Database
Container Virtualization
Container Virtualization
Works at the kernel level, masking processes running on other partitions.
All guests share the same filesystem tree. Same kernel on all machines Unprivileged VMs can’t mount drives or change network settings
Native Speeds, no emulation overhead Any OS Crash effects all machines OpenVZ, Virtuozzo, Solaris Containers, FreeBSD Jails, Linux-Vserver
Works at the kernel level, masking processes running on other partitions.
All guests share the same filesystem tree. Same kernel on all machines Unprivileged VMs can’t mount drives or change network settings
Native Speeds, no emulation overhead Any OS Crash effects all machines OpenVZ, Virtuozzo, Solaris Containers, FreeBSD Jails, Linux-Vserver
Full VirtualizationFull Virtualization
Hardware Virtual MachinesVMWare, Xen HVM, KVM, Microsoft VM, Parallels
Runs unmodified guestsGenerally worst performance, but often acceptable
Simulates bios, communicates with VMs through ACPI emulation, BIOS emulation, sometimes custom drivers
Can sometimes virtualize accross architectures, although this is out of fashion.
Hardware Virtual MachinesVMWare, Xen HVM, KVM, Microsoft VM, Parallels
Runs unmodified guestsGenerally worst performance, but often acceptable
Simulates bios, communicates with VMs through ACPI emulation, BIOS emulation, sometimes custom drivers
Can sometimes virtualize accross architectures, although this is out of fashion.
VMWare ServerVMWare Server
Very well-developed GUIDecent PerformanceExcellent DocumentationBacked by a single vendorFree Version
Very Functional. Easy setup.No server-server communication/failover or supported shared storage
Non-free VersionShared Storage, centralized management, automated provisioning.
Very well-developed GUIDecent PerformanceExcellent DocumentationBacked by a single vendorFree Version
Very Functional. Easy setup.No server-server communication/failover or supported shared storage
Non-free VersionShared Storage, centralized management, automated provisioning.
VMWare Server 2VMWare Server 2
Para-virtualizationPara-virtualization Hypervisor runs on the bare metal. Handles CPU
scheduling and memory compartmentalization. Dom0, a modified Linux Kernel, handles
networking and block storage for all guests. Dom0 is also privileged to manage the VMs on the
system. DomU, or the guests OS, sends some requests
straight to the hypervisor, and others to the Dom0.
Because the kernel knows its virtualized, features can be built into it: hot connection/disconnection of resources, friendly shutdown, serial console.
Other paravirtualization schemes: Sun Logical Domains, VMware (sometimes)
Hypervisor runs on the bare metal. Handles CPU scheduling and memory compartmentalization.
Dom0, a modified Linux Kernel, handles networking and block storage for all guests. Dom0 is also privileged to manage the VMs on the
system. DomU, or the guests OS, sends some requests
straight to the hypervisor, and others to the Dom0.
Because the kernel knows its virtualized, features can be built into it: hot connection/disconnection of resources, friendly shutdown, serial console.
Other paravirtualization schemes: Sun Logical Domains, VMware (sometimes)
Elements of a Xen VMElements of a Xen VM
Virtual Block DeviceImage fileReal block device (either LVM or physical)
Network BridgesRouted, terminates at the Dom0Bridged, terminates at the network interface
Virtual FramebufferVNC Server
Virtual Block DeviceImage fileReal block device (either LVM or physical)
Network BridgesRouted, terminates at the Dom0Bridged, terminates at the network interface
Virtual FramebufferVNC Server
Example VM ConfigExample VM Config
name = ”DomU-1"maxmem = 512memory = 512vcpus = 2bootloader = "/usr/bin/pygrub"on_poweroff = "destroy"on_reboot = "restart"on_crash = "restart"vfb = [ "type=vnc,vncunused=1,keymap=en-us" ]disk = [
"tap:aio:/var/lib/xen/images/Centos5Image.img,xvda,w" ]vif = [ "mac=00:16:3e:79:fd:8d,bridge=xenbr0" ]
name = ”DomU-1"maxmem = 512memory = 512vcpus = 2bootloader = "/usr/bin/pygrub"on_poweroff = "destroy"on_reboot = "restart"on_crash = "restart"vfb = [ "type=vnc,vncunused=1,keymap=en-us" ]disk = [
"tap:aio:/var/lib/xen/images/Centos5Image.img,xvda,w" ]vif = [ "mac=00:16:3e:79:fd:8d,bridge=xenbr0" ]
xm -- Xen Managerxm -- Xen Manager
Commandline tool on Dom0 for managing vms.
Quick overview of options: console -- attach to a device’s console create -- boot a DomU from a config file destroy -- immediately stop a DomU list -- List running DomUs migrate -- Migrate a console to another Dom0 pause/unpause -- akin to suspend. TCP connections will timeout shutdown -- Tell a DomU to shut down. network-attach/network-detach block-attach/block-detach
Commandline tool on Dom0 for managing vms.
Quick overview of options: console -- attach to a device’s console create -- boot a DomU from a config file destroy -- immediately stop a DomU list -- List running DomUs migrate -- Migrate a console to another Dom0 pause/unpause -- akin to suspend. TCP connections will timeout shutdown -- Tell a DomU to shut down. network-attach/network-detach block-attach/block-detach
Redhat/Centos virt-manager
Redhat/Centos virt-manager
Simple Graphical Interface.Basically does what xm does, plus:
Built in short-term performance graphing
Built in VNC clientQuick tour...
Simple Graphical Interface.Basically does what xm does, plus:
Built in short-term performance graphing
Built in VNC clientQuick tour...
Main ViewMain View
Create VMCreate VM
Name MachineName Machine
Choose MethodChoose Method
Choose Media LocationChoose Media Location
Networking ConfigNetworking Config
Memory, CPU allocationMemory, CPU allocation
Confirmation ScreenConfirmation Screen
VNC WindowVNC Window
Graph ViewGraph View
BenchmarksBenchmarks
Small Images
0
2000
4000
6000
8000
10000
12000
14000
Small Images
Moodle
90
95
100
105
110
115
120
125
Moodle
Kilobytes / Second
Bare Machine
Xen Image
Xen Device
VMWare
More BenchmarksMore BenchmarksMysql Benchmark Suite
248202 203 214
309
198
266 275
480
544 558
384
864
676 676
767
271
200 219 217
0
100
200
300
400
500
600
700
800
900
1000
VMWare Xen Image Xen Partition Hardware
Seconds to Completion
countcreate-deletinsertselectupdate
Xen Live MigrationXen Live MigrationMigrate machines off during upgrades or balance load
Set xend.conf to allow migration from other xen Dom0s.
Machine must reside on shared storage.
Must be on the same level2 networkxm migrate -l Machine dest.ip.addr.ess
Migrate machines off during upgrades or balance load
Set xend.conf to allow migration from other xen Dom0s.
Machine must reside on shared storage.
Must be on the same level2 networkxm migrate -l Machine dest.ip.addr.ess
Shared Storage OptionsShared Storage Options
NFSSimple hardware failoverwell-understood configurationSpotty reliability history
Block level storage (iscsi or FC)More complex configurationMultipathingCommercial solutions are expensiveWe’re seeing traction for open iscsi lately.
NFSSimple hardware failoverwell-understood configurationSpotty reliability history
Block level storage (iscsi or FC)More complex configurationMultipathingCommercial solutions are expensiveWe’re seeing traction for open iscsi lately.
What to Look for In Storage
What to Look for In Storage
Redundant host connectionsSnapshottingReplicationSensible Volume ManagementThin ProvisioningIP-based failover, esp. if x86 based
Redundant host connectionsSnapshottingReplicationSensible Volume ManagementThin ProvisioningIP-based failover, esp. if x86 based
Storage SystemsStorage Systems OpenFiler
Nice fronted.Replication with DRBDiscsi with linux iscsi-target
OpenSolaris/ZFSThin provisioningToo many ZFS features to listStorageTek AVS -- Replication in may formsComplex configuration
NexentaStorZFS/AVS in Debian.Rapidly Evolving
SAN/IQ Failover, storage virtualization, n(y) redundancyExpensive and wickedly strict licensing
Too Many propriety hardware systems to list
OpenFilerNice fronted.Replication with DRBDiscsi with linux iscsi-target
OpenSolaris/ZFSThin provisioningToo many ZFS features to listStorageTek AVS -- Replication in may formsComplex configuration
NexentaStorZFS/AVS in Debian.Rapidly Evolving
SAN/IQ Failover, storage virtualization, n(y) redundancyExpensive and wickedly strict licensing
Too Many propriety hardware systems to list
Network SegmentationNetwork Segmentation
802.1q VLAN taggingAll VLANs operate on the same physical network, but packets carry an extra tag that indicates which network they belong in.
Create an interface and a bridge for each vlan.
Connect Xen DomUs to their appropriate vlanConfigure host’s switch ports as vlan trunk ports.
Configure router somewhere, or a layer 3 switch is useful here.
802.1q VLAN taggingAll VLANs operate on the same physical network, but packets carry an extra tag that indicates which network they belong in.
Create an interface and a bridge for each vlan.
Connect Xen DomUs to their appropriate vlanConfigure host’s switch ports as vlan trunk ports.
Configure router somewhere, or a layer 3 switch is useful here.
Commercial XensCommercial Xens
Citrix XenServerOracle VMVirtualIron
Typical Features:Resource QoSPerformance trendingPhysical Machine Failure detectionPretty GUI!API for server provisioning
Citrix XenServerOracle VMVirtualIron
Typical Features:Resource QoSPerformance trendingPhysical Machine Failure detectionPretty GUI!API for server provisioning
Recovery strategiesRecovery strategies
Mount virtual block device on Dom0losetup /dev/loop0 XenVBlockImage.imglosetup -akpartx -a /dev/loop0pvscan (if using LVM inside VM)vgchange -a y VolGroup00mount /dev/mapper/VolGroup00-LogVol00 /mnt/xen
chroot /mnt/xen (or whatever recovery steps you take next)
Mount virtual block device on Dom0losetup /dev/loop0 XenVBlockImage.imglosetup -akpartx -a /dev/loop0pvscan (if using LVM inside VM)vgchange -a y VolGroup00mount /dev/mapper/VolGroup00-LogVol00 /mnt/xen
chroot /mnt/xen (or whatever recovery steps you take next)
Xen Recovery -- contXen Recovery -- cont
Boot from recovery CD as HVMdisk = [
’tap:aio:/home/xen/domains/damsel.img,ioemu:hda,w','file:/home/jack/knoppix.iso,ioemu:hdc:cdrom,r' ]builder="hvm"extid=0device_model="/usr/lib/xen/bin/qemu-dm"kernel="/usr/lib/xen/boot/hvmloader"boot="d"vnc=1vncunused=1apic=0acpi=1
Create custom Xen Kernel OS image for rescues
Boot from recovery CD as HVMdisk = [
’tap:aio:/home/xen/domains/damsel.img,ioemu:hda,w','file:/home/jack/knoppix.iso,ioemu:hdc:cdrom,r' ]builder="hvm"extid=0device_model="/usr/lib/xen/bin/qemu-dm"kernel="/usr/lib/xen/boot/hvmloader"boot="d"vnc=1vncunused=1apic=0acpi=1
Create custom Xen Kernel OS image for rescues
PitfallsPitfalls
Failure to segregate network802.1q and iptables firewalls everywhere
Creating Single Points of FailureMake sure that VMs are clusteredIf they can’t be clustered, auto started on another machine
Assess reliability of shared storageStorage BottlenecksNot planning for extra points of managementcfengine, puppet, centralized authentication
Less predictable performance modeling
Failure to segregate network802.1q and iptables firewalls everywhere
Creating Single Points of FailureMake sure that VMs are clusteredIf they can’t be clustered, auto started on another machine
Assess reliability of shared storageStorage BottlenecksNot planning for extra points of managementcfengine, puppet, centralized authentication
Less predictable performance modeling
Maintaining HAMaintaining HAHardware will failIndividual VMs will crashCluster Multiple VMs for each application
Load Balancers can be VMs too.
Hardware will failIndividual VMs will crashCluster Multiple VMs for each application
Load Balancers can be VMs too.
HA -- ContinuedHA -- ContinuedFailure Detection, make VM restart on different machines if a machine fails
Make VMs migrate off a host when you shut it down
Build your testing system into the VM scheme.At least one testing system per type of host. Diligently do all changes on that before rolling out.
Have at least one development VM per VM cluster.Make sure that networking equipment and storage is redundant too
If running web servers, keep a physical web server on hand to serve a “We’re sorry, come back later” page. For mail servers, an independant backup MX.
Failure Detection, make VM restart on different machines if a machine fails
Make VMs migrate off a host when you shut it down
Build your testing system into the VM scheme.At least one testing system per type of host. Diligently do all changes on that before rolling out.
Have at least one development VM per VM cluster.Make sure that networking equipment and storage is redundant too
If running web servers, keep a physical web server on hand to serve a “We’re sorry, come back later” page. For mail servers, an independant backup MX.