red hat enterprise linux 5 / xen hat enterprise linux 5 / xen troubleshooting security agility...
TRANSCRIPT
Red Hat Enterprise Linux 5 / XenTroubleshooting
Security Agility Reduced CostSecurity Agility Reduced Cost
Jan Mark [email protected] Engineer Emerging Technology Group
May 2007
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 3
Outline • Some basic rules• Key directories and files• Some common examples• Q&A
4RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
BasicBuilding Blocks
5RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Before you start• Read the documentation and QuickStart guides
− RHEL5 Virt Guide• http://www.redhat.com/docs/manuals/enterprise/RHEL-5-manual/Virtualization-en-US/index.html• http://www.redhat.com/docs/
− Fedora Core 6 Xen QuickStart• http://fedoraproject.org/wiki/FedoraXenQuickstartFC6
− Kbase articles• http://kbase.redhat.com• http://kbase.redhat.com/faq/topten_108_0.shtm for virtualization
specific topics− Virtualization “brain dump”
• http://et.redhat.com/~jmh/docs/− Look for Installing_RHEL5_Virt.pdf/odt
− XenWiki• http://wiki.xensource.com/xenwiki/
6RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Before you start• Remember you can use most of the standard Linux/RHEL tools
for troubleshooting− Very few Xen specific tools (mostly xm and virsh commands)
• Develop a standard build and naming conventions− Hostnames and Volumegroups
• Develop a standard build and naming conventions
7RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Guest/Hypervisor Matrixwith PAE
32bit PAE paravirt Guest
32bit HVM Guest
64bit paravirt Guest
64bit HVM Guest
32bit (PAE) Hypervisor / dom0
64bit Hypervisor / dom0
For Paravirtual, the guest has to be equal to dom0. For HVM, the guest has to be equal or less than dom0. RHEL5/Xen Hypervisor itself must be EQUAL to dom0.
8RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Installing RHEL5/virt• Considerations
− Secure RHEL5 platform layer before installing any virtual machines or applications
− Enable SElinux to run in 'enforcing' mode− Remove or disable any unwanted services
• AutoFS, NFS, FTP, WWW, NIS, telnetd, sendmail etc...− Only add minimum number of user accounts needed for platform
management− Avoid running applications on dom0/Hypervisor
• Running applications in dom0 may impact virtual machine performance− Use central location for virtual machine installations
• Will make it easier to move to shared storage later on− If laptop with wireless adapter it used as virt platform need to perform
special steps to make Xen networking functional• Setup dummy network interface and NAT traffic through WiFi
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 9
Basic Information and Hints
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 10
/etc/xen● xend-config.sxp
● xend configuration file● Domain / guest config files
/etc/xen/scripts● (customizable) control scripts
● Device management -eg. Networking /etc/xen/auto
● Softlinks to guests that should start automatically● xendomains service starts these guests
● Typically disabled by default
Configuration – Key files and directories
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 11
/var/lib/xen● dump
● Kernel dumps generated with “x m dump-core <Domain>”● images
● Guest images (covered by SElinux policy)● xend-db
● Xen internal database
/var/lib/xend● Sockets used by xend (xend-socket / relocation-socket)
/var/lib/xenstored● xend database (tdb)● Can be “r ead” with xenstore-ls
Configuration – Key files and directories
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 12
/var/log/xen● xend.log
● Logfile used by xend for logging● Primary file to review in case of problems
● xend-debug.log● Debug output from xend
● xend-hotplug.log● Logfile for hotplug events● Will record information in case of hotplug failure of devices
● qemu-dm.{PID}.log● Logfile used by the qemu-dm process
All logfiles are human readbale and can be view with the standard Linux utilities (or “ xm log” for xend.log)
Logging – Key files and directories
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 13
/proc/cpuinfo● shows whether pae, svm, vmx enabled CPU
/proc/xen● various information about xen
/sys/hypervisor● more information about domain, including UUID
Xen Python scripts are logging to debug facility (not enabled by default)● Add *.* /var/log/debug.log to syslog config file
Logging/Capabilities – Key files and directories
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 14
/var/lib/xen/images● Default location for virtual machine images
● Recommended, not mandatory● Included in recent SELinux policies
/var/log/xen● xend.log – ma in log file● xend-debug.log – de bugging information● qemu* - logs for QEMU
Core RPMS: xen, xen-libs and kernel-xen
Configuration – Key files and directories
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 15
Basic Xen commands• Once you have your first guest installed you can use the
following commands for some basic management• To startup a guest
− # /usr/sbin/xm create -c GuestName − Where GuestName is the name you gave for your guest during the
installation− The -c will attach a xen console to your vm
• A variety of other commands are available via xm including− # /usr/sbin/xm help or # virsh help− For a list of commands that can be run− Use '--long' in addition for extended help text− You can also use #/usr/sbin/xm help – help 'Command' for a specific
command
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 16
Basic Xen commands contd.• # /usr/sbin/xm list (--long) or # virsh list
− List running domains/guest and their status/accumulated CPU time
• # /usr/sbin/xm top− for a display showing what your virtual machines are doing similar to that
provided by top
• # /usr/sbin/xm shutdown GuestName or # virsh shutdown GuestName− to nicely shut down a guest OS where foo is the name of your guest.
• # /usr/sbin/xm destroy GuestName or # virsh destroy GuestName− To power down a guest (hard reset)
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 17
Basic Xen commands (Suspend/Resume)
• # /usr/sbin/xm save GuestName GuestName.sav or # virsh save GuestName GuestName.sav− to save the state of the guest 'GuestName' to the file GuestName.restore
• # /usr/sbin/xm restore GuestName.sav or # virsh restore GuestName.sav− to restore the above saved guest
• # /usr/sbin/xm pause GuestName or # virsh pause GuestName− to suspend a running guest (release CPU cycles but retain memory footprint)
• # /usr/sbin/xm unpause GuestName or # virsh unpause GuestName− to resume a previously suspended guest
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 18
Basic Xen commands (Resource Management)
• # /usr/sbin/xm vcpu-set <dom> <value> or # virsh setvcpus <dom> <vcpus>− set the number of CPUs available to <dom> to <value> (only works for
dom0/paravirtualized guests)
• # /usr/sbin/xm vcpu-list or # virsh vcpuinfo− List the physical-virtual CPU bindings
• # /usr/sbin/xm mem-set <dom> <value> or # virsh setmem <dom> <value>− balloon <dom> up or down to <value> (only works for dom0/paravirtualized
guests)
• # /usr/sbin/xm sched-credit -d <DomainID>− Display credit schedule information and set cap/weight for individual domain
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 19
Other Useful Commands and Tools• Basic Xen commands
− xm log− xm dmesg− xm info / virsh nodeinfo− xm top
• virsh commands (extract)− virsh dominfo <dom>− virsh domstate <dom>− Virsh dumpxml <dom>
• Tools − strace, lsof, iostat/vmstat− Systemtap− /var/log/messages− /var/log/xen− AVC messages (setroubleshoot)− sosreport has a plugin that automatically gathers all of the above
information• Plugin not included in RHEL-5 GA, should be included in day0 errata
20RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Installation Issues• Require different installation source for PV vs FV installations
− PV requires network based install tree− FV require local boot.iso or DVD
• In upcoming version of virt-manager/virt-install can specify a network tree for boot.iso
• VM image not in /var/lib/xen/images with SElinux enabled in enforcing mode− Will cause backend device not being “c onnected” and generate a
hotplug event− Installation seems to hang
• If iSCSI is used for root device for VM need to create a local boot slab− Also needs a custom network script if iSCSI is not reachable via
default network
21RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Installation Issues• If multiple Xen bridges are configured use virt-install to
specify specific bridge− virt-manager will use the bridge attached to the default network/route
22RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Installation Issues / VT/AMD-V Extensions• Intel/VT extensions not enabled in BIOS
− Unable to install HVM guest (box grayed out)
• You can use the following commands to verify whether the virtualization extensions have been enabled
● On an Intel/VT based system, look for “ VMX”[root@woodie ~]# xm dmesg | grep VMX(XEN) VMXON is done(XEN) VMXON is done
and to verify the CPU flags have been set
[root@woodie ~]# cat /proc/cpuinfo |grep vmx
flags: fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm
• You should have a VMXON for each reported processor. If you have any other messages visit your BIOS settings
• There is no reason to go any further until you have VMXON reported - it just isn't going to work
23RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Installation Issues / VT/AMD-V Extensions• AMD/AMD-V extensions not enabled in BIOS
− Unable to install HVM guest (box grayed out)
• On an AMD-V based system, look for “ SVM”[root@perf3 ~]# xm dmesg|grep SVM
(XEN) AMD SVM Extension is enabled for cpu 0
(XEN) AMD SVM Extension is enabled for cpu 1
●and to verify the CPU flags have been set[root@perf3 ~]# cat /proc/cpuinfo |grep svmflags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush mmx fxsr sse
sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm cr8legacy ts fid vid ttp tm stc
● You should have a AMD SVM for each reported processor. If you have any other messages visit your BIOS settings
● There is no reason to go any further until you have VMXON reported - it just isn't going to work
24RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
• Xen management daemon− Runs in dom0− Provides management interface for VMM
• Accessed through− HTTP Interface− UNIX domain socket
• Configuration file : /etc/xen/xend-config.sxp
Architecture – xend
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 25
Read once when xend service starts● /etc/xen/xend-config.sxp● Python style syntax
Default installation● Local management only● No migrations● Bridged networking● No remote graphics● Minimum memory for dom0 - 256mb
xend configuration
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 26
Allow remote management through HTTP● Note : NO security - USE IPTABLES● Provides HTTP based API (xml/rpc) not a gui
(xend-http-server yes)(xend-port 8000)
Allow remote graphical consoles (vnc) with no password(vnc-listen '0.0.0.0')(vncpasswd '')
xend configuration
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 27
Typically store in /etc/xen● Default location● XM utility looks first in $PWD then /etc/xen● Recommended location
Domain configurations created using tools● virt-manager● virt-install● or manually
Guest Domain configuration
28RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
The plug-in network scripts are responsible for connecting real, dom0 and domU interfaces● For example 'bridge' eth0, vif0.0 & vif1.0
Networking is configured in xend-config.xsp Changes require a restart of xend There are two network scripts for each 'model'
● network-script : called when xend starts● vif-script : called when a domU is created or destroyed
Networking
29RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
network-script ● Responsible for network initialization for dom0● Bridge example (simplified) :
● Creates bridge interface● Connects eth0 to bridge interface● Disconnect (and remove) when xend is stopped
vif-script● Responsible for connecting domU vif to the bridge● Disconnects vif from bridge when domU is stopped
Networking
30RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Scripts are stored in /etc/xen/scripts Configured in /etc/xen/xend-config.sxp
Networking
(network-script network-bridge) (vif-script vif-bridge)
This configuration declares that “ network-bridge” sh ould handle the network setup “ vif-bridge” sho uld handle each domU add/remove
31RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Multi Network Bridge Solution : ● Modify /etc/xen/xend-config.sxp
● Replace call to network-bridge script with call to custom script (ie multi-network-bridge)
● Create custom script in /etc/xen/scripts● Inside the custom script call network-bridge script with addtl
parameters/interfaces
Networking / Multi-bridge config
32RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Networking / Example for multi-bridge config#!/bin/sh# network-xen-multi-bridgeset -e# First arg is the operation.OP=$1shiftscript=/etc/xen/scripts/network-bridge.xencase ${OP} in start)
$script start vifnum=0 bridge=xenbr0 netdev=eth1 $script start vifnum=1 bridge=xenbr1 netdev=eth2 $script start vifnum=2 bridge=xenbr2 netdev=eth3 ;; stop) $script stop vifnum=0 bridge=xenbr0 netdev=eth1 $script stop vifnum=1 bridge=xenbr1 netdev=eth2 $script stop vifnum=2 bridge=xenbr2 netdev=eth3 ;; status)
$script status vifnum=0 bridge=xenbr0 netdev=eth1 $script status vifnum=1 bridge=xenbr1 netdev=eth2 $script status vifnum=2 bridge=xenbr2 netdev=eth3 ;; *) echo 'Unknown command: ' ${OP} echo 'Valid commands are: start, stop, status' exit 1esac
33RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Laptop WiFi Solution : ● Use dummy network driver
● Requirement :● Must use static IP's in domU● DHCP server isn't available offline● Cannot run dhcp server on dummy interface
Networking
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 34
Migration• Requires shared storage
− Both domain 0's must access same disk image or physical device
• Using Physical device− LUN on SAN− Exported block device
• ISCSI, GNBD− Both domain 0's should use same name for device
• eg. /dev/sdg• Use UDEV rules for mapping if required
35RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Migration Using File based disk image
● Disk image stored on shared file system● NFS *● Samba● GFS
● Should be mounted on same mount point● Both domain 0's see same directory structure
● eg. /mnt/vm
36RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Migration Enable relocation server
/etc/xen/xend-config.sxp(xendrelocationserver yes)
(xendrelocationport 8002)
(xendrelocationaddress '')
(xendrelocationhostsallow '')
Enables relocation server Listens on port 8002 (default)
● Use lsof to verify relocation port is active[root@grumble]$ sudo lsof -i |grep tera
python 3873 root 5u IPv4 12329 TCP *:teradataordbms (LISTEN)
Binds to all IP addresses Allows any host to migrate to this domain 0
● Authorized hosts can be listed by IP or name ● '^myserver.mydomain.com$ ^.*\.redhat\.com$'
37RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
A few examples
38RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Troubleshooting (Memory Ballooning)• Failed domain creation due to memory shortage, ie unable to balloon
domain● A domain might fail to start if there's either not enough memory
available or if dom0 has not ballooned down enough to provide space for the newly created/started guest
● A typical error message in your /var/log/xen/xend.log would be
[2006-11-21 20:33:31 xend 3198] DEBUG (balloon:133) Balloon: 558432 KiB free; 0 to scrub; need 1048576; retries: 20.
[2006-11-21 20:33:52 xend.XendDomainInfo 3198] ERROR (XendDomainInfo:202) Domain construction failed
● You can verify the amount of memory currently used by dom0 using the command “xm list Domain-0” , if dom0 indeed is not ballooned down you can use the command “ xm mem-set Domain-0 NewMemSize” (w here NewMemSize should be a small(er) value
39RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Troubleshooting (non-Xen/PAE kernel)
• Wrong kernel image (ie non-Xen kernel in a para-virt guest)● If you try to boot a non-xen kernel in a para-virtualized guest you will
see the following error message
[root@grumble]# xm create testVM
Using config file "./testVM".
Going to boot Red Hat Enterprise Linux Server (2.6.18-1.2839.el5)
kernel: /vmlinuz-2.6.18-1.2839.el5
initrd: /initrd-2.6.18-1.2839.el5.img
Error: (22, 'Invalid argument')● In the above example you can see that the kernel line shows that it's
trying to boot a non-xen kernel. The correct entry would be ” kernel: /vmlinuz-2.6.18-1.2839.el5xen”
40RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Troubleshooting (networking/bridging)• Wrong bridge configured in guest configuration file causing Xen Hotplug
scripts to timeout● If you have moved configuration files between different hosts you
may to make sure your guest configuration files have been updated to reflect any change in network topology/configuration such as Xen bridge numbering etc
[root@grumble xen]# xm create r5b2-mySQL01Using config file "r5b2-mySQL01".Going to boot Red Hat Enterprise Linux Server (2.6.18-1.2747.el5xen)kernel: /vmlinuz-2.6.18-1.2747.el5xeninitrd: /initrd-2.6.18-1.2747.el5xen.imgError: Device 0 (vif) could not be connected. Hotplug scripts not
working● In /var/log/xen/xen-hotplug.log you will see the following error being
loggedbridge xenbr1 does not exist!
41RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Troubleshooting (networking/bridging) contd● In /var/log/xen/xend.log you will see the following messages being
logged[2006-12-14 15:07:08 xend 3874] DEBUG (DevController:143)
Waiting for devices vif.[2006-12-14 15:07:08 xend 3874] DEBUG (DevController:149)
Waiting for 0.[2006-12-14 15:07:08 xend 3874] DEBUG (DevController:464)
hotplugStatusCallback /local/domain/0/backend/vif/2/0/hotplug-status.
[2006-12-14 15:07:08 xend 3874] DEBUG (DevController:464) hotplugStatusCallback /local/domain/0/backend/vif/2/0/hotplug-status.
[2006-12-14 15:08:48 xend 3874] DEBUG (DevController:464) hotplugStatusCallback /local/domain/0/backend/vif/2/0/hotplug-status.
[2006-12-14 15:08:48 xend 3874] DEBUG (DevController:464)
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 42
Zombie Domains When shutting down a domain or migrating, it doesn't actually die but ends up leaving a
domain named Zombie-<dom>● Typically caused by restarting xend while domains are running● Xend does not properly reconnect certain devices (most notably frame buffers), so when
the domain goes to shut down, xend does not know how to find all of the resources for the domain to destroy
● If it happens during migration likely cause is either networking problem or remote relocation server not enabled (status can also change to “ migrating-<dom>”
[root@dhcp78-237 ~]# xm list
Name ID Mem(MiB) VCPUs State Time(s)
Domain-0 0 14627 8 r----- 18883.9
Zombie-rhel4u5pv01 9 511 1 -p---- 1072.2
rhel5gapv01 4 511 1 -b---- 1961.1● To resolve the issue
● Reboot (only reliable solution)● Balloon down Zombie domain to make room for other domains
43RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Storage Troubleshooting (Filebased)
Unable to start more than 8 file based guests● If file based guest images are used one may have to increase the
number of configured loop devices. The default configuration allows up to 8 loop devices to be active, if more than 8 file based guests/loop devices are needed the number of loop devices configured can be adjusted in /etc/modprobe.conf
● Simply edit /etc/modprobe.conf and add the following line to itoptions loop max_loop=64
● You can substitute the number '64' with a number which fits your local configuration
44RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Storage Troubleshooting (Filebased)
Check if tapdisk process is actually running[root@dhcp78-237 ~]# ps auxw|grep tap
root 3338 0.0 0.0 95644 680 ? Ssl Apr29 0:00 blktapctrl
root 1053 0.0 0.0 60268 708 pts/7 S+ 06:09 0:00 grep tapd
root 5238 0.0 0.0 30176 612 ? Sl Apr29 0:04 tapdisk /dev/xen/tapctrlwrite2 /dev/xen/tapctrlread2
root 6843 0.0 0.0 30172 612 ? Sl Apr29 0:03 tapdisk /dev/xen/tapctrlwrite3 /dev/xen/tapctrlread3
45RH Summit 2007 / RHEL5/Virt Troubleshooting
Product features subject to change prior to availability
Storage Troubleshooting (Filebased) For each disk there should be a corresponding entry in your domain
configuration[root@woodie ~]# virsh dumpxml 4
<domain type='xen' id='4'>
<name>rhel5gapv01</name>
<os>
<type>linux</type>
<kernel>/var/lib/xen/vmlinuz.Q0Psrs</kernel>
<initrd>/var/lib/xen/initrd.B_e0Gm</initrd>
<cmdline>ro root=/dev/VolGroup00/LogVol00 rhgb quiet</cmdline>
<disk type='file' device='disk'>
<driver name='tap' type='aio'/>
<source file='/var/lib/xen/images/rhel5gapv01.dsk'/>
<target dev='xvda'/>
</disk>
<console tty='/dev/pts/3'/>
</devices>
</domain>
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 46
Other random bits A fully virtualized x86_64 guest fails to boot with “ Your CPU does
not support long mode. Use a 32bit distribution” .● Resolution:
● Make sure “ pae=1” is set in the domain configuration file
Over time, the /var/lib/xen directory becomes full of kernel.xxxxx and initrd.xxxxx files● Xen tools don't always clean up after errors● Resolution:
● Safe to just delete all of the files
Message: “ FATAL: Module microcode not found” during domain boot● It's not really a fatal error, it is just warning that the virtual machine
couldn't update the CPU microcode● Resolution:
● Nothing required● To get rid of error message, disable microcode_ctl service
RH Summit 2007 / RHEL5/Virt TroubleshootingProduct features subject to change prior to availability 47
Resources• Red Hat
− http://www.redhat.com/
• Virtualization Infocenter− http://www.openvirtualization.com/
• Libvirt− http://www.libvirt.org/
• Virt-Manager− http://virt-manager.et.redhat.com/
• Red Hat Cluster Suite− http://www.redhat.com/solutions/gfs/
• Red Hat Emerging Technology Group− http://et.redhat.com/