solaris kernel debugging v1.0

36
1 Oliver Yang Software Engineer Sun Mircosystem, Inc. Solaris Kernel Debugging Mdb and DTrace 1

Upload: jarod-wang

Post on 06-May-2015

8.723 views

Category:

Technology


10 download

TRANSCRIPT

Page 1: Solaris Kernel Debugging V1.0

1

Oliver YangSoftware EngineerSun Mircosystem, Inc.

Solaris Kernel DebuggingMdb and DTrace

1

Page 2: Solaris Kernel Debugging V1.0

2

Agenda• Kernel Debug Overview• Modular Debugger - Mdb• Dynamic Tracing - DTrace • References

Page 3: Solaris Kernel Debugging V1.0

3

Skill Sets of Kernel Debugging• Key elements for kernel debugging

> Kernel source code– http://src.opensolaris.org/source/xref/onnv/o

nnv-gate/usr/src/> Kernel debugging tools> System Architecture

– x32/x64/SPARC> Programing skills

– C/Assembly/D/Shell/Awk/Sed/Perl

Page 4: Solaris Kernel Debugging V1.0

4

Kernel Debugging Tools• Debug In Code

> cmn_err(9F) - Kernel version of printf(3C)> ASSERT - Only effective in debug kernel

• In-situ kernel debuggers> Kmdb, SPARC OBP

• Run time tracing> DTrace, Lockstat, Kmem allocator...etc.

• Post-mortem debuggers> Mdb, ACT, SCAT

Page 5: Solaris Kernel Debugging V1.0

5

Difficulties of Kernel Debugging...• The problems you may encounter

> System Panic> System hang> Memory leaks & corruption > Performance issues> Any other functionality issues

• Some of hot bugs found on customer sites...> Can not debug on the non-production kernel> Can not debug on mission-critical machines> May not be deterministically reproduced> May only have the crash dumps

Page 6: Solaris Kernel Debugging V1.0

6

Agenda• Kernel Debug Overview• Modular Debugger - Mdb• Dynamic Tracing - DTrace • References

Page 7: Solaris Kernel Debugging V1.0

7

Mdb - The Modular Debugger • Mdb targets

> User processes> User process core files> Live kernel read only by /dev/kmem&/dev/ksyms> Live Kernel with execution control by kmdb> System crash dumps> User process images inside system crash dumps> ELF object files> Raw data files

Page 8: Solaris Kernel Debugging V1.0

8

Live Kernel Debug – Read Only• How to run it?

> mdb -k• What you can do?

> Inspect kernel data structures and kernel pages> /dev/kmem

Access kernel virtual address space excluding memorythat is associated with an I/O device

> /dev/ksymsAccess kernel symbols as kernel ELF definitions

Page 9: Solaris Kernel Debugging V1.0

9

Live Kernel Debug - Execution Control • How to run it?

> mdb -K> Boot system with kmdb loaded

– x86 “-k”option in grub menu– SPARC “-k or kmdb” option in OBP

• What you can do?> Instruction-level control of kernel threads

executing on each CPU> Setting breakpoint and single-step the kernel

and inspect data structures in real time

Page 10: Solaris Kernel Debugging V1.0

10

Live Kernel Debug - Execution Control • dcmds

> [addr]:b> [addr]:d> ::events or $b> :z> :c> :e> :s> [syscall]::sysbp> addr [,len]::wp

Page 11: Solaris Kernel Debugging V1.0

11

Post-mortem Debug - Crash Dumps• How to use it

> mdb unix.<n> vmcore.<n>• What you can do?

> Access kernel memory pages and user process images inside a system crash dump

> Inspect kernel/user process data structures and kernel/user process pages

Page 12: Solaris Kernel Debugging V1.0

12

Post-mortem Debug - Crash Dumps• You can get a crash dump by...

> A real panic> Reboot with -d> Enter kmdb, run $<systemdump> Deadman timer

– Setting snooping to 1 in /etc/system, reboot– Setting deadman_enabled to 1 via mdb -kw

• savecore(1M) & dumpadm(1M)Dump content: kernel pagesDump device: /dev/dsk/c0d0s1 (swap)Savecore directory: /var/crash/<hostname>Savecore enabled: yes

Page 13: Solaris Kernel Debugging V1.0

13

Modular Debugger Basic• General Dcmds

> ::help> ::dcmds> ::formats> ::dmods -l [module...]> ::log -e file> ::quit or $q

Page 14: Solaris Kernel Debugging V1.0

14

Modular Debugger Basic• Inspect memory and data structures

> addr[,b]::dump [-g sz] [-e]> addr::dis> addr::print type field> ::sizeof type> ::offsetof type field> ::enum enumname> addr::array [type count] [var]> addr::list type field [var]

Page 15: Solaris Kernel Debugging V1.0

15

Crash Dumps Analysis - Panic• Panic procedures

> Panic messages– Panic thread– Trap number– Pointer of trap frame– CPU registers– back trace

> Dump memory to dump device> Dump CPU registers to dump device> Reboot> Savecore (from dump device to file system)

Page 16: Solaris Kernel Debugging V1.0

16

Crash Dumps Analysis - Panic• dcmds

> ::satus> ::showrev> ::prtconf> ::modinfo> ::msgbuf> [addr]$c/::stack/::stackregs> [addr]::dis> ::regs> [rp]::print struct regs

• Know the ABIs of x32/x64/SPARC

Page 17: Solaris Kernel Debugging V1.0

17

Crash Dumps Analysis – Hang• What conditions cause hangs?

> Deadlock> Resources exhaustion> Hardware problems

• Debugging system hangs> Live debugging with kmdb> Forcing a crash dump and analysis with mdb

Page 18: Solaris Kernel Debugging V1.0

18

Crash Dumps Analysis – Hang• Dispatcher and kernel threads

> [id]::cpuinfo> ::cycinfo> [addr]::threadlist> [addr]::thread> [addr]::findstack> [addr]::mutex> [addr]::rwlock> [addr]::wchaninfo> [addr]::whatthread or ::kgrep

Page 19: Solaris Kernel Debugging V1.0

19

Crash Dumps Analysis – Hang• Kernel Memory

> ::memstat> ::findleaks> ::kmastat/::kmem_cache/::walk <cache name>> ::kmausers> ::vmem/::walk vmem_seg/::vmem_seg> [addr]::whatis> [addr]::bufctl> [addr]::allocdby/[addr]::freedby

• Some of dcmds need kmem allocator tracing> Setting kmem_flags = 0xf in /etc/system, reboot

Page 20: Solaris Kernel Debugging V1.0

20

Agenda• Kernel Debug Overview• Modular Debugger - Mdb• Dynamic Tracing - DTrace • References

Page 21: Solaris Kernel Debugging V1.0

21

Dynamic Tracing Framework• DTrace framework includes...

> Consumer programs running in user land– dtrace(1M)/intrstat(1M)/lockstat(1M)...

> Kernel modules that provide probes to gather tracing data– dtrace(7D) and providers: syscall/fbt/sdt/vminfo...

> A library interface that consumer programs use to access the DTrace facility by dtrace driver

Page 22: Solaris Kernel Debugging V1.0

22

DTrace Big Picture

Page 23: Solaris Kernel Debugging V1.0

23

Provider• How provider works

> Provider represents a methodology for instrumenting the system

> Provider covers a certain aspect of the system> Provider makes probes available to the DTrace

framework> DTrace informs providers when a probe is to be

enabled provider transfers• Using providers with different ways

> Watch code path– fbt/sdt/syscall/pid/fsinfo/io/vminfo/proc/sched, etc.

> Get statistical data– mib/lockstat/profile/sysinfo, etc.

Page 24: Solaris Kernel Debugging V1.0

24

ProvidersProvider Description

lock contention statistics or understand locking behaviorsprofile a time-based interrupt firing every fixed, specified interval

entry to and return from most functions in the Solaris kernelentry to and return from every system call in the systemlocations at that a programmer has formally designatedcorrespond to kernel statistics classified by the name sys

process creation and termination,sending and handling signalsrelated to CPU schedulingrelated to disk input and outputrelated to counters in MIB - management information basesentry and return of any function in a user process

lockstat

fbtsyscallsdtsysinfovminfo correspond to the vm kernel statisticsprocschediomibpid

Page 25: Solaris Kernel Debugging V1.0

25

Running DTrace• D scripts

> Run *.d scripts#!/usr/sbin/dtrace -sprobe/predicate/{ actions}

• Command line> Run dtrace command, see dtrace(1M)

dtrace -n probe'/predicate/{actions}'

Page 26: Solaris Kernel Debugging V1.0

26

Probe• provider:module:function:name

> Provider– The instrumentation method to be used.For example,

the syscall provider is used to monitor system calls while the io provider is used to monitor the disk io.

> Module– The kernel module you want to observe

> Function– The kernel function you want to observe

> Name– Represents the location in the function. For example,

use entry for name to instrument when you enter the function.

Page 27: Solaris Kernel Debugging V1.0

27

Probe

Probe Description Explanationfbt::bge_intr:entryentry into bge_intr functionsfbt::bge_*:entry entry into any kernel functions that starts with bge_fbt:bge::entry entry into any bge driver functionsfbt:::entry entry into any kernel functionsfbt::: all probes published by the fbt provider

• A probe...> Is defined as 4-attribute tuple> could be listed by dtrace -l [-f|-l|-m|-n|-P]> supports wildcards match

Page 28: Solaris Kernel Debugging V1.0

28

Predicate

Predicate ExplanationCPU == 0 true if the probe executes on cpu0

true if the process is not the scheduler

Pid == 1029 true if the pid of the process that caused the probe to fire is 1029

execname != “sched”

ppid !=0 && arg0 == 0 true if the parent process id is not 0 and first argument is 0

• A predicate...> could be any D expression, result is boolean> is true means the actions could be executed

Page 29: Solaris Kernel Debugging V1.0

29

Action• An Action...

> is executed when a probe fires> has two categories

– Data Recording Action/Destructive ActionAction Explanationtrace() trace the D expression resultsprintf() print something using C-style printf()printa() print the aggregationsustack() print the user stack tracestack() print the kernel stack tracetracemem() copy data from an address in memory to a bufferbreakpoint() a kernel breakpoint, causes system drop into kmdbpanic() cause a kernel panicchill() spin for the specified number of nanoseconds

Page 30: Solaris Kernel Debugging V1.0

30

Aggregation

Functions Explanationcount() times that the count function is calledsum() total value of the specified expressionsavg() arithmetic average of the specified expressionsmin() smallest value among the specified expressionsmax() largest value among the specified expressions

quantize()

lquantize()A linear frequency distribution of the values of the specified expressions that is sized by the specified rangeA power of 2 frequency distribution of the values of the specified expressions.

• Aggregation syntax> @name[ keys ] = aggfunc( args );

Page 31: Solaris Kernel Debugging V1.0

31

Variables> Scalar Variables

– Represent individual fixed-size data objects> Associative Arrays

– name [ key ] = expression ;> Thread-Local Variables

– self->[variable name]> Clause-Local Variables

– this->[variable name]> Built-in Variables

– pre-defined scalar global variables> External Variables

– the ”`” is a scoping operator for accessing variables that are defined in the OS, eg: `kmem_flags

Page 32: Solaris Kernel Debugging V1.0

32

Built-in VariablesType and Name Explanationint64_t arg0...arg9 The first 10 input argumentscpuinfo_t *curcpu The CPU information for the current CPU.processorid_t cpu The CPU identifier for the current CPU.kthread_t *curthread kthread_t address for current kernel threadpid_t pid The process ID of the current processpid_t ppid parent process ID of the current processuint_t ipl IPL on the current CPU at probe firing timeint errno Error value returned by the last system callstring execname name passed to exec(2) to execute the process

uint64_t timestamp

uint64_t vtimestamp

A nanosecond timestamp counter, it increments from an arbitrary point in the past and should only be used for relative computationsA nanosecond timestamp counter that is the time of the current thread has been running on a CPU, minus the time spent in predicates and actions

Page 33: Solaris Kernel Debugging V1.0

33

Agenda• Kernel Debug Overview• Modular Debugger - Mdb• Dynamic Tracing - DTrace • References

Page 34: Solaris Kernel Debugging V1.0

34

Documentations & Links - Mdb• Solaris Internals Second Edition

> www.solarisinternals.com• Solaris Modular Debugger Guide

> docs.sun.com/app/docs/doc/817-2543• OpenSolaris mdb community

> opensolaris.org/os/community/mdb• Crash Dump analysis

> opensolaris.org/os/community/documentation/files/book.pdf

Page 35: Solaris Kernel Debugging V1.0

35

Documentations & Links - DTrace• Solaris Internals Second Edition

> www.solarisinternals.com/wiki/index.php/DTrace_Topics

• Solaris Dynamic Tracing Guide> docs.sun.com/app/docs/doc/819-3620

• OpenSolaris DTrace community> opensolaris.org/os/community/dtrace

• DTrace Tools> www.brendangregg.com/dtrace.html

Page 36: Solaris Kernel Debugging V1.0

36

Q&AOliver [email protected]

36