lkcd linux kernel crash dumps

15
LKCD Linux Kernel Crash Dumps Matt D. Robinson [email protected]

Upload: kennedy-bell

Post on 31-Dec-2015

38 views

Category:

Documents


1 download

DESCRIPTION

LKCD Linux Kernel Crash Dumps. Matt D. Robinson [email protected]. LKCD Overview. Description Kernel Implementation Configuration Invocation/Kernel State User-Level Analysis (lcrash) lcrash Example Output Future Development/Evolution. Description. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: LKCD Linux Kernel Crash Dumps

LKCDLinux Kernel Crash Dumps

Matt D. Robinson

[email protected]

Page 2: LKCD Linux Kernel Crash Dumps

04/19/23 Version 1.0 2

LKCD Overview

Description

Kernel Implementation

Configuration

Invocation/Kernel State

User-Level Analysis (lcrash)

lcrash Example Output

Future Development/Evolution

Page 3: LKCD Linux Kernel Crash Dumps

04/19/23 Version 1.0 3

Description

LKCD is a set of kernel and application code to configure, implement, and analyze system crash dumps.

These slides will cover a high-level view of the kernel side of LKCD, with a brief introduction to the user-level analysis tools.

Page 4: LKCD Linux Kernel Crash Dumps

04/19/23 Version 1.0 4

Kernel Implementation

dump.o is the primary kernel driver, and can be either a module or built by default into the kernelDump driver is dormant until either invoked for configuration or for dumpingConfiguration of dump device determines what occurs on invocationDisruptive and non-disruptive dumping available

Page 5: LKCD Linux Kernel Crash Dumps

04/19/23 Version 1.0 5

Kernel Implementation

Dump compression available through modules (or standalone) – GZIP or RLE

Access to dump driver through /dev/dump (device pair 227,0)

panic() or die_if_kernel() will invoke the dumping process – dumping only occurs if dumps are configured

Page 6: LKCD Linux Kernel Crash Dumps

04/19/23 Version 1.0 6

Kernel Implementation

Current dump path uses existing I/O subsystem for dumping

Disks (primarily swap) are used for now – future direction will be MUCH different

panic() die_if_kernel()

dump()

dump_execute()

dump_add_page()

dump_write_pages()

dump_compress_page()

I/O Subsystem(Disk, Network, Etc.)

Page 7: LKCD Linux Kernel Crash Dumps

04/19/23 Version 1.0 7

Configuration

Dump configuration takes place via ioctl() to the kernel driver: DIOSDUMPLEVEL

DUMP_LEVEL_NONE – Don’t dump any pages DUMP_LEVEL_ALL – Dump all memory pages DUMP_LEVEL_KERN – Dump just kernel level pages

DIOSDUMPFLAGS DUMP_FLAGS_NONE – No flags set DUMP_FLAGS_NONDISRUPT – Try and continue

standard system operation after a dump takes place

Page 8: LKCD Linux Kernel Crash Dumps

04/19/23 Version 1.0 8

Configuration

DIOSDUMPCOMPRESS DUMP_COMPRESS_NONE – Raw dump format DUMP_COMPRESS_RLE – Use RLE compression DUMP_COMPRESS_GZIP – Use GZIP compression

DIOSDUMPDEV This is the device to dump to (for example, /dev/sda4)

Each configuration parameter is dependent on the system state, whether dump compression is loaded into the kernel, etc.

Page 9: LKCD Linux Kernel Crash Dumps

04/19/23 Version 1.0 9

User-Level Analysis (lcrash)

Linux Crash (lcrash) is used for analyzing system crash dumps. It is extremely powerful for support and engineering personnel for finding solutions to kernel crashes:

Evaluates CPU state Mode, register settings, etc.

Displays all tasks Includes which task is running on a given CPU

Stack trace for each running task This is accomplished WITHOUT frame pointers built into the kernel (-

fomit-frame-pointer)

Allows for memory dumping, struct analysis, finding symbols, etc. lcrash is amazingly versatile for problem analysis Crash dump reports can be created automatically on boot-up after a system

crash

Page 10: LKCD Linux Kernel Crash Dumps

04/19/23 Version 1.0 10

lcrash Example Output>> stat | head

sysname : Linux

nodename : crashme.atmyhouse.com

release : 2.4.8

version : #9 SMP Mon Dec 10 00:05:19 PST 2001

machine : i686

domainname : (none)

LOG_BUF:

>> dump log_buf 10

0xc0332c60: 4c3e343c 78756e69 72657620 6e6f6973 : <4>Linux version

0xc0332c70: 342e3220 2820382e 746f6f72 74617740 : 2.4.8 (root@cra

0xc0332c80: 79657265 70612e65 : shme.atm

Page 11: LKCD Linux Kernel Crash Dumps

04/19/23 Version 1.0 11

lcrash Example Output>> task ADDR UID PID PPID STATE FLAGS CPU NAME======================================================================0xc02e4000 0 0 0 0 0 - swapper0xdfffc000 0 1 0 0 0x100 - init0xdfff2000 0 2 1 1 0x40 - keventd0xdffee000 0 3 0 0 0x40 - ksoftirqd_CPU0

[ . . . ]

0xde47a000 0 867 1 1 0x100 - mingetty0xda0fe000 0 1017 660 0 0x140 - sshd0xd9c06000 0 1018 1017 1 0x100 - bash0xde4b4000 0 1101 1018 0 0x100 0 insmod======================================================================31 active task structs found

Page 12: LKCD Linux Kernel Crash Dumps

04/19/23 Version 1.0 12

lcrash Example Output>> t 0xda0fe000=========================================================STACK TRACE FOR TASK: 0xda0fe000(sshd) 0 schedule+1040 [0xc0111250] 1 schedule_timeout+121 [0xc0110d89] 2 do_select+506 [0xc014251a] 3 sys_select+820 [0xc01428c4] 4 system_call+44 [0xc0106ed4]=========================================================

>> fsym panic_timeout ADDR OFFSET TYPE NAME============================================================0xc0332804 0 GLOBAL_DATA panic_timeout============================================================1 symbol found

>> od panic_timeout0xc0332804: 00000005 : ....

Page 13: LKCD Linux Kernel Crash Dumps

04/19/23 Version 1.0 13

lcrash Example Output>> px ((struct task_struct *)0xd8abf000).thread.esp00x15a159

>> px ((struct task_struct *)0xd8abf000).thread.debugreg[0]0x0

>> whatis user_structstruct user_struct { atomic_t __count; atomic_t processes; atomic_t files; struct user_struct *next; struct user_struct **pprev; uid_t uid;};

>> px (struct user_struct *)(((struct task_struct *)0xd8abf000).user).uid0xfffff000

Page 14: LKCD Linux Kernel Crash Dumps

04/19/23 Version 1.0 14

Future Development/Evolution

The 2.5 implementation of LKCD will use dump methods to allow multiple dumping paths through the kernel (multiple devices!)Low-level device drivers will register their own set of dump functions so that each driver does what it thinks is correctAdditions to lcrash and other LKCD utilities will be extended to allow for this functionalityLKCD will be extended to work on multiple OS architectures (such as FreeBSD)

Page 15: LKCD Linux Kernel Crash Dumps

04/19/23 Version 1.0 15

Questions/Comments?