introduction to windows kernel

21
Introduction to Windows Kernel Sisimon Soman

Upload: sisimon-soman

Post on 20-May-2015

421 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Introduction to windows kernel

Introduction to Windows Kernel

Sisimon Soman

Page 2: Introduction to windows kernel

How system call works• Cannot directly enter kernel space using jmp or a call instruction.• When make a system call (like CreateFile, ReadFile) OS enter kernel mode

(Ring 0) using instruction int 2E (it is called interrupt gate).• Code segment descriptor contain information about the ‘Ring’ at which

the code can run. For kernel mode modules it will be always Ring 0. If a user mode program try to do ‘jmp <kernel mode address>’ it will cause access violation, because of the segment descriptor flag says processor should be in Ring 0.

• The frequency of entering kernel mode is high (most of the Windows API call cause to enter kernel mode) sysenter is the new optimized instruction to enter kernel mode.

Page 3: Introduction to windows kernel

Rings continued..

Page 4: Introduction to windows kernel

System Call continued..• Windows maintains a system service dispatch table which is

similar to the IDT. Each entry in system service table point to kernel mode system call routine.

• The int 2E probe and copy parameters from user mode stack to thread’s kernel mode stack and fetch and execute the correct system call procedure from the system service table.

• There are multiple system service tables. One table for NT Native APIs, one table for IIS and GDI etc.

Page 5: Introduction to windows kernel

Lets try it in WinDBG..

• NtWriteFile: mov eax, 0x0E ; build 2195 system service number for NtWriteFile

mov ebx, esp ; point to parameters

int 0x2E ; execute system service trap

ret 0x2C ; pop parameters off stack and return to caller

Page 6: Introduction to windows kernel

File System

Volume Manager

Disk Class Driver

Hardware Driver

IO Manager

App issue ReadFile

NtReadFile

IO Mgr create IRP Packet, send to driver stack

User Land

Kernel Land

IRP

Page 7: Introduction to windows kernel

What is IO Request Packet (IRP)

• IO Operation passes thru, – Different stages.– Different threads.– Different drivers.

• IRP Encapsulate the IO request.• IRP is thread independent.

Page 8: Introduction to windows kernel

IRP Continued..

• Compare IRP with Windows Messages -MSG structure.

• Each driver in the stack do its own task, finally forward the IRP to the lower driver in the stack.

• IRP can be processed synchronously or asynchronously.

Page 9: Introduction to windows kernel

IRP Continued..

• Usually lower level hardware driver takes more time. H/W driver can mark the IRP for pending and return.

• When H/W finish IO, H/W driver complete the IRP by calling IoCompleteRequest().

• IoCompleteRequest() call IO completion routine set by drivers in stack and complete the IO.

Page 10: Introduction to windows kernel

Structure of IRP

• Fixed IRP Header• Variable Stack locations –– One sub stack per driver

IRP Header

Stack Location 1

Stack Location 2

Stack Location 3

Stack Location N

Page 11: Introduction to windows kernel

Flow of IRP

IRP Header

Stack Location 1

Stack Location 2

Stack Location 3

Stack Location 4

File System

Volume Manager

Disk Class Driver

Hardware Driver

Storage Stack

IRP for Storage Stack

Forward IRP to lower driver in the stack

Page 12: Introduction to windows kernel

Flow of IRP Completion

IRP Header

Stack Location 1

Stack Location 2

Stack Location 3

Stack Location 4

File System – Completion Routine

Volume Manager – Completion Routine

Disk Class Driver – Completion Routine

Hardware Driver – Complete the IRP

Storage Stack

IRP for Storage Stack

Call the completion routine while completing the IRP

Page 13: Introduction to windows kernel

IRP Header

• IO buffer Information.• Flags– Page IO Flag– No Caching IO flag

• IO Status – On Completion set this to IO Completed.

• IRP cancel routine

Page 14: Introduction to windows kernel

IRP Stack Location

• IO Manager get the driver count in the stack from the top device in the stack.

• While creating IRP, IO manager allocate the IO stack locations equal to the device count from the top device object.

Page 15: Introduction to windows kernel

Contents of IO Stack Location

• IO Completion routine specific to the driver.• File object specific to the request.

Page 16: Introduction to windows kernel

Software Interrupt Request Levels (IRQLs)

• Windows has its own interrupt priority schemes know as IRQL.

• IRQL levels from 0 to 31, the higher the number means higher priority interrupt level.

• HAL map hardware interrupts to IRQL 3 (Device 1) to IRQL 31 (High)

• When higher priority interrupt occur, it mask the all lower interrupts and execute the ISR for the higher interrupt.

• After executing the ISR, kernel lower the interrupt levels and execute the lower interrupt ISR.

• ISR routine should do minimal work and it should defer the major chunk of work to Deferred Procedure Call (DPC) which run at lower IRQL 2.

Page 17: Introduction to windows kernel

Software Interrupt Request Levels (IRQLs)

Page 18: Introduction to windows kernel

IRQL and DPC

• DPC concept is similar to other OS, in Linux it is called bottom half.

• DPC is per processor, means a duel processor SMP box contains two DPC Qs.

• The ISR routine generally fetch data from hardware and queue a DPC for further processing.

• IRQL priority is different from thread scheduling priority.

Page 19: Introduction to windows kernel

IRQL and DPC

• The scheduler (dispatcher) also runs at IRQL 2.• So a code that execute on or above IRQL 2(dispatch

level) cannot preempt.• From the Diagram, see only hardware interrupts and

some higher priority interrupts like clock, power fail are above IRQL 2.

• Most of the time OS will be in IRQL 0(Passive level)• All user programs and most of the kernel code

execute on Passive level only.

Page 20: Introduction to windows kernel

IRQL continued..• Scheduler runs at IRQL 2, so what happen if my driver try to wait on or

above dispatch level ?.• Simple system will crash with ‘Blue Screen’, usually with the bug check ID

IRQL_NOT_LESSTHAN_EQUAL.• Because if wait above dispatch level, no one there to come and switch the

thread.• What happen if try to access a PagedPool in above dispatch level ?.• If the pages are on disk, then a page fault exception will happen, the

current thread need to wait and page fault handler will read the pages from page file to page frames in memory.

• If page fault happen above the dispatch level, no one there to stop the current thread and schedule the page fault handler. Thus cannot access PagedPool on or above dispatch level.

Page 21: Introduction to windows kernel

IRQL 1 - APCs• Asynchronous Procedure Call (APC) run at IRQL 1. • The main duty of APC is to send the data to user thread

context.• APC Q is thread specific, each thread has its own APC Q.• User space thread initiate the read operation from a device

and either it wait to finish it or continue with another job.• The IO may finish sometime later, now the buffer need to

send to the calling thread’s process context. It is the duty of APC.