cs 550 operating systems spring 2018huilu/slides/6-os-systemcall.pdf · •linux system call: int...
TRANSCRIPT
![Page 1: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/1.jpg)
CS 550 Operating SystemsSpring 2018
System Call
1
![Page 2: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/2.jpg)
Recap: The need for protection
When running user processes, the OS needs to protect itself and other system components
• For reliability: buggy programs
• For security: malicious user programs
2
![Page 3: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/3.jpg)
Recap: The need for protection
• How can we provide this protection?• Treat those operations trying to access/modify critical
system resources as “privileged operations”
• Allow only the OS kernel to perform the privileged operations• How?
3
![Page 4: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/4.jpg)
Recap: Dual-mode operation• Allows OS to protect itself and other system
components• User mode and kernel mode• Mode bit provided by hardware
• Provides ability to distinguish when system is running user code or kernel code
• Some instructions designated as privileged (e.g., those accessed/changed system states or critical resources), only executable in kernel mode
• If executed in user mode, exception
• To perform privileged operations, must transit into OS through well defined interfaces• Interrupt handlers• System calls
4
![Page 5: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/5.jpg)
CPU’s ‘fetch-execute’ cycle
How can external devices notify the CPU aboutcertain events?
Interrupts
IP: Instruction Pointer (or Program Counter, PC)5
![Page 6: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/6.jpg)
CPU’s ‘fetch-execute’ cycle with interrupt
Fetch instruction at IP
Advance IP to next instruction
Decode the fetched instruction
Execute the decoded instruction
IRQ?
no
Save context
Get INTR #
Lookup ISR
Execute ISR
yes IRET
User
Program
IP
ld
add
st
mul
ld
sub
bne
add
jmp
…
6
![Page 7: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/7.jpg)
Interrupt hardware (legacy systems)
• I/O devices have (unique or shared) Interrupt Request Lines (IRQs)
• IRQs are mapped by special hardware to interrupt numbers, and passed to the CPU• This hardware is called a Programmable Interrupt Controller (PIC)
7
![Page 8: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/8.jpg)
The Programmable Interrupt Controller (PIC)
• Responsible for telling the CPU when a specific external device wishes to ‘interrupt’• Needs to tell the CPU which one among several devices is
the one needing service
• PIC translates IRQ to interrupt number• Raises interrupt to CPU
• Interrupt # available in register
• Interrupts can have varying priorities• PIC also needs to prioritize multiple requests
• Possible to “mask” (disable) interrupts at PIC or CPU
8
![Page 9: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/9.jpg)
Fetch instruction at IP
Advance IP to next instruction
Decode the fetched instruction
Execute the decoded instruction
IRQ?
no
Save context
Get INTR ID
Lookup ISR
Execute ISR
yes IRET
User
Program
IP
ld
add
st
mul
ld
sub
bne
add
jmp
…
CPU’s ‘fetch-execute’ cycle with interrupt
9
![Page 10: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/10.jpg)
Interrupt Descriptor Table
• The ‘entry-point’ to the interrupt-handler is located via the Interrupt Descriptor Table (IDT)
• Interrupt Service Routine = IDT[Interrupt number]• Also called interrupt handler
• IDT is in memory, initialized by OS at boot
• How to locate base of IDT?• CPU has a register, idtr, pointing to IDT, initialized by OS
via the LIDT (x86) instruction at boot
10
![Page 11: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/11.jpg)
CPU’s ‘fetch-execute’ cycle with interrupt
Fetch instruction at IP
Advance IP to next instruction
Decode the fetched instruction
Execute the decoded instruction
IRQ?
no
Save context
Get INTR #
Lookup ISR
Execute ISR
yes IRET
User
Program
IP
ld
add
st
mul
ld
sub
bne
add
jmp
…
11
![Page 12: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/12.jpg)
Same interrupt mechanism used for other control transfers• We’ve seen Interrupts: raised externally by device
• Traps (or exceptions): raised internally by CPU• 0: divide-overflow fault
• 3: breakpoint
• 6: Undefined Opcode
• 13: General Protection Exception
• System call can be implemented this way too• Linux system call: int 80h
• “int” instruction generates a “software interrupt” or “trap”, causing the transition from user mode to kernel mode. 80h is the interrupt ID.
12
![Page 13: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/13.jpg)
Dual-mode operation• Allows OS to protect itself and other system
components• User mode and kernel mode• Mode bit provided by hardware
• Provides ability to distinguish when system is running user code or kernel code
• Some instructions designated as privileged (e.g., those accessed/changed system states or critical resources), only executable in kernel mode
• If executed in user mode, exception
• To perform privileged operations, must transit into OS through well defined interfaces• Interrupt handlers• System calls
13
![Page 14: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/14.jpg)
System calls
• A type of special “protected procedure calls”allowing user-level processes request services from the kernel.
• System calls provide:• An abstraction layer between processes and hardware,
allowing the kernel to provide access control, arbitration
• A virtualization of the underlying system
• A well-defined interface for system services
14
![Page 15: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/15.jpg)
System calls vs. Library functions
• What are the similarities and differences between system calls and library functions (e.g., libcfunctions)?
• Similarity• Both appear to be APIs that can be called by programs to
obtain a given service
• E.g., open,
• E.g., strlen
15
![Page 16: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/16.jpg)
System calls vs. Library functions
• System calls make explicit requests to the kernel, and can only be initiated by special software interrupt instructions
• Each system call has a corresponding standard C library wrapper routines, which hide the details of system call entry/exit.• strlen() (<string.h>) ?
• open() (<fcntl.h)?
• printf() (<stdio.h>)?
• sprintf() (<stdio.h>)?
all in user space
sys_open()
write() sys_write()
all in user space
16
![Page 17: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/17.jpg)
Invoking system calls
…xyz()
…
user-mode (restricted privileges)
kernel-mode (unrestricted privileges)
xyz {…
int 80h;…}
call ret
system_call:…
sys_xyz();…
int 0x80
iret
sys_xyz() { … }
call ret
systemcall serviceroutine
systemcall handler
appmakingsystemcall
wrapperroutinein std Clibrary
17
![Page 18: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/18.jpg)
Invoking system calls: more details• In user program
• call the library function that includes a system call
• In library function• Preparation work• Save the syscall number in %eax (x86)• Call “int 80h” (Linux)
• Hardware: locate the system call trap handler using the interrupt ID 80h
• In trap handler:• Save user process context• Look up the intended system call in the “system call table”
• In system call:• Perform the requested service• Return to user mode by “iret” instruction, which restores the
user process context18
![Page 19: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/19.jpg)
Next: Syscall Wrapper Macros
{printf(“hello world!\n”);
}
libcUser mode
kernel mode
%eax = sys_write #;int 0x80
IDT0x80
system_call() {fn = syscalls[%eax]
} syscallstable
sys_write(…) {// do real work
}
printf (…){ …
19
![Page 20: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/20.jpg)
Designing the syscall interface
• Important to keep interface small, stable (for binary and backward compatibility)
• Early UNIXes had about 60 system calls, Linux 2.6 has about 300; Solaris more, Window more still
• Aside: Windows does not publicly document syscallsand only documents library wrapper routines (unlike UNIX/Linux)
• Syscall numbers cannot be reused (!)• Why?• Deprecated syscalls are implemented by a special “not
implemented” syscall (sys_ni)
20
![Page 21: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/21.jpg)
The system-call jump-table (system call table)• There are approximately 300 system-calls in Linux 2.6.
• Any specific system-call is selected by its ID-number (i.e., the system call number, which is placed into register %eax)
• An array of function-pointers is directly accessed (using the ID-number)
• This array is named ‘sys_call_table[]’ in Linux
21
![Page 22: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/22.jpg)
The system-call jump-table –assembly language (.data)
0 common read sys_read
1 common write sys_write
2 common open sys_open
3 common close sys_close
4 common stat sys_newstat
5 common fstat sys_newfstat
6 common lstat sys_newlstat
// …etc (arch/x86/entry/syscalls/syscall_64.tbl)
22
![Page 23: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/23.jpg)
The ‘jump-table’ idea
sys_restart_syscall
sys_exit
sys_fork
sys_read
sys_write
sys_open
sys_close
…etc…
sys_call_table
.section .text0
1
2
3
4
5
6
7
8 23
![Page 24: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/24.jpg)
Discussion• Instead of using the approach of system table, can we
use if-else tests or switch statement to transfer to the service routine’s entry point?• Functionality wise, yes.
• But it would be extremely inefficient.
• System call invocations are synchronous, long system call execution cause performance degradation for the calling program.
24
![Page 25: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/25.jpg)
Syscall Naming Convention
• Usually a library function “foo()” will do some work and then call a system call (“sys_foo()”)
• In Linux, all system calls begin with “sys_”
• Often “sys_foo()” just does some simple error checking and then calls a worker function named “do_foo()”
25
![Page 26: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/26.jpg)
Syscall return values
• Recall that library calls return -1 on error, and place a specific error code in the global variable errno
• System calls return specific negative values to indicate an error• On x86, the return value is put into %eax, so that the
library wrapper function can access.
• The library wrapper functioin is responsible for conforming the return values to the errnoconvention
26
![Page 27: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/27.jpg)
System call argument passing
Three general methods used to pass arguments to the OS:
• Method 1: pass the arguments in registers (simplest)• Any drawbacks?
• Method 2: arguments are placed, or pushed, onto the stack by the program and popped off the stack by the OS kernel code (i.e., the syscall implementation)
• Method 3: arguments are stored in a block, or table, in memory, and address of block passed as a parameter in a register • This approach taken by Linux and Solaris
• Which method does xv6 use?27
![Page 28: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/28.jpg)
DiscussionTo a programmer, a system call looks like any other call to a library functions. Is it important that a programmer knows which library procedures result in system calls? Under what circumstances and why?
• As far as program logic is concerned, it does not matter whether a call to a library procedure results in a system call. But if performance is an issue, if a task can be accomplished without a system call the program will run faster.
• Every system call involves overhead time in switching from the user context to the kernel context.
• Furthermore, on a multiuser system the operating system may schedule another process to run when a system call completes, further slowing the progress in real time of a calling process.
Library calls are much faster than system calls. If you can do it in user space, you should.
28
![Page 29: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/29.jpg)
Discussion
Consider a hypothetical system call, zeroFill, which fills a user buffer with zeroes:
zeroFill(char* buffer, int bufferSize);
The following kernel implementation of zeroFill contains a security vulnerability. What is the vulnerability, and how would you fix it?
void sys_zeroFill(char* buffer, int bufferSize) {
for (int i=0; i < bufferSize; i++) {
buffer[i] = 0;
}
}
29
![Page 30: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/30.jpg)
Discussion
• The user buffer pointer is untrusted, and could point anywhere. In particular, it could point inside the kernel address space. This could lead to a system crash or security breakdown.
• Fix: verify the pointer is a valid user address
30
![Page 31: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/31.jpg)
Discussion
• Is it a security risk to execute the zeroFill function in user-mode?
void zeroFill(char* buffer, int bufferSize) {
for (int i=0; i < bufferSize; i++) {
buffer[i] = 0;
}
}
31
![Page 32: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/32.jpg)
Discussion
• No. User-mode code does not have permission to access the kernel’s address space. If it tries, the hardware raises an exception, which is safely handled by the OS
• More generally, no user mode code should ever be a security vulnerability.• Unless the OS has a bug…
32
![Page 33: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/33.jpg)
Assignment
• Read the xv6 code/books about system call implementation
• In the Assignment1, you will be implementing your own system calls.
• It’s already late if you have not started!
33
![Page 34: CS 550 Operating Systems Spring 2018huilu/slides/6-os-systemcall.pdf · •Linux system call: int 80h • ^int instruction generates a software interrupt or trap, causing the transition](https://reader033.vdocuments.mx/reader033/viewer/2022060218/5f0698887e708231d418c650/html5/thumbnails/34.jpg)
Midterm1
• 2/26, in class
• Coverage: Processes, IPC, and system calls.
34