![Page 1: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/1.jpg)
Corey: An Operating System For Many Cores
Silas Boyd-Wickizer˚ Haibo Chen‡ Rong Chen‡ Yandong Mao‡ Frans Kaashoek˚ Robert Morris˚ Aleksey Pesterev˚
Lex Stein§ Ming Wu§ Yuehua Dai† Yang Zhang˚ Zheng Zhang§
˚MIT‡Fudan University
†Microsoft Research Asia§Xi'an Jiaotong University
OSDI’08
1
![Page 2: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/2.jpg)
Background, Motivation• what is an operating system?
• a layer between applications and hardware• resource/service provider• it tries to generalize possible conditions
• observation: as there are more cores, some systems suffer from unnecessary resource sharing
• applications know better what they need, return the power to the applications
2
![Page 3: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/3.jpg)
Virtual Address Space• Each process sees memory address space as
linear, but in fact it is not• Each process has its own PageTable• Kernel has its own PageTable
in kernel space• Translation Lookaside Buffer(TLB)
3source http://en.wikipedia.org/wiki/File:Virtual_memory.svg
![Page 4: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/4.jpg)
Traditional Address Space Management
4
![Page 5: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/5.jpg)
PROBLEM #1file descriptor duplication
• unnecessary shared resource contention• shared data structures • locks: (cache miss cost)
5
![Page 6: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/6.jpg)
sys_dup()
fget()
main()
dupfd()
close()
Loop, executed by many threads
rcu_read_lock()
fd_install()
spin_lock()
spin_lock()
![Page 7: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/7.jpg)
asmlinkage long sys_dup(unsigned int fildes){ int ret = -EBADF; struct file * file = fget(fildes);
if (file) ret = dupfd(file, 0, 0); return ret;}
sys_dup()
7
![Page 8: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/8.jpg)
fget()look up fd in fd_table
struct file *fget(unsigned int fd){ struct file *file; struct files_struct *files = current->files;
rcu_read_lock(); // internally it is implemented by a global mutex locked by a read_lock
file = fcheck_files(files, fd); if (file) { if (!atomic_inc_not_zero(&file->f_count)) { /* File object ref couldn't be taken */ rcu_read_unlock(); return NULL; } } rcu_read_unlock();
return file;}
8
![Page 9: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/9.jpg)
dupfd()duplicate given file descriptor
static int dupfd(struct file *file, unsigned int start, int cloexec){ struct files_struct * files = current->files; struct fdtable *fdt; int fd;
spin_lock(&files->file_lock); fd = locate_fd(files, file, start); if (fd >= 0) { /* locate_fd() may have expanded fdtable, load the ptr */ fdt = files_fdtable(files); FD_SET(fd, fdt->open_fds); if (cloexec)` FD_SET(fd, fdt->close_on_exec); else FD_CLR(fd, fdt->close_on_exec); spin_unlock(&files->file_lock);
fd_install(fd, file); // write to fd array } else { spin_unlock(&files->file_lock); fput(file); }
return fd;}
9
![Page 10: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/10.jpg)
fd_install()where things really went wrong
void fd_install(unsigned int fd, struct file *file){ struct files_struct *files = current->files; struct fdtable *fdt; spin_lock(&files->file_lock); fdt = files_fdtable(files); BUG_ON(fdt->fd[fd] != NULL); rcu_assign_pointer(fdt->fd[fd], file); spin_unlock(&files->file_lock);}
// recall that fd_install is called by every thread
10
![Page 11: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/11.jpg)
PROBLEM #2cache miss is expensive, ft. lock contention
• unnecessary shared resource contention• shared data structures• locks: (cache miss cost)
11
255
3
![Page 12: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/12.jpg)
LOCKSspin lock: spin on global variable, cache miss happens when a thread release or acquire a lock, high cache coherence traffic
Test And Set (TAS) lock: spin on local variable, atomic, better than spin lock, but no fairness guarantee
MCS lock: spin on local variable, atomic, FIFO queue, when a thread release its own lock, it handles over the ownership to the next node in queue
12
![Page 13: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/13.jpg)
MCS LOCK (QUEUE)
13
NULL
False TrueFalseFalse
tail_of_queue
![Page 14: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/14.jpg)
MCS LOCK (QUEUE)
14
NULL
False
True
FalseFalse
tail_of_queue
False
True
a thread releases a lock myLock -> prev -> aquire = true
False
a thread attempts to require lock myLock -> prev = tailmyLock -> next = tail -> nextmyLock -> aquire = falsetail -> next = myLock
![Page 15: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/15.jpg)
Corey• Inspired by ExoKernel: protect but do not
manage system resource
• 3 abstractions • Address range• kernel core• shares
15
![Page 16: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/16.jpg)
Address RangeWhy Not Both
• An abstraction that corresponds to a range of virtual-to- physical mappings.
• - private(default): only owner core is able to access- shared: assign by application
• avoid contention • ar_alloc() to create
16
![Page 17: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/17.jpg)
Address Range Evaluation• private memory access: memclone
• each core allocate 100MB on its own DRAM pool• use round-robin to allocate new core
• shared memory access: mempass• one core allocates 100MB• each core accesses every page
17
![Page 18: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/18.jpg)
Memclone
18
![Page 19: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/19.jpg)
Memclone
19
• Linux single memory• shared memory access: mempass
• one core allocates 100MB• each core accesses every page
![Page 20: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/20.jpg)
Mempass
20
![Page 21: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/21.jpg)
Kernel CoreA Hint Of Resource Isolation
• an abstraction that specifies a core to kernel functions and data
• a kernel core can manage hardware devices and execute system calls
• among kernel cores, they communicate via IPC• increase scalability by avoiding cache miss• less TLB invalidation (TLB is cleared in every
context switch)
21
![Page 22: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/22.jpg)
Kernel Core Evaluation• simple TCP service that accepts and responds with a128
bytes message to each connection before closing it
• 2 modes• dedicated: one kernel core handles everything except for
computation• polling: a kernel core only to poll for received packet
notifications and transmit completion• In both cases, each other core runs private TCP/IP service
with private TCP/IP stack
22
![Page 23: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/23.jpg)
Kernel Core EvaluationDedicated Mode
23
NIC
dedicated core
Service_0 Service_1 Service_2
buffer exchange via IPC
service core do computation dedicated core do handling
![Page 24: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/24.jpg)
Kernel Core EvaluationPolling Mode
24
NIC polling kernel
Service_0 Service_1 Service_2
receive packet notification & transmission completion
DMA buffer transfer
receive packets notification
![Page 25: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/25.jpg)
Kernel Core EvaluationThroughput
25
each core is able to handle more connections per second in the Dedicated configuration than in Polling
![Page 26: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/26.jpg)
Kernel Core EvaluationL3 Cache Miss
26
![Page 27: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/27.jpg)
SharesA Explicit Way Of Avoiding Cache Miss
• a book keeping mechanism • conceptually similar to encapsulation in OOP• applications can specify if a kernel object is
shared among cores
27
![Page 28: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/28.jpg)
SHARES EVALUATIONadd/remove per-core segment to global/local share
28
modifying global data structure causes contention
![Page 29: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/29.jpg)
SHARES EVALUATIONadd/remove per-core segment to global/local share
29
![Page 30: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/30.jpg)
APPLICATION• MapReduce
• framework: Metis• bound to core by calling sched_setAffinity() • mostly benefit from address range
• webd • mostly benefit from kernel core
30
![Page 31: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/31.jpg)
Mapreduce Evaluation• word inverted index • 1GB input, 2GB for intermediate value• reducer copy intermediate value to shared address
space
31
![Page 32: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/32.jpg)
Mapreduce Evaluation
32
![Page 33: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/33.jpg)
Mapreduce Evaluation (Cont.)
33
Linux’s soft page fault handler is about 10% faster than Corey’s when there is no contention
![Page 34: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/34.jpg)
Webd Evaluation• 8 webd core, 8 application core• FileSum: returns sum of bytes of a given file• 2 modes
• random: webd is allowed to pass request to any application cores
• locality: each each webd will only passes request to a certain application core
34
![Page 35: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/35.jpg)
Webd Evaluation (Cont.)
35
when file is small, both mode is limited by webd’s network stackwhen file size is big, locality mode has (some) advantage of
being able to cache bigger files (L3 Cache = 2MB)
![Page 36: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/36.jpg)
Discussion
36
Baumann, Andrew, et al. "The multikernel: a new OS architecture for scalable multicore systems." Proceedings of the ACM SIGOPS 22nd
symposium on Operating systems principles. ACM, 2009.
Han, Sangjin, et al. "PacketShader: a GPU-accelerated software router." ACM SIGCOMM Computer Communication Review 41.4
(2011): 195-206.
Boyd-Wickizer, Silas, et al. "An analysis of Linux scalability to many cores." (2010).
![Page 37: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/37.jpg)
Discussion
36
Baumann, Andrew, et al. "The multikernel: a new OS architecture for scalable multicore systems." Proceedings of the ACM SIGOPS 22nd
symposium on Operating systems principles. ACM, 2009.
Han, Sangjin, et al. "PacketShader: a GPU-accelerated software router." ACM SIGCOMM Computer Communication Review 41.4
(2011): 195-206.
Boyd-Wickizer, Silas, et al. "An analysis of Linux scalability to many cores." (2010).
Linux still has hope
![Page 38: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/38.jpg)
Discussion
36
Baumann, Andrew, et al. "The multikernel: a new OS architecture for scalable multicore systems." Proceedings of the ACM SIGOPS 22nd
symposium on Operating systems principles. ACM, 2009.
Han, Sangjin, et al. "PacketShader: a GPU-accelerated software router." ACM SIGCOMM Computer Communication Review 41.4
(2011): 195-206.
Boyd-Wickizer, Silas, et al. "An analysis of Linux scalability to many cores." (2010).
Linux still has hope
![Page 39: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/39.jpg)
Discussion (Cont.)
37
Yoo, Richard M., Anthony Romano, and Christos Kozyrakis. "Phoenix rebirth: Scalable MapReduce on a large-scale shared-memory
system." Workload Characterization, 2009. IISWC 2009. IEEE International Symposium on. IEEE, 2009
Soares, Livio, and Michael Stumm. "FlexSC: Flexible system call scheduling with exception-less system calls." Proceedings of the 9th USENIX conference on Operating systems design and implementation.
USENIX Association, 2010.
![Page 40: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/40.jpg)
THE END
38
![Page 41: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/41.jpg)
SUPPLEMENT SLIDE
39
![Page 42: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/42.jpg)
source: http://upload.wikimedia.org/wikipedia/commons/thumb/3/32/Virtual_address_space_and_physical_address_space_relationship.svg/773px-Virtual_address_space_and_physical_address_space_relationship.svg.png
![Page 43: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/43.jpg)
http://upload.wikimedia.org/wikipedia/commons/thumb/d/d6/AMD_K10_Arch.svg/2000px-AMD_K10_Arch.svg.png
![Page 44: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/44.jpg)
“The cache-management logic for the L3 cache is unique. When an item is loaded from L3 cache into a core’s L1 cache (the L2 cache is always by-passed), the item is sometimes removed from the L3 cache and sometimes not. The determining factor is whether other cores are still accessing the item. If so, it’s not removed from L3 and a copy of the data is loaded into L1. If no other cores are accessing the data item, then it is removed from the L3 cache”
http://upload.wikimedia.org/wikipedia/commons/thumb/d/d6/AMD_K10_Arch.svg/2000px-AMD_K10_Arch.svg.png
![Page 45: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/45.jpg)
SYS_CLOSECLOSE FILE POINTED BY FD
asmlinkage long sys_close(unsigned int fd){ //... initialize
spin_lock(&files->file_lock); fdt = files_fdtable(files);
//magic that releases the file descriptorspin_unlock(&files->file_lock); retval = filp_close(filp, files); // more locks in flip_close()
//magic that checks error, in case of error, go to out_unlock return retval;out_unlock:
spin_unlock(&files->file_ return -EBADF;}#define files_fdtable(files)
(rcu_dereference((files)->fdt))
43
![Page 46: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/46.jpg)
mm_structHow Processes Use Page Table
struct mm_struct { int count; pgd_t * pgd; //page global directory unsigned long context; unsigned long start_code, end_code, start_data, end_data; unsigned long start_brk, brk, start_stack, start_mmap; unsigned long arg_start, arg_end, env_start, env_end; unsigned long rss, total_vm, locked_vm; unsigned long def_flags; struct vm_area_struct * mmap; struct vm_area_struct * mmap_avl; struct semaphore mmap_sem;};
44
![Page 47: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/47.jpg)
mm_struct (cont.)
45
Page Global Directory
Page Middle Directory
Page Table Entry
user data in physical memory
![Page 48: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/48.jpg)
source: http://pdos.csail.mit.edu/~sbw/corey/osdi-12-08-2008.pdf
![Page 49: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/49.jpg)
source: http://pdos.csail.mit.edu/~sbw/corey/osdi-12-08-2008.pdf
![Page 50: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/50.jpg)
source: http://pdos.csail.mit.edu/~sbw/corey/osdi-12-08-2008.pdf
![Page 51: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/51.jpg)
source: http://pdos.csail.mit.edu/~sbw/corey/osdi-12-08-2008.pdf
![Page 52: Corey: An Operating System For Many Corescourses.cs.vt.edu/cs5204/fall14-butt/lectures/corey.pdf · Corey: An Operating System For Many Cores Silas Boyd-Wickizer˚ Haibo Chen‡ Rong](https://reader034.vdocuments.mx/reader034/viewer/2022050423/5f92145fc0f0763edc28160a/html5/thumbnails/52.jpg)
NOTES• the presentation is for Boyd-Wickizer, Silas, et al. "Corey: An
Operating System for Many Cores." OSDI. Vol. 8. 2008.
• graphs used in slide 4, 5, 11, 16, 18, 20, 25, 26, 28, 29, 32, 33, 35 are from original paper
• source code used in slide 7,8,9,10, 43, 44 are from Linux kernel 2.6.25 source code on http://www.cs.fsu.edu/~baker/devices/lxr/http/search?v=2.6.25
• this slide is created and presented by Jin, Yilong for CS5204 Fall 2014 on Oct. 16th 2014