don't shoot down tlb shootdowns! - eurosys 2020 · 2020. 4. 29. · don't shoot down tlb...
TRANSCRIPT
![Page 1: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/1.jpg)
©2019 VMware, Inc.
Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai,Michael Wei
April 2020
![Page 2: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/2.jpg)
©2019 VMware, Inc.
Virtual Address
Translation Lookaside Buffer (TLB)
TLB = cache for virtual to physical address translations
PGD PUDPMD
PTE
TLBPage-Tables
VAàPA
![Page 3: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/3.jpg)
©2019 VMware, Inc.
TLB Coherency
Hardware does not maintain TLBs coherent
The problem is left for software (OS)
TLBincoherent
PTEs TLB
VAàPA VAàPA’ VAàPA’’
incoherent
![Page 4: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/4.jpg)
©2019 VMware, Inc.
TLB Shootdown (in Linux)
initiator
time
responder
![Page 5: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/5.jpg)
©2019 VMware, Inc.
Challenge
TLB shootdowns are expensive.
How can we further optimize them?
This work focus on:• Linux/x86 – common lessons• Userspace mappings – common case
Lessons are relevant to other environments
![Page 6: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/6.jpg)
©2019 VMware, Inc.
Existing Solutions
Hardware based TLB invalidations• Not available on all architectures
• Does not coexist (yet) with software techniques:– No selective target cores for TLB invalidation
Software solutions• Replicating page-tables [RadixVM, Clements’13]
– Can increase overhead with low-latency IPIs
• Aggressive batching [LATR, Kumar’18]– Breaks POSIX semantics
![Page 7: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/7.jpg)
©2019 VMware, Inc.
TLB Flushes in Linux and FreeBSD
initiator
responder
time
busy-wait
![Page 8: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/8.jpg)
©2019 VMware, Inc.
Optimization 1: Concurrent Flushes (forgotten lesson)
initiator
time
RP3 TLB consistency algorithm [Rosenburg’89]
responder
![Page 9: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/9.jpg)
©2019 VMware, Inc.
TLB Shootdown Responder
Entry
SMP
TLB
Page Table Isolation
![Page 10: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/10.jpg)
©2019 VMware, Inc.
Optimization 2: Cacheline Consolidation
SMP info
TLB flush info
memoryEntry
SMP
TLB
![Page 11: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/11.jpg)
©2019 VMware, Inc.
Optimization 3: Early Acknowledgment
Entry
SMP
TLB
![Page 12: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/12.jpg)
©2019 VMware, Inc.
Optimization 3: Early Acknowledgment
Entry
SMP
TLB
Safe: flush will happenBetter: Initiator is faster
![Page 13: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/13.jpg)
©2019 VMware, Inc.
Optimization 4: In-Context Flushes
Entry
SMP
TLB
![Page 14: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/14.jpg)
©2019 VMware, Inc.
Optimization 4: In-Context Flushes
Entry
SMP
TLB
1. Efficient2. Better batching
![Page 15: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/15.jpg)
©2019 VMware, Inc.
In the Paper
Userspace-safe batching• Deferring TLB shootdowns while the kernel runs
Avoiding TLB flushes on Copy-on-Write• Special case we can optimize
TLB flushes in virtualization• The effect of page size mismatch
Many important and subtle details
![Page 16: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/16.jpg)
©2019 VMware, Inc. 16
Evaluation: Unmapping and Flushing 10 PTEsmadvise(MADV_DONTNEED)
0
3000
6000
9000
12000
15000
18000
21000
samecore samesocket diffsocket
cycl
es
baseconcurrentcachelineearly-ackin-context
16208
7685
14361
6247
16475
6929
0
2000
4000
6000
8000
samecore samesocket diffsocket
8411
6785
7313
5879
8039
6290
Initiator Responder
![Page 17: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/17.jpg)
©2019 VMware, Inc. 17
Evaluation: SysBench – Random Writes
1
1.05
1.1
1.15
1.2
1.25
0 5 10 15 20 25
spee
dup
threads [#]
baseconcurrent
cacheline-consolearly-ack
in-context flushesuserspace-safe batchingRandom writes
Periodic flushes
Memory-mapped file
Emulated persistent memory, no write-cache
![Page 18: Don't shoot down TLB shootdowns! - EUROSYS 2020 · 2020. 4. 29. · Don't shoot down TLB shootdowns! Nadav Amit, Amy Tai, Michael Wei April 2020 ©2019 VMware, Inc. Virtual Address](https://reader035.vdocuments.mx/reader035/viewer/2022071517/613aa2380051793c8c01271e/html5/thumbnails/18.jpg)
©2019 VMware, Inc.
Conclusions
TLB shootdown can be improved
Doing it well in software è better hardware interfaces
We are working to push these enhancements to Linux