from a to e: analyzing tpc’s oltp benchmarks
DESCRIPTION
From A to E: Analyzing TPC’s OLTP Benchmarks. The obsolete, the ubiquitous, the unknown. Pınar Tözün Ippokratis Pandis* Cansu Kaynak Djordje Jevdjic Anastasia Ailamaki. École Polytechnique Fédérale de Lausanne *IBM Almaden Research Center. OLTP Benchmarks of TPC. 2005. 2015. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/1.jpg)
From A to E:Analyzing TPC’s OLTP Benchmarks
Pınar Tözün Ippokratis Pandis*Cansu Kaynak Djordje Jevdjic
Anastasia Ailamaki
École Polytechnique Fédérale de Lausanne*IBM Almaden Research Center
The obsolete, the ubiquitous, the unknown
![Page 2: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/2.jpg)
OLTP Benchmarks of TPC
2
• Allow fair product comparisons• Drive innovations for better performance
TPC-E: Unknown – Results from one DBMS vendorTPC-C: Ubiquitous – Most common
TPC-A, TPC-B: Obsolete
20151985 1995 2005
19901989 1992 2007
TPC-C
TPC-B
TPC-E
TPC-ABanking
Wholesale supplier
Brokerage house
![Page 3: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/3.jpg)
3
How is TPC-E different?
Hardware
Storage Manager
Workload
Micro-architectural behavior
Where does time go?
Characteristics/Statistics
Under-utilization due to instruction stallsFewer cache misses and higher IPC
Harder to partition requestsLogical lock contention
More page re-useComplex schema & transactions
Longer held locks
![Page 4: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/4.jpg)
4
Outline• Preview• Setup & Methodology• Micro-architectural behavior• Within the storage manager• Conclusions
![Page 5: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/5.jpg)
5
Experimental SetupServer Fat (Intel Xeon X5660) Lean (Sun Niagara T2)
#Sockets 2 1#Cores per Socket 6 (OoO) 8 (in-order)
#HW Contexts 24 64Clock Speed 2.80GHz 1.40GHz
Memory 48GB 64GBL3 12MB (shared) –L2 256KB (per core) 4MB (shared)
L1-D 32KB (per core) 8KB (per core)L1-I 32KB (per core) 16KB (per core)OS Ubuntu 10.04
Linux kernel 2.6.32SunOS 5.10
Generic_141414-10
![Page 6: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/6.jpg)
6
Methodology• Shore-MT
– Scalable open-source storage manager
• Shore-Kits– Application layer for Shore-MT– Workloads: TPC-B, TPC-C, TPC-E, ++
• Micro-architectural– Xeon X5660: Vtune, Niagara T2: cputrack– Measured at peak throughput
• Storage manager profiling– Niagara T2: dtrace
*https://sites.google.com/site/shoremt
*
*
![Page 7: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/7.jpg)
7
Outline• Preview• Setup & Methodology• Micro-architectural behavior• Within the storage manager• Conclusions
![Page 8: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/8.jpg)
8
IPC on Fat & Lean Cores
TPC-B TPC-C TPC-E0
0.5
1
1.5
2
2.5
3
3.5
4
Inst
ructi
ons
per C
ycle
Intel Xeon X5660
TPC-B TPC-C TPC-E0
0.5
1
1.5
2
2.5
3
3.5
4
Inst
ructi
ons
per C
ycle
Sun Niagara T2Maximum
Maximum
OLTP utilizes lean cores betterTPC-E has higher IPC
![Page 9: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/9.jpg)
9
Execution Cycles and StallsIntel Xeon X5660
More than half of execution time goes to stallsInstruction stalls are the main problem
TPC-B TPC-C TPC-E0%
20%
40%
60%
80%
100%Busy Stalled
Exec
ution
Cyc
les
Brea
kdow
n
TPC-B TPC-C TPC-E0%
20%
40%
60%
80%
100%Rest Instruction
Core
Sta
lls
![Page 10: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/10.jpg)
10
TPC-B TPC-C TPC-E0
20
40
60
80
100 L2D L1D
L2I L1I
Mis
ses
per k
-Inst
ructi
ons
Cache Misses
TPC-E has lower data miss ratio (MPKI)L1-I misses dominate
Intel Xeon X566032KB L1-I & 32 KB L1-D
Sun Niagara T216KB L1-I & 8KB L1-D
TPC-B TPC-C TPC-E0
20
40
60
80
100LLC L2D L1D L2IL1I
Mis
ses
per k
-Inst
ructi
ons
![Page 11: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/11.jpg)
11
Why TPC-E has lower miss ratio?
TPC-B TPC-C TPC-E0
30
60
90
120
150
#Rec
ords
Acc
esse
d
More scans of TPC-E Increased page reuse
Average per transaction
TPC-B TPC-C TPC-E0
5
10
15
20
25
30
35
40
HeapIndex
#Pag
es A
cces
sed
![Page 12: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/12.jpg)
12
Outline• Preview• Setup & Methodology• Micro-architectural behavior• Within the storage manager• Conclusions
![Page 13: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/13.jpg)
13
From A to E: Schema
branch warehouse
Fixed
Scaling
Growing
customer
Increasing schema complexity
TPC-B TPC-C TPC-E
![Page 14: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/14.jpg)
14
From A to E: TransactionsTPC-B TPC-C TPC-E
#Transactions 1 5 12Transaction Mix RW 100% RW 92% RW 23%Secondary Indexes
None 2 transactions 10 transactions
Transaction Input includes
Branch ID Warehouse ID Customer ID orBroker ID orTrade ID or
…
Harder to partitionMore complexity & variety in transaction mix
![Page 15: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/15.jpg)
15
Within the Storage ManagerSun Niagara T2
64 HW Contexts
SF 64 – 0.6GBSpread
SF 64 – 8.2GBSpread
SF 1 – 20GBNo-Spread
4 16 48 4 16 48 4 16 48TPC-B TPC-C TPC-E
0%
20%
40%
60%
80%
100%
Other
Btree
BPool
Logging
Locking
#HW Contexts
Tim
e Br
eakd
own
![Page 16: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/16.jpg)
16
Within the Storage ManagerSun Niagara T2
64 HW Contexts
SF 64 – 0.6GBSpread
SF 64 – 8.2GBSpread
SF 1 – 20GBNo-Spread
Lock manager is the main bottleneck for TPC-E
4 16 48 4 16 48 4 16 48TPC-B TPC-C TPC-E
0%
20%
40%
60%
80%
100%
Other
Btree
BPool
Logging
Locking
#HW Contexts
Tim
e Br
eakd
own
![Page 17: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/17.jpg)
17
4 16 48 4 16 48 4 16 48TPC-B TPC-C TPC-E
0%
20%
40%
60%
80%
100%
Physical ContentionLogical ContentionLock Ac-quisition
#HW Contexts
Tim
e Br
eakd
own
SF 64 – 8.2GBSpread
Inside the Lock Manager
SF 64 – 0.6GBSpread
SF 1 – 20GBNo-Spread
Logical contention even for a large DB
![Page 18: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/18.jpg)
18
Conclusions• Modern hardware is still highly under-utilized
– TPC-E: fewer misses, less stall time, higher IPC– OLTP utilizes less aggressive cores better
• Instruction footprint is too large to fit in L1-I– Spread instructions, (software guided) prefetching– Code/Compiler optimizations
• Logical lock contention due to hotspots– Increased complexity in schema and transactions– TPC-E: harder to physically partition– Logical partitioning, OCC
![Page 19: From A to E: Analyzing TPC’s OLTP Benchmarks](https://reader035.vdocuments.mx/reader035/viewer/2022070408/568143a3550346895db02700/html5/thumbnails/19.jpg)
The obsolet
eThe
ubiquitous
The unexplored
Directed by
Produced by
Also starring: Shore-MT, Xeon X5660, Niagara T2
TP C- B
TPC-C
TP C- E