h-store: a specialized architecture for high-throughput oltp … · 2011. 11. 17. · intel xeon...
TRANSCRIPT
-
H-Store:A Specialized Architecture for High-throughputOLTP Applications
Evan Jones (MIT)Andrew Pavlo (Brown)13th Intl. Workshop on High Performance Transaction SystemsOctober 26, 2009
-
Intel Xeon E5540 (Nehalem/Core i7)
October 26, 2009
Source: Intel 64 and IA-32 Architectures Optimization Reference Manual
-
Distributed Clusters
October 26, 2009
-
Scaling OLTP on Multi-Core?
Use a distributed shared-nothing design
October 26, 2009
-
How to Make a Faster OLTP DBMS Main memory storage
Replication for durability
Explicitly partition the data
Specialized concurrency control
Stored procedures only
Single partition: execute one transaction at time
Multiple partitions: supported but slow
October 26, 2009
-
Source: S. Harizopoulos, D. J. Abadi, S. Madden, M. Stonebraker, “OLTP Under the Looking Glass”, SIGMOD 2008.
OLTP: Where does the time go?
October 26, 2009
-
Users Rely on Partitioning
October 26, 2009
Source: R. Shoup, D. Pritchett, “The eBay Architecture,” SD Forum, Nov. 2006.
-
What about multi-core? Traditional approach:
One database process
Thread per connection
Shared-memory, locks and latches
H-Store approach:
Thread per partition
Distributed transactions
October 26, 2009
-
Example Microbenchmark One table per client
Table(id INTEGER, counter INTEGER)
Each client executes the following query:UPDATE Table
SET counter = counter + 1
WHERE id = 0;
Add clients to find maximum throughput
Data on RAM disk
October 26, 2009
-
Experimental Configuration
October 26, 2009
Threads Partitions
-
Partitions versus Threads
0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Relative Speed Up
CPUs
October 26, 2009
Ideal
Partitions
Threads
-
Scalability AnalysisPartitions scale better than threads.
Threads: contention for shared resources [1]
Partitions: memory bottleneck causes sublinearscaling
H-Store: Not just for distributed shared-nothing clusters
October 26, 2009
[1] R. Johnson et al., "Shore-MT: A Scalable Storage Manager for the Multicore Era," EDBT 2009.
-
Multi-core Design Problem How to automatically create a data placement
scheme to improve multi-core throughput?
Data Partitioning:
Maximize the number of single-partition transactions.
Data Placement:
Maximize the number of single-node transactions.
October 26, 2009
-
Database Partitioning Select partitioning keys and construct schema tree.
October 26, 2009
DISTRICT
CUSTOMER
ORDER_ITEM
ITEM
STOCK
WAREHOUSE
ORDERS
DISTRICT
CUSTOMER
ORDER_ITEM
STOCK
ORDERS ITEM
Replicated
WAREHOUSE
TPC-C Schema Schema Tree
-
ITEMITEMjITEMITEMITEM
Database Partitioning Combine table fragments into partitions.
October 26, 2009
P2
P4
DISTRICT
CUSTOMER
ORDER_ITEM
STOCK
ORDERS
Replicated
WAREHOUSE
P1
P1
P1
P1
P1
P1
P2
P2
P2
P2
P2
P2
P3
P3
P3
P3
P3
P3
P4
P4
P4
P4
P4
P4
P5
P5
P5
P5
P5
P5
P5
P3
P1
ITEMITEM
ITEM ITEM
ITEM
Partitions
ITEM
Schema Tree
-
Partition AffinityCore1 Core2
HT2
HT1
Core1 Core2
HT2
HT1
Data Placement Assign partitions to cores on each node.
October 26, 2009
P1ITEM
P2ITEM
P5ITEM
Partitions Cluster Node
P4ITEM
P3ITEM
Node 1 Node n
P1
P2
P3
P4
P5
-
H-Store’s Future New Name. New Company.
Six full-time developers.
Open-source project (GPL)
Beta by end of 2009
Multiple deployments in financial service areas.
October 26, 2009
-
More Information H-Store Info + Papers:
http://db.cs.yale.edu/hstore/
VoltDB Project Information:
http://www.voltdb.org/
October 26, 2009